Serializing C intermediate representations for efficient and portable parsing

TitleSerializing C intermediate representations for efficient and portable parsing
Publication TypeJournal Articles
Year of Publication2010
AuthorsMeister JA, Foster JS, Hicks MW
JournalSoftware: Practice and Experience
Pagination225 - 238
Date Published2010/03/01/
ISBN Number1097-024X
KeywordsC, intermediate representations, parsing, static analysis, XDR, XML

C static analysis tools often use intermediate representations (IRs) that organize program data in a simple, well-structured manner. However, the C parsers that create IRs are slow, and because they are difficult to write, only a few implementations exist, limiting the languages in which a C static analysis can be written. To solve these problems, we investigate two language-independent, on-disk representations of C IRs: one using XML and the other using an Internet standard binary encoding called eXternal Data Representation (XDR). We benchmark the parsing speeds of both options, finding the XML to be about a factor of 2 slower than parsing C and the XDR over 6 times faster. Furthermore, we show that the XML files are far too large at 19 times the size of C source code, whereas XDR is only 2.2 times the C size. We also demonstrate the portability of our XDR system by presenting a C source code querying tool in Ruby. Our solution and the insights we gained from building it will be useful to analysis authors and other clients of C IRs. We have made our software freely available for download at Copyright © 2010 John Wiley&Sons, Ltd.