WWW 2008 / Poster Paper April 21-25, 2008 · Beijing, China Protecting Web Services from Remote Exploit Code: A Static Analysis Approach Xinran Wang, Yoon-Chan Jhi, Sencun Zhu Dept. of Computer Science and Engineering, Pennsylvania State University, State College, PA Peng Liu College of Information Sciences and Technology, Pennsylvania State University, State College, PA pliu@ist.psu.edu xinrwang, jhi, szhu@cse.psu.edu ABSTRACT We prop ose STILL, a signature-free remote exploit binary co de injection attack blocker to protect web servers and web applications. STILL is robust to almost all anti-signature, anti-static-analysis and anti-emulation obfuscation. Categories and Sub ject Descriptors: C.2.0 [ComputerCommunication Networks]: General - Security and protection. General Terms: Security. Keywords: HTTP, Code Injection Attack, Static Analysis. static-analysis and anti-emulation obfuscation. STILL is signature free, thus it can block new and unknown remote co de injection attacks such as zero-day exploit co de. STILL is also go od for economical Internet wide deployment with very low deployment cost. Http Requests Firewall 1. INTRODUCTION Proxy-based STILL (Application Layer) Web Server A great numb er of remote binary execution vulnerabilities including buffer overflow and format string vulnerabilities have b een found in web servers and web applications [1]. This typ e of vulnerabilities allow attackers to use a crafted HTTP request to inject a piece of exploit binary co de into the "b ody" of the web servers and applications. Once such exploit binary co de injection attacks succeed, the attacker may gain full control of the victim machine. In different attacks, exploit code may be either a piece of shellcode to break into web servers or an infection vector for worms. We prop ose STILL, a real-time, out-of-the-box, signaturefree, remote exploit binary code injection attack blo cker to protect web servers. STILL is motivated by an imp ortant observation that the request messages to web servers are exclusively data and not binary executable code. Since remote exploits are typically binary executable code, this observation indicates that if we can precisely distinguish (service requesting) messages that contain binary co de from those that do not contain any binary code, we can protect web servers as well as other Internet services (which accept data only) from binary code-injection attacks by blo cking the messages that contain binary co de. Figure 1 shows that an application layer proxy-based STILL is deployed b etween the web server and the corresponding firewall to protect web servers. STILL (including static taint analysis and initialization analysis) detect not only unobfuscated exploit code, traditional p olymorphic and metamorphic exploit co de, but also self-modifying and indirect jump obfuscation code that could easily defeat previous static analysis approaches. Indeed, STILL is robust to almost all anti-signature, antiCopyright is held by the author/owner(s). WWW 2008, April 21­25, 2008, Beijing, China. ACM 978-1-60558-085-2/08/04. Figure 1: Deployment of STILL. 2. RELATED WORK This pap er is mainly relevant to the previous static analysis exploit code detection approaches [3, 4, 6]. One benefit of these static analysis approaches is that they can detect b oth foreseen exploit code exploiting known vulnerabilities and zero-day exploit co de exploiting unknown vulnerabilities. In addition, they are in general more resilient to p olymorphism and metamorphism (than string-matching signatures). However, Polychronakis et al. [5] demonstrated that some anti-static-analysis techniques such as self-mo difying can easily thwart these existing static analysis techniques. Polychronakis et al. [5] firstly prop osed a CPU emulator to detect p olymorphic shellcode. The emulators, being a dynamic analyzer, are immune to most anti-static-analysis techniques. However, dynamic analysis is vulnerable to several anti-emulation techniques, which have existed in virus writer community for many years. Motivated by [5], we prop osed STILL, which is robust to b oth anti-static-analysis and anti-emulation techniques. 3. PROPOSED METHOD Figure 2 depicts how STILL works. We next briefly describe its working flow. It works as a proxy-based blocker in the application layer. When it captures a data stream, it disassembles the data stream and generates a control flow graph. It analyzes the disassembled result in two stages. First, STILL detects self-mo difying and indirect jump obfuscation code. Although the real exploit code may be hidden by self-modifying and indirect jump, the obfuscation code itself provides some strong evidences of self-modifying and/or 1139 WWW 2008 / Poster Paper Disassembly and control flow generation Self-modify ing or indirect jump obfuscation code ? Ye s Block/Alert April 21-25, 2008 · Beijing, China Plain exploit code, metamorphic code, etc ? Ye s Block/Alert Data Stream No No Pass Figure 2: The activity diagram of STILL system indirect jump b ehaviors. STILL detects these b ehaviors by static taint analysis and initialization analysis. Since p olymorphism is a kind of self-modifying, STILL can also detect p olymorphic co de in this stage. However, attackers may use neither self-mo difying nor indirect jump obfuscation. In the second stage, STILL detects the plain exploit co de based on system calls and/or function calls that could even have b een obfuscated by metamorphism. STILL also exploits static analysis and initialization analysis in this stage to combat other obfuscation techniques. Below we will describe the mechanisms in greater details. the jump target of indirect jump should b e initialized; the op erands of memory up dating or writing instructions in selfmodifying co de should b e initialized. If they are uninitialized, we will not consider them as attacks. 4. EXPERIMENTAL RESULTS To evaluate the detection effectiveness of STILL, we collected 12,000 p olymorphic attack messages from 10 publicly available p olymorphic engines, all of which encrypt the original shellcode. Among these ten, seven engines are from the Metasploit framework [2], including Countdown, Alpha2, JumpCallAdditive, Pex, PexFnstenvMov, PexFnstenvSub, and ShikataGaNai. The other three engines are CLET , ADMmutate, and JempiScodes . ShikataGaNai, CLET, ADMmutate, and JempiScodes are advanced p olymorphic engines, which also obfuscate the decryption routine by metamorphism such as instruction replacement and garbage insertion. CLET also uses sp ectrum analysis to defeat data mining metho ds. We generated 1,000 different attack messages p er each of ADMmutate and CLET. For JempiSco des, we generated 3,000 different attack messages, 1,000 p er each of its three obfuscation algorithms. We also generated 7,000 different attack messages using the Metasploit Framework, 1,000 per each of the following engines, Alpha2, JumpCallAdditive, Countdown, Pex, PexFnstenvMov, PexFnstenvSub, and ShikataGaNai. We tested the stand-alone prototyp e of STILL using these 12,000 attack messages. All of these messages are successfully detected. 3.1 Disassembly and Control Flow Graph Generation We exploit the O(N) disassembly algorithm used in SigFree [6] to disassemble the input data stream and generate a control flow graph. Here N is the length of the data stream. It first deco des all possible instructions and finds all p ossible transfer of control in a data stream, and then creates a control flow graph based on these instructions and transfers of control. We note that in the presence of indirect jump and self-mo difying obfuscation, it is impossible to completely and statically disassemble the entire b ody of the exploit code emb edded in a data stream using the recursive traversal algorithm. Fortunately, the partially disassembled result may already provide some strong evidences of selfmo difying and/or indirect jump behavior. 3.2 Detection of Self-modifying and Indirect Jump Obfuscation Code The new techniques we prop ose to detect self-mo difying and indirect jump exploit code are called static taint analysis and initialization analysis. We observe that self-mo difying and indirect jump exploit code first need acquire the absolute address of payload. Accordingly, we first try to find the piece of co de which acquires the absolute address of payload at runtime from an instruction sequence. The variable which holds the absolute address will b e marked tainted. Then, we use the static taint analysis approach to track the tainted values and detect whether tainted data are used in the ways that could indicate the presence of self-mo difying and indirect jump exploit co de. A tainted variable is propagated to a new tainted variable by data transfer instructions that move data (e.g., push, p op, move) and data operation instructions that p erform arithmetic or bit-logic op erations on data (e.g., add, sub, xor). For data transfer instructions, the destination operand will b e tainted if and only if the source op erand is tainted. For data op eration instructions, the destination operand will be tainted if and only if either source or destination op erand is tainted. Finally, we use initialization analysis to reduce false p ositives. We observed that the operands of self-mo difying and indirect jump code must b e initialized. Specifically, 5. CONCLUSION We proposed STILL, a novel static taint and initialization analysis approach, to protect web servers from binary codeinjection attacks. Our experiments show that STILL detect self-modifying code or indirect jumps with a high accuracy. Acknowledgments 6. REFERENCES This research was supported by the National Science Foundation (CAREER NSF-0643906). [1] Computer emergency resp onse team (cert). http://www.cert.org. [2] The metasploit pro ject. http://www.metasploit.com. [3] Ramkumar Chinchani and Eric Van Den Berg. A fast static analysis approach to detect exploit co de inside network flows. In RAID, 2005. [4] C. Kruegel, E. Kirda, D. Mutz, W. Rob ertson, and G. Vigna. Polymorphic worm detection using structural information of executables. In RAID, 2005. [5] Michalis Polychronakis, Kostas G. Anagnostakis, and Evangelos P. Markatos. Network-level p olymorphic shellco de detection using emulation. In DIMVA, 2006. [6] Xinran Wang, Chi-Chun Pan, Peng Liu, and Sencun Zhu. Sigfree: A signature-free buffer overflow attack blo cker. In 15th Usenix Security Symposium, July 2006. 1140