TY - CONF T1 - A Statistical Analysis of Attack Data to Separate Attacks Y1 - 2006 A1 - Michel Cukier A1 - Berthier,R. A1 - Panjwani,S. A1 - Tan,S. KW - attack data statistical analysis KW - attack separation KW - computer crime KW - Data analysis KW - data mining KW - ICMP scans KW - K-Means algorithm KW - pattern clustering KW - port scans KW - statistical analysis KW - vulnerability scans AB - This paper analyzes malicious activity collected from a test-bed, consisting of two target computers dedicated solely to the purpose of being attacked, over a 109 day time period. We separated port scans, ICMP scans, and vulnerability scans from the malicious activity. In the remaining attack data, over 78% (i.e., 3,677 attacks) targeted port 445, which was then statistically analyzed. The goal was to find the characteristics that most efficiently separate the attacks. First, we separated the attacks by analyzing their messages. Then we separated the attacks by clustering characteristics using the K-Means algorithm. The comparison between the analysis of the messages and the outcome of the K-Means algorithm showed that 1) the mean of the distributions of packets, bytes and message lengths over time are poor characteristics to separate attacks and 2) the number of bytes, the mean of the distribution of bytes and message lengths as a function of the number packets are the best characteristics for separating attacks M3 - 10.1109/DSN.2006.9 ER -