SNARE: Spatio-temporal Network-level Automatic Reputation Engine

TitleSNARE: Spatio-temporal Network-level Automatic Reputation Engine
Publication TypeReports
Year of Publication2008
AuthorsFeamster N, Gray AG, Krasser S, Syed NA
Date Published2008///
InstitutionGeorgia Institute of Technology

Current spam filtering techniques classify email based oncontent and IP reputation blacklists or whitelists. Unfortu-
nately, spammers can alter spam content to evade content-
based filters, and spammers continually change the IP ad-
dresses from which they send spam. Previous work has sug-
gested that filters based on network-level behavior might be
more efficient and robust, by making decisions based on how
messages are sent, as opposed to what is being sent or who
is sending them.
This paper presents a technique to identify spammers
based on features that exploit the network-level spatio-
temporal behavior of email senders to differentiate the spam-
ming IPs from legitimate senders. Our behavioral classifier
has two benefits: (1) it is early (i.e., it can automatically
detect spam without seeing a large amount of email from
a sending IP address—sometimes even upon seeing only a
single packet); (2) it is evasion-resistant (i.e., it is based on
spatial and temporal features that are difficult for a sender
to change). We build classifiers based on these features us-
ing two different machine learning methods, support vec-
tor machine and decision trees, and we study the efficacy
of these classifiers using labeled data from a deployed com-
mercial spam-filtering system. Surprisingly, using only fea-
tures from a single IP packet header (i.e., without looking at
packet contents), our classifier can identify spammers with
about 93% accuracy and a reasonably low false-positive rate
(about 7%). After looking at a single message spammer
identification accuracy improves to more than 94% with a
false rate of just over 5%. These suggest an effective sender
reputation mechanism.