Click the comments link on any story to see comments or add your own.
Subscribe to this blog
24 Apr 2006
Since we know we're not going to find a FUSSP any time soon, anti-spam efforts are concentrating on incremental efforts to make the current mail system, messy though it is, work better. Dealing with abuse reports is a particularly messy and labor-intensive area that desperately needs more automation.
When an ISP or other mail provider receives an abuse report, they typically do triage into categories such as needs immediate attention (phish, child porn or other very illegal content), discard (false alarms due to forged headers, abusive rants), and eventual attention (everything else). Then they deal with the messages in each category. Some providers still have people handling all of abuse mail that comes in, but for any but the smallest that is far too slow and expensive. More likely they do the triage automatically by ad-hoc pattern recognition, looking for various keywords to figure out what kind of report it is, and looking for the Received: headers their mail programs produce to identify copies of mail they sent. For spam reports, if the automatic process can figure out the IP address or the sender, they just count the report and act when there are enough reports for a given IP or sender to indicate trouble. Often, if the message can be recognized as one from a mailing list, it removes the recipient from that list. Needless to say, this process is rather approximate and can make a lot of mistakes.
To make this process less approximate, an ad-hoc group including representatives from some of the largest ISPs have defined a simple common format for abuse messages known as ARF (Abuse Report Format.) Building on existing practice as all the best standards do, an ARF message is a kind of multipart MIME message with three parts.
The first part is unstructured text not intended for computers to interpret, typically a note in case a live person reads the message. The second part is a series of lines in a format resembling mail message headers, a keyword followed by a colon and the corresponding value, intended to be decoded by computers, but still easy for people to read. The third part is a copy of the message about which the message is complaining (or as much of it as the recipient feels like sending back.)
Some fields in the second part such as Original-Mail-From:, Original-Mail-To:, Received-Date:, and Source-IP: contain information that was parsed out of the original message or logged in the incoming SMTP session. The Feedback-Type: field describes the type of report. It usually is abuse to report spam or other e-mail abuse, but it can also be fraud or virus. At the request of bulk mail providers, it can also be opt-out or opt-out-list, with a corresponding Removal-Recipient: field to say who to remove, but unless the recipient network has some reason to believe that the message is from a legitimate mailing list and some reason to do the sender's list management, those are unlikely to appear. The set of fields is intended to be extensible and will probably grow as ARF becomes more widely used.
AOL and Earthlink can provide ARF reports for their feedback loop recipients, and it's spottily used other places. If someone is already using an automated or semi-automated tool to send abuse reports, it's not hard to make it send ARF. I have some perl scripts that send out abuse reports from my network, and it took under half an hour to change them from the previous ad-hoc text format to ARF.
While ARF won't do much on its own against spam, it should help cut down on bogus "this wasn't from us" responses due to triage staff who misread reports. Also, since ARF is intended to be sent and received automatically, it makes it more likely that when a spam run hits a spamtrap and provokes an automated report, that the report will be handled as soon as it's received, and maybe can even stop the spam run while it's in progress.
comments... (Jump to the end to add your own comment)
Add your comment...
Note: all comments require an email address to send a confirmation to verify that it was posted by a person and not a spambot. The comment won't be visible until you click the link in the confirmation. Unless you check the box below, which almost nobody does, your email won't be displayed, and I won't use it for other purposes.
My other sites
© 2005-2018 John R. Levine.
CAN SPAM address harvesting notice: the operator of this website will not give, sell, or otherwise transfer addresses maintained by this website to any other party for the purposes of initiating, or enabling others to initiate, electronic mail messages.