Short survey of email obfuscation techniques

Alec Jacobson

October 26, 2010

weblog/

Although, spam detection is better each day, many people (and in the computer science computer it seems like most or all) people are paranoid about displaying their email addresses in plain text on the web. The HTML standard has gone out of its way to make displaying email addresses easy to both the programmer and the user.

<a href="mailto:joeschmo@cs.abc.edu">
joeschmo@cs.abc.edu
</a>

Displays as joeschmo@cs.abc.edu which allows the user to click on the address and the browser will open the user's default mail client. But paranoia has driven many to obfuscate their addresses against would be spammers by first removing the "mailto" link and then manipulating the text in some way:

joeschmo [AT] cs [DOT] abc [DOT] edu
firstnamelastname@cs.abc.edu

joeschmo[strudel]cs.abc.edu

The last one is my favorite (courtesy Yotam Gingold, who told my strudel is how Israelis refer to the @ sign). I remember a student asking one of my professors who used one of these techniques, "Isn't it pretty easy for a spam bot to parse these common replacements?" And he replied, not without wisdom, "A person would obfuscates their email address is probably also a person who doesn't respond to spam, so why would the spammers bother." This certainly holds some reason, but then the burden is placed on the user who wants to email you. Typically we've put our email addresses on the web because we want other people to email us. So it should be as easy as possible to do so: as the language provides. Recently a friend and I experimented with a few different techniques for achieving user-end ease of use, but also satiating our paranoia about spam attacks. Admittedly posting these techniques could be seen as escalating the cold war with spammers. But they're certainly not unknown and probably not worth spammers while (see the logic above).

Just in time replacement with javascript

<a href="mailto:joeschmocs.abc.edu" onmouseover="this.href=this.href.replace('ocs','o@cs');return true;" >
joeschmo@cs.abc.edu
</a>

Displays as joeschmo@cs.abc.edu The email address is still display in plain text in the source so we're not done yet, but the href mailto link is only correct after the user has moused over the link. Thus, when the user clicks on the link the correct email address shows up in the mail client. Doing this onmouseover is expecially nice so that correct email address shows up in the browser's status bar and that the user gets the correct email address if he right-clicks and chooses "Copy email address" or the like. The one place where this fails is if the user is navigating via keyboard (alt-tabbing) through the forms and links. Then focusing on the email link and "clicking" it with space-bar will not present the user with the correct link. You can fix this by adding the same functionality to the onfocus action.

<a href="mailto:joeschmocs.abc.edu" onfocus="this.href=this.href.replace('ocs','o@cs');return true;" onmouseover="this.href=this.href.replace('ocs','o@cs');return true;">
joeschmo@cs.abc.edu
</a>

Displays as joeschmo@cs.abc.edu Note: You should be careful that the replacement you use only works the first time so that if the link is hovered over or focused more than once the correct email address remains.

Address completion via CSS

My friend, Tino Weinkauf, thought of a very clever way to be able to display his proper email address in a browser and be sure that spammers would not see it. The idea hinges on the assumption that spam bots won't waste time evaluating CSS code because usually CSS doesn't change the content of the page. In your HTML head tag:

<style type="text/css">
.eobf:after {content:"o\0040cs";}
</style>

then in your link:

<a href="mailto:joeschmocs.abc.edu" 
onfocus="this.href=this.href.replace('ocs','o@cs');return true;" onmouseover="this.href=this.href.replace('ocs','o@cs');return true;" 
>
<span class="eobf">joeschm</span>.abc.edu
</a>

Displays as joeschm.abc.edu The only thing I don't like about this technique is that (at least on Safari) selecting the displayed address and copying it doesn't seem to grab the part placed by the css. So if the user is copying the text, he'll have to replace the characters on his own anyway. Note: Michael Overton has an amusing way of preventing spam from reaching his inbox. HIs method certainly seems to be the most extreme burden on himself and the user, it hard to imagine that he's not missing many non-spam emails. Update: In the end I'm skipping the CSS trick and sticking to javascript and simple character replacement:

<a  href="mailto:jacobsoncs.nyu.edu" 
  onmouseover="this.href=this.href.replace('ncs','n@cs');return true;"
  onfocus="this.href=this.href.replace('ncs','n@cs');return true;"
  >
  jacobson<span style="font-size:1.25em">&#x263A;</span>cs.nyu.edu
</a>

Which displays as: jacobson☺cs.nyu.edu Admittedly the user has to replace the ☺ with an at sign, but it's a pleasant, easily-noticed substitution. Note: Another site on the issue has an in-browser spam parser you can use to check out your obfuscation technique. Update:A friend pointed out that rather than waiting for the user to hover over the link you could just have the switch occur when the div is loaded. The logic being that most spam bots that don't evaluate onmouseover probably don't evaluate onload either. The only problem I see is that the <a> object doesn't allow the onload event, so you'd have to give your email link an id and do it on your html's or body's onload event.

Comments

January 01, 2011, Tino

Hi Alec, Thanks for writing this up. Copy&Paste works in Opera and I never checked it in other browsers before. I just tried it in Firefox and it fails. So bad! Even worse, it works in IE. Happy New Year! Tino.