Short survey of email obfuscation techniques

Although, spam detection is better each day, many people (and in the computer science computer it seems like most or all) people are paranoid about displaying their email addresses in plain text on the web.

The HTML standard has gone out of its way to make displaying email addresses easy to both the programmer and the user.


<a href="mailto:joeschmo@cs.abc.edu">
joeschmo@cs.abc.edu
</a>

Displays as
joeschmo@cs.abc.edu
which allows the user to click on the address and the browser will open the user’s default mail client.

But paranoia has driven many to obfuscate their addresses against would be spammers by first removing the “mailto” link and then manipulating the text in some way:


joeschmo [AT] cs [DOT] abc [DOT] edu
firstnamelastname@cs.abc.edu
email obfuscation image technique
joeschmo[strudel]cs.abc.edu

The last one is my favorite (courtesy Yotam Gingold, who told my strudel is how Israelis refer to the @ sign).

I remember a student asking one of my professors who used one of these techniques, “Isn’t it pretty easy for a spam bot to parse these common replacements?” And he replied, not without wisdom, “A person would obfuscates their email address is probably also a person who doesn’t respond to spam, so why would the spammers bother.”

This certainly holds some reason, but then the burden is placed on the user who wants to email you. Typically we’ve put our email addresses on the web because we want other people to email us. So it should be as easy as possible to do so: as the language provides.

Recently a friend and I experimented with a few different techniques for achieving user-end ease of use, but also satiating our paranoia about spam attacks. Admittedly posting these techniques could be seen as escalating the cold war with spammers. But they’re certainly not unknown and probably not worth spammers while (see the logic above).

Just in time replacement with javascript


<a href="mailto:joeschmocs.abc.edu" onmouseover="this.href=this.href.replace('ocs','o@cs');return true;" >
joeschmo@cs.abc.edu
</a>

Displays as

joeschmo@cs.abc.edu

The email address is still display in plain text in the source so we’re not done yet, but the href mailto link is only correct after the user has moused over the link. Thus, when the user clicks on the link the correct email address shows up in the mail client. Doing this onmouseover is expecially nice so that correct email address shows up in the browser’s status bar and that the user gets the correct email address if he right-clicks and chooses “Copy email address” or the like.

The one place where this fails is if the user is navigating via keyboard (alt-tabbing) through the forms and links. Then focusing on the email link and “clicking” it with space-bar will not present the user with the correct link. You can fix this by adding the same functionality to the onfocus action.


<a href="mailto:joeschmocs.abc.edu" onfocus="this.href=this.href.replace('ocs','o@cs');return true;" onmouseover="this.href=this.href.replace('ocs','o@cs');return true;">
joeschmo@cs.abc.edu
</a>

Displays as

joeschmo@cs.abc.edu

Note: You should be careful that the replacement you use only works the first time so that if the link is hovered over or focused more than once the correct email address remains.

Address completion via CSS

My friend, Tino Weinkauf, thought of a very clever way to be able to display his proper email address in a browser and be sure that spammers would not see it. The idea hinges on the assumption that spam bots won’t waste time evaluating CSS code because usually CSS doesn’t change the content of the page.

In your HTML head tag:


<style type="text/css">
.eobf:after {content:"o\0040cs";}
</style>

then in your link:


<a href="mailto:joeschmocs.abc.edu" 
onfocus="this.href=this.href.replace('ocs','o@cs');return true;" onmouseover="this.href=this.href.replace('ocs','o@cs');return true;" 
>
<span class="eobf">joeschm</span>.abc.edu
</a>

Displays as

joeschm.abc.edu

The only thing I don’t like about this technique is that (at least on Safari) selecting the displayed address and copying it doesn’t seem to grab the part placed by the css. So if the user is copying the text, he’ll have to replace the characters on his own anyway.

Note: Michael Overton has an amusing way of preventing spam from reaching his inbox. HIs method certainly seems to be the most extreme burden on himself and the user, it hard to imagine that he’s not missing many non-spam emails.

Update:
In the end I’m skipping the CSS trick and sticking to javascript and simple character replacement:


<a  href="mailto:jacobsoncs.nyu.edu" 
  onmouseover="this.href=this.href.replace('ncs','n@cs');return true;"
  onfocus="this.href=this.href.replace('ncs','n@cs');return true;"
  >
  jacobson<span style="font-size:1.25em">&#x263A;</span>cs.nyu.edu
</a>

Which displays as:
jacobsoncs.nyu.edu

Admittedly the user has to replace the ☺ with an at sign, but it’s a pleasant, easily-noticed substitution.

Note: Another site on the issue has an in-browser spam parser you can use to check out your obfuscation technique.

Update:A friend pointed out that rather than waiting for the user to hover over the link you could just have the switch occur when the div is loaded. The logic being that most spam bots that don’t evaluate onmouseover probably don’t evaluate onload either. The only problem I see is that the <a> object doesn’t allow the onload event, so you’d have to give your email link an id and do it on your html’s or body’s onload event.

Tags: , , ,

4 Responses to “Short survey of email obfuscation techniques”

  1. I like your just-in-time javascript solution. I took the same approach at http://www.php-ease.com/functions/email_link.html – only I also put my encoded string in the title attribute, and move the decoded string into the href attribute just before they click for one more level of evasiveness.

  2. Tino says:

    Hi Alec,

    Thanks for writing this up. Copy&Paste works in Opera and I never checked it in other browsers before. I just tried it in Firefox and it fails. So bad! Even worse, it works in IE.

    Happy New Year!

    Tino.

  3. Guido Tonini says:

    We implemented the “just in time” solution back in year 2004.
    A sample page still exists on the Wayback Machine, so our vintage approach can be viewed at
    http://web.archive.org/web/20041207040024/www.irc-irene.org/ecomondo2004/
    (please be patient for it to load. Css not archived: awkward formatting, but it works)

    The main differences were:
    – the scrambled email is placed as a direct argument of the onmouseover function, so no need to take care of multiple calls
    – onfocus=this.onmouseover() avoids duplicating the decode function and reduces length
    – the decoding function is “external”, not hard-coded into the link. An harvester should in principle download all the .js files and look (and execute) the proper js function to get the email
    – initial href is “mailto:please.Activ@te.javascri.pt” (or something similar). A sort of courtesy alert for noscript users
    – the email text in the link is camouflaged with a mix of several techniques to enforce strength again harvesters

    We didn’t get any spam over the years on email addresses included in these pages

    We still use today a very similar technique – just a bit improved – with good success

  4. Obfuscator says:

    Here’s one way to obfuscate using CSS only, and create a “clickable” link.

    http://albertogasparin.it/scraps/2011/02/rtl-better-email-obfuscation-css-anchor/

Leave a Reply