shahine.com/omar/

homepage | Send mail to the author(s) contact

yet another Microsoft blogger

# Thursday, July 07, 2005

Boomerang

Catchy name for a feature huh? Boomerang is one of the cooler and innovate things we've done at Hotmail, and chances are you have no idea what it is or that we've even using it.

I'm going to be upfront and tell you that I had nothing to do with Boomerang. However, I'm writing about it because it restores dependability to email. It's specifically aimed at keeping the spammers from ruining mail.

Background

Let’s go back to when I started using email. That was around 1994, my first year in college. Back then you could send and reply to email as much as you wanted and it generally got delivered. The only thing that prevented email getting from user A to user B was some kind of routing/smtp failure or bug in the mail client (like a crash). There weren't any spammers out there to muck this up. Furthermore, if you made a typo or something when writing some one's email address you typically got a Non Delivery Receipt (NDR) notifying you of the failure. Generally, things were good.

Today Hotmail blocks what we estimate to be about 3.2 billion messages. I say estimate because much of our anti-spam technology will simply prevent us from picking up the phone when you try and call. In other words, we drop you at the router, before we even spend any cycles figuring out if you are sending spam using software such as SmartScreen. We've invested a tremendous amount of time and energy into systems that try and figure out the behavior of your IP address, and the content of your messages. These systems are by no means fool-proof and unfortunately, it’s possible that you have experienced some false positives (messages that are not spam in your Junk folder), or, more likely, false negatives, which are messages in your Inbox that are spam. What's worse though are the messages you never even received. To give you an example, I didn't find out my old roommate was getting married till pretty late because his "save the date" email never made it to my Inbox. That sucks!

The Solution

Well it's impossible to really fix things perfectly. But there is one critical scenario that we did solve, and we did it using Boomerang. If you send mail to some one using your Hotmail account, replies to that message will go straight to your Inbox. In other words, messages you send your friends and family, will have a straight shot back to you if they chose to reply. There is also another additional benefit. Over time spammers started to do things like spoofing your email address or send spam to other people that looks like it came from you (“spoofing” you).  Some of that doesn’t reach its victim and gets failed back to you as an NDR, diluting the value of NDRs. The problem with spoofing is that anyone can pretend to be you and send mail using your email address. The result is that many mail providers will send NDRs back to you (not the spammer) telling you that you tried to send mail to a non-existent mailbox. Annoying. Well Boomerang allows us to nuke any NDR messages that were not a result of YOU sending mail. How did we do all this? With smart people and simple technology. It's my favorite kind of work. No standards bodies, no politics, just an idea and some code.

Boomerang takes advantage of two very simple things: 1) The Message-ID header, 2) A hash, and 3) the In-Reply-To header. Basically, when you send a mail at Hotmail, we create a Message-ID that is unique to you. The hash is a one way hash, and it's pretty much only possible for our servers to generate them. I say pretty much because with any simple technology, it's entirely possible that some motivated person will find a way around this. However, it's really not worth their time to do so given that each instance of these things are time limited, and unique to an individual.

When another mail client receives the message from Hotmail, it contains this special Message-ID. Most mail clients that respect our Message-ID will use that Message-ID as the basis for the In-Reply-To header. This behavior is specified in RFC 822—the basis of the format of internet email.  The In-Reply-To is another message header that allows mail clients to uniquely tie a reply to another email (the one you sent). Each subsequent response to that thread maintains the Message-ID in the form of the References header. Whenever we receive a piece of mail that has this special Boomerang identifier in either of those two headers, we place it in your inbox ensuring that it bypasses any content filtering. Now some mail clients do not respect the Message-ID that we generate, or they may not generate a In-Reply-To header. Clients like Hotmail, Outlook, and OE do, and I suspect the list will continue to grow.

On NDRs. The format is such that the original message is almost always attached (to facilitate resending).  In this case, we can know that the proper Boomerang identifier is present or not, and not only bypass filtering if it is, but automatically junk it if not.  This is a very important scenario for users who can get confused by NDRs of messages they probably didn’t send from someone else that received a virus claiming to be from them, detected, cleaned, and refused the message, causing someone two degrees away from the actual infected computer to get worried, call helpdesks, reformat their machine, etc.

So, to summarize. We have some code that tags your outbound messages. If the email recipients reply to your message, and they are using a modern mail client, that message will find its way back to you; just like a boomerang. If the message reaches an address that does not exist you will receive notification of that Non Delivery, and you will not get spam messages disguised as NDRs, or NDRs that are a result of some one hijacking your email address.

We've had this feature in Hotmail for a few months now and it's working great. You probably never noticed, but you're probably getting all the replies to emails you sent when people chose to reply. Making email more reliable, it's just one things we're bringing you this year.

You can thank Eliot Gillum, Aditya Bansod and Pablo Stern for this work. They are smart guys, and I'm glad I get to work with them.

Disclaimer: this work is Patent Pending.

 

Thursday, July 07, 2005 9:06:57 AM (Pacific Daylight Time, UTC-07:00)
Omar, do you have any plans to push this technology to Outlook/OE/Exchange server? AFAICT, there is no technological reasons stopping mail clients from doing exactly the same thing and this way, you would get this cool feature for all you mail accounts. I know that you are trying to promote hotmail and all, but think about all those home users that have email a/cs from their ISPs that would benefit from this as well as your corporate customers.
Thursday, July 07, 2005 10:21:56 AM (Pacific Daylight Time, UTC-07:00)
I totally agree with you. I would love to see this technology widely deployed. I can't comment on who is doing it and when they will do it, but rest assured that we'll do our part to evangelize this technique within Microsoft.
Thursday, July 07, 2005 12:11:17 PM (Pacific Daylight Time, UTC-07:00)
When I began reading the above post, I guessed it would be about tagging an email and checking it again when someone replied. What was unexpected was that you consider this to be such an original idea that you decided to patent it.

Goodbye hotmail. Thankyou EU.
RichB
Thursday, July 07, 2005 1:34:31 PM (Pacific Daylight Time, UTC-07:00)
What I am not going to do, on this post, or on this blog is have a discussion about Patents.
Thursday, July 07, 2005 5:34:33 PM (Pacific Daylight Time, UTC-07:00)
I had noticed replies were delivered noticeably faster than the original message, guess I wasn't imaginging it! Impressive work, I'm glad you blog this sorta stuff to keep us in the loop :)
Thursday, July 07, 2005 5:37:03 PM (Pacific Daylight Time, UTC-07:00)
So, how does Boomerang solve your roomate's invitation email? You would still not get that email, would you? Well, except if he was inviting you by replying to some other email you sent from your hotmail account...
Andre
Thursday, July 07, 2005 9:52:41 PM (Pacific Daylight Time, UTC-07:00)
We already have a solution for that problem. If you add some one to your address book, then they are excluded from any filtering. My point though was just to point out that the current situation sucks. We don't have a perfect solution for that problem yet.
Tuesday, July 12, 2005 2:29:46 PM (Pacific Daylight Time, UTC-07:00)
I stumbled across this post via a link from Mary Jo Foley's "Microsoft Watch" ("New Hotmail Code Names: Kahuna and Boomerang"). You made two statements in your post which concern me.

The first is this: "How did we do all this? With smart people and simple technology. It's my favorite kind of work. No standards bodies, no politics, just an idea and some code." I don't want to make too much of it, but it seems you are validating a lot of peoples' complaints that Microsoft doesn't respect standards.

I understand your point that your team was able to quickly, efficiently, and elegantly implement this technique without breaking standards. It's just that the way you say it, it's as if standards often get in your way. I'd like to state (as counteless others have done) that Microsoft causes a lot of the standards burden because they don't implement standards well in their software. I have spent countless hours reworking css for websites so they will render at least close to how they appear in more compliant browsers.

Standards let us all do things in the BEST way instead of just the MICROSOFT way. I hope that Microsoft's attitude towards standards changes.... or that I have read too much into the statement I quoted above.

The second issue I want to mention regards the patent comment. I understand you don't want to get into a debate about patents. But I just want to say that my ISP has offered a similar feature for quite a while now. If I receive a reply to a message I sent, my ISP's Spam Filter locates a string in my message which serves as the "key." Though the Hotmail feature appears to be a little more automated and user-transparent, I think we can all agree that there is prior art that precludes placing a patent on Boomerang.

All that being said, I do applaud Microsoft's efforts in the spam and adware arena. I am excited about Longhorn's more locked-down user access. And XP SP2 has been a success... it has done a lot to aleviate the support burden at my ISP. (I used to work there.)
MIchael B.
Tuesday, July 12, 2005 5:12:05 PM (Pacific Daylight Time, UTC-07:00)
Thanks for the comment Michael.

I don't think my post was meant to discredit the work of standards bodies. I've worked on a number of features in my career at Microsoft that have been a direct result of the work of standards bodies. Such things as S/MIME, IMAP, POP, SMTP, HTML, XHTML, iCal, vCard and other that have not such as RSS and numerous other "proprietary" features.

In some cases implementation of standards can be difficult as a result of the lowest common denominator solutions necessary to support interoperability, or the changes in technology over time, or the "design by committee" problems of standards.

However, they do afford you some luxuries such as interop. Either way, this particular functionality is one I was happy we could ship without any need for broad standards as the technology need not interoperate.
Tuesday, July 12, 2005 11:46:05 PM (Pacific Daylight Time, UTC-07:00)
I thought about this before, but one reason why I think it wouldn't work is because some people have multiple email clients set to use the same email address. For example I send email using both OE and web-mail. And in both of them I specify the same sender email address. Now if I send with OE and OE sets a specific msg-id and then I receive replies using web-mail, the web-mail client won't recognize that msg-id and might think it is spam.

Another scenario is with online greeting card services, which usually send out emails with my email address as the sender, but which did not originate from my email client. If the recipient of that greeting card decides to reply to it to contact me, the msg-id would again be different.
Nathar Leichoz
Wednesday, July 13, 2005 9:25:01 AM (Pacific Daylight Time, UTC-07:00)
Is your Blogroll broken? Or are the majority of them supposed to point back at your own blog?
J. Curious
Wednesday, July 13, 2005 1:21:09 PM (Pacific Daylight Time, UTC-07:00)
Hi Omar,

Why is it necessary to rename Delivery Status Notifications to Non-Delivery Receipts? Is it simply a wanton disrespect for all existing standards?

By the way, I've been testing Message IDs in DSNs for over five years. I hope your patent self-immolates.
jwb
Thursday, July 14, 2005 3:34:58 AM (Pacific Daylight Time, UTC-07:00)
As I understand the terminology, DSN are the structured error reports defined in RFC 3464 (previously RFC 1894). Support of this standard is unfortunately quite rare. An NDR on the other hand, is just another word for a bounce message, regardless of the format used within.

Boomerang is a cool idea, but as it forces(?) the msg-id it works best when you control the user's e-mail client. For more general deployment, it would be useful to have an SMTP extension for submission mode which reports the added msg-id back to the client at the end of DATA. (This could be a "passive" service extension, a la PIPELINING.)

As for the multiple outbound servers with same address problem, Hotmail are already doing their best to coax people into using only their servers with the SenderID records they publish. Boomerang has the same problems if you use it as a forgery detector, but you shouldn't do that. Boomerang is a whitelisting scheme, although Hotmail uses its non-presence to junk NDRs, which is probably okay for their environment. I think a scheme like BATV is less intrusive for more sites, though.
Kjetil T. Homme
Thursday, July 14, 2005 7:34:31 AM (Pacific Daylight Time, UTC-07:00)
"It's my favorite kind of work. No standards bodies, no politics, just an idea and some code."

Only a Microsoft employee would put "standards bodies" in the same category as "politics." Not that you care, but it's that kind of attitude that makes Microsoft so hated by so much of the non-Microsoft world.

(NB: I tried posting this 3 times from Mozilla, but your site didn't accept it. It just reloaded the page, without posting my comment, and with no warning or error or anything. I'm trying this post through IE.)
Thursday, July 14, 2005 11:13:56 PM (Pacific Daylight Time, UTC-07:00)
Anthony, I think there are lots of people who would put "standards bodies" in the same category as "politics". Standards Bodies don't ship software, they don't have customers, and they don't have ship schedules to worry about.
Comments are closed.