Is MSN Lying About Their Referrer Spam?

| |
![]() | |
Introduction
So several months ago MSN started crawling the web with a new crawler. This one was designed to (in theory) detect sites that were cloaking. The idea behind this was that if the site was serving up a different page to bots, and a lot of cloaking sites bypass their own IP lists to save on CPU usage if there’s a referrer detected, then sending a fake “LIVSOP” search engine referrer should detect the cloaking sites.
Sounds like a good move for them, right? No. The way they did was sending out a lot of referrers that were pretending to be from Live.com search results. Like an excessive amount I still get 30+ per day on this blog.
This was supposed to bypass a lot of cloaking filters, but really only succeeded in dirtying up everyones logs, and causing a handful of people to ban the abusive IP range. If you want to take a look, it is well documented everywhere. Eventually, they admitted they were doing it, and the dust settled.
The Reality
So. I got asked by Gab of SEO ROI to find a domain I had that was completely banned so that he could settle a discussion he was having about Google Bowling(yes, he was testing against one of his old sites). I trust Gab, so I said sure.
Now, at this point I’m running my blackhat off of 2 shared hosting accounts. My one ancient one, and my one super-shared host I have that is running my current software that uses almost no CPU usage, so I can host a lot of domains on it. I reasonably figured that my latest host’s domains were probably still mostly alive in MSN/Yahoo, even if they were banned from Google, and decided to root around my old account for a domain.
Now, on the ancient hosting account, I NEVER updated any databases or software to combat MSN’s referrer spam cloaking detection. All of my domains(as it has been verified) behaved exactly how MSN would expect them to(yes, every single domain on this account cloaks). And then I realized something. Every Single domain I had was indexed on either Yahoo or MSN still. That is not a good sign for them. Some of these are way older than a cloaked domain should be. I eventually had to give Gab a less than successful domain I had that was still indexed!
When I woke up the next morning, I then ran some figures and realized that 85% of the domains I had on that account are still indexed by Live/MSN. It’s been MONTHS since they started screwing with all my logs. They have crawled these domains well over 50,000 times (my database on that account freaks out if I query for more than 50,000 logs in the DB, so that’s all I can confirm) since they began spewing their crap all over my log files.
In fact, my very first blackhat domain (a piece of crap to be honest) is still somehow indexed by Live.com. In fact, it appears(as of this morning) to have increased it’s indexed pages that it had last night to now easily exceed 15,000 pages.
So Why The Hell are They Crawling Like This?
To be honest, I have no idea. A thought by Gab(somewhat jokingly) was that maybe they’re just trying to shuffle some traffic into their Search Engine. A “Hey, I’m still here!” kind of thing. That seems a bit of strech, but who knows? There’s either some other reason for them doing this referrer spam, or they’re just really really terrible at detecting cloaking. I have no idea. These sites were exactly what their referrer spam should’ve detected.
Any ideas?
-XMCP
P.S: I’m even harder to get ahold of than normal this week. My good laptop bit the dust(HD burnt out) so now I’m on my old ghetto rig that can officially maintain less browser windows open than an iPhone. So apologies.




















March 14th, 2008 at 2:23 pm
It’s just f**n retarded.
For all I know, MSN is trying to remind webmasters that they exist. I’ve got site after site to rank #1 on MSN for a term I wanted… It’s pretty easy, but doesn’t do you much good in traffic. #1 on MSN is about as good as #37 in Google.
Or who knows, maybe MSN figures they can get a few links from stats pages… Push up their pagerank a bit — they sure need it.
March 14th, 2008 at 2:29 pm
Fun chatting with ya bud, and appreciate the link love. Honestly, at this point I can’t see any other reason than to artificially increase their numbers of “searches powered this month”
March 14th, 2008 at 5:04 pm
They could just be doing the equivalent of prstorm for live.com? LOL
March 15th, 2008 at 8:50 pm
This is the second time I’ve seen you refer to banned domains and google bowling. Any chance you will be doing a post about banned domains?
I have a white hat domain that got banned (for BS reasons) and I am currently redirecting it to a new domain (same site, new domain). Is that hurting the new site?
March 16th, 2008 at 4:29 pm
@B-Gone, no one knows the answer to that question for sure. Your rankings may return, or they may not. The penalty will carry over to your new domain if you didn’t rid of the offending thing that caused it to be banned.
March 17th, 2008 at 5:44 pm
Thanks for the input Jack. I believe I have corrected the problem and the site has started to rank for some non-competitive terms, so maybe I have avoided the problem.
It would be interesting though to hear what someone who has experimented with banned domains has to say.