Google's "Bad Data Push" Affecting Alexa, Too


Google's "Bad Data Push" is mysteriously affecting the Alexa rankings of the billion-page-spammer's sites. The random letter-number domains, suffering from the bad data push and malfunctioning site : operator, seem to have been negatively impacted by the Google recall of the bad data:



Banned = Crash

The crash is probably from them geting banned. But from going from top 2000 rank to crashing pobably proves that he had millions is not billions of URLs indexed. 10,000 URLs like the spammer said he had, would not get you in to the top 2000 in rankings!!!

I just posted about this in the other thread.

Look at the alexa rank and you'll see he's not getting much traffic at all. Daily reach is not a direct measure of vistors.

The TP1see domain has a rank of 331000. Which implies a few hundred visitors a day. Not many for a 5 billion page site :)

Look at the peak of the graph of the daily traffic rank:

His busiest site (the one receiving a lot of traffic from the other domains as redirects) easily made it to under #1,000 in daily rank before the crash.

Does anyone believe this was a one domain trick? Or that Google has solved this problem? I'm seeing plenty of others still ranking well in Google... Including ( edit by DaveN we don't out spammers)

Define "outing"?

I give two good outing examples in my blog today.

Possible outing rules:

1. Anything that hurts me directory should be outed.
2. Anything that does not hurt me should be ignored.
3. Anything that hurts us all should be crushed with extreme prejudice.


I work with Matt Cutts and other engineers in the Search Quality Team at Google. And yes, we noticed that lots of subdomains got indexed last week -- and sometimes listed in search results -- that shouldn't have been. Compounding the issue, our result count estimates in these contexts was MANY orders of magnitude off. For example, the one site that supposedly had 5.5 billion pages in the index actually had under 1/100,000th of that.

So how did this happen? We pushed some corrupted data with our index. Once we diagnosed the problem, we started rolling the data back and pushed something better... and we've been putting in place checks so that this kind of thing doesn't happen again.

Talking about stuff that you wouldn't want exposed if it was yours. Something that's already appeared elsewhere is fine (this is THREADwatch, after all), talking about something that's unmissable may be OK, talking about something that someone actually wants to work probably isn't....



Who thinks that this will now result in millions of legitimate subdomains being hurt by Google in an over-aggressive attempt to stop this 'bad data' push?

This should be relatively easy for G to fix. If a TLD suddenly shows up with thousands of one-page subdomain sites, then it shouldn't be indexed, right?


Mark, I have legitimate domains, so far I see no negative result but I will let you know.



Mission accomplished. Party over.

This should be relatively easy for G to fix. If a TLD suddenly shows up with thousands of one-page subdomain sites, then it shouldn't be indexed, right?

Now we have an arbitrary definition of one-page subdomains == spam. Lovely.

Hey Matt Cutts if you haven't gotten that million dollar bonus for setting up the SEO "propaganda" machine it must be coming now. Stifle innovation, brand seo as evil, encourage self-imposed labeling such that new=spam, etc etc etc. Nice work.

Hey markdaoust, do you really believe you can swipe like that with integrity? Don't answer that -- instead spend your energies advancing your marketing skills, reading about billboards, one second radio spots, websites-as-advertisements, insertions, impression marketing, and examining the serps you live in. Chalk another one up for the big publishers who can dial into Google to protect their spam, and blow some taps for the independent webmaster.

we need a new tshirt or new thread, perhaps titled "John Andrews: anti anti spam crusader" ;)


