Google Scraping & Spamming

13 comments

Wow Google what a great algo you've got and what a really terrific user experience you're providing when you occupy 9 out of the top 10 results for a query with your own sites [shredermanrules.com] screen shot for posterity. Crappy results on a search term that made it the top trends for June 9th and June 10th. Even better is that Google is really providing unique content on all those pages too, maybe I'll point out the recently updated webmaster guidelines too you

Purely scraped content, even from high-quality sources, may not provide any added value to your users without additional useful services or content provided by your site. It's worthwhile to take the time to create original content that sets your site apart. This will keep your visitors coming back and will provide useful search results.

Comments

Well...

I don't know, but maybe if there were more results this would be notable... but as it is it seems rather ordinary. When I search I see 12 results with 11 of them supplemental, and even in your image all but the first one are supps.

If this is Google's version of a "great user experience"

I guess there's no more doubts left how they're viewing their users in the first place...

Logic

I don't know, but maybe if there were more results this would be notable

Oh so if there's nothing else google should backfill with their own scraped duplicate domain spam? I can remember many a site clinic where Matt admonished someone owning multiple domains with very similar content saying that wasn't Google wanted to see, or do Google's rules apply to everyone but Google ...

You mean you didn't realise?

That

"..do Google's rules apply to everyone but Google ..."

could only be answered in the affirmative?

Why the whining?

I'd rather see those results than the actual site, which is just a big piece of crap with nothing but ads

Kudos to Google

Did you mean

I'd rather see those results than the actual site

Well actually what would be the best results would be to suggest "did you mean Sheddermanrules.com" with two "D"'s. However when you look at the results [Sheddermanrules.com] you see the same scraping and spamming just to a lesser extent.

There is sod-all search

There is sod-all search content on the web across any of the major search engines currently covering "shreddermanrules". Google: 40; Yahoo!: 10; MSN: 12. Could you have picked any smaller a vertical?

That Google should have a few trends subdomains open for indexing I shouldn't think is a bad thing - Google have a pretty significant amount of their internal content already blocked via robots text, so I figured the hottrends is an oversight.

Even still, I would have been a better argument to encourage Google to open up more of it's own proprietary content to indexing, so that third-party links and content published on Google's non-search applications can get their relative credit.

Really, screaming "scraping and spamming" on a keyword with so few results comes across as pedantic in the extreme. We've seen far far worse behaviour from other sites in terms of content "spamming", so to start yelling about Google's "misbehaviours" on issues of little substance simply dilutes those more relevant criticisms we should be focusing on re: ISP's in general, not least Google, especially on privacy issues.

If those people who could perhaps be best suited to call-out ISP's for their significant flaws, waste time labouring over every tiny perceptinle infraction, then the bigger message may be diluted and lost.

2c.

Is that even...

...a "vertical"? :)

SB

I imagine a day...

when the majority of first page placements on the major engines will be owned or directly controlled by traditional media or the engines themselves...the internet brick wall is solidifying at a faster rate these days.

Tip of the iceberg

I agree it's a very small thing in the scope of all the searches, but it's the tip of the iceberg used to make a point. Google engineers do site review clinics at conferences and scold people who have lot's of domains with very similar if not identical content. However in practice they do exactly they tell everyone else not to do. They publish guidelines about how not providing original content creates a bad user experience, but again do exactly what they tell everyone else not to do. They keep preaching it's all about the user experience and providing great results yet for a search term that they describe as "volcanic" in volume they provide almost no value.

my 2 drachmas feel free to disagree

Looks like

Looks to me that not only do we have subdomain spam we now have a new term: "TLD spam". Just buy every TLD possible with your trademarked name and put the same content on it. Voila!

We had previously added

We had previously added "/trends?" to our robots.txt to prevent crawling pages like this, but these urls have a slash instead of a question mark. I'll check on making it /trends or adding a noindex tag. Thanks for mentioning it, Michael.

trends

no worries, but you know a searchable google hot trends would be a nice feature

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.