Is Google's Algo Dropping Bloggers?

Source Title:
Is Google's Algo Dropping Bloggers?
Story Text:
Anybody whos's ever had a site droped can sympathize with his situation. As an SEO you know it's part of the game, but it can be pretty traumatic if you're not prepared. As more and more non SEO'd websites get "tossed out with the bathwater" so to speak, what's Google going to do? Do they keep using more aggressive filters that become increasingly easier to trip up? Do they ease things back a little and let a little spam slide through? How will the general public react to stories like this?

This is the second time that HackingNetflix has been dropped from Google, and I am going crazy trying to figure out why. I don't employ any search engine optimization tricks, and I don't think I've done anything that would upset the Google gods. I just write a bunch of stories about Netflix, Blockbuster, and the DVD-by-mail business and hope that people enjoy what I write.
It's extremely frustrating since the majority of regular traffic to almost any Web site comes from Google, and there is no way to dispute being dropped or even find out what happened. I've been writing about Netflix for about a year and a half now, and about 800 stories have virtually disappeared from the most popular search engine.

- Y! MyWeb

I think the problem is that

I think the problem is that some good blogs get blogrolled, which give them too many similar sitewide links. That used to be offset by some of the casual comment links and the like, but it has got to be hard for Google to filter some sites for too much similar anchor text without nuking a bunch of blogs.

A few months back my site was filtered.

I think they keep tweaking the algorithms all the time and its sorta like moon phases and ebbs and flow of the ocean tide...or something like that :)

How will the general public react to stories like this?

I do not think most people care in the least bit.


I've been asking similar

I've been asking similar questions ever since tw started gw, how will bloggers react. Aarons right that the public don't know or care, but bloggers are numerous, and loud, and the public increasing listen to them, and will thus be increasingly aware if this becomes widespread.

If the "google dropped me" meme starts in the blog world, all hell could break loose PR wise for big G..


Well he's not been removed,

Well he's not been removed, but this makes for interesting reading...


Is this a CMS Issue?

Good catch Nick.

Check out the cache using IE. Shows the same page over and over. Doesn't do the same with Firefox... ???

Might be an issue with his CMS...


What the??

Nick, your right, he hasn't been dropped, he's just been shuffled on some major keywords he was getting traffic with, which takes me back to the whole question of the post, and the simple answer that's no: bloggers aren't being dropped, one blogger doesn't account for the 60million odd other blogs out there. There still is a question though as to why he's had the drop across a number of keywords, but I'm not an SEO so I'll leave that to others, but a good thought is that we do know that humans play apart in Google's placement of sites, has a human hand played a roll in this? Has something he's written or said pissed off one of the swarm of Uni students Google hires to filter and check its results?


Hidden text


color me dubious

on the unprepared part. He knows about Bourbon. He knows about PageRank. He knows about allintitle searching. Any one of those shows more sophistication than the average person.

Quick skim, I don't see anything wrong. Certainly don't get the impression he's done anything wrong. Certainly find it odd that when searching for "netflix," they come up on the third page of results when viewing 100 at a time (something like ranked 250) while http://blogshares.com/blogs.php?blog=http://www.hackingnetflix.com%2F beats them at around 150. Google's bad on that.

But dropped from Google? For what? For Netflix? He doesn't say where he was ranked or anything like that, that I've been able to see. He did say he was dropped for searches on that term, so if he was in the top ten, sure -- why'd he go if it's a good site? But he says he had traffic for a variety of Netflix-related terms, as well. Those still coming? What's the overall traffic like? Or do we have to regress back into a world fixated around your rank for one particular term, always a bad way to measure your overall search engine health.

Hmm, what have we got elsewhere. Yahoo brings him up in the top results for Netflix. Well, sort of. This is 2nd, www.hackingnetflix.com/netflix/2004/12/netflix_launche.html, a pretty bad choice for the general term of Netflix. It's about a "Friends List" Feature being launched. Well, the home page is ranked 3. Wait, it looks like the home page, but it's an alternate URL for it, a mirror rather than a redirect. Probably accidental, but if Yahoo's picking that up as the home page, I can see Google just getting confused overall about which to go with.

MSN? Well, second page of results, suggesting again that this is an important site odd not to be in Google. Same thing with Ask.

So....I do agree, Graywolf, it is sad anyone has to go through these type of oddities. You're running what seems to be a good site, ranking well, then boom out of nowhere you get dropped. Chances are, a technical bug. Wish Google had that dedicated support line for someone to use. In lieu of that, blogging to the world about your problems generally seems to be a good end run to get some GoogleGuy love to sort it out :)


I believe there is an RSS connection

I have seen increasing evidence over the past few months that Google may be downgrading sites with a lot of links from RSS-driven content. Blogs would naturally fall into that category.

The evidence is inconclusive as I've been too busy to study the issue very much myself. But I'm starting to see some consistent results where RSS feeds that are carried by certain sites supercede the original results.


DaveN talked about this

DaveN talked about this recently though not in the context of feeds..


He makes a good point

The fact that certain RSS-feed driven sites are being crawled more frequently by Google would explain why Google gives them priority. That behavior seems to validate the belief that they have implemented their FreshRank technology to a certain degree -- at least with respect toward comparing two documents and determining which one is more recent. Of course, it doesn't indicate which site Google feels is fresher (does the most recent copy win, or the first copy crawled? -- in this case, maybe the first copy crawled).

I can see why they might trust their crawl dates better than the document dates. Anyone can touch or otherwise update the dates on their files, but only Google can set the crawl date. A site that gets crawled more frequently will therefore have an advantage over any source site.

This would only reinforce the "rich keep getting richer" effect with Google, as it now would be emphasizing information hubs over sources.

I think that explains why the screen scraper spam has been so successful lately. Google is favoring sites that take content from other sites in summary form.


Monoculture

Although it is slowly getting better, this is what happens when you have a search engine monoculture.


the "google dropped me" meme?

will always exist, and will never exist in a large way. just as a few blogs get dropped, others are sent a ton of traffic. such is they way of everflux.

i don't this is a case of "blogs in general" being dropped... maybe 3% of blogs carrying similar link network characteristics get dropped with an aglorithmic shuffle. And meanwhile another 3% got a spike in traffic. life goes on...


When did it become news

When did it become news that an affiliate site got dropped from google?

Quote:
If anyone should get the meaning behind "Hacking Netflix," I would have thought it would be Google.

And I believe they do.


...they do.

LOL jc. ;-)

As for the problem of sorting out near dup or dup content, our working theory, just for illustrative purposes:

Winner's criteria: 20% found first; 30% freshness; 40% domain backlinks. If true or even close to true, this is a poor way to sort our dup content IMO.

Also, if we're seeing it correctly, this is happening with more than just articles and the like. It extends in some cases to high level page content, and possibly even certain META data.


Howdy & Thanks,

I'm the owner of the Web site in question, HackingNetflix.com. It's hardly a news story -- I just posted a story about my frustration in hopes that one of my readers would have an idea about what is happening. I never expected to find it here.

I started the site about 18 months ago, and there were no ads on the site until last summer. I added them later when I figured out that the time I'm spending on the site was more than a part-time job. It makes a modest amount of money, but not enough to quit my day job.

HackingNetflix is more of a community site, and I post almost everything my readers ask or request as a story. It's more like a moderated community blog, but I also write a lot of original content.

I have refused to do any SEO at all since the site is more of an experiment than anything else. The reason I know the terminology is that my day job employs a SEO firm and is in a high-cost keyword market (stock photography). I got bumped during the Bourbon update for about 3 weeks, and then I came back. I was able to see HN if I used the Google "allintitle: netflix" search. That no longer works.

Thanks for discussing my situation. I helped one of my readers start a similar blog, http://netflixfan.blogspot.com, that is ranked #4 for the term "Netflix" (I used to be #2 for most of the past year). We're friendly and share blogging tips. Why she is still ranked and I'm gone is a mystery (I refuse to believe that Google is penalizing me because I'm using TypePad and she's using Blogger).

I'm not doing this to be #2 in Google for Netflix. I really enjoy writing about the DVD industry. The problem is that I feel like I'm doing something to cause my content to be excluded from searches, and I can't figure out what. I have broken many exclusives, including Blockbuster Online, Netflix RSS feeds, Netflix Friends, and many more. I have always tried to mix the content like the above post suggested.

This week I broke the story about Netflix Player and was linked to by Slashdot, Engadget, Gizmodo, Red Herring, CNet, Lockergnome, and many more. I received 60,000 pageviews in one day, and yet you have to dig deep to find stories that were highly ranked in Google.

The frustrating thing is that I have no idea what I'm doing wrong. Another problem is that TypePad screwed up my site for a few days with some of the older pages when they did a major update. Since it was really old content, I didn't notice for a few days (4th of July weekend). The timing is right about when I got dropped. Could this have done it?

I'm a writer and want people to be able to read what I write. The fact that Google has to drop sites because they think they are SPAM or affiliate-only sites, is a shame because innocent sites are victims.

I'm not interested in learning and implementing SEO tricks. The content should be ranked by popularity and relevance, but it's not.

Thanks for any suggestions,

- Mike
mikek at hackingnetflix.com


Don't overlook Google's rel=nofollow tag

The different blogging systems, last time I checked, had settled on various approaches to handling REL=NOFOLLOW. You and your friend may be getting different treatment from Google based on who is using REL=NOFOLLOW.

I did not find any uses of the tag in the RSS feed-driven sites I monitor, but that doesn't mean the blogging indexes (as well as some blog systems) aren't using it.

Individual users may have elected to use it on some blogs, where they have that privilege, and others may not. I don't know if the blog systems are limiting the tag to their comment sections or if there is no real consistency in application.

I am still waiting to see what the fallout of REL=NOFOLLOW is, but in the long run I feel it's yet another stupid idea from a company that has made billions off of stupid ideas.


About the 4th of July mixup: YES, it could be the cause

One of my own domains has been taken offline and had other issues several times this year. The search engines have cached the wrong pages for specific URLs, and that has affected some of my rankings.

So your guess as to the immediate cause may be the most correct.


The 302 redirect from hackingblockbuster may not help

Taking a string of text from HackingNetflix.com, and searching for it in Google results in an RSS display of the page appearing in Google, from another domain.

I see a number of other results, from other pages that are aggregating and displaying posts from the site when I click on the link under that result, in the text that reads:

In order to show you the most relevant results, we have omitted some entries very similar to the 1 already displayed.
If you like, you can repeat the search with the omitted results included.

Amongst those is another domain that you own(hackingblockbuster.com), which is using a temporary redirect (302) instead of a permanent one to send people to the hackingnetflix.com site. Google is caching the same information for the hackingblockbuster.com site, as for the hackingnetflix.com site.

Given a number of choices to display your content, it appears that Google is showing the RSS feed instead of your blog, and filtering out the rest with a duplicate content filter.

While you still may have a problem with the RSS feed (I had the same thing happen with a bloglines feed displaying snippets of text from a blog of mine tripping a duplicate content filter and showing the bloglines version instead of the blog), you might be able to eliminate one possible cause for your duplicate filter problem by changing the 302 redirect on the hackingblockbuster.com site to a 301 redirect.

See: The Rundown on 301 and 302 Redirects


Looks to me like

Looks to me like hackingnetflix.com currently has some htaccess issues.

I get the redirection limit and page won't load error.

googlebot hates that kind of stuff.


302

Great job, Bill!


The Google Cache

It's the same thing for all the bloody pages, like they are cloaking the same page.

So when you do a site:search of hackingnetflix.com with the omitted results you get a boatload of the same content on different pages. For one page you have this and for another page you have the exact same thing.

It looks like cloaking, but as I suggested above, it's not a Google problem it may be a CMS problem, or something else that is causing the same bloody page to be shown to Google.


Cloaking

Page has a 302 on for bots


Redirection limit exceeded

I got that redirection limit thingy too, on that article linked above here. Firefox said it was due to blocked cookies. Doesn't sound right. I tested it in Sam Spade. It just keeps redirecting (302) to the same page. Looping until something breaks. I'm sure that's enough for Googlebot to get tired...


GoDaddy Screwed Up

I took the advice from one of my readers and renewed the domain with GoDaddy for 3 years yesterday. Last night, late, they updated the registration and killed my CName record. I called GoDaddy about it and they said the update wouldn't change anything, so I figured TypePad had a problem (they did a huge update over the weekend).

TypePad showed me it was GoDaddy and I went in and fixed it in a few minutes. Doggonit. I'm writing GoDaddy an e-mail about that one. I only renewed and didn't touch any of the site configuration settings.

I'm going to turn off the HackingBlockbuster redirect. I registered it as a joke (many people asked why there isn't a HackingBlockbuster).

Thanks for the feedback and suggestions. Sounds like part of the problem is that I'm shooting myself in the foot here.

- Mike