Directories Taking a Whack on Google - What's going on?

73 comments
Thread Title:
WebAtlas booted from Google
Thread Description:

The conversation threadlinked above started by some gloating knob could be very worrying for Directory owners. The knob in question is gleefully informing the members at seozip that webatlas.org has been booted from Google.

Speaking of knobs, this thread at IHU has Dastardly Doug crowing about "spammy directories" - there's not much to be gained from the thread itself (surprise..) but it's indicative of many threads floating around out there over the last couple of months about Directory sites having trouble with Google.

So, have a look at Threadwatch member nandini's site: Web Atlas and tell us what you think. This would seem on a level with BlueFind's recent troubles

Is Google actively seeking out directories or are there some inherent flaws in the scripts that generate and maintain such sites that are tripping some kind of site hazard at G?

Comments

Fix your site first :)

I don't have time to look into the specific case but in 99% of the cases like this I work on for clients it's a technical glitch or bad arcitechture (duplicate content, spider traps etc) on the site itself - not Google.

In a some cases (and this could potentially be one of them) changes in Googles algorithms hit certain kind of sites more than others. I would not be surprised if Google have, or will, make updates that filter out some of the worst directories that really bring no value to the search results. I don't think they will remove them completely but they probably won't rank either.

Recently it was posted at V7

Recently it was posted at V7 that Webatlas was banned by Google. When in fact a search for the domain in Google does appear that it was removed from the index. But, after talking with the owner and creator of the site, it seems that she wanted to redirect http://webatlas.org to http://www.webatlas.org and she used a 302 redirect which we all know from the Business.com instance that Google treats as spam. In other words, it's a problem/ glitch with Google, not Webatlas. Now, I'd like to see how quickly it will return to the index. She emailed them and hopefully they will get back to her soon with reinclusion.

Once again, directories will be banned or penalized for the exact same reasons that a normal site will be penalized or banned. And not just because they are a directory. Google & Y! knows crap directories from quality ones, I assure you. :)

Bridges, trolls and Doug

>>gloating knob

Ever notice how the knives come out when you are down?

>>Doug crowing

Typical.

>> Web Atlas

If it is some sort of filter/penalty it is a shame, WebAtlas struck me as one of those directories that was trying to give some real value to human users.

>>Is Google actively seeking out directories

feh! Am I the only one getting really sick of Google?

All we have is a bunch of conjecture, nobody really knows what is going on or how widespread Google's alleged gulag for directories might extend. Does it apply to niche directories? What if a directory is using redirect links that do not transmit PR - hard to accuse such a directory of selling PR - are they effected? Is it just Gburp?

.

I'll wait two weeks before drawing any conclusion. I'm guessing Webatlas will be back in by then. If not, it's time to rewrite my link building knowledge base :)

Directories Rock - Best links to have.

these same problems happen with regular ole websites too....it's just not directories that have glitches like this.

Those who dismiss getting on directory pages I feel are missing in on one of the best places to get links from. What's not to like about directories...you get your link on a page with relevant title tags on the links pages, and the other people listed there are Relevant to you, putting you in the prime linking neighborhood.

Since most sites don't want to link to their direct competitors, the next best thing is to have your links on pages that also link to your competitors....

Seems like no matter the methods of marketing that come up...there are always those to want to knock down those methods...I think 2005 is going to be the year of tons of niche directories....some will utulize it, and some will bitch and predict downfalls.

I'm suprised at the level of "Directories are bad" views out there....everyone wants to get into DMOZ, but they also hate DMOZ because of the corruption and because it's so hard to get in...yet some people have built directories that are run better and smoother than DMOZ, and are highly marketing them to boot....and some are scared of them??

Really good points jim..

Really good points jim..

Knocking

Yes, thanks Jim.

Some people will knock anything. It seems to be their aim in life - whatever, these folks rarely manage to do much more than bluster and blow like loons.

Human Aspect
I've been speaking with nandini a bit. It's easy to forget that behind a website is someones hard work, time and love - She sounds quite upset over this whole business and I'd like to take the opportunity to ask anyone that has the ear of a G rep to give them a buzz - Webatlas is one of the better new directories out there and deserves better treatment...

This is news?

G has been killing directories based over here in the UK since December 03.

99% of them are spam - and the owners are very good at appearing again with another name and another design a couple of days later.

WebAtlas looks pretty clean to me, just looks like it was taken out by an algo tweak designed to take out the spam directories, as unfortunately its design is very similar to many of the spam directories.

Quote:
I'm suprised at the level of "Directories are bad" views out there....everyone wants to get into DMOZ, but they also hate DMOZ because of the corruption and because it's so hard to get in...yet some people have built directories that are run better and smoother than DMOZ, and are highly marketing them to boot....and some are scared of them??

What?? - have you been seeing the same directories as me - I had compiled a list of about 50 directories which were all basically the same model, highly marketed, keyword targetted pages and PPC click through financed. In some searches in Google it had got to the state that 18 of the top 20 results were crap directories. The SEO was great but it was just too, too easy to clone and so it became an issue that Google had to deal with.

Here are some WMW threads on the subject

http://www.webmasterworld.com/forum7/1065.htm
http://www.webmasterworld.com/forum7/1101.htm
http://www.webmasterworld.com/forum7/1086.htm

>What?? - have you been seein

>What?? - have you been seeing the same directories as me - I had compiled a list of about 50 directories which were all basically the same model, highly marketed, keyword targetted pages and PPC click through financed.

Yup. Though I happen to run some very old directories in a niche market, I'm siding with Kali on this one.

Spam getting removed from goo

Spam getting removed from google is not directory specific..once again it's any type of site. But to compare spam to Webatlas is a joke. Searchfeed infested dmoz clones are spam directories..not quality resources.

Perception

All a matter of perception i think Shawn, algorythmic perception...

If what in real life is a great resource is Spam to Google, then for all intent and purpose, it's spam. This sucks, clearly.

>algorythmic perception Co

>algorythmic perception

Collateral damage always sucks. I used to get some of my best program and utility tips from the old threaded boards that were swept away in the nuke-guestbook era.

I agree that there are a ton

I agree that there are a ton of directories out there that only take paid listings, will list anything, and do not add any quality resources to make them useful, but I do not think its fair to categorize WebAtlas in that category. Sure algorithms are not people, but if an algorithm targeted at directory styled sites removed WebAtlas and left lots of the other directories that I have still seen in the index then it was a bad tweak.

I am friends with Nandini so I might be partial, but from what else I see that is in Google's index I would guess this to be a temporary glitch of some sort...I have been away, but hopefully Shawn was right about the 302.

Most directories are spam

Though that I think directory listings are great to have, most directories are junk though. Out of the thousands we've looked at, we've found less than 100 that are "quality" in our eyes.

Webatlas being one of the quality directories in our eyes.

Most DMOZ rip offs are nicely put in the "supplimental results" in google, and dup content filters out most of the rest.

Those who can pass an "algorythmic perception" will survive, and those who can't "stand out and be different" will not survive.

Nandini & WebAtlas

webatlas.org is a high-quality directory, Nandini is cool, and I trust the site will be reindexed by Google soon. I wish her and the project much success.

That said, I see five weaknesses in the many of the directories I analyze:

1.) They're not chunking down to a small enough niche. Now, I like uncoverthenet.com and appreciate what they're trying to accomplish. However, has anyone done the math on how long it will take to *populate* these pages with high-quality site listings?

site:www.uncoverthenet.com

Kind of on par with building a pyramid by hand...or something.

2.) Related to #1, many directories are cross-referencing their entire taxonomy of categories with (what seems like) every geographical region on the planet. LOTS of pages, but most are full of nothing useful to visitors, and adding no value to the web. If AdSense comprises most of the content, hmmm...dangerous, in my view.

3.) They're selling run of site ad listings, allowing spammy Titles and Descriptions, and risking linking out to low-quality sites.

4.) They're eroding their brand, and the value to webmasters who are considering adding listings, by placing too many text and graphic ads that compete with the free or paid listings. Hey, you can't build a long-term directory project and have it only be about transferring PR. It has to provide value to humans in addition to spreading the PR wealth.

5.) They're publically bragging on their forums about all the money they're spending on buying PR and link pop. They'd be well-advised to keep their business model and strategies a little closer to the vest, in my view.

Here are my suggestions:

1.) Don't get greedy. Niche WAY down.

2.) Don't get greedy. Let go of geo cross-referencing unless you have a very clear taxonomic reason for adding that dimension.

3.) Don't get greedy. Sell listings, not ads.

4.) Don't get greedy. If you use AdSense, do so carefully and with constraint. Put site listings first.

5.) Don't brag. And be aware that creating a high-quality directory that will become a long-term business is a LOT of work.

Here's how I'm practicing what I preach:

1.) In my latest directory project, I've chunked down to 514 carefully organized categories, and never more than three levels deep. I hope humans finds this directory easy to navigate and locate the web site info they're trying to find.

2.) I could have added, let's say...1,000 geo categories and ended up with a 514,000 page site. Tempting? Yes, a little. But I think of a directory as a retail store. I don't want to open the doors until the shelves are full. So, my editors have just added the 1000th human-generated high-quality listing, giving me an average of two listings per category. I'll launch soon.

3.) I'm not going to sell any run of site text ads. I see them as too risky, long-term.

4.) Since all listings will be free in this directory, I'm going to add some AdSense to monetize the site, but only after listings and in low volume.

5.) I'm not telling anyone but trusted collaborators about my promotion plans. And no, I'm not bragging about being confidential! ;-)

Will this directory be an amazing get rich quick venture for me? Nope. Will it be a solid, LONG-TERM venture that provides value to both visitors and webmasters, and earns reasonable income? I hope so. That's the plan, anyway.

Aloha,
Kirk out.

Condolences for whoever took

Condolences for whoever took a hit, but I don't tend to dwell on specifics.

I was just looking for a post, but couldn't find it. It was titled "paranoid about fingerprints" or somesuch. (Found it! http://www.threadwatch.org/node/874)

I spend hours and hours ripping and rewriting any stock code I might buy in order to make it unique. I even go so far as to rename css class and id, and certainly any generic graphics. Directories are usually leaving flaming roadmaps instead of just fingerprints and footprints.

longterm I think Kirk has a s

longterm I think Kirk has a solid strategy...niche and stay focused. with that being said there will still be some useful quality general directories on the market. I also agree that keeping the noise out is important for brand development. I do not think WebAtlas has any sitewide sponsorships or adsense type revenue generators at this point.

I believe Nandini wrote her script from scratch and she is not selling her directory script at this point so I do not think she has any footprint issues.

Yeah - I liked that thread too RC

Anything with a fingerprint should be avoided where possible - and I know it does take hours of work.

Bingo!

RC just hit the nail on the head. With the recent proliferation of sub-standard directories running OTC scripts, it makes nothing but sense to write your own scripts or at least make them unique. Anyone with long term plans using a common script would do well to deal with the footprint problem immediately.

>>or at least make them uniqu

>>or at least make them unique.

Agreed.

You guys didn't look very har

You guys didn't look very hard. This one doesn't have footprints ...more like "snow angels." Hell, it has 'Directory' plastered all over it. Even if it didn't use about every poison word (which have been out there since '99), check out the class codes. More than that, there are offers to "Get 5 links for the price of one!" Great big yellow flags dropping everywhere.

umm...

I don't see any red flags... not from an algorithmic point of view, nor from a human point of view. She wrote the whole thing from scratch. This has to be a glitch.

Absolutely, and from a 1999 a

Absolutely, and from a 1999 algo point of view. This 2000 newsletter was a wrap-up of directory poison words we were avoiding well before then.

http://www.searchengineworld.com/newsletter/2000/poison_words.htm

>from scratch

Who cares? It reeks of paid directory.

Empty Categories

Personally I think the empty categories are the biggest problem all of these new directories are having and will have. You can't just throw up a directory and generate hundreds of thousands of pages of empty content. I don't blame Google for wanting to rid its index of that stuff. It serves no purpose and it takes up valuable resources.

All of those directories mentioned so far have way too many indexed pages that are empty, way too many. And, when you start bragging about your directories at the public level, expect the worse to happen.

That whole group that those directories revolve around, have poisoned that network themselves with their lack of tact when it comes to promoting them at the public level.

Damn, I was saving that for l

Damn, I was saving that for later, P1! hhh!

Yeah, I always use NC travel. It's blank.

lol! I see all this talk revo

lol! I see all this talk revolving around a glitch. The freaking glitch is generating thousands of pages that are empty and contain nothing but adverts. These fall into the "Made for AdSense" or "Made for Our Advertisers" criteria.

It's unfortunate that Google will index anything. But, once it gets indexed, they then have the painstaking task of cleaning up after the fact. When you start bragging at high level public communities about having over 500,000 pages indexed and PR8 or PR7 or PR whatever, you've just painted a big red target on your back and you can believe that someone is going to be aiming for it.

If what I'm reading at various communities online is true, those using this particular script can kiss their directories good-bye. It has happened before, it is happening now, and, it will happen in the future. I'm sure the script developers had good intentions along with some who have purchased it. But, when the abuse starts, the search engines will then take a stance. And, these are not algo issues, these are manual bans that are happening.

Jeez guys, you're reading too

Jeez guys, you're reading too much into this. Look at biz-directory and his huge network of spammy adsense directories with the same cat structure. All those are still indexed. Sure not 500k pages,but that's just cause there isnt much link pop into them.

Regardless, it seems as though everyone here is stating that a directory should not be allowed to be indexed by search engines until it has sites in every single possible category? Give me a break. Look at Microsoft SBD directory. Half of the cats in that say:

"There are no sites listed in this category yet. Be the first and submit your site!"

So again, I have to disagree with you. Websites, whether directories or not, have to be given a time to age and grow.

rcjordan, are you really quoting something from almost 5 years ago? I have no faith in the theory that you pointed out...not anymore atleast. There are now many better ways of evaluating website content than by penalizing them for using a certain word.

I guess www.something.com should be banned cause it has an empty site and takes up the search results right? And so should Microsofts Directory?

Now please dont confuse my opinions here as to hold up for adsense sites or spam. But a directory (a quality one) with empty categories does not constitute spam, just reflects it's age.

"Now please dont confuse my o

"Now please dont confuse my opinions here as to hold up for adsense sites or spam. But a directory (a quality one) with empty categories does not constitute spam, just reflects it's age."

No, it just reflects poor planning on the directory administrator. If you were a search quality engineer, would you want 500,000 pages of empty categories in your index? I don't think you would.

No one ever said anything about spam, or at least I don't think so. What we are talking about though is a lack of understanding on the directory developers part. I repeat, you cannot flood the index with empty pages and not expect something negative to happen. And, when you start bragging at various communities online that you have this many pages indexed, who do you think is going to be looking at that?

Also, for those of you blending your AdSense in with your directory results, be prepared to take a hit there too. I have no problem running AdSense here and there, but, when you take it to the level I've seen, you've crossed the line of deception. I've seen some implementations where you could not tell the difference between the AdSense and the natural listings. You can bet those will be some of the first to be manually removed from the index.

We all have our faiths, Shawn

We all have our faiths, Shawn.

Agreed. I respect yours. :)

Agreed. I respect yours. :)

>I repeat, you cannot flood t

>I repeat, you cannot flood the index with empty pages and not expect
>something negative to happen.

Yeah, they get dropped out of the index, like any other type of page on any other type of site.

Pages

Would it not be more sensible to take a forum-like approach?

Where you start out with a few very general categories and as submissions increase, you sub-divide them...?

Empty pages just seems such an obvious pitfall, why are people not thinking about this from the very start?

Turthfully Nick, I think that

Turthfully Nick, I think that would be one of the best scenerios. In retrospect, I kinda wonder if I shouldn't have done such a thing.

"Where you start out with a f

"Where you start out with a few very general categories and as submissions increase, you sub-divide them...?"

Absolutely! That would be the only approach. Why take a chance at jeopardizing the integrity of the site? Empty pages do that. They are also worthless to the search engines and the visitor.

Start out small, garner some PageRank (for all of you PR hounds) and then start sub-dividing them. It is much easier to harness PageRank with fewer pages than it is to try and spread it out across thousands or hundreds of thousands of pages.

I tend to think that search e

I tend to think that search engines aren't good weeders --when a category becomes problematic they'll nuke first and ask questions later, just as I would do in the same situation.

Ok, great advice for future b

Ok, great advice for future breeds of directories, but what about ones that have already got a ton of pages indexed with tons of categories? Just delete the empty ones for now?

Personally I think it is too

Personally I think it is too late for those who have already polluted the index with empty cats. The last thing any directory owner should do is generate an entire ontology for the directory and let it sit there in hopes of populating the cats. If you have lower level cats that will take some time to populate, take them offline. Wait until you get a submission to populate the cat or go out and find one to put there. Kind of hard to do when you have 14 million pages available according to one of the directories in this group. Yes, 14 million pages open for indexing. 90-95% of them empty. Yikes!

I ran a well-tuned spider 4-5

I ran a well-tuned spider 4-5 days solid, then filled the categories after I gave them a quick review. If a category couldn't be filled, I nuked it or combined it with another.

>Kind of hard to do when you

>Kind of hard to do when you have 14 million pages available according to one of the directories in this group.

I assume you mean Uncover the Net. I can't help but notice the sarcasm in your post, combined with the lack of useful advice. My question was legit and as an answer I receive sarcasm.

So back to the topic. Basically for those directories, either populate the cats themsleves or just delete them?

Will deleting them create a problem with the page not being there anymore when Google spiders it again?

No, there was no sarcasm ther

No, there was no sarcasm there, just some solid advice. If you have 14 million pages available for indexing and Google decides to chomp on some more, you can expect to be in the next group who will screaming at the forums in the next few weeks. This is webmaster 101. You cannot expect the search engines to sit back and let you pollute the index with empty categories, it won't happen. If it does, it will be short lived. As I stated above, Google will index anything first time around. It's what happens after the fact that counts.

>It's what happens after the

>It's what happens after the fact that counts.

Meaning take care of it while you still can?

Yes. Although, as I mentioned

Yes. Although, as I mentioned above it may be too late for UTN. You have 809,000 pages indexed and most are empty cats. I can only assume that some search quality engineer is working on reducing that number as we sit here chatting about all this. When you start to see mass numbers of Supplemental Results, you will know that the process has begun.

No it wont make any difference

Other than you log file being filled with 404's no it wont make any difference but its the best way forward if you want to be around this time next year which according to your site you have paid to be listed on the w3c site for three years

just my opinion

Help

I had quite a lot to say about IHU ealier today on another forum. Although I think P1 is right, let's please remember that Threadwatch is a nice place - let's help the folks in this thread that need to do some damage repair or move on.

I had a go at them for gloating at other peoples misfortunes hehe - i'd say it's not to late to do some quick disaster limitation wouldnt you?

If it were me i would cut the empty cats NOW and worry about tidying the mess later... this looks like the start of something big to me...

btw, i dont think P1 was being mean here, but as your so personally involved it can be easy to 'insensitive' or to take info the wrong way - let's just beware that it's a sensitive subject for some members, thanks...

thanks guys.

thanks guys.

Another bit of advice in rega

Another bit of advice in regards to Google indexing pages. I personally believe that the G Team will put caps on sites that have the ability to produce millions of pages. In that scenario, I would surely want my quality pages to be the ones that are indexed and not those outside of the cap. Let's say your cap is 1 million pages and you have 14 million available. Which of those 14 million do you want indexed? Take control of that spider! ;)

It makes alot of sense. Than

It makes alot of sense. Thanks for the help and advice. I just got rid of the locations, and now it's at 11k pages instead of 23 million. :)

Dynamic

Many dynamic sites have these problems - often cms/blog/dir packages just dont take the things we as search marketers think are important into account.

I have a couple of issues here that are on the agenda for fixing this week. Check out how many pages G think we have at TW... trackback pages have to be dealt with and i have to work out why my robots.txt is not working for /emailpage yikes!

Glad im only the the thousands though...

Nick, your robots.txt file is

Nick, your robots.txt file is probably fine. What you need to do is allow the bot access to the page and then drop a Robots META Tag on the page with a robots-term of noindex.

Hmmm, the blog software wouldn't let me post the actual code.

>Hmmm, the blog software woul

>Hmmm, the blog software wouldn't let me post the actual code.

If I'm not mistaken you can use [ code ] code here [ /code ], but close the spaces.

UTN and Locations

Shawn Walters ~

Gutsy move on deleting the locations at UTN. Shows me you're committed to long-term success with your directory (in which I've purchased some links, so you have my vote of confidence).

I'll recommend another strategy to you. You already have a great editorial team. In addition, consider hiring an offshore firm to populate your 11K categories pronto. Humans will do better than scraping a major SE or running your own crawler.

You'll need to write up VERY specific and detailed "internal use only" editorial guidelines if you don't already have them. Then place a virtual assistant (administrative category) bid on a freelance site like Elance.com. (FYI, most of your low bids will come from India.) You may have to weed through a few providers before you find some good ones. Also consider hiring multiple firms and dividing up your categories among them (exclusive, not overlapping).

If it helps, I currently pay $0.15 per completed listing, which includes initial research, editing the Title and Description and Keywords, and entering the listing. You may even be able to do better cost wise. You're probably aware that many Indian firms--and other offshore countries as well--run three 8-hour teams 24/7 and have large staffs. They're *always* looking for more work, and the relatively low wages you'll pay them translate into a good living in their country (personally, I'm not willing to be involved in "sweat shop" win-lose scenarios). So, for about $3K (or hopefully, even less) you should be able to add a couple of listings into each of your 11K categories...and fast.

This is part of the strategy I use to build my directories. With this approach, I don't have to put up a small set of categories initially. I like to hit the ground running and create a strong first impression when visitors or webmasters make their first visit.

Hope that helps!

Best,
Kirk out.

Wow Kirk. Thanks so much for

Wow Kirk. Thanks so much for the advice. It's a great idea. I'm looking into it as we speak. And yes, I have long term goals and a lot of time and money invested into it, to just let it go waste.

Other idea

Shawn, what about dinamically generating a robots noindex metatag for the categories that don't have any websites listed?

I've already looked into that

I've already looked into that, and it seems the only way i can figure out how to do it would be in the body of the document. Would a robots no index, no follow work if it's placed in the body of the document (not the header tags)? If so that would be the IDEAL way of doing it.

No it won't. It is metadata a

No it won't. It is metadata and needs to be placed in the head of your document.

Yeah I didnt think so. I did

Yeah I didnt think so. I did figure out a way though to query the database to see if there are no listings. And if there are zero listings then it would insert a no index no follow meta tag in the header.

So in addition to removing the empty cats, and removing cross-referencing, I'm going to implement this today. Thanks for the suggestion fishyking.

It will be...

...interesting to see how this pans out Shawn, the very best of luck to you!

I hope some of the other better known directories out there have seen this, there's much to learn, from some very knowledgable folks of which some have been building directories for 7/8yrs or more...

"Yeah, they get dropped out o

"Yeah, they get dropped out of the index, like any other type of page on any other type of site."

I just wanted to clarify something with the above statement. The only pages that get dropped from Google's index are those that return a 404 or other forbidden response. And, those that are manually and/or automatically removed due to various reasons including but not limited to; penalities, banning, technical reasons, etc.

Pages that return a 200 status in the Server Header will probably remain in the Google index forever. They just won't be found through normal search routines. You have to dig for them.

410

pageoneresults, what do you think of sending a 410/Gone for all the pages that are currently listed in the Google index that Shawn's pulling off-line? This is the approach I usually take.

Absolutely!

10.4.11 410 Gone

The requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 (Not Found) SHOULD be used instead. This response is cacheable unless indicated otherwise.

The 410 response is primarily intended to assist the task of web maintenance by notifying the recipient that the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed. Such an event is common for limited-time, promotional services and for resources belonging to individuals no longer working at the server's site. It is not necessary to mark all permanently unavailable resources as "gone" or to keep the mark for any length of time -- that is left to the discretion of the server owner.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

410 Incorrect Advice

Wait a minute, the 410 is incorrect advice. Why? Because those resources will be populated at some point in time. You don't want to instruct the useragent that they are gone permenantly. Sorry about that, wasn't thinking clearly. They need to be 404'd.

Posted by me: >Would a rob

Posted by me:

>Would a robots no index, no follow work if it's placed in the body of the document (not the header tags)?

Let me clarify this here cause I dont want people to get the wrong idea. I know were the meta data are placed. But there is a non standard meta tag called a snippet tag. I was not aware of how Google uses this tag. eg, I thought you place it before and after say a blockquote of something you don't want indexed, but apparently its just another meta tag.

Side note: I'm really shocked at the maturity level of some of the people in the seo community.

>They need to be 404'd. Wh

>They need to be 404'd.

When a document uri is no longer in place it is the default to return a 404 no?

nosnippet

There is a Google Specific Robots META Tag...

nosnippet
Google will not display snippets and will not archive a copy of the document (Google's Cached Page). A snippet is a text excerpt from the returned result page that has all query terms bolded.

The above is in response to Shawn's question about the nosnippet.

Yes

Though you might want to keep the resource but send a 404 header - it would save nuking it then re-doing it. This may or may not be an issue depending on the software.

If your scripts work as i think they probably do then if you can identify those pages as you said earlier by finding them in the DB and then instruct the script to return a 404 before sending the page it should work absolutely fine but still keep the page intact for future use.

P1, that sound right to you?

But make sure...

that the server doesnt send a 200 straight after the 404! heh...

Repopulation, 410, & 404

Actually, knowing this directory software's file naming schema, since Shawn has disabled location geo cross-referencing (I assume permanently), those pages will never be resurrected. Instead of removing the empty pages in the baseline directory, I recommend he simply leave them on-line and populate each page with a couple of listings, pronto. I'd go 410 for the pages that are permanently dead (location-based pages).

The script I use is set up to

The script I use is set up to have only one url forever, and if that url is deleted it will not be populated again with a new database entry. So if a category (in this case) is deleted it will return a 404.

>Shawn has disabled location

>Shawn has disabled location geo cross-referencing (I assume permanently), those pages will never be resurrected.

Another theory to consider is if the database inserts a no index tag on non populated categories, would it be beneficial to keep the geo locations? Eg, the only ones that would be spidered are the ones with listings..?

no index tag

If UTN were my directory, I'd want to present only useful and active pages to the search engines. Why make them crawl and crawl and crawl through a bunch of pages tagged with no index? I see no benefit on any front. 11K pages is still a VERY healthy directory, especially when all those page have high-quality listings.

Point taken kirkvan. Thank y

Point taken kirkvan. Thank you. :)

License

Hi all,

Just a quick note to say that if anyone has an interest, Kirk has some nice directory software for sale - he's not going to use it and would like to sell the lisence (all above board) to someone rather than have it go to waste.

If you want more details, just send a message to kirk and im sure he can fill you in on the details...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.