Article Banks: Content Thieves?


Martinibuster posted a thread in the WMW supporters section about finding his content articles stolen and syndicated across many article banks.

Nuevojefe made the following suggestion:

The rule was that only original articles that had not been submitted elsewhere and didn't reside on the author's website would only be accepted. The suggestion was to automate the process of searching for a google snippet. The script would grab three random strings of text from each submitted article and do quotation mark searches in google to identify whether or not the article was indexed elsewhere.

But I doubt most article bank owners even care.

I am sure this is nothing new, but what are the best ways to deal with it?



Just file a DMCA complaint with the article bank and their ISP and watch it hit the fan.

I periodically scan for exact strings using SEO tools and landed on a few this weekend which have me up in arms - someone might lose his web site tomorrow unless he complies.

May be easier just to e-mail

May be easier just to e-mail the webhost - US hosts especially tend to be very compliant, and a polite e-mail to them usually works far faster than a DMCA, IMO.


People sell SEO article sites on eBay, claiming they're ready to go for advertising that'll earn hundreds, if not thousands a day. They make no mention of whether they've gotten permission from the authors.

Um, Brian...

I email them DMCA complaints - it works wonders and is quick to do


Question does it actually have to be your copy for you to file a DMCA or can it be anyone's copy?

amen to that. we should probably be encouraging it, finding new ways to get content stolen. soon enough people will be paying to give it away. after all, the future of media is distributed.

Why lose sleep?

Said by someone that obviously has no clue how rampant the scraping is, now bad the bandwidth charges can become, how others are shamelessly profiting from your work.

I don't lose sleep anymore as I've got custom automated tools stopping crawl attempts 24/7 - If I don't give them permission they don't get in.

It's my server, my content, my bandwidth and I'm re-taking full control of what's mine.

You obvioulsy don't understand the extent of the damage these idiots can cause as my revenues are actually UP since I've been blocking them and giving them poison data.

Lose more than sleep

It's really all about the importance of the content and the site to YOU. I've got plenty a site that I couldn't give a shit whether all the content is stolen from; I've got others which, if infringed, I'd have the lawyers fire the big guns over.

Also, there are many different levels of nefariousness when it comes to the use of stolen web copy. My organization has had everything from minor infringement where another party decided to borrow a paragraph to supplement their own content (or who knows, maybe they aggregated a few paragraphs from a few sources), to the whole opposite end of the spectrum where our entire site was lifted with the intent of duping visitors into giving up credit card information and enough other info to commit identity theft. How lovely of them to leave our fuckin' logo on the site!

