Scraper sites are sites comprised of content taken from other websites. Usually scraper sites contain short excerpts of content, since republishing excerpts is more legal from a copyright perspective than republishing entire works. By scraping a site that has semantically well crafted content in a given topical niche, you are basically mimicking a successful competitor's keyword presence. Lately they have become silent partners for free, generating sales for me at no cost!
Mimicry is the highest form of flattery, so they say. It is also "content theft", according to others. Scraper sites are often credited to "scraper scum", and scraper sites are often called SPAM. In my opinion scraper sites are only SPAM when the material scraped is search results. Certain search engines love to eat their own dog food -- that is, they give preference in the SERPs to scraper pages comprised of.. those very same SERPs. Go figure.
Now other people's scraper sites are making me money. I love it.
How can that be? Evolution may take time, but it's a wonderfully efficient process of rewarding the fittest and weeding out the weak. On a few projects, my content ranks at the top for nearly every worthwhile search phrase in my niche in the major search engines. I got there through a deliberate, carefully-executed campaign based on almost a year of research and trials. Unless you copied each of my pages and put together a copy of my site, including it's back links, you could not replicate my results. I have been examined repeatedly by webmasters (lots of referral traffic from sophisticated linkdomain constructs and the like) and scraped relentlessly. I was 302'd to death a few months ago, from all around the world. By the way my site is ugly and appears haphazard. I can understand why it is so heavily researched by others... they probably can't see the obvious because the details they view as important are so seemingly random. Affiliate links out in plain site, lots (too many?) banners, HTML that won't validate, etc.
After a while at the top of the SERPs my affiliate links starting to appear in the SERPs. That's an affiliate link, like http://www.vendorsite.tld?id=my_affiliate_id I can surmise that with the success of my pages at the top of the SERPs, and because those pages included multiple back links to affiliate landing pages which included my affiliate IDs as shown, those vendor landing pages started to rank for relevance. Not at the top, mind you, but certainly high enough to start generating affiliate sales for me, independent of my own websites.
Think about it. That's lead generation for FREE. No bandwidth, no use of my web pages, no liability (?), just pure profit. Sounds good so far.
Those experienced in the audience have already recognized the obvious... as scraper scum build scraper sites from the SERPs, my web pages are naturally included (as evidenced by the 302 hijacking of a few months ago) but SO ARE THE VENDOR PAGES WITH MY AFFILIATE IDs. Count those as backlinks to me, and backlinks to that vendor site, but with my affiliate ID. More scraper sites means more backlinks, and more references to me as being "important". Plus, more references to my vendor as being "important".
Now the theoretical questions.
Is a URL with query string treated by the search engines as a unique page, or as a variant of the "canonical" page (without the query string)? This is argued frequently. As the SEs "improve" the engines, and start crawling those "dynamic pages" using some compromise along these lines, I win. My pages rank, and my vendor ransk FOR MY AFFILIATE LANDING PAGE. The rich get richer, no?
What about *other* gazillion affiliates whose backlinks to the same vendor site are supposedly signaling the hierachical importance of that vendor site for the topic of interest? Does it work on the canonical domain, on the page, or is the URL-with-query-string treated as a page? Is some "page rank"passed upwards to the vendor's domain (root), and if so, is it then distributed back downwards to other pages on that domain, including my affiliate landing page? More affiliates means more relevance for my pages again. The rich get richer again. I'm liking this for sure.
When we hear about the search engines addressing the affiliate situation, it is usually limited to those who link directly to the vendor site via PPC ads and such (not those who build pages which may link to the vendor site). I understand the desire of SEs to avoid representing the same vendor multiple times in the SERPs; that is a basic SEO strategy to be moderated. But if landing pages with affiliate IDs are now raked independently from the sites that house them, they compete with the vendor's own pages. Why am I winning that competition? I am not generating the majority of traffic to that vendor (I may be one of their super affiliates, but my traffic is not larger than the combined traffic from all other sources -- not even close).
I can hope that the search engines have bigger problems than this, so they won't try and track affilaite IDs, or revert to not indexing "dynamic URLs". There's no need, because this current situation serves the user well. They're buying via the links provided, so they are being served.
So scrape away, scraper scum. Continue generating leads and sales for my vendor under my affiliate ID, and I will continue to not share the profits with you. It's been working well so far! I look forward to continuing our unilaterally beneficial relationship, and finding more like it.