Threadwatch to Build Killer Link Analysis Tool, Give it Away Free

79 comments

UPDATE: Due to comments made by Google and many, many warnings from 3rd parties, our major sponsor has decided to pull out rather than risk repercussions from the Search Engines - We are in talks with other companies, and are working to find a way to build the tool whilst remaining within Google TOS - See GoogleGuy's comments pointing out the TOS for more details.

At present, we are in a state of indecision: Some questions need to answered:

  • Can a link analysis tool be built within Google TOS and still be useful?
  • If we do get to build it, and it's within TOS, will our sponsor, and Threadwatch (and any other sites i build) be free from repercussions?

Back to our regularly scheduled programming...

Threadwatch, thanks to major sponsorship from Unspecified will be building the link analysis tool to end all link analysis tools.

Here's how it works:

  • Unspecified pay for the initial development
  • JasonD and DaveN provide the programming and technical expertise respectively
  • We ask you what features you want in a dream link analysis tool
  • We build it, based on your input
  • We give it away for FREE

Follow the title link for more...

Some further Details

Here are some short answers to questions i can think you may want answers too. If you have more, please just comment, and ill answer them.

What kind of Tool?
We have a very simple (in concept) goal in mind: To build the best link analysis tool on the market. Period. It will be a standalone tool that you can download and run in your browser, with no "phone home" shite associated with it. The only thing hosted at Threadwatch will be the download page where you'll be able to upgrade it as improvements are made.

Sponsorship
Unspecified. are the main sponsor, the amount will not be disclosed, but it's considerable, and 100% of it will go into building the tool. There are opportunities for 10 smaller sponsors (this is why its free..) and any wishing to get involved should pm me.

Jason is donating the project management (as well as collaborating with DaveN on the technical SEO aspects) and DaveN his considerable seo expertise, these guys will also be named and credited. Both JasonD and DaveN have STELLA reputations for SEO and having their skills installed in the tool will make it truly awesome.

What's in it for Threadwatch?
Firstly there is the viral nature of building the best link tool on the market and giving it away for free - secondly, additional minor sponsorhips will help to monetize TW - we've talked about this subject so many times, but i've not managed to put much together to date and this looks like a win-win-win to me. You win as you get a great free tool, the sponsors win as there are limited slots and this tool will go mentally viral, i win as i get a few $$'s in my pocket and can stop eating boiled beef and cabbage for a week or two (only kidding :-)

This also carries on the initiative we started with the CashKeywords giveaway, while Threadwatch is still looking for viable ways to monetize, with utmost care and attention given to the membership, you can enjoy another cool freebie.

More Questions and Feature Requests
Fire away, im here all evening and will happily answer points i will have obviously missed, just ask...

Comments

Well...

Apart from the fact that they're not involved right? heh..

Welcome to threadwatch kservik, do introduce yourself

Love the idea

Just want to throw in my support. I just love the idea and it is a incredibly smart marketing move from TLA.

Yahoo API

I agree with SEOMike's sentiment, but I think the Yahoo API is the way to go.

It would be nice

Have the tool display a prominent banner showing seomikes post (my vote for TW comment of the year) :O)

Yahoo is the solution

Just use Yahoo!'s new API. It will work fine. Google doesn't want to play, so don't play. Google is not needed, nor wanted. :)

re: Yahoo is the solution

>> Yahoo is the solution
You're probably right, Backlinks are more reliable there too.

Just do it

I say any organization or company that scrapes 8 billion pages of copy written content, then uses that content as a back bone to sell ad space doesn't get the right to have a TOS that says it can't be scraped back.

You scrape our back we scrape yours. You make more money at it so shut the hell up and take it like a man :)

I say build the thing. It's a service with ads on it just like a SERP If they wanna get legal then maybe we should get a nice fat class action lawsuit to get the royalties for all our scraped sites that have generated adwords revenues G makes would only be fair right? Come on, would G be a 200 dollar stock without you, me and the mom and pop stores cotent?

I know I have sites in the SERPS that are piggy backed by adwords that cost 15 bucks a click. I'm sure there's enough to go around :)

TOS, GoogleGuy and AutoLink

n/a

seo fun

Wouldn't it be funny if the TOS included a claus that prohibits SE employees and agents from downloading, examining, or using it? (or perhaps just certain SE employees and agents).

Wouldn't it be funny if a "submit your site" webpage promised to submit your site for free provided you agreed to contribute a certain small percentage of your background CPU power and local internet access to this new BLA-at-home distributed computing project? Think how attractive it would be to have your site submitted by jason and Dave...

Googleguy - DaveN

I could have been my fault, and I'm Giving Google the benifit of the doubt... has GG put it "I would ask that it not scrape Google" ... thats fine I have sent an email to the contact I have at the plex and Trust his judgement... if we are going to start shooting people shoot me :) I'm much bigger than GG ;)

DaveN

If Google has an objection to this kind of thing

Why are Axandra et al still able to sell their link management software?

Googleguy did I miss something

So we all have our wish list and the tools will be develoiped and swapped amongst people. But with facilities to stop you tracking it.
You know this is already going on and will continue more and more.

Nick, Dave and Jason could build a tool that would become the defacto tool, you could have access to see what it does and even point it to servers you own so as to not hit the rest of your network. Plus you could even look at the data in isolation and see what is going on and do your own analysis.

Overture already do this with their partner ppc management tools.

Bravo Nick for being so honourable and mentioning it to the se's, rather than just building it and selling it under the counter.

DougS

Napster like sharing

The thing that would make this tool rock even more, is if there was a way to share info programmed in.

Ok many would be scared to on the trust basis :)

But imagine the info a group could put together!

Free Link Analysis Tool - You Decide the Features

Your wish can come true if you participate in the following blog post in TreadWatch about Killer Link Analysis Tool, Give It Away Free.

Several suggestions, aka my wishlist:

Customizable filters for viewing only certain details
Sort data
Abilit...

Do no evil?

heh - perhaps we should just be greatful that clearly we're more evil minded than GG?

Driving Underground

The thread/idea above with comments will likely be done irrespective of whether it is even remotely connected to TW (maybe in private me thinks). I can almost say that it is probably already out there, but now with added TW input.

I can see why people are backing off, it is a dangerous game. SE's, well Google, who rely heavily on linking will not what their privates on show to be reverse engineered using an easily accessible reverse engineering tool. I can see why they dont want load of muppets releasing a beast upon them.

I am with you gurtie, keeping quiet would have given untold riches, proxies would have been identified, money terms identified and watched etc etc.

Please consider this a polite request not to produce software that scrapes Google without our permission.

I can entirely see the position Google are coming from, but I feel its rather short-sighted and naive of them[you].

Google likes collecting data, this is obvious. Tools that scrape your data will continue to be written without your permission -- whether this one is or not, I don't know, but should it not be, someone else will take note and do it anyway -- the crux of the matter is how much you know about it. I'd suggest that having tools scraping from one IP (block) or one UA string is far better an option than the alternative: proxied scraping. Inherently slower that may be, but the data would be far harder to track, and what good does that do you? At least when its "above the water" you can make use of the data you can scrape about the scraper ;)

Using the API may be an alternative, but were it not, Google might be better off asking for restrictions - one query a second, one UA string - than explicitly denying the usage altogether, and driving the ilk of tool underground.

(of course, the slower the tool goes, the less of an argument server load becomes, because I could do it all manually, and that would increase load, simply by virtue of images loading and whatnot)

Plus, faux queries still count as queries, and I'm sure handling more queries a day would not look bad on the investor's board reports :p

Talks with Yahoo

Well, we've made contact with Yahoo! and talks have gone well, i didn't do much other than say hi, and i've not got all the details from DaveN yet but it sounds promising so far...

No sign of Google yet, but we're hoping they respond still...

and here i am...

http://www.threadwatch.org/node/814

Welcome, and do say hi in the above thread linkster...

A very good point Linkster an

A very good point Linkster and Hi and welcome (I'm sure Nick will pop in and send you over to the welcome thread:) )

I don't agree that the search engines are leeching our resources without us having an opportunity at stopping them doing so but I do feel a watered down version of your points of view are extremely valid.

Search Engines rose to prominence by breaking new ground and entering new territory. This tool will be entering similar areas, growing upon research that the engines have already done. They make that information public already (by displaying the SERPs) so by partnering with TW in the development of this application they will be giving back to the community of webmasters that helped them rise to prominence and build $multi billion companies.

Hello First time poster

I may sound crazy here but how can google/google guy ask you not to make a tool that uses their public data. They seem to scrape data from the entire internet themselves without asking, and they do this to make a profit (indirectly). There whole biz model depends on them taking your data that you produced and displaying it for the purpose of making cash....think about that. Now they may say well put up a robots.txt but how many average webmasters know to do that. What if I put up on all my sites tomorrow a terms of service saying it was not ok to scrape my data and display it on any site that is commercial in nature, would they stop. Hell no they wouldnt care one bit and you guys should not either. IF they can do it I dont see why you cant use their public data.
Thanks

I think you're close to spot

I think you're close to spot on Gurtie but actually believe that the search engines can get more out of it than that. They have the chance of knowing about and liasing with the developers, funders and supporters with what will become the leading SEO tool out there.

Through discussion amazing things can happen. A touch extreme I know but when these guys can chat I am sure that SEOers and SEs can as well!

the SE's would have benefitted indirectly surely?

if a really good and free tool were available then not only would a lot of SEO's use it but the SE's would be able to see exactly what we were seeing and know what we were working from?

If I were them I would have kept my mouth shut and let it go ahead... by stopping it you just have a load of good researchers/programmers using their own unique tools which do all of this stuff anyway and a load of the rest of us paying some backstreet programmer to hack something up which does the same. How much better for them to reverse engineer the tool a lot of SEO's use and break it?

I really don't think I'll ever learn to think like an SE rep.

Limitations are Huge

You can't have a win - win situation on this topic. You either go the WPG route and don't care (SEOs win - Google Loses) OR you go the API route and limit the users big time (SEOs lose - Google Wins).

SE's

I would find it hard to imagine how many major SE's will be interested in helping such a project along. Google have made a great point of limiting the amount of useful data than SEO's can access over the past year.

And as Xan has made a point of in SEW, SE's can easily see SEO's as interfering with their processes, rather than trying to add anything useful to them.

So I guess I wouldn't expect too much offered from discussions with search engines.

I figure there are compelling reasons for as to why such a wholly comprehensive tool suitehas not yet been developed, and people like Barry Schwartz and Shawn Hogan are people to talk to on what can actually be added in terms of tool features, with regards to their perception of the limitations.

Legal Issues

Barry has a good thread on the legal issues concerned over at SEW - do go see, it's a great post...

I personally feel 90% certain

I personally feel 90% certain the app will happen guys so don't worry about that. As Nick said we are chatting to the Search Engines (not just Google) hoping to work with them rather than do this behind their backs.

Ultimately if we end up being in a position where the app can't ahead with the search engines being happy then there are alternatives. Watch this space :D

Agreed

Good point - you know what im like, write first think later :)

I have changed the original post, just prior to seeing your comment NFFC, so im with you on that.

I understand that the guys are attempting to open up discussions with the SE's and untill we hear from them, the project remains in ICE

Cheers

No and No

Its just a bad idea all round imho.

And while I'm in that groove, the "se repurcussions" spin is a little too much, everybody should have the right to protect there own sites/servers.

Talks with Search Engines

We are currently talking to several search engines - the idea being that through discussion, we may move forward...

Marginalize Google

>>Can a link analysis tool be built within Google TOS and still be useful?

Probably. With enough other sources yes.

>>If we do get to build it, and it's within TOS, will our sponsor, and Threadwatch (and any other sites i build) be free from repercussions?

No. You never are and never will be.

What about the API?

Something like Digitalpoint do - everyone has to sign up for the API to use the tools and they just supply the technology?

Major Sponsor pulls out - Discussions with SE's pending

UPDATE: Due to comments made by Google and many, many warnings from 3rd parties, our major sponsor has decided to pull out rather than risk repercussions from the Search Engines - We are in talks with other companies, and are working to find a way to build the tool whilst remaining within Google TOS - See GoogleGuy's comments pointing out the TOS for more details.

At present, we are in a state of indecision: Some questions need to answered:

  • Can a link analysis tool be built within Google TOS and still be useful?
  • If we do get to build it, and it's within TOS, will our sponsor, and Threadwatch (and any other sites i build) be free from repercussions?

multilingual & theming

I would like to know where the links come from (countries...) and from what theme the links are coming (there might be a database with topics - sub topics and keywords attached to these categories to identify a general theming structure...)

Hi Chris. Main reason for

Hi Chris.

Main reason for a browser based version 1 is speed and costs of development to make it as multi OS as possible.

If we output HTML it's one area that looks the same (pretty much) across all operating systems so means we don't need to code a seperate GUI for Mac's, Windows, Linux etc users.

:)

Why a browser app?

As JasonD has seen, I have been developing my own link analysis tool (using google api before I get a b'tch slapping off GG heh) and quickly found that doing it in the browser is problematic, beyond basic link analysis the queries required are just too long. Unless it will be a java applet (definately would make it cross platform)?

My only feature request that will be different to others is please do not make it mySQL only if you are going to use a datastore, either make it file based (xml data store would be cool then could extract data) or DB agnostic (web services, an abstraction layer or somesuch) as I really do not want to have to install stuff on a computer that I don't need just for one app!

PM me if any of you want to see the aborted browser one.

Break down all of the backlin

Break down all of the backlinks by IP and then by C class (or have it as an optoin) so I can see if a certain site is buying up site wide links with a certain forum or portal.

Once I can see if site X has 3000 links from a certain site what % have the same anchor text and the proximity with in other text i.e are they just footer links or are we talking links in articles etc...

Not sure how possible it would be to then go off and query for whois data, to see if we are just talking about large networks owned by one guy/gal.

Hey GG and Tim

you both know how to get in touch with me, we will be staying within the guidelines of course :) would I ever bend the rules ... ever :)

As for data collection I don't think we are going to capture any, unless it gets run on servers, the local distribution model shouldn't anyway... unless we can find a real sneaky way without anyone knowing lol ...(joke)

DaveN

Just to clarify a couple of p

Just to clarify a couple of points.

The tool will be running on YOUR machine accessable via YOUR web browser. It is not a remotely hosted application and there is no central data repository. The only people who can see your research are those that have physical access to your computer.

The sponsors do not get access to any data at all. Nada, Zilch, not one iota. All they get out of this tool is a warm fuzzy feeling in their belly that they've helped deliver something very good indeed.

Oh, and a prominent position in the application noting them as the sponsor and hopefully your business :)

A "search within these results" option

I'd like to be able to search within the sites returned before the other info is collected. Often very helpful in narrowing the list.

Hilltop Type Thingumy

Pick a site, get the all the backlinks for that site that also appear in the top 1000 for the same search, then split them down via category of IP.

Nice! This is my wish list:

- Flexible queries: I'd like to have a set of pre-defined models for queries, but also with an option to allow changes in search engines. I mean that we can change the SE we query, the parameter that points the query (ie, in Google, q=), the operator, other custom parameters in the URL, etc, so that we can adapt to new search engines or to the changes of current search engines, or make different queries (e.g., link:http://www versus linkdomain:www)
- What Optilink does (IP, PR, number of backlinks of the returned pages, title of the returned pages)
- To have a comprehensive set of results, an option so that after returning the first 100 results, we can keep widening the result list by filtering the websites present in the first set. E.g., if we make a backlink check at Yahoo and it returns 100 links from 56 different websites, make the query again adding a -site:www.whatever.com for every one of the 56 different websites.
- To return just the first backlink from every unique website.
- In the report: A column for all parameters present in the anchor tag other than the href, so that we can check if the backlinks have target, nofollow, title, etc.
- The anchor text, of course, or the ALT if it's an image.
- In the report: a column with all the anchor text for links that point to other externals URLs except for the URL we're checking.
- A custom field where we can place a snippet of code and for every page returned, it checks if the snipet is present and says so in the report. So we can check if the pages where are the links have a certain word, or a certain tag, or a certain link.
- A list of co-ocurring links ordered by number of ocurrences: links to pages other than the one we are checking that are present in the returned set of pages.

Now, confess: you aren't planning to give this tool away, right? This is just a trick to harvest ideas for your own private backlink-checking tool, true?

I might be wrong, but I do no

I might be wrong, but I do not think the sponsors are gaining access to data. I believe they are just going to get ad space.

Giving a third party access to the linkage data you collect would make the tool worth way less than it otherwise would be.

How much data...

are the sponsors going to have access to?

Nice one Nick

RustyBrick said:

Quote:
I would love to see it plotted out visually

I second that. A tool that allows you to quickly visualise inter-relationships and patterns.

Me too - sorta

What mivox said about OS X, except not the X11 part cos I never got that whole bit running (I think.) A real OS X version would be cool.

Also ditto what SEOBook said.

For each site in the top 10 o

For each site in the top 10 of a query I would like to see a breakdown of the anchor text in a pie graph, for each site, then merged overall for the top 10.

Standalone

Quote:
Is this a tool that would be hosted exclusively at Threadwatch, or would it likely see deployment across multiple sites?

The idea is to build a standalone tool that will run in your browser Brian, and allow other websites to distribute it on the condition that they do not intefere with the interface and sponsorship stuff of course.

Thanks GG

Thanks GG - Send regards to Rose. I am hurt.

concept

Is this a tool that would be hosted exclusively at Threadwatch, or would it likely see deployment across multiple sites?