New Scientist Digs up Old Google Patent

Source Title:
Google searches for quality not quantity
Story Text:

A New Scientist article says that Google will dramatically improve the results of internet news searches. Currently Google News items are ranked only by time of publication, or by relevance. I had always (incorrectly) assumed that relevance was in indicator of quaility, but it is only based on the search keyword. They will now add a "weighting" factor for the BBC, Reuters, CNN etc - blindingly simple! Yet as Gary Price points out, this patent is from 2003, which the NS article fails to mention...

The database will be built by continually monitoring the number of stories from all news sources, along with average story length, number with bylines, and number of the bureaux cited, along with how long they have been in business. Google's database will also keep track of the number of staff a news source employs, the volume of internet traffic to its website and the number of countries accessing the site.

So Bob's your uncle

Google will take all these parameters, weight them according to formulae it is constructing, and distil them down to create a single value. This number will then be used to rank the results of any news search.

Looks as if ThreadWatch is going to have to employ more peopleto get included (again!)

and a bit at the end - obvious too when you think about it

The patent also reveals that the same system could be roped in to rank other search results, not simply news. So sales and services could in the future be listed on the basis of price and the reputation of the company involved.


And the bit in the tail

(sorry my edit function does not seem to work!)
bumped the quote up to the end of the story


>>along with how long they have been in business.

Not always a reliable indicator in the news business. UPI is just a pale shadow of what it used to be, but at one time (12 years ago before bankruptcy and the current owners) it produced better quality articles with fewer staff, than AP.

No algo could have you that, you have to actually put your sliderule down and read them and use sound human judgement.

whats the big deal?

It looks to me that this article is based on a 2003 patent and no other source. We've known for sometime that Google has played around with authority sources. Maybe I am missing something?

more of the same

More indications that "competitive intelligence" (and the "management" of publicly disclosed competitive information, of course)is getting bigger and bigger.

"number of employees" and "number of bylines" and "how long they have been in business" are to be learned from where? D&B? Most of this stuff is easily spoofed sans repurcussions, and there are not enough reliable trusted authorities for a Google to depend on without losing autonomy.

I think there's a reason this is still interesting after 2 years -- it sounds good but is impractical beyond the boring basics everyone can access.


Why does the length of the article matter? I hate to see people filling news reports with fluff just to get the article to rank better on google. That'll just suck.

There is no such thing as "Fundamentalist SEO"

Google's ranks are filled with folks from the Intelligence Community. They maintain a proactive information warfare campaign for their benefit against competitors and external forces (uhh, the SEO Community or anyone else interested in manipulating their business products). In doing so, sometimes you get an abundance of information. Sometimes you get none. Sometimes you get ambiguous suggestions. Sometimes you get actual targeted answers. Sometimes they forecast accurate forward statements. Sometimes they flood the horizon with permutations to every possible variable, which is what these patents are doing.

As somebody who's had a little experience with the Intelligence Community, the Defense Industry, and even notices much of the same behavior now in political circles, I can assure you that, at the end of the day, you're better off thinking like you were a search quality engineer at Google rather than reacting to the publishing endeavors of the same.

There *is* such thing as "Fundamentalist SEO"

I agree with every word of your post - Google watchers should be much more concerned with information *not* included in public information, and Google behaviour.

I disagree with your title, though; the "Fundamentalist SEO" is one who sees Google as the antichrist, fails to accept that even search engines can evolve. And mutters strange incantations at a computer screen when their sites get dropped.

uh oh

And mutters strange incantations at a computer screen when their sites get dropped.

Hmmm.... now I am geting worried. This is quite an inclusive description of seo's in general. Then again, it applies equally to bloggers LOL

* Ranking a Stream of News, accepted at www05

"In this paper, we introduce this problem by proposing a ranking framework which models: (1) the process of generation of a stream of news articles, (2)
the news articles clustering by topics, and (3) the evolution of news story over the time. The ranking algorithm proposed ranks news information, finding the most authoritative news sources and identifying the most interesting events in the different categories to which news article belongs. All these ranking measures take in account the time and can be obtained without a predefined sliding window of observation over the stream. The complexity of our algorithm is linear
in the number of pieces of news still under consideration at the time of a new posting. This allow a continuous on-line process of ranking."

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.