MSN Using Neural Networks for Ranking Algorithm

17 comments
Source Title:
Local, Relevance, and Japan!
Story Text:

MSN Search's Ken Moss talks about recent innovations in their Search algorithm on the MSN blog. One interesting thing mentioned is the introduction of neural networks...

In collaboration with Chris Burges and other friends from Microsoft Research, we now have a brand new ranker. The new ranker has improved our relevance and perhaps most importantly gives us a platform we think we can move forward on quicker than before. This new ranker also is based on technology with an awesome name -- it's a "Neural Net."

Findory's Greg Linden says

While applying machine learning techniques to relevance rank for web search is common, using neural networks is not. I am surprised to see neural networks used as part of the relevance rank in a system of this size and scope.

I can't claim much understanding myself, perhaps someone can put that into simple terms for mere mortals? :)

Comments

Try it out! OHMIGOD the results are fantastic!

They have produced an incredible improvement in search results. This is far better than what I am seeing on Google.

I need to test it more, especially with Local Search (which is where I've been concentrating most of my efforts lately). But it looks really, really good right now.

Google needs to shape up or MSN may just become a serious contender.

I never thought I would give Kudos to Microsoft, but they have just earned a Thumbs Up from me.

neural networks

Greg Lindens quote just means that neural nets were about the last thing anybody imagined would be used for scoring in a mainstream SE at this point in time.

Personally I find it very interesting.

Neural Networks are a sort of "self organizing algo" so to speak. It's a kind of "black box" you present with a set of input (eg. a few billion interlinking web pages) and a set of desired results as well (eg. the top two-three-ten for a query - looks like "top two" from the figure in the MS blog post). This process is known as "training".

That's basically it. You then set the network to rank something and it will turn out extremely horrible results. So, you have to train it again, and -- as you will quickly discover -- again, and again, and again.

Repeat ad nauseam, and eventually results will start looking somewhat like they should (most likely with some very strange and 100% inexplicable exceptions as well).

Inexplicable because it's all. You'll never know what goes on inside the black box. There's no way to figure out why it places "page A" higher/lower than "page B". There's no "logic" as we know it, and no "reasoning" or "explanations" - it's all math.

Hence, the resulting "Algo" can, in fact, be different from query to query. Meaning: It might not be the same things that make # 1 for "blue widgets" as what makes # 1 for "red widgets".

Imagine that: Individual algo's per keyword: How does that sound for SEO?

The more training, the better results, and the larger your data set the more training is generally needed. With the web as "corpus" (data set) they will need several.. no, strike that... a whole godawful extremely large amount of training rounds.

So, how are they doing that training I ask?

They will need at least a "number one" and a "number two" for each training round (read: for each keyword), so how do they find these? Google? Yahoo? (just kidding ..i think :-)

algorithm

I should add that it's probably entirely wrong to speak of "algorithm" in the context of neural nets. An algorithm is a set of known rules and/or actions that are laid out in advance to produce a result to some kind of problem. Like:

"Take this road 300 metres, turn left, and then right at the church."

That would be an algo. You know where you are, where you are supposed to end up, and the steps that will take you there. It is the steps that are the algo.

With a neural net you only know where you are, and where you end up. You don't know which steps will be taken along the way (neither in advance, nor in hindsight).

MSN results are good..

..OK one tends to make that conclusion on how they rate your sites..but they appear to deliver better results for the selection of keywords I tried.

If you want to learn about neural networks for the layman (well for the non specialist) the place to try is the Scientific American and that link will give you a start.

(sheesh, why am I writing this, its probably the best evening of the year here in Cornwall - the temperature is in the mid twenties C and the tide is full below me now, back to the balcony)

There have been some changes

There have been some recent changes in my primary msn serps, some hurt, most stayed the same. *IF* the neural-algo is now fully in place, it's weird in some spots (as claus also said it would be) yet decent enough coming out of the gate.

It's nowhere near fantastic, though. Not in my niche anyway.

I'm not just looking at my sites, folks

I am looking at specific queries where I have had, for months, to search past the first page of results on Google for relevant or meaningful content.

I'm not going to bury myself in the semantics of neural networking and algorithms. I have two degrees in Computer Science and Data Processing and almost 30 years' IT experience. I understand the fundamentals of the field well enough.

Microsoft has made a substantial and significant change in the way it determines search results rankings.

It remains to be seen whether they maintain this level of quality or if it degrades and if Google can match it.

well

Very impressed with the dozen or so search terms I looked at [one of yours too rc], fantastic maybe be a little over the top but it looks to have the edge on Google.

>There's no "logic" as we know it, and no "reasoning" or "explanations" - it's all math.

There is always a reason, maths has nothing to do with it, it goes beyond that.

Running a 3-way comparison

Running a 3-way comparison on a certain city in Nevada known for casinos, as an example, I don't see an edge. In one of my serps (the big dog, NFFC) I see a troubling reappearance of a long time high-roller in PPC and paid placement that has all but been eliminated in any other truly relevant serp.

That said, the msn T10 are definitely in the same quality range as Y and G and this is the first outing. And, there is some indication that the neural-algo will learn --wonder how? Clickpop? Bail time?

>>wonder how? What apps

>>wonder how?

What apps phone home on WIn?

Toolbar?

Your mileage may vary, of course...

Yes, I was a bit over the top with my enthusiasm. I was just so surprised to see anything this good come out of Microsoft where they didn't have to buy a company.

Searching the low end of the

Searching the low end of the spectrum, however, MSN has some hits and misses --but so do the other two (I use very, very small towns as the litmus, i.e., town name and full state name).

now please

- before you dissect my posts above, and declare the level of detail too low or indeed interpret them as minted on any specific person or scenario: In the first post Nick explicitly asked:

perhaps someone can put that into simple terms for mere mortals?

Which is exactly what I did. No more, no less. And deliberately too. People write large volumes on this stuff you know. It's not as easy to describe it in simple terms as you might think it is (or as it hopefully sounds after the fact).

If you think it sounds simple, that would be because it is explicitly and deliberately intended to sound simple. So, I choose to take it as a compliment for now. EOD.

---
< pun >
And, some will argue that it's all math. Even reason.
< /pun >

OK..so this may be way off base..

Quote:
That's basically it. You then set the network to rank something and it will turn out extremely horrible results. So, you have to train it again, and -- as you will quickly discover -- again, and again, and again.

How would you train a machine? Well, the way *I'd* train a machine to tell them what a 'good' result is vs a 'bad' one is how long the person stays at the site before hitting the back button to try again.

Now, unless MSN is hiring a bunch of people to put in every keyword there is to bring up results and judge them, obviously, MSN is looking at the data they already have.

AND, perhaps those sites that have the 'fewest' back button hits will rise to the top.

How else could you do it?

LOL

"My CPU is a neural net processor; a learning computer. But Skynet pre-sets the switch to read-only when we're sent out alone."

LOL

Comment by "official msn representative" at SEW

The comment itself is not very illuminating, but it is interesting to see Danny accrediting them as "Official" at SEW

Quote:
Finally, in regards to SERPs changing position with our new ranking algorithm that uses RankNet --- all of the major engines are working hard to stabilize towards the "best" ranks, but of course we've all got a decent way to go before we're getting almost all queries almost completely correct. People often talk of taking one step back for two steps forward; I look at things as taking two million steps forward and a million steps back. Great for end users, but less ideal if you're one of those one million.

What helps us out the most is direct, actionable feedback. Gripes about "my site doesn't show up as high anymore" isn't super useful, as (a) we get a ton of them, and (b) we don't know why, and we can't look at all of them (see a). Maybe other sites are better and we promoted them --- that's what we always intend when one site is lowered. However, it could be that the sites that are better are spamming or optimising in some new way that unduly affects them. Or perhaps we just got some of 'em plain wrong. Whatever it is, please let us know why the site you care about is relevant to the query, and why some of the sites now supplanting the site are less relevant. The folks reading the feedback are technical, and I know this crowd is as well, so the more detail, the faster we'll be able to track down bugs and loopholes and fix 'em up. So click on that "Help us improve!" link on the bottom of every search page --- we do appreciate the feedback.

Thanks,
Erik Selberg
MSN Search

Fancy that

A SE representative with a *real name* and everything! Shock!, Horror!, Etc!

You mean GoogleGuy ISN'T his real name?

I thought he'd changed it by deed poll :)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.