Google has 41 DataCentres After All...

23 comments

I keep seeing people refer to a random rag-bag collection of IP addresses that bring up the "Google English" search box, and then refer to those as being from "56 Google DataCentres". I assumed that Google would be more organised than that, and always felt that there must have been an expansion in the number of IPs brought into use since the time that most of the Google IP address lists were compiled back in 2004. So I went looking...

You're probably aware that a few weeks ago I uncovered another 500 IP addresses that also bring up that search box, taking the total to well over 600 IPs that are active. The list is featured in several places including Search Engines Forums which is the original place that I posted it in.

Many years ago, Google had a collection of addresses like www.sj.google.com etc that brought a search box up. There were many others (-fi, -in, -va, -ab, -dc, -cw, -ex, -zu, -sj, -lm, -mc, -kr, -gv, etc). They retired those names in 2004 forcing everyone to refer to results using the bare IP address.

Whenever you access a search result at www.google.com you get your result from a random Google IP address. Usually that IP address ends in x.x.x.104 or x.x.x.99 or x.x.x.147 and the cache comes from the same Class-C IP block.

Much confusion has ensued as Google has many hundreds of IPs that never seem to directly serve results. As well as IP addresses ending in 99, 104, and 147, they also have some ending in 44, 80, 91, 115, 184, 214, and many others for example.

Whenever you look at which Class-C block that is serving the results, you seem to get a different one every few minutes, but over the course of a few weeks you might have seen only 7 or 8 different ones. But we now know that Google actually has 41 such blocks.
So, there are several dozen Class-C blocks that don't seem to make it into the rotation either.

Well, today, I found the missing link to all of this and I think I can confirm that Google has 41 datacentres, each spanning a single Class-C block, and that the www-xx.google.com URLs weren't actually retired back in 2004 after all.

In fact, they were merely renamed to a new gfe-xx.google.com format and many retained the old two letter ID pair from the old www-xx style names too.

I have found 41 such names and they correspond to each of the recently known 41 Class-C IP blocks that Google uses to serve results.

Each gfe-xx.google.com name serves results from an IP address ending x.x.x.104.

Additionally, for some of the gfe-xx.google.com names, there are extra entries at gfe-xx2.google.com that serves results from the same Class-C block but with an IP address ending x.x.x.99 and for some there is yet another entry at gfe-xx3.google.com that serves results from an IP like x.x.x.147 but again this comes from the same Class-C block.

More can be found at this thread where all the Class-C blocks are listed, along with their newly found "gfe" names, and range of IPs that respond.

And GFE, what does it stand for? Google Fools Everyone?

If you're a "Google Toolmaker" I hope this post hasn't given you a heart attack.

Comments

in lamens terms

Okay for those of us who "mock but don't understand" ... can you explain again in plain english the 600 IP and only 41 datacenter thing.

data centers in order

gfe-au, gfe-ar, gfe-bf, gfe-bp, gfe-bu, gfe-bx, gfe-cw, gfe-dc, gfe-ed, gfe-eh, gfe-ff, gfe-fg, gfe-gv, gfe-he, gfe-hk, gfe-hs, gfe-hu, gfe-ik, gfe-in, gfe-jc, gfe-jp, gfe-kc, gfe-kr, gfe-lm, gfe-lo, gfe-mc, gfe-nf, gfe-nz, gfe-od, gfe-po, gfe-py, gfe-qb, gfe-rn, gfe-ro, gfe-tw, gfe-ug, gfe-ui, gfe-va, gfe-wr, gfe-wx and gfe-yo,

add these before xxx-xx.google.com

if anyone else is interested, i found this:

http://66.249.93.104/translate_c?hl=en&u=http://search.web-sun.com/zatu/data_center_list.html

Google isn't hiding these from us either:
1. http://www.google.com/search?num=100&q=gfe+google
2. http://en.wikipedia.org/wiki/List_of_Google_server_types

If GWS means Google Web Server, what does GFE stand for? I highly doubt Google is referring to the Urban Dictionary for this either: http://www.urbandictionary.com/define.php?term=gfe (#5 LOL)

ah

okay, i think i get it now.

Huh?

Are we trying to guess the number of phsyical data centers by counting class C networks? I think that it would be typical for one major datacenter to diversify network access with multiple class C's purchased from different providers, especially if that datacenter experienced 10 years of dramatic growth.

gfe - Google Front End?

How about Google Front End?

front controller

It's operating as a front controller, so antezeta wins the prize IMHO

IP and DNS round-robin

Any access to www.google.com will usually return 3 IP addresses if you use a Mozilla/Firefox extension such as ShowIP. These end in 104 and 99 and 147 each time, all from the same Class-C block. Next time you search you may well get returns from a completely different class-C block.

Access to any gfe-xxn.google.com address returns just one IP every time, and the pattern and numbering is consistent across -xx (104) and -xx2 (99) and -xx3 (147) every time. The IP address is completely static.

sorry

Sorry... I meant that rather than having a front controller, which would direct incoming requests as appropriate, those boxen have a front end (acting in the place of a front controller, since they don't distribute queries). So using gfe for "google front end" would make sense to me (developer wise).

Now about the data on those boxen...

Tracking views of cached pages with these IPs

The updated list of IPs is very useful in tracking views of a site's pages served from Google's cache. Assuming a page has embedded objects, such as images, css etc, the cache IPs and search keywords used to find the page will show up in a site's server log referrers (once for each object in a page).

The cache addresses are always numeric IPs. I'm not sure why there isn't a separate alphanumeric alias – an oversight?. For the record, I'm just
seeing IPs ending in 104, albeit in a limited data sample.

Lots of nice questions follow:

  • Why did a user choose to view the cached copy of a page?
    • The page no longer exists (but maybe should)?
    • The site was/is slow or down?
    • The user was doing “competitor analysis” and didn't want to hit your site (but forgot to use the &strip=1 option)?
  • For a given set of keywords, was the user enticed to follow through to visit the site?

I wrote about Tracking Search Engine Cache Page Views with Web Analytics earlier this month.

Thank you g1smd for helping pull this info together.

What do you mean by

What do you mean by "&strip=1" ?

&strip=1 is a URL parameter

&strip=1 is a URL parameter when viewing some Google cached copies of web pages

&strip=1 - Text only cache display

This option shows only the text in Google's cache. Thus, images and other fun stuff don't get pulled from my server - and I cannot measure the cache view nor do I know you looked at my page.

Ah, I see, thanks a lot for

Ah, I see, thanks a lot for the explanation :)

Does it show javascript? If so, then you can makee an XMLHTTP-request to your server and count it.

The advantage of &strip=1 is

The advantage of &strip=1 is you get to see the actual text parsed by Google - it doesn't pull the additional components from your server. So, for example, if you used Javascript for nav, then viewing the &strip=1 version of the cache will show that Google can't see the Javscript nav; or flash nav - its a text view of what was cached, without pulling the uncached stuff.....

everybody's closet looked like that

Back then everybody's closet looked like that. And even today, when people see cables, they think "wow... complex technology".

If you have ever wired anything (prototyped anything) you know that it is foolish to route the cables until the design is fully vetted. Until then it looks like hell. No big deal.

Chinese registered....

Some new IP addresses for www.google.cn discovered today:

59.151.21.100
59.151.21.101

These are in a new Class-C block, that I haven't seen Google use before.

The IP addresses, when directly accessed, redirect to www.google.com from outside of China, but when trying www.google.cn it responds as advertised.

gfe-eh.google.com

Looks like all the big changes in the SERPs are at gfe-eh.google.com recently.

Matt Cutts also hints to keep an eye on that one in the coming weeks...

Is this The Dalles starting up?

Five new Class-C IP blocks recently found:

gfe-ag - 72.14.247.*
gfe-td - 72.14.255.*

gfe-fk - 209.85.129.*
gfe-mu - 209.85.135.*
gfe-?? - 209.85.143.*

http://www.webmasterworld.com/google/3098099.htm

And there's more....

Two more blocks now live:

gfe-?? - 209.85.133.x
gfe-?? - 209.85.139.x

And another one....

This one has gone live in just the past few days:

gfe-?? - 209.85.155.x

GFE confusion ?

Sorry, but i am just beginning to understand Google Data center theory but then... What exactly is GFE?

and do we actually have the name of the locations where GOOGLE does have its so called 41 or more data centers ?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.