google china and linux

google internet

Google censors it’s results in China. Lately I have been googling allot for a current project. Installing things on linux machines means that you have to have access to all those how-to’s boards forums. Allot of linux technology exists in China and everywhere in the third world. How about calling the next redhat release not ‘tettnang’ but rather ‘democracy’. Or sprinkle those ‘bad words’ in your source code. Everybody who will filter would cut himself of a significant amount of information. The chinese boom is fueled by technological information available so freely. By merging ‘bad terms’ into the things they need to read you essentially break filtering. No matter how eagerly google will comply: Either you censor and loose lots of information for your business or you give up on the censor stuff and deal with the realities of history. Their choice really.

Right now the chinese have the cake (free information on technology) and eat it too (cencorship)

this is the day …

free of any reason google internet media

… that we have learned that whale vomit is valueable, yahoo gives up on competing with gooogle and what George Lucas sold for 10 Million to finance a divorce is now worth seven Billion dollars.

So, yes, this is a strange day.

google’s earnings

google internet

The BBC has an widely discussed article about google out. It does not contain any real news, and it’s tone is missing the point.
But they write that Google’s ad revenue was one point five billion US$ in the quarter ending in September. According to Wikipedia has 5,000 employees. Let’s stick with that number here even though they might have hired a couple hundred more people in the meantime. Probably smart ones.
Those numbers come down to the simple fact that google makes 100,000 US$ in earnings per month and employee. That feels profitable to me.

No wonder the telcos become greedy.

Their cause is lost though.
The internet is not an aqueduct.
It’s the content that matters.
TCP/IP is so commodizied that it will always be cheap.
Worst case somebody comes up with a hack to mesh wifis together. [kidding]

broken windows theory and splogs

BlogsNow google internet malware

You an apply the Broken Windows Theory at spam blogs as well. There was always ample opportunity for spammers in blogs. Now they are in, and they make revenue. So they enhance their spam blogs to stay in the game. Here two splogs out of a current campaign:

exhibit A
exhibit B

They certainly get better.

ads for god

communication google marketing media

looking down

it does not take long

google

For a new feature to be used. Recently google allowed
the placement on websites of their content.

somebodies most liked list

there will be more, of course.

smart splogs

google malware

http://thisasseenontvpetsteps.blogspot.com/
or
http://thisdoggieramp.blogspot.com/

not your usual link collection.

google spams

BlogsNow google internet malware

ok, provokative title. Let’s rephrase: google tolerates spam.

Blogger is owned by google. It runs the biggest blog service on it’s blogspot domain.

It appears to be very simple to create hundrets of thousands of ‘weblogs’ like this:

http://p85.blogspot.com/

Created solely for spam purposes. So called ‘splogs’. You set up a robot and there is nothing in the blogger software that stops you from adding all the blogs you like.

This is not new. Google / Blogger / Blogspot knows about it. They did nothing against it in the last years.

It should be relatively easy to make sure that there is a human in front of the computer if a new weblog is created at blogspot.com. Simplecaptchas are very common today.

There are two possible explainations why this did not happen yet:

– blogspot engineering is amazing incapable

or

– there is no real rush to get rid of splogs on googles side.

It might make sense:
You have to forget the “don’t be evil” and “organize the worlds information and make it easily accessible” google dogma’s for a second though. Google knows one thing very very well: how to run a scalable service. They have the lowest cost per stored bit due to their own file system technology. It uses commodity hardware and adds failover management brilliantly. It does cost google not much to host millions of splogs.

But wouldn’t million of false blogs pose a danger to the result-quality of a search engine?

Exactly.

Google knows from which ip address a blog get’s maintained. Nobody else does. They have the actual blog data readily available for further parsing. I doubt that the googlebot comes through the front door to blogspot. The bandwidth alone that you could be saved by crawling blogsport internally should make up for the ‘exception’ that this would mean to the googlebot operations. I don’t know these things. It’s a guess.

Every search engine has to have spam combat tools these days. Google is one of the most useful search engines and in the US they have an ok handle on search engine spam. Isn’t it funny that they don’t use their insider knowledge and acess together with their anti-spam tools to simple turn off splogs on blogspot?

Last October there was somebody that scraped famous blogers sites and reposted that content splogs. That got some attention, and stopped. But splogs did not.

Blogspot hosts lots of splogs. But also lots of legit and very powerful weblogs. Nobody can really afford to ignore the biggest weblog service. Yahoo, Msn and even my little BlogsNow have to crawl blogspot in order to find out what is going on. Google can skip the skip, all others have to deal with it.

There is also a third theory that is the most plausible:

splogs don’t matter to search engines. They have to crawl billions of pages anyway. Who cares about a couple of million spam blogs here and there. That’s probably what it is: The aircraft carrier keeps on going regardless if there are 50% more roaches in the kitchen or not.

google.de kaputt, so is Karstadt.de

google umlautfrei

Tomorrow we will drive 40 miles to the next city to do some offline shopping. Who cares? I know, nobody. But these little chores give you an interesting glimpse on the state of things. In Germany shops can’t just be open when they like to be. As crazy as it sounds there is a Ladenschlusszeiten gesetz where more than 3000 words define when stores can be open or closed in Germany. It used to be simple that stores would close at 6:30 pm. Which was real helpful when I was young: It got me out of bed to return some empty beer bottles and get new ones. Sometimes I missed it. That was a long time ago, and there was a revision of the law. Of course it is still regulated, but not really simple when stores are open and when they are not. After all this is Germany: if you want to consume then you have to obey some rules. There need to be rules. Germans love their rules.
Back to the shopping trip: The biggest department store in Germany is called Karstadt. Think Sears blended with Macy’s. They usually occupy a big chunk of the inner city. My wife googles

karstadt bremen oeffnungszeiten

Which should do the trick: karstadt is a very very rare name, only being used for the store. Bremen happens to be the city that the store is in. And oeffnungszeiten is german for “shop hours”. The results are complete spam. 100%.
Not one page in the right direction.

Karstadt has a website. But they seem to prefer to pay google money to be listed in the search results. Not a single result is from their own site.

The only question that remaings: Who is more broken, ‘google.de’ or ‘Karstadt’ .
Probably both.

google.de is as messed up as Apple Germany. They always have been a total pain to deal with.
I think they manage to tell their US motherships that it’s Germany’s fault that they have no success. Easier than actually getting something done here.

What might be the next google at the ‘big daddy data center’ does show the same amount of spam and junk.
Different junk, but the actual website of Germany’s biggest department store is equally missing.

cringley is an idiot

google history internet

or maybe I am one.
Last November Robert X Cringley writes about a google project.
He claims that Google is planning to put 5,000 opteron CPUs and 2.5 Petabytes in a 20 or 40 foot container.

Back in November the story got attention, and now it bubbles back up again in the context of the “Google PC Walmart CES” buzzword cluster.

I wondered if the “Cringleytainer” would actually be feasible:

Those pieces would barely fit in a 40 foot container. Forget about air flowing around. Maybe it’s all water cooled?

Which leads to the ultimate flaw in Cringley’s concept:
5,000 CPUs @ 90 W and 30,000 disks @ 15 W would use 0.9 Megawatt. Let’s add 0.1 Megawatts for boards and powers supplies. Of course this would assume a couple of technology breakthroughs.
Ignoring the laws of thermo dynamics we have to add the same power to cool the thing: 2 Megawatts.

Googling around I found this power source for the Cringleytainer. Guestimating optistically again it would use a gallon of diesel every minute.

Of course Mr Cringley is not an idiot. Not more or less than anybody else. I am only certain that I am one,
since I had to spend so much time with me.

What strikes me is that such a story can float around without anybody doing the basic math. Or maybe people did and got ignored. It’s much more ‘news worthy’ to toss around crazy ideas involving google.

If I should be bored in mid March then I will try to inject the urban myth of a planned Apple Google merger into the world.