post progress

history

Yves Marchand & Romain Meffre

INTERDUBS podcast

interdubs

I would get the New York Yellow Pages from audible if Jeff Heusser would read them. Sorry that do talk so much in this podcast.

what we do for a living

history

nicely visualized

3269 days later

google history internet

September 26th 2000 I started to count how many pages google had for specific terms. I am moving some data around, so while it was going by on a terminal window it caught my eye. Here some excerpts:

Peace
was: 6,290,000 today: 258,000,000 41x

War:
was: 16,000,000 today: 865,000,000 54x

Sex:
was: 24,200,000 today: 650,000,000 26x

Love:
was: 24,900,000 today: 1,500,000,000 60x

Apple:
was: 5,920,000 today: 342,000,000 57x

Microsoft:
was: 15,000,000 today: 503,000,000 33x

Linux:
was: 27,500,000 today: 301,000,000 11x

ssh prime agent

linux

(sorry if this should not make any sense to you. This is a note for me to go back to. Even though I bring new machines online regurlarly I forget the exact steps for this)

on X
ssh-keygen -t dsa

add content of .ssh/id_dsa.pub in the end of .ssh/authorized_keys2 on Y. Only thing we need to do on Y.

after boot of X run

ssh_info_file=~/.ssh-agent-info-`hostname`
ssh-agent >$ssh_info_file
chmod 600 $ssh_info_file
source $ssh_info_file
ssh-add ~/.ssh/id_dsa

before login from X to Y
source /root/.ssh-agent-info-hostname

keep the cat out the cold

Apple

After a couple of days with OS X 10.6 aka Snow Leopard I have to say that it is not worth bothering with. It breaks things. Nothing major, but there are no rewards for that hassle. ZFS would have been nice, but it is not there. Applications that would use more than 2/3/4 Gigabytes of memory would have been nice. But even the recently updated FinalCut is using 64bit yet to address this much memory.

All those glorious changes under the hood? Well, all I notice were the things that are broken. I am so glad that I kept my main machines at 10.5

“only takes a minute”

interdubs technology

So I wrote a script that will save me a minute. I pretty much assumed that I wrote it, just because I like writing code, and this task was just something that fit into the timeslot before dinner. I chalked up the twenty minutes it took as wasted time. Others check Facebook, I write a script that can be done before the next thing on the schedule.

As i said this one will only save a minute. But it will do so every day. Still no big deal, I thought. But -funny as it goes- it finished it a minute early, so I came to realize that I will have saved six hours after a year. Yes, in my head it takes takes 60 seconds to compute 365 / 60. Anyway: after two years I get one more day in Hawaii. That’s actually not bad at all for something squeezed in before dinner.

It also gets to show how bad we are actually in estimating what the impact our actions is. I didn’t start out to save a work day in two years. I simply had twenty minutes to fill and a repeating task that could be sped up. Guess I got lucky. Again.

fonts installed as seen by browser

internet

a very nice tool that will detect the fonts a browser can use

mindsoft speak // technology integration

interdubs technology

“It usually takes about 12 to 18 months to build a new center,” she said. “We’re cutting that down to less than a year.”

from a NY Times article about Microsoft and Google and their respective data center operations.

It is interesting that after years in corporate culture people start saying this kind of thing and feel that there is nothing wrong with it.

The article poses the question whether google benefits from looking at each level of the technology stack and inventing where needs are. It does not come to a conclusive answer. I think it is rather obvious: Google was able to reduce its capex spending simply when it felt oppotune to do so. To my knowledge and own experience there hasn’t been any noticeable impact on the Google useability by this reduction in spending. I would guess google simple turned down the pace innovation while the influx of new equipment was slowing.

On a -in comparison- microscopic scale I experience the benefits of looking at the entire technology stack first hand. Part of what runs INTERDUBS is of the shelve, and other parts are enhanced, customized and severely optimized. Some we even actually build ourselves. We constantly look at the running service and identify room for improvement. Be it, in the user experience, or how efficient internals work. Having an understanding of the entire system on all levels lets us identify clearly where enhancements should be made. Each of these steps might only add a couple of percentage points. Having metrics and detailed information about all aspects of the system at all times not only give us visibility into which areas are to be tunesd and enchanced next. It also reveals, that all those little optimizations add up into a configuration ,that is faster by dimensions than the un-altered and generic one would have been.

Having this culture of change and constant optimization is allot of fun. I was plain scared having to do this on live data and a running service. But the goal was that INTERDUBS is available 24/7. And it turns out that technology – used in the right way – is able to do this now. It is literally flying the airplane and rebuilding it in the same time. You start in LA in a 707 and land in New York on a A380.

machine memory afterburner

linux

sar from the sysstat modules is nice. I think it keeps about a weeks worth of history around. I’d like to have more than that. There might even be a command lines switch to do that. But often it is just faster to write what you need when you can type with reasonable speed. This script will copy all sa files into a directory called /var/log/allsa in the form saYEARMONTHDATE. So today’s sa file I can access forever via


sar -f /var/log/allsa/sa20090822

The script only cares about files that are older than a day. So it will take between 24 and 48 hours that the files appear in their final destination.


#!/usr/bin/perl

#
# This will keep all daily sa files readable via saw.
# It seems to be a shame to # throw them away.
# A year worth of sa files is about 113 MB for my machines
#
# This script is meant to run daily. It probably needs root permissions.
#
# use as much as you like. No Warranties or promises. Your problem if it eats your machine.
# Andreas Wacker, 090822

use strict ;

my $sourcedir = "/var/log/sa";

my $targetdir = "/var/log/allsa";

if (! -d "$sourcedir"){
die "can not find directory $sourcedir for sa files";
}

if (! -d "$targetdir"){
system ("mkdir -p $targetdir");
if (! -d "$targetdir"){
die "was unable to create $targetdir. $0 would need it to proceed ";
}
}

opendir (INDIR , $sourcedir) or die "unable to read directory $sourcedir";

my @allfiles = readdir (INDIR);

close (INDIR);

foreach my $file (@allfiles){
if ($file =~ /^sa[0-9]+$/){
my $completefilepath = "$sourcedir/$file";
my $mtime = (stat $completefilepath)[9];
my $dayage = (time() - $mtime ) / ( 3600 * 24 ) ;
if ($dayage > 1){
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime ( $mtime);
my $datestring = sprintf ( "%d%0d%02d" , $year + 1900, $mon + 1, $mday);
my $targetfilepath = "$targetdir/sa$datestring";
if (! -f "$targetfilepath"){
#print "$file $dayage $datestring\n";
system ("cp -p $completefilepath $targetfilepath");
if (! -f "$targetfilepath"){
die "tried to copy from $completefilepath to $targetfilepath and it did not work. This is a very bad sign!";
}
if (((stat $completefilepath)[7]) != ((stat $targetfilepath)[7])){
die "file sized for $completefilepath and what should have been a copy $targetfilepath did not match. Not good!";
}
}
}
}
}