glitch #1

BlogsNow

4:20 this morning. The computer that runs BlogsNow freezes.
Not the first time, but it was doing ok in the last 2 months.
Tricky problem to fix: The error frequncy is so low that you would know after a year or two if a change fixed the issue or if you just were lucky. Sigh.

Good thing: mysql and it’s wonderful mysqlcheck auto repair.

Does in minutes what took days on the old server.

Around the world in 100 Links

BlogsNow google

BlogsNow got it’s first ‘special view’ today. I mentioned it before, but this is the first sign of it: I rewrote BlogsNow in Version 2 also to be able to whip up quick views that I feel might be interesting. Google Maps is an amazing web application. And, of course all the images being out there people will find interesting views and share them in their blogs. BlogsNow simply lists the most prominent ones. As usual millions of bloggers will collect some collective filter that is somewhat interesting. Since longitude and latitude coordinates do not mean much to most of us, I listed the closest airports instead.

BlogsNow gets ads

BlogsNow

BlogsNow runs now ads.
I think one on the top is not too much.

Let’s see how this works out.

BlogsNow and the others

BlogsNow

I made a little overview about meme tracking tools

It looks as BlogsNow is one of the internets best kept secrets.

the king is dead, long live the king

BlogsNow

blogsnow Version 1 has been turned off. 16 months of solid service. Many hours of coding. People loved it, but now its offline.

Version 2 is the new black. The DNS will be switched right I after I saved this entry. And the rest is history.
Yes, there is stuff missing in Version 2, but on the other side it does update quicker than you can read through it.
Try it.

BlogsNow Version2 one step closer to being done …

BlogsNow

BlogsNow Version 2 is getting there.

For now I cut some corners and just put the super fresh data online in the old design.
There are still not more features like different views, but I have a plan for that.

It all should go pretty quickly now. The plot thickens …

BlogsNow Version2: next feature

BlogsNow

BlogsNow Version2 just got the next feature: und ‘blgs’ there is a list for a all the blogs linking to a given item.

BlogsNow Version 2: Preview Version live

BlogsNow internet

The first public page of BlogsNow Version 2

The link above will go to a preview page of BlogsNow Version2. It is just the latest links. BlogsNow Version2 became a complete rewrite. None of the old code or data has been used. Just the experience.

BlogSpam is one of the biggest issues for a Meme Tracker like BlogsNow. Rigth now it looks as if 25% of all active blogs are spam. Created by programs, not people. Created to make a quick bug for somebody somewhere.

It was an interesting mental excercise to spend so much coding time on this subject. My spontanous reaction to ‘spam’ is that I really hate it. I hate the concept to create huge damages for many people just so that very few have a little financial gain. But being furious is not a good mental state to write code in. At least not for me. So I had to get over it, and just
turn it off. It looks as if it works right now.

Since spam filtering works I could include blogspot.com hosted blogs in BlogsNow Version 2 again. Which is nice,
since there are jsut so many blogs on there. It was a sad day when I was forced to turn the crawl off for it in Version 1.

blogsnow V1 was fast. BlogsNow Version2 is even faster. It runs circles around Version1: An analysis of last 50,000 links
added to the blogosphere (covering something like 5-10 hours right now) takes about 30 seconds. Version1 is busy for about five minutes on the same task.

The Preview page gets a fresh data set every three minutes. Since it can 😉 The Ranking is a mix of number of links and time since the link has been added: Links added right now have full weight, while the last one has no weight. I think I will be playing with the exact recipe for a little bit.

The Preview page has no features. Of course there will be pages with who links to what, etc. I am somewhat undecided on RSS. Version1 had no RSS half it’s life. People got all excited when I added it, but I did not see the use spread or be particularly interesting. I think that many people just ‘collect’ RSS feeds like they do bookmarks. But they actually never go back, since they are busy chasing the next butterfly. So I might as well skip RSS. Except for movies and mp3. Media Enclosures make sense. And yes, BlogsNow will have those lists as well.

Let me know what you think about this little glimpse on the future of BlogsNow Version2.

server crash

BlogsNow linux this weblog

It all went to well.

suddenly the server was not reacting. Everything worked till the shell needed to do /anything/ with the disk.
Had to reset it. Sigh.

The syslog said:

May 29 19:14:25 andreaswacker kernel: kswapd0: page allocation failure. order:5, mode:0x50
May 29 19:14:25 andreaswacker kernel: [<0214b29a>] __alloc_pages+0x28b/0x298
May 29 19:14:25 andreaswacker kernel: [<0214b2bf>] __get_free_pages+0x18/0x24
May 29 19:14:25 andreaswacker kernel: [<0214e9c1>] kmem_getpages+0x15/0x94
May 29 19:14:25 andreaswacker kernel: [<0214f74c>] cache_grow+0x155/0x29a
May 29 19:14:25 andreaswacker kernel: [<0214fa9e>] cache_alloc_refill+0x20d/0x23d
May 29 19:14:25 andreaswacker kernel: [<0215004f>] __kmalloc+0x6b/0x7d
May 29 19:14:25 andreaswacker kernel: [<82964f54>] kmem_alloc+0x50/0x96 [xfs]
May 29 19:14:25 andreaswacker kernel: [<829479a1>] xfs_inode_item_format+0xe0/0x239 [xfs]
May 29 19:14:25 andreaswacker kernel: [<8295aaf3>] xfs_trans_fill_vecs+0x3a/0x86 [xfs]
May 29 19:14:25 andreaswacker kernel: [<8295a8a4>] xfs_trans_commit+0x18d/0x300 [xfs]
May 29 19:14:25 andreaswacker kernel: [<829494de>] xfs_iomap_write_allocate+0x248/0x436 [xfs]
May 29 19:14:25 andreaswacker kernel: [<82949520>] xfs_iomap_write_allocate+0x28a/0x436 [xfs]
May 29 19:14:25 andreaswacker kernel: [<0224e474>] generic_make_request+0x190/0x1a0
May 29 19:14:25 andreaswacker kernel: [<829485a3>] xfs_iomap+0x23b/0x3ed [xfs]
May 29 19:14:25 andreaswacker kernel: [<829486ba>] xfs_iomap+0x352/0x3ed [xfs]
May 29 19:14:25 andreaswacker kernel: [<8296c99f>] xfs_bmap+0x1a/0x1e [xfs]
May 29 19:14:25 andreaswacker kernel: [<82965219>] xfs_map_blocks+0x29/0x11e [xfs]
May 29 19:14:25 andreaswacker kernel: [<82965dd9>] xfs_page_state_convert+0x273/0x4e8 [xfs]
May 29 19:14:25 andreaswacker kernel: [<829664cb>] linvfs_writepage+0x91/0xc6 [xfs]
May 29 19:14:25 andreaswacker kernel: [<021522db>] pageout+0x83/0xc0
May 29 19:14:25 andreaswacker kernel: [<02152522>] shrink_list+0x20a/0x547
May 29 19:14:25 andreaswacker kernel: [<02151540>] __pagevec_release+0x15/0x1d
May 29 19:14:25 andreaswacker kernel: [<02152a92>] shrink_cache+0x233/0x4d5
May 29 19:14:25 andreaswacker kernel: [<0215357f>] shrink_zone+0x8f/0x9a
May 29 19:14:25 andreaswacker kernel: [<021538a5>] balance_pgdat+0x176/0x249
May 29 19:14:25 andreaswacker kernel: [<02153a3e>] kswapd+0xc6/0xc8
May 29 19:14:25 andreaswacker kernel: [<02120fcf>] autoremove_wake_function+0x0/0x2d
May 29 19:14:25 andreaswacker kernel: [<02120fcf>] autoremove_wake_function+0x0/0x2d
May 29 19:14:25 andreaswacker kernel: [<02153978>] kswapd+0x0/0xc8
May 29 19:14:25 andreaswacker kernel: [<021041d9>] kernel_thread_helper+0x5/0xb
May 29 19:14:25 andreaswacker kernel: deadlock in kmem_alloc (mode:0x50)
May 29 19:14:25 andreaswacker kernel: possible deadlock in kmem_alloc (mode:0x50)
May 29 19:14:25 andreaswacker last message repeated 85 times

Don’t hope that this is common for Fedora Core3 on a AMD machine with a big array. I start to load the machine with tasks now.
Let’s see if it happens again.
Sure enough Mysql was not happy since I run it with delay-key-write.

WordPress said:


WordPress database error: [Incorrect key file for table 'wp_comments'; try to repair it]

So I did a


mysqlcheck -pXXXXXX --auto-repair wordpress

which seems to have done the trick.

BlogsNow Version 2 and spam

BlogsNow malware

BlogsNow Version2 is coming along. Instead of moving code and data from Version1 over to this host I decided to write it again. Most changes go into spam detection and filtering.

Right BlogsNow Version 2 flags and ignores –

– spam:
http://midwesternerslavished.blogspot.com/
http://pet-insurance-tips.blogspot.com/
http://guitar-rock.blogspot.com/

– indecent content:
http://spaces.msn.com/members/teen-galleries/
http://spaces.msn.com/members/adult-creampies/

[I thought that spaces had such a tight content filter, apparently not]

– ‘blogs’ that forward directly to porn sites:
http://jasmine-disney-hentai.blogspot.com

There is an ever increasing amount of blogs that only were created for spam purposes.
Right now it looks as if BlogsNow can start crawling blogspot.com blogs again in Version2.