Tuesday, July 22, 2008

Gamma

I liked this story (http://recursed.blogspot.com/2008/07/rutgers-graduate-student-finds-new.html) about a Rutgers grad student discovering a new prime generating sequence.

Tuesday, July 15, 2008

Epsilon

I found some posts on this tags issue

Tags: Database Schemas
http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html

and some performance tests on the above schema

Tagsystems: Performance tests
http://www.pui.ch/phred/archives/2005/06/tagsystems-performance-tests.html


The way I thought about doing it in normal SQL with joins was basically the second approach, having it mostly normalised but just using the tagname as the tad id.

Now, the question remains of how to do this in BigTable without any joins? The full text search schema still remains but I don't think that's a scalable solution so I'm not really interested in that approach.

Monday, July 14, 2008

Delta

Some links for today:

GeSHi Demo
Useful for posting highlighted code to the blog.

Hacker News @ Y-Combinator

Why you should play Go

Sunday, July 13, 2008

Gamma

In terms of my tag intersection problem I came across httpmr which is/aims to be a MapReduce implementation over http.

Beta

My second post.

As I've been hiding behind a rock for the last few months I only got wind of Google App Engine recently and have been looking at it over the weekend. It looks quite fun and I'm quite tempted to play around with it and write a simple app to test it out.

In particular I'm trying to get my head around their datastore and how to use it efficiently.

One particular problem that I just started thinking about is how you would do an intersection of sets given the lack of joins. Look at tags as used by gmail, Flickr, del.icio.us or any other such app and for concreteness I'll talk about a gmail type example. How do you find all emails that have a given set of tags without pulling in all the records for each tag and doing the intersection in memory? This might not be a problem for an email app where each account has a relatively small number of objects but what would you do if you had a bigger app?

This is a standard problem so I'm sure it has been solved efficiently many times and I hope that I'll be able to find an answer out there if I don't find one myself first. I started thinking about this last night and think I have a solution which I'm busy coding up in Python along with the traditional relational db approach for comparison. Hopefully I'll be able to post this here tomorrow.

Alpha

Howdy there

I hopefully will start putting some thoughts down here as I go along. I acknowledge that this has a high probability of becoming another blog stub, like one sees so often on the internet.


What's in a title?

I thought it was a suitable title for a beginning... and this is the beginning...in more ways than one.

It's the beginning of my blog.

It's the beginning of my Linux career. Actually more like a renaissance. I used to use Linux before but haven't in about 10 years or so. I also never really had my own Linux box where I could run root commands and actually install stuff. With my newly acquired Asus Eee I've been forced to get reacquainted, this time with the Xandros/Debian flavour, and so far I'm loving it!

I'm looking for a new job so I'm also refreshing my knowledge of C++ and learning some Python along the way as well.

Finally, Alpha, Beta and Gamma all have meanings in Quantitative Finance and this was a title of a blog I wanted to write about 3 years ago explaining these concepts and what I had learned about them. That's not really my focus right now but I might pick up these topics again once I've taken care of my current more pressing needs.