Porting Mapskrieg to App Engine
Lately I’ve been playing around a lot with Google App Engine. I think it’s finally at a point where it makes sense to develop for it, and it can be fun and profitable to do so. For example, I wrote Instascriber on App Engine, and so far it has cost me a few pennies in CPU cycles (and like $2 for a domain name or something).
Playing around with memcaching and stuff with Instascriber has led me to become a little obsessed with efficiency (not that I wasn’t already). I took a look at Mapskrieg and found that it was performing pretty slowly. My Google Webmaster Tools thing was saying it took on average 6-8 seconds to load! I wanted to decrease that number, and I figured I could make it work on App Engine, so I started working on it last week.
I modified the parser I use to grab craigslist listings to also send them to Mapskrieg. Then I basically rewrote all of the logic that existed in PHP and ported it over to App Engine. This took a little while but it wasn’t too difficult since Mapskrieg is a pretty simple web app. I hardly changed any code in the Maps API implementation, though I’m thinking of moving to v3 (which is apparently supposed to be faster).
I just switched out the hosting from my MediaTemple PHP based host to the Google App Engine based one. So far the results look good. I think Mapskrieg is faster in terms of response time and the caching is definitely smarter than it was before (a memcache that only gets cleared when the content in it changes versus a time based cache expiration).
I’m going to watch closely to make sure nothing got borked in the transition, but so far so good. If I can keep Mapskrieg on App Engine, I might downgrade the MediaTemple server to save some money and maybe eventually move other services to App Engine as well.
One funny thing I noticed was the pricing plan for App Engine. I’m currently pruning listings from the database when they get too old or move past the limit of listings per listing type (because I’ll never show them). This is actually older behavior from when I had listing info in MySQL. Back then, the size of the database would affect the performance of selecting rows. Google’s datastore apparently does not get slower with respect to size. This is actually pretty awesome.
The weird thing is that adding and removing datastore objects costs quite a lot in CPU time. Additional CPU time past 6.5 hours is $.10 an hour. Storage, on the other hand, is .$01 per 2 gigabytes per day. From my usage, I am predicting that the CPU cycles that I would use to trim the datastore listings would actually cost more than just leaving them in the datastore and paying for storage. Is that insane or what? I still have to test some things out, but it kind of surprises me that storage would be that much cheaper than the CPU cycles to remove something.