Monday, October 27, 2008

This is depressing: Ken Thompson is also a googler

Just found out that Ken Thompson, one of the creators of Unix, works at Google. You can see his answers to various questions addressed to Google engineers by following the link with his name on this page.

So let's see, Google has hired:

* Ken Thompson == Unix
* Vint Cerf == TCP/IP
* Andrew Morton == #2 in Linux
* Guido van Rossum == Python
* Ben Collins-Sussman and Brian Fitzpatrick == subversion
* Bram Moolenaar == vim

...and I'm sure there are countless others that I missed.

If this isn't a march towards world domination, I don't know what is :-)

Thursday, October 16, 2008

The case of the missing profile photo

Earlier today I posted a blog entry, then I went to view it on my blog, only to notice that my profile photo was conspicuously absent. I double-checked the URL for the source of the image -- it was http://agile.unisonis.com/gg.jpg. Then I remembered that I recently migrated agile.unisonis.com to my EC2 virtual machine. I quickly ssh-ed into my EC2 machine and saw that the persistent storage volume was not mounted. I ran uptime and noticed that it only showed 8 hours, so the machine had somehow been rebooted. In my experiments with setting up that machine, I had failed to add a line to /etc/fstab that causes the persistent storage volume to be mounted after the rebooted. Easily rectified:

echo "/dev/sds /ebs1 ext3 defaults 0 0" >> /etc/fstab

I connected to my EC2 environment with ElasticFox and saw that the EBS volume was still attached to my machine instance as /dev/sds, so I mounted it via 'mount /dev/sds/ /ebs1', then restarted httpd and mysqld, and all my sites were again up and running.

I tested my setup by rebooting. After the reboot, another surprise: httpd and mysqld were not chkconfig-ed on, so they didn't start automatically. I fixed that, I rebooted again, and finally everything came back as expected.

A few lessons learned here in terms of hosting your web sites in 'the cloud':

1) you need to test your machine setup across reboots
2) you need automated tests for your machine setup -- things like 'is httpd chkconfig-ed on?'; 'is /dev/sds mounted as /ebs1 in /etc/fstab?'
3) you need to monitor your sites from a location outside the cloud which hosts your sites; I shouldn't have to eyeball a profile photo to realize that my EC2 instance is not functioning properly!

I'll cover all these topics and more soon in some other posts, so stay tuned!

Recommended book: "Scalable Internet Architectures"

One of my co-workers, Nathan, introduced me to this book -- "Scalable Internet Architectures" by Theo Schlossnagle. I read it in one sitting. Recommended reading for anybody who cares about scaling their web site in terms of both web/application servers and database servers. It's especially appropriate in our day and age, when cloud computing is all the rage (more on this topic in another series of posts). My preferred chapters were "Static Content Serving" (talks about wackamole and spread) and "Static Meets Dynamic" (talks about web proxy caches such as squid).

I wish the database chapter contained more in-depth architectural discussions; instead, the author spends a lot of time showing a Perl script that is supposed to illustrate some of the concepts in the chapter, but falls very short of that in my opinion.

Overall though, highly recommended.

Wednesday, October 08, 2008

Example Django app needed

Dear lazyweb, I need a good sample Django application (with a database backend) to run on Amazon EC2. If the application has Ajax elements, even better.

Comments with suggestions would be greatly appreciated!

Thursday, October 02, 2008

Update on EC2 and EBS

I promised I'll give an update on my "Experiences with Amazon EC2 and EBS" post from a month ago. Well, I just got an email from Amazon, telling me:

Greetings from Amazon Web Services,

This e-mail confirms that your latest billing statement is available on the AWS web site. Your account will be charged the following:

Total: $73.74

So there you have it. That's how much it cost me to run the new SoCal Piggies wiki, as well as some other small sites, with very little traffic. Your mileage will definitely vary, especially if you run a high-traffic site.

I also said I'll give an update on running a MySQL database on EBS. It turns out it's really easy. On my Fedora Core 8 AMI, I did this:

* installed mysql packages via yum:

yum -y install mysql mysql-server mysql-devel

* moved the default data directory for mysql (/var/lib/mysql) to /ebs1/mysql (where /ebs1 is the mount point of my 10 GB EBS volume), then symlinked /ebs1/mysql back to /var/lib, so that everything continues to work as expected as far as MySQL is concerned:

service mysqld stop
mv /var/lib/mysql /ebs1/mysql
ln -s /ebs1/mysql /var/lib
service mysqld start

That's about it. I also used the handy snapshot functionality in the ElasticFox plugin and backed up the EBS volume to S3. In case you lose your existing EBS volume, you just create another volume from the snapshot, specify a size for it, and associate it with your AMI instance. Then you mount it as usual.

Update 10/03/08

In response to comments inquiring about a more precise breakdown of the monthly cost, here it is:

$0.10 per Small Instance (m1.small) instance-hour (or partial hour) x 721 hours = $72.10

$0.100 per GB Internet Data Transfer - all data transfer into Amazon EC2 x 0.607 GB = $0.06

$0.170 per GB Internet Data Transfer - first 10 TB / month data transfer out of Amazon EC2 x 2.719 GB = $0.46

$0.010 per GB Regional Data Transfer - in/out between Availability Zones or when using public IP or Elastic IP addresses x 0.002 GB = $0.01

$0.10 per GB-Month of EBS provisioned storage x 9.958 GB-Mo = $1.00

$0.10 per 1 million EBS I/O requests x 266,331 IOs = $0.03

$0.15 per GB-Month of EBS snapshot data stored x 0.104 GB-Mo = $0.02

$0.01 per 1,000 EBS PUT requests (when saving a snapshot) x 159 Requests = $0.01

EC2 TOTAL: $73.69
Other S3 costs (outside of EC2): $0.05

GRAND TOTAL: $73.74

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...