Thursday, June 20, 2013

Amazon EC2, Bash, Python, Cloud-Init

I've been messing with Amazon Free Tier services,
spinning up a t1 micro EC2 instance using the Amazon AMI and dumping files in S3.

Originally I was writing up some bash deployment scripts, but today I rewrote that in Python (using the 2.6 version on the AMI). And I tested out running this Python script on startup using cloud-init and pulling the latest version of the script from S3.
This involved setting a bash script as User Data that would download the python script from S3 (aws s3 get-object) and run it as a specific user (with su and -c).

I'm not much of a Python programmer, so I ended up needed to read through the docs a lot, and then re-read through them for the older 2.6 version.

I had intended to use subprocess.check_output, but when this wasn't available I had to switch back to subprocess.Popen. I also found out that the recommendation of sending in a string command (instead of array) for shell=True is more of a requirement.

Although I'm sure the Amazon SDK is available in Python, it's super easy to call from the shell (aws s3 get-object --bucket bname --key kname fileout). This is especially true with a S3 read access IAM role associated with the EC2 instance (so no access/secret keys are needed).

I also found out that zipfile.extractall does not preserve file permissions, so I had to reset with os.chmod.
All in all, just minor issues. And I really appreciated the error handling over writing in bash.
Not that I really got away from bash completely.

I was going to use the #include with S3 HTTP url to the py script in the user-data, until I realized I could just call the same shell Amazon SDK as a bash script. This let me customize where to install the script, setting its owner/permissions, and running it as a specific user.

Now the final hurtle was getting this script to actually run on every boot, which was not the case at first. When I would stop/start the instance, nothing happened.
My impression was that cloud-init was supposed to handle this, parsing the user-data and running scripts there.
I tried to wade through the documentation on cloud-init, but really was getting nowhere until I traced back from the /etc/init.d/cloud-init* scripts.


How I got Cloud-Init to run User-Data on every startup:

First I changed the /usr/bin/cloud-init, changing "once-per-instance" to "always" for "consume_userdata".
This didn't work until I noticed the file at /var/lib/cloud/sem/consume_userdata.<instance>. Once I removed that file and restarted, it was recreated with the .always extension.
This at least caused my user-data to be downloaded as a script and stored in the new scripts dir in /var/lib/cloud/data,
But still it wasn't run.
I also had to edit /etc/init.d/cloud-init-user-scripts, again changing "once-per-instance" to "always" but this time for /usr/bin/cloud-init-run-module and user-scripts.
And removed /var/lib/cloud/sem/user-scripts.<instance>.
Finally after stop/start, my Python script was downloaded from S3 and run on startup. Deploy success!