Encrypting Amazon EC2 boot volumes via Packer

In order to layer on some easy data-at-rest security, I want to encrypt the boot volumes of my Amazon EC2 instances.  I also want to use the centos.org CentOS images but those are not encrypted.  How can I end up with an encrypted copy of those AMIs in the fewest steps?

In the past, I have used shell scripts and the AWS CLI to perform the boot volume encryption dance. The steps are basically:

  1. Deploy an instance running the source AMI.
  2. Create an image from that instance.
  3. Copy the image and encrypt the copy.
  4. Delete the unencrypted image.
  5. Terminate the instance.
  6. Add tags to new AMI.

The script has a need for a lot of VPC/subnet/security group preparation (which I guess could have been added to the script), and if there were errors during the execution then cleanup was very manual (more possible script work). The script is very flexible and meets my needs, but it is a codebase that needs expertise in order to maintain. And I have better things to do with my time.

A simpler solution is Packer.

I had looked at Packer around July of 2016 and it was very promising, but it was missing one key feature: it could not actually encrypt the boot volume. Dave Konopka wrote a post describing the problem and his solution of using Ansible in Encrypted Amazon EC2 boot volumes with Packer and Ansible. Luckily, there was an outstanding pull request and as of version 0.11.0, Packer now has support for boot volume encryption whilst copying Marketplace AMIs.

The nice thing about a Packer template is that it takes care of dynamic generation of most objects. Temporary SSH keys and security groups are created just for the build and are then destroyed. The above steps for the boot volume encryption dance are followed with built-in error checking and recovery in case something goes wrong.

This template assumes automatic lookup of your AWS credentials. Read the docs (Specifying Amazon Credentials section) for more details.

Code can be downloaded from GitHub.

$ cat encrypt-centos.org-7-ami.json
{
    "description": "Copy the centos.org CentOS 7 AMI into our account so that we can add boot volume encryption.",
    "min_packer_version": "0.11.0",
    "variables": {
        "aws_region": "us-east-1",
        "aws_vpc": null,
        "aws_subnet": null,
        "ssh_username": "centos"
    },
    "builders": [
        {
            "type": "amazon-ebs",
            "ami_name": "CentOS Linux 7 x86_64 HVM EBS (encrypted) {{isotime \"20060102\"}}",
            "ami_description": "CentOS Linux 7 x86_64 HVM EBS (encrypted) {{isotime \"20060102\"}}",
            "instance_type": "t2.nano",
            "region": "{{user `aws_region`}}",
            "vpc_id": "{{user `aws_vpc`}}",
            "subnet_id": "{{user `aws_subnet`}}",
            "source_ami_filter": {
                "filters": {
                    "owner-alias": "aws-marketplace",
                    "product-code": "aw0evgkw8e5c1q413zgy5pjce",
                    "virtualization-type": "hvm"
                },
                "most_recent": true
            },
            "ami_virtualization_type": "hvm",
            "ssh_username": "{{user `ssh_username`}}",
            "associate_public_ip_address": true,
            "tags": {
                "Name": "CentOS 7",
                "OS": "CentOS",
                "OSVER": "7"
            },
            "encrypt_boot": true,
            "ami_block_device_mappings": [
                {
                    "device_name": "/dev/sda1",
                    "volume_type": "gp2",
                    "volume_size": 8,
                    "encrypted": true,
                    "delete_on_termination": true
                }
            ],
            "communicator": "ssh",
            "ssh_pty": true
        }
    ],
    "provisioners": [
        {
            "type": "shell",
            "execute_command": "sudo -S sh '{{.Path}}'",
            "inline_shebang": "/bin/sh -e -x",
            "inline": [
                "echo '** Shreding sensitive data ...'",
                "shred -u /etc/ssh/*_key /etc/ssh/*_key.pub",
                "shred -u /root/.*history /home/{{user `ssh_username`}}/.*history",
                "shred -u /root/.ssh/authorized_keys /home/{{user `ssh_username`}}/.ssh/authorized_keys",
                "sync; sleep 1; sync"
            ]
        }
    ]
}

To copy the CentoS 6 AMI, change any references of CentOS “7” to “6” and the product-code from “aw0evgkw8e5c1q413zgy5pjce” to “6x5jmcajty9edm3f211pqjfn2”.

When you build with this Packer template, you will have to pass in the variables aws_vpc and aws_subnet. The AWS region defaults to us-east-1, but can be overridden by setting aws_region. The newest centos.org CentOS AMI in that region will be automatically discovered.

$ packer build -var 'aws_vpc=vpc-12345678' -var 'aws_subnet=subnet-23456789' \
encrypt-centos.org-7-ami.json
amazon-ebs output will be in this color.

==> amazon-ebs: Prevalidating AMI Name...
    amazon-ebs: Found Image ID: ami-6d1c2007
==> amazon-ebs: Creating temporary keypair: packer_583c7438-d1d8-f33d-8517-1bdbbd84d2c9
==> amazon-ebs: Creating temporary security group for this instance...
==> amazon-ebs: Authorizing access to port 22 the temporary security group...
==> amazon-ebs: Launching a source AWS instance...
    amazon-ebs: Instance ID: i-5b68a2c4
==> amazon-ebs: Waiting for instance (i-5b68a2c4) to become ready...
==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Connected to SSH!
==> amazon-ebs: Provisioning with shell script: /var/folders/42/drnmdknj7zz7bf03d91v8nkr0000gq/T/packer-shell797958164
    amazon-ebs: ** Shreding sensitive data ...
    amazon-ebs: shred: /root/.*history: failed to open for writing: No such file or directory
    amazon-ebs: shred: /home/centos/.*history: failed to open for writing: No such file or directory
==> amazon-ebs: Stopping the source instance...
==> amazon-ebs: Waiting for the instance to stop...
==> amazon-ebs: Creating the AMI: CentOS Linux 7 x86_64 HVM EBS (encrypted) 1480356920
    amazon-ebs: AMI: ami-33506f25
==> amazon-ebs: Waiting for AMI to become ready...
==> amazon-ebs: Creating Encrypted AMI Copy
==> amazon-ebs: Copying AMI: us-east-1(ami-33506f25)
==> amazon-ebs: Waiting for AMI copy to become ready...
==> amazon-ebs: Deregistering unencrypted AMI
==> amazon-ebs: Deleting unencrypted snapshots
    amazon-ebs: Snapshot ID: snap-5c87d7eb
==> amazon-ebs: Modifying attributes on AMI (ami-9d4b748b)...
    amazon-ebs: Modifying: description
==> amazon-ebs: Adding tags to AMI (ami-9d4b748b)...
    amazon-ebs: Adding tag: "OS": "CentOS"
    amazon-ebs: Adding tag: "OSVER": "7"
    amazon-ebs: Adding tag: "Name": "CentOS 7"
==> amazon-ebs: Tagging snapshot: snap-1eb5dc01
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Cleaning up any extra volumes...
==> amazon-ebs: Destroying volume (vol-aa727a37)...
==> amazon-ebs: Deleting temporary security group...
==> amazon-ebs: Deleting temporary keypair...
Build 'amazon-ebs' finished.

==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs: AMIs were created:

us-east-1: ami-9d4b748b

Installing Livy on a Hadoop Cluster

Purpose

Livy is an open source component to Apache Spark that allows you to submit REST calls to your Apache Spark Cluster. You can view the source code here: https://github.com/cloudera/livy

In this post I will be going over the steps you would need to follow to get Livy installed on a Hadoop Cluster. The steps were derived from the above source code link, however, this post provides more information on how to test it in a more simple manner.

Install Steps

  1. Determine which node in your cluster will act as the Livy server
    1. Note: the server will need to have Hadoop and Spark libraries and configurations deployed on them.
  2. Login to the machine as Root
  3. Download the Livy source code
    cd /opt
    wget https://github.com/cloudera/livy/archive/v0.2.0.zip
    unzip v0.2.0.zip
    cd livy-0.2.0
  4. Get the version of spark that is currently installed on your cluster
    1. Run the following command
      spark-submit --version
    2. Example: 1.6.0
    3. Use this value in downstream commands as {SPARK_VERSION}
  5.  Build the Livy source code with Maven
    /usr/local/apache-maven/apache-maven-3.0.4/bin/mvn -DskipTests=true -Dspark.version={SPARK_VERSION} clean package
  6. Your done!

Steps to Control Livy

Get Status

ps -eaf | grep livy

It will  be listed like the following:

root      9379     1 14 18:28 pts/0    00:00:01 java -cp /opt/livy-0.2.0/server/target/jars/*:/opt/livy-0.2.0/conf:/etc/hadoop/conf: com.cloudera.livy.server.LivyServer

Start

Note: Run as Root

cd /opt/livy-0.2.0/
export SPARK_HOME=/usr/lib/spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
./bin/livy-server start

Once started, the Livy Server can be called with the following host and port:

http://localhost:8998

If you’re calling it from another machine, then you will need to update “localhost” to the Public IP or Hostname of the Livy server.

Stop

Note: Run as Root

cd /opt/livy-0.2.0/
./bin/livy-server stop

Testing Livy

This assumes you are running it from the machine where Livy was installed. Hence why we’re using localhost. If you would like to test it from another machine, then you just need to change “localhost” to the Public IP or Hostname of the Livy server.

  1. Create a new Livy Session
    1. Curl Command
      curl -H "Content-Type: application/json" -X POST -d '{"kind":"spark"}' -i http://localhost:8998/sessions
    2. Output
      HTTP/1.1 201 Created
      Date: Wed, 02 Nov 2016 22:38:13 GMT
      Content-Type: application/json; charset=UTF-8
      Location: /sessions/1
      Content-Length: 81
      Server: Jetty(9.2.16.v20160414)
      
      {"id":1,"owner":null,"proxyUser":null,"state":"starting","kind":"spark","log":[]}
  2. View Current Livy Sessions
    1. Curl Command
      curl -H "Content-Type: application/json" -i http://localhost:8998/sessions
    2. Output
      HTTP/1.1 200 OK
      Date: Tue, 08 Nov 2016 02:30:34 GMT
      Content-Type: application/json; charset=UTF-8
      Content-Length: 111
      Server: Jetty(9.2.16.v20160414)
      
      {"from":0,"total":1,"sessions":[{"id":0,"owner":null,"proxyUser":null,"state":"idle","kind":"spark","log":[]}]}
  3. Get Livy Session Info
    1. Curl Command
      curl -H "Content-Type: application/json" -i http://localhost:8998/sessions/0
    2. Output
      HTTP/1.1 200 OK
      Date: Tue, 08 Nov 2016 02:31:04 GMT
      Content-Type: application/json; charset=UTF-8
      Content-Length: 77
      Server: Jetty(9.2.16.v20160414)
      
      {"id":0,"owner":null,"proxyUser":null,"state":"idle","kind":"spark","log":[]}
  4. Submit job to Livy
    1. Curl Command
      curl -H "Content-Type: application/json" -X POST -d '{"code":"println(sc.parallelize(1 to 5).collect())"}' -i http://localhost:8998/sessions/0/statements
    2. Output
      HTTP/1.1 201 Created
      Date: Tue, 08 Nov 2016 02:31:29 GMT
      Content-Type: application/json; charset=UTF-8
      Location: /sessions/0/statements/0
      Content-Length: 40
      Server: Jetty(9.2.16.v20160414)
      
      {"id":0,"state":"running","output":null}
  5. Get Job Status and Output
    1. Curl Command
      curl -H "Content-Type: application/json" -i http://localhost:8998/sessions/0/statements/0
    2. Output
      HTTP/1.1 200 OK
      Date: Tue, 08 Nov 2016 02:32:15 GMT
      Content-Type: application/json; charset=UTF-8
      Content-Length: 109
      Server: Jetty(9.2.16.v20160414)
      
      {"id":0,"state":"available","output":{"status":"ok","execution_count":0,"data":{"text/plain":"[I@6270e14a"}}}
  6. Delete Session
    1. Curl Command
      curl -H "Content-Type: application/json" -X DELETE -d -i http://localhost:8998/sessions/0
    2. Output
      {"msg":"deleted"}