Codelab 3: AWS + S3 + CloudFront
This is codelab 3, which expands upon codelab-02 by introducing CloudFront. It is due on Thursday, February 22nd at 11:59:59PM.
In this codelab, you'll get to play around with CloudFront.
- You'll set up CloudFront via the AWS GUI
- You'll test latency
- You'll issue an invalidation
Before starting this codelab, run
git pull in the
389Lspring18 directory to update your local copy of the class repository.
It's a good idea to read/skim the entire codelab first so you have an idea of what you're doing. Keep in mind that you will be submitting screenshots of some command outputs, so make sure to read the submission section (at a minimum) located near the bottom to understand what you will be turning in
Back to buckets
First, download the image canyon.jpg.
Now, using the AWS GUI, navigate to S3.
From the S3 dashboard, click
In the field labeled bucket name, use the format
cmsc389l-<your directory id>-codelab-03.
In the field labeled region, select
Asia Pacific (Sydney).
This is what your form should look like.
Create button in the lower-left corner. For the purposes of this tutorial, the default properties and permissions are fine. Go ahead and skip past the rest of the configuration screens and create the bucket.
Select your newly-created bucket (you have to click the bucket name itself).
canyon.jpg from where you downloaded it on your machine. By default, S3 objects permissions are set to private. We will need to modify those permissions to public.
Next and select
Grant public read access to this object(s) from the drop-down menu labeled Manage public permissions.
Your form should look similar to this.
Upload button in the lower-left corner.
Select your new object (again, you have to click the name itself). This page similarly lets you see some details and configure properties. What we're after is the URL at the bottom of the page, under the label
Copy the link and paste it in the URL bar of a new tab in your browser. The canyon should start loading, albeit slowly. Why so slow? Well, we did place it in a region half-way 'round the world.
Let's try to quantify that speed. If you haven't already, enter your environment now by running:
$ pipenv shell
Make sure you have the proper dependencies from the
Pipfile with the command:
$ pipenv install
You may need to make the script executable. While in the same directory as the script is located, run:
$ chmod +x lat-test.sh
Now execute it but running:
$ ./lat-test.sh https://s3.amazonaws.com/cmsc389l-public/codelab-04/canyon.jpg
- What is the output?
- namelookup: The time, in seconds, it took from the start until the name resolving was completed.
- connect: The time, in seconds, it took from the start until the TCP connect to the remote host (or proxy) was completed.
- appconnect: The time, in seconds, it took from the start until the SSL/SSH/etc connect/handshake to the remote host was completed.
- pretransfer: The time, in seconds, it took from the start until the file transfer was just about to begin. This includes all pretransfer commands and negotiations that are specific to the particular protocol(s) involved.
- redirect: The time, in seconds, it took for all redirection steps including name lookup, connect, pretransfer and transfer before the final transaction was started. Redirect shows the complete execution time for multiple redirections.
- starttransfer: The time, in seconds, it took from the start until the first byte was just about to be transferred. This includes pretransfer and also the time the server needed to calculate the result.
- download: The average download speed that curl measured for the complete download. Bytes per second.
- total: The total time, in seconds, that the full operation lasted.
Don't be too surprised if it's several whole seconds (our image is nearly 7MB). Unlike a ping request which simply tests reachability, this script includes server side and download time. More information about curl may be found on the man page.
For comparison, try some other URLs in the script. You can grab URLs from any websites only (like
./lat-test.sh https://en.wikipedia.org/wiki/Sydney). If you use the link for the index file from codelab-02, the time should be on the order of .5 seconds (recall we used us-east-1 which is located in northern Virginia).
Lets return to using the AWS GUI, this time navigating to CloudFront, which can be found under Storage & Content Delivery.
From the CloudFront console, click
Create Distribution. This will allow us to tell AWS which origin to use for our content. There are two types of delivery CloudFront can use; web (suited to static content, like our canyon image) and RTMP (suited to dynamic content, like videos). Select the top
Get Started button under the Web heading.
Now, click into the Origin Domain Name text box. You will see a list of possible origins for content. Select the bucket created in the first part of this tutorial. We'll be using the default values for all other fields in this tutorial.
Scroll to the bottom and click the blue
Create Distribution in the bottom-left. You will then be on a page showing your distributions that will look something like this.
Note that the status is in progress, deployment may take several minutes. When complete, the status will change to deployed. Then, go ahead and grab the Domain Name (it should end in
Now we're ready to test. The new URL of our content will be of the form
http://<domainName>/<objectName>. So it should look something like
http://3x4mpl3.cloudfront.net/canyon.jpg. Try this in your browser.
Edit the shell script again and run it a few times. Does the total time change?
Verify you can reach it by the S3 link from your browser. Then try to reach it with your CloudFront link... and you might not be able to. So what is happening here? Why is CloudFront serving a stale version of our canyon.jpg image? This is because every object cached by CloudFront has an associated "Time-To-Live" (TTL) which expresses how long a file should be served from an edge location before being considered expired. This defaults to 24 hours -- so if you waited 24 hours and then re-visited this file via the CloudFront distribution, you would get the new version of
Your assignment for this codelab is to reference the documentation and issue an invalidation via the AWS CLI.
You will be submitting:
- Screenshot of
lat-test.shoutput on canyon.jpg in Sydney S3
- Two other screenshots of
lat-test.shoutput on any two internet URLs (images, websites, etc.).
- Screenshot of
lat-test.shoutput on canyon.jpg in CloudFront. Make sure to run this a few times before taking the screenshot, so that the local edge nodes can cache the resource leading to decreased latency.
- Screenshot showing the invalidation command you used and its console output.
- A short paragraph explaining the trade-off associated with time duration (or TTL) of objects in CloudFront. Write this in a text file called
Submit this assignment to
codelab3 on the submit server. Upload a zipped directory containing the following files:
<directory id>.zip s3.png screenshot1.png screenshot2.png cloudfront.png invalidation.png summary.txt