Note: This post was originally written in July 2020, and later updated and published in June 2024.
Introduction
This post is inspired by a workshop I made for a team of data scientists who owned a bunch of large datasets stored in an S3 bucket, and they managed S3 only via the web console. It worked fine when they only needed to add a bucket once in a while. But the real challenge came when the team had to reduce costs. They had to update the storage class of their data to “Infrequent Access”, as well as add lifecycle rules to automate this process, ensuring that unused data is moved to a more cost-effective storage class.
I was surprised that there was no declarative approach in place. Manual changes via the web console have multiple issues: it is mindeless time-consuming clicking around, and it’s prone to errors, very easy to simply forget something, or make a typo. It’s hard to organize peer review for such an approach, unless multiple people sit and click together.
So having Terraform would have helped to deal with those problems, though the team didn’t have any experience with tools like that. To address this gap, I created a quick demo to show how Terraform can be used to manage S3 buckets.
If you’ve read my other posts, you may have noticed that I’m a huge fan of reproducible setups and research, using tools like Docker Compose. So I used it for this demo too, together with localstack, a tool that emulates the AWS environment, to avoid the need for a real AWS account. This makes the demo entirely local, enabling any team member to play with it and see how easy it is to use Terraform.
The Setup
Ok, after such a long intro, let’s move to the actual demo. If you’re eager to try it immediately, check out this repository.
Note: Here I assume that you already have Docker (or any other similar tool, like Podman) installed.
For the purpose of the demo, we’ll need two containers: one that runs localstack, and the other one for running terraform commands, and other tools, like aws
, if needed.
Here is how we can do it.
docker-compose.yaml
services:
localstack:
image: localstack/localstack
ports:
- "4566:4566"
terraform:
build: .
depends_on:
- localstack
stdin_open: true
tty: true
volumes:
- ./terraform:/terraform
working_dir: /terraform
The localstack
configuration is pretty simple, we just use the official docker image and set port mappings, in case you want to access the service from the outside.
The terraform
definition is slightly more complex, but still straightforward. First, we use a build
parameter to use a custom Dockerfile
to build the container, so we can customize it and install extra tools there, we’ll discuss it later.
Next, we have a depends_on
option that says that the terraform container should start after localstack. It’s not strictly needed for this setup, but we have it here for clarity. The next two settings allow us to use the container to run command-line tools.
The volume
parameter is used to map our local terraform directory into the container, we’ll add terraform files in there.
Lastly, we set the working directory (working_dir
) to point to the mapped one.
Now, let’s take a quick look at the Dockerfile we use for the terraform
service.
Dockerfile
FROM hashicorp/terraform:latest
RUN apk add --no-cache aws-cli
ENTRYPOINT ["sh"]
It’s quite straightforward as we use a pre-built container for terraform.
For demonstration purposes, we add the AWS command-line tool, which might be useful for testing. And also we redefine the entrypoint
to use shell, because the default one is set to terraform
.
Running the demo
Now let’s see how we can use this configuration.
First, we need to build a terraform container because we use a custom Dockerfile
. To do this, we use the docker-compose build
command.
Once the build is complete, we can launch our setup with docker-compose up
.
And after, you should see something like
localstack-1 | LocalStack build date: 2024-06-03
localstack-1 | LocalStack build git hash: 5085b532c
... localstack-1 | Ready.
meaning that localstack has successfully started. You may notice a few errors in the log, but they’re not relevant for our setup.
Now we can connect to the terraform container by running docker-compose exec terraform sh
in a separate terminal. Once you are in the helper container shell, you will be in the mapped local terraform
directory, because we set the working directory in the docker-compose
file.
You can check that you actually have terraform here by running terraform --version
. If you did everything correctly, you should see something like
/terraform # terraform --version
Terraform v1.8.5
on linux_arm64 /terraform #
Here we can see the Terraform command is available for us. And we have version 1.8.5
, the latest at the time of writing.
Next, let’s look at how to configure terraform to work with localstack.
Let’s add a main.tf
file to the terraform
directory like:
main.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.54.1"
}
}
backend "local" {
path = "terraform.tfstate"
}
}
provider "aws" {
access_key = "test"
secret_key = "test"
region = "us-east-1"
s3_use_path_style = true
endpoints {
s3 = "http://localstack:4566"
sts = "http://localstack:4566"
} }
This is a pretty standard configuration for AWS. First, we require the AWS provider, needed to interact with AWS, you may want to update it to a more recent version. Also, we store the terraform state as a local file, which is enough for local development purposes.
In the next block, we configure the AWS provider itself. We set credentials as just test
for the localstack. And then we set API endpoints to point to the localstack ones. We set the S3 endpoint, since that’s what we’ll use for the demo, and STS, just because by default terraform tries to validate credentials via the AWS STS service.
And now we’re all set to see our setup in action. Go back to the terminal, and run terraform init
to initialize terraform. This command might take a little time because it pulls the necessary dependencies. When it’s done, you should see:
/terraform # terraform init
Initializing the backend...
Successfully configured the backend "local"! Terraform will automatically
use this backend unless the backend configuration changes.
Initializing provider plugins...- Finding hashicorp/aws versions matching "~> 5.54.1"...
- Installing hashicorp/aws v5.54.1...
- Installed hashicorp/aws v5.54.1 (signed by HashiCorp)
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
After that, you can notice that there are some new files in the terraform directory. The lock file ensures consistent provider versions, and the .terraform
directory contains all downloaded plugins. We don’t really need to modify or do anything with those files.
Now, let’s run terraform plan
, you should see:
/terraform # terraform plan
No changes. Your infrastructure matches the configuration.
Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.
As expected, it shows no changes because we haven’t set up any infrastructure yet. Let’s change that by adding an S3 bucket.
Just create a new file buckets.tf
.
buckets.tf
resource "aws_s3_bucket" "bucket_demo" {
bucket = "demo" }
Now go back to the terminal and run terraform plan
again:
/terraform # terraform plan
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: + create
Terraform will perform the following actions:
# aws_s3_bucket.bucket_demo will be created + resource "aws_s3_bucket" "bucket_demo" {
+ acceleration_status = (known after apply)
+ acl = (known after apply)
+ arn = (known after apply)
+ bucket = "demo"
+ bucket_domain_name = (known after apply)
+ bucket_prefix = (known after apply)
+ bucket_regional_domain_name = (known after apply)
+ force_destroy = false
+ hosted_zone_id = (known after apply)
+ id = (known after apply)
+ object_lock_enabled = (known after apply)
+ policy = (known after apply)
+ region = (known after apply)
+ request_payer = (known after apply)
+ tags_all = (known after apply)
+ website_domain = (known after apply)
+ website_endpoint = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
Now we can see the difference. Terraform says that it plans to add one resource, which is our bucket. But to actually create the new bucket, we need to execute terraform apply
. The apply
command checks the state, shows the plan too, and asks if you want to continue, type yes
.
/terraform # terraform apply
...
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
aws_s3_bucket.bucket_demo: Creating...[id=demo]
aws_s3_bucket.bucket_demo: Creation complete after 1s
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Apply complete!
confirms that changes are applied. And you can double-check it by running terraform plan
again.
Now you can see that there is a new file called terraform.tfstate
. This is the file that we set up earlier in a local backend to keep the state.
Also, to showcase the aws
tool we installed via Dockerfile, let’s check that the bucket was created using the aws
command.
First, let’s run aws configure
, and you can use test as credentials here too.
/terraform # aws configure[None]: test
AWS Access Key ID [None]: test
AWS Secret Access Key [None]:
Default region name [None]: Default output format
But configure
didn’t ask us for endpoint_url
that we need to override and point to localstack. We can set it explicitly by running aws configure set endpoint_url http://localstack:4566
.
Now you can run aws s3 ls
, and you’ll see demo
in the output.
/terraform # aws s3 ls 2024-06-15 00:00:01 demo
And that’s it! As you can see, it is quite simple with just a few files and containers help, you have an environment to play with terraform and see how it can be used to manage AWS infrastructure without actually using AWS or installing any tools on your machine, besides Docker itself.
This demo is completely reproducible. You can take this template and experiment with terraform on your own.