Featured image of post Archiving your pictures on Amazon S3 with a AWS Lambda & Rust

Archiving your pictures on Amazon S3 with a AWS Lambda & Rust

Recently, I wanted to find a cost-effective solution to archive my pictures. I was using a NAS at home but it can fail at any time, so it was not a good long term solution. Being an AWS consultant, I naturally thought of using Amazon S3 to provide the durability needed...

Introduction

Recently, I wanted to find a cost-effective solution to archive my pictures. I was using a NAS at home but it can fail at any time, so it was not a good long term solution. Being an AWS consultant, I naturally thought of using Amazon S3 to provide the durability needed.

I store around 40 000 pictures (120 GB) at the moment so using Amazon S3 Standard would cost about $3 per month ($0.024 per GB). This is more than I would like to pay. Fortunately, Amazon S3 offers different storage classes, which can help achieve better cost efficiency such as S3 Glacier Deep Archive ($0.0018 per GB) so a tenth of the price.

The downside of using Glacier Deep Archive is that once you archive an object, you need to restore it before being able to download it. It’s not optimal when you are searching through thousands of pictures. So I decided to generate thumbnail of each picture and storing it alongside the initial picture. I could easily browse through thumbnails and then restore only the file I needed. The reduced size and quality reduces the storage cost to a few cents per month for all the thumbnails.

In order to convert automatically all my pictures, I decided to write a custom AWS Lambda function in Rust which would generate the thumbnail and archive the original picture.

Why Rust ? I wanted to learn something new. Although AWS APIs and Lambda functions are my bread and butter, I am mostly using Python at work. This project seemed a good fit to transpose existing knowledge into a new language.

This blog post will go through the different steps to implement this solution.

Overview of the solution

The following architecture diagram represents the solution being built.

Setting up your environment to build AWS Lambda functions in Rust

My development environment is Ubuntu Linux 22.04. It relies on Rust and Cargo Lambda and SAM.

Install rust

The first step is to install Rust on your computer if you have not already done so. The official documentation is here.

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Install cargo lambda

Cargo lambda is an open-source tool to streamline the development of AWS Lambda functions.

pip3 install cargo-lambda

Install SAM CLI

SAM CLI streamlines the deployment of AWS Resources using CloudFormation templates.

curl -LO https://github.com/aws/aws-sam-cli/releases/latest/download/aws-sam-cli-linux-x86_64.zip
unzip aws-sam-cli-linux-x86_64.zip -d sam-installation
sudo ./sam-installation/install

Initialize your folder

Using cargo lambda, we will bootstrap a new repository to host our code.

cargo lambda new image-archiver --no-interactive

It will create a new image-archiver folder, go into it and open it with your preferred editor. The folder contains a simple Rust structure with Cargo.toml to define the project and its dependencies as well as a src/main.rs file.

The toml file contains the description of the Rust project as well as its dependencies.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[package]
name = "image-archiver"
version = "0.1.0"
edition = "2021"

# Starting in Rust 1.62 you can use `cargo add` to add dependencies 
# to your project.
#
# If you're using an older Rust version,
# download cargo-edit(https://github.com/killercup/cargo-edit#installation) 
# to install the `add` subcommand.
#
# Running `cargo add DEPENDENCY_NAME` will
# add the latest version of a dependency to the list,
# and it will keep the alphabetic ordering for you.

[dependencies]

lambda_runtime = "0.7"
serde = "1.0.136"
tokio = { version = "1", features = ["macros"] }
tracing = { version = "0.1", features = ["log"] }
tracing-subscriber = { version = "0.3", default-features = false, features = ["fmt"] }
  • lambda_runtime is a library that provides a Lambda runtime for applications written in Rust.
  • serde is a library to serialize and deserialize.
  • tokio is a library for asynchronous workload
  • tracing and tracing subscriber are tracing and logging libraries.

If we want to add new dependencies, we can use the cargo add command.

The src/main.rs contains the actual code which will run in the function. I’ve stripped some parts that are not useful at the moment.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
use lambda_runtime::{run, service_fn, Error, LambdaEvent};


/// This is the main body for the function.
/// Write your code inside it.

async fn function_handler(event: LambdaEvent<Request>) -> Result<Response, Error> {
    // Extract some useful info from the request
    let command = event.payload.command;

    // Prepare the response
    let resp = Response {
        req_id: event.context.request_id,
        msg: format!("Command {}.", command),
    };

    // Return `Response` (it will be serialized to JSON automatically by the runtime)
    Ok(resp)
}

#[tokio::main]
async fn main() -> Result<(), Error> {
    tracing_subscriber::fmt()
        .with_max_level(tracing::Level::INFO)
        // disable printing the name of the module in every log line.
        .with_target(false)
        // disabling time is handy because CloudWatch will add the ingestion time.
        .without_time()
        .init();

    run(service_fn(function_handler)).await
}

Write your function

Implementation logic

The implementation logic will be as follow:

  1. An Amazon EventBridge event triggers the Lambda function
  2. The function downloads the object
  3. The function generates the thumbnail
  4. The function uploads the thumbnail to Amazon S3
  5. The function archives the original object.

Parsing Amazon EventBridge in the function

Whenever a new object, a picture in our case, is uploaded into our Amazon S3 bucket, Amazon EventBridge will react to this event and trigger the lambda function with a similar payload.

{
    "version": "0",
    "id": "2d4eba74-fd51-3966-4bfa-b013c9da8ff1",
    "detail-type": "Object Created",
    "source": "aws.s3",
    "account": "123456789012",
    "time": "2021-11-13T00:00:59Z",
    "region": "us-east-1",
    "resources": [
        "arn:aws:s3:::photos-archive"
    ],
    "detail": {
        "version": "0",
        "bucket": {
            "name": "photos-archive"
        },
        "object": {
            "key": "archive/2023/01/01/DSC_12345.jpg",
            "size": 99797,
            "etag": "7a72374e1238761aca7778318b363232",
            "version-id": "a7diKodKIlW3mHIvhGvVphz5N_ZcL3RG",
            "sequencer": "00618F003B7286F496"
        },
        "request-id": "4Z2S00BKW2P1AQK8",
        "requester": "348414629041",
        "source-ip-address": "72.21.198.68",
        "reason": "PutObject"
    }
}

To find the Amazon S3 Bucket name and object key to download the picture and generate the thumbnail, we need to create structures (Struct) to parse the event. We will map only the necessary fields. You can replace the initial sample request and response of the main.rs file with following snippet.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
use tracing::{error};
use serde::Deserialize;

#[derive(Deserialize, Debug)]
pub struct EventBridgeEvent {
    detail: EventBridgeDetail,
}

#[derive(Deserialize, Debug)]
struct EventBridgeDetail {
    bucket: EventBridgeBucket,
    object: EventBridgeObject,
}

#[derive(Deserialize, Debug)]
struct EventBridgeBucket {
    name: String,
}
#[derive(Deserialize, Debug)]
struct EventBridgeObject {
    key: String,
    size: u32
}

async fn function_handler(event: LambdaEvent<EventBridgeEvent>) -> Result<(), Error> {
    let (event, _context) = event.into_parts();
    Ok(())
}

We see here that we need to create one structure for each level of our initial payload until we can get the value we need. Rust is a strongly typed language so we also need to update the function_handler input and output types.

Download the object

To download the object we will need to use the AWS SDK for Rust (Developer preview). The SDK comes as libraries and can be added by running cargo add. In our project, we need to add aws-config to handle the environment configuration and retrieve credentials as well as aws-sdk-s3 to interact with Amazon S3 APIs. Packaged dependencies are also called crates and available on crates.io.

cargo add aws-config
cargo add aws-sdk-s3

You can now see the two added dependencies in the Cargo.toml file.

[dependencies]
aws-config = "0.52.0"
aws-sdk-s3 = "0.22.0"

Now we will instantiate the SDK by loading the configuration from the environment and creating a client.

1
2
3
4
5
6
7
8
9
async fn function_handler(event: LambdaEvent<EventBridgeEvent>) -> Result<(), Error> {
    let (event, _context) = event.into_parts();
    
    // load the environment configuration (credentials, region) and creates a client for S3 API.
    let config = aws_config::load_from_env().await;
    let client = aws_sdk_s3::Client::new(&config);

Ok(())
}

We create a config variable by using the let statement and then use aws_config::load_from_env() to load the environment configuration. As the runtime is asynchronous, we add the await statement to retrieve the value. We then instantiate the S3 client by using aws_sdk_s3::Client::new(&config). The & operator means that we pass the reference to config to the new function. Passing references is a key aspect of how Rust manages memory. If we didn’t pass the &config, Rust would have considered that we no longer needed the config variable and dropped it. You can read more in the Rust book in the ownership section.

We retrieve the Object using the SDK.

1
2
3
4
5
6
    // We retrieve the bucket name and key from the event
    let bucket_name = event.detail.bucket.name;
    let key = event.detail.object.key;
    
    let result = client.get_object().bucket(&bucket).key(&key).send().await;
    

I’d like to stop here for a second as we start interacting with the AWS APIs. We build an API call by starting from the client and then concatenating the operation get_object() followed by the parameters bucket(&bucket) and key(&key). We then send() the request and as it is an asynchronous call, we await the response. By using the let result = we assign the result of the API call to the result variable.

Every API call will return a Result which is an error handling mechanism in Rust. We will use the match operator to evaluate the different cases (success or error).

1
2
3
4
5
6
7
    let body = match result {
        Ok(output) => output.body.collect().await?,
        Err(e) => {
            error!("Unable to retrieve the object {}", &key);
            return Err(Error::from(e.into_service_error()))
        }
    };

If the operation is a success Ok(output) we execute the statement on the right hand (=>) which retrieves the body of the object by calling the collect() function. It returns a Future that we wait using the await statement. Again the call is a Result which needs to be processed. Here we use the question mark ? operator to tell Rust to return the value or to propagate the error. It is a shortcut to avoid too many match operators.

If the operation GetObject is an error Err(e) we write an error log, and we exit the handler with the error.

Generate the thumbnail

In order to generate a thumbnail, we will use another crate which is built to handle pictures. The crate is called image and looking at its documentation it can read an image from memory and generate a thumbnail from this image.

Let’s get started by adding the crate,

cargo add image

then modify the main.rs file to use the image module and transform our image.

1
2
3
4
5
6
7
8
    let image = match image::load_from_memory(&body.into_bytes()){
        Ok(img) => img,
        Err(e) => {
            error!("Unable to process the image {}", &key);
            return Err(Error::from(e.to_string()));
        }
    };
    let thumbnail = image.thumbnail(1280,720);

We invoke the load_from_memory function to try to import the object as an image. We then apply the thumbnail function to resize the image. The pattern is similar to the get_object API call earlier. load_from_memory will return a result that we process with a match operator. If the operation is a success then the image is assigned to image variable. If not we log the error and exit.

Upload the thumbnail

To upload the thumbnail, we need to write the image into a buffer as a JPEG image, generate the filename, derived from the initial object key to not overwrite the picture. We finally call the put_object operation to upload the buffer to an Amazon S3 object.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    // we create the buffer and write the thumbnail into it as a Jpeg image.
    let mut buffer: Vec<u8> = Vec::new();
    thumbnail.write_to(
        &mut Cursor::new(&mut buffer),
        image::ImageOutputFormat::Jpeg(50),
    )?;

    // we generate a key for the thumbnail by prefixing thumbnail and changing the extension to jpeg.
    let split_key: Vec<&str> = key.split('.').collect();
    let thumbnail_key = format!("thumbnail/{}.jpeg", split_key[0]);
    

    // upload the thumbnail to S3
    let resp = client
        .put_object()
        .body(ByteStream::from(buffer))
        .bucket(&bucket_name)
        .key(&thumbnail_key)
        .send()
        .await;
    if resp.is_err() {
        error!("Unable to upload the thumbnail. Error {:?}", resp);
        return Err(Error::from(resp.err().unwrap()));
    };

Archive the original object

To change the storage class of an object, we need to copy the object to the same location and changing the storage class parameter. We use the copy_object() method of the S3 client. The parameters are a copy_source string which requires to be URL encoded. There is a library called urlencoding that we will add to this project.

cargo add urlencoding
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
    let source = format!("{}/{}", &bucket_name, &key);
    let encoded_source = urlencoding::encode(source.as_str());
    let resp = client.copy_object().copy_source(encoded_source).key(&key).bucket(&bucket_name)
        .storage_class(aws_sdk_s3::model::StorageClass::DeepArchive)
        .send()
        .await;
    
    if resp.is_err() {
        error!("Unable to upload the thumbnail. Error {:?}", resp);
        return Err(Error::from(resp.err().unwrap()));
    };

    Ok(())
        

Below is the complete main.rs file that we created.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
use lambda_runtime::{run, service_fn, Error, LambdaEvent};
use tracing::{error};
use serde::Deserialize;
use std::io::Cursor;


#[derive(Deserialize, Debug)]
pub struct EventBridgeEvent {
    detail: EventBridgeDetail,
}

#[derive(Deserialize, Debug)]
struct EventBridgeDetail {
    bucket: EventBridgeBucket,
    object: EventBridgeObject,
}

#[derive(Deserialize, Debug)]
struct EventBridgeBucket {
    name: String,
}
#[derive(Deserialize, Debug)]
struct EventBridgeObject {
    key: String,
    size: u32
}

async fn function_handler(event: LambdaEvent<EventBridgeEvent>) -> Result<(), Error> {
    let (event, _context) = event.into_parts();
    
    let config = aws_config::load_from_env().await;
    let client = aws_sdk_s3::Client::new(&config);
    
    // We retrieve the bucket name and key from the event
    let bucket_name = event.detail.bucket.name;
    let key = event.detail.object.key;
    
    let result = client
        .get_object()
        .bucket(&bucket_name)
        .key(&key)
        .send()
        .await;

    let body = match result {
        Ok(output) => output.body.collect().await.ok().unwrap(),
        Err(e) => {
            error!("Unable to retrieve the object {}", &key);
            return Err(Error::from(e.into_service_error()))
        }
    };

    let image = match image::load_from_memory(&body.into_bytes()){
        Ok(img) => img,
        Err(e) => {
            error!("Unable to process the image {}", &key);
            return Err(Error::from(e.to_string()));
        }
    };
    let thumbnail = image.thumbnail(1280,720);

    // we create the buffer and write the thumbnail into it as a Jpeg image.
    let mut buffer: Vec<u8> = Vec::new();
    thumbnail.write_to(
        &mut Cursor::new(&mut buffer),
        image::ImageOutputFormat::Jpeg(50),
    )?;

    // we generate a key for the thumbnail by prefixing thumbnail and changing the extension to jpeg.
    let split_key: Vec<&str> = key.split('.').collect();
    let thumbnail_key = format!("thumbnail/{}.jpeg", split_key[0]);
    

    // upload the thumbnail to S3
    let resp = client
        .put_object()
        .body(aws_sdk_s3::types::ByteStream::from(buffer))
        .bucket(&bucket_name)
        .key(&thumbnail_key)
        .send()
        .await;
    if resp.is_err() {
        error!("Unable to upload the thumbnail. Error {:?}", resp);
        return Err(Error::from(resp.err().unwrap()));
    };

    let source = format!("{}/{}", &bucket_name, &key);
    let encoded_source = urlencoding::encode(source.as_str());
    let resp = client
        .copy_object()
        .copy_source(encoded_source)
        .key(&key)
        .bucket(&bucket_name)
        .storage_class(aws_sdk_s3::model::StorageClass::DeepArchive)
        .send()
        .await;
    
    if resp.is_err() {
        error!("Unable to upload the thumbnail. Error {:?}", resp);
        return Err(Error::from(resp.err().unwrap()));
    };

    Ok(())

}



#[tokio::main]
async fn main() -> Result<(), Error> {
    tracing_subscriber::fmt()
        .with_max_level(tracing::Level::INFO)
        // disable printing the name of the module in every log line.
        .with_target(false)
        // disabling time is handy because CloudWatch will add the ingestion time.
        .without_time()
        .init();

    run(service_fn(function_handler)).await
}

Test your function locally

cargo lambda provides some tools to run your function locally and test its behavior before deploying it. cargo lambda watch launches a server listening on port 9000. Use the command cargo lambda invoke --data-ascii '{}' to invoke the function. This way you can iterate over your code and see where you have some bugs. It also re-compiles your code after changes.

Now open two terminals in the project directory and enter the following command in the 1st terminal:

cargo lambda watch

and in the second terminal enter:

cargo lambda invoke --data-ascii '{}'

Once you hit enter, you will see that the compiler is building on the fly the Lambda function and starts the execution. The terminal one should return something like with the detail of the invocation:

ERROR Lambda runtime invoke{requestId="fca26b37-d8a1-4890-9c2f-7a8c350d9356" xrayTraceId="Root=1-63c1ae81-6a215122a9eb40caa0b16056;Parent=77fbc8d0e0c59adb;Sampled=1"}: Error("missing field `detail`", line: 1, column: 2)

while the terminal 2 will return only the inner error message

Error: serde_json::error::Error

  × missing field `detail` at line 1 column 2

Now let’s try with a valid event

cargo lambda invoke --data-ascii '{"detail":{"object":{"key":"picture.jpg", "size": 1234}, "bucket": {"name":"my_bucket"}}}'

Error: alloc::boxed::Box<dyn core::error::Error + core::marker::Send + core::marker::Sync>

  × unhandled error

We now see that the event is processed correctly but the error is unhandled. We have more details in terminal 1 where the error has been logged.

ERROR Lambda runtime invoke{requestId="8c641830-6358-443d-b266-f398bdc36b0d" xrayTraceId="Root=1-63c1badb-6cc969048a7388afb192699b;Parent=23fc8aaca58ee2c3;Sampled=1"}: GetObjectError { kind: Unhandled(Unhandled { source: Error { code: Some("PermanentRedirect"), message: Some("The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint."), request_id: Some("9WZ3EF00KEGW0KDM"), extras: {"s3_extended_request_id": "J1SO1A62y0NES+MpsKfjNWoI5Krh/4bZ6EqNc6U5NDYXtJwVGDknFRj53lJkYDH6ed3ugdHtiHI="} } }), meta: Error { code: Some("PermanentRedirect"), message: Some("The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint."), request_id: Some("9WZ3EF00KEGW0KDM"), extras: {"s3_extended_request_id": "J1SO1A62y0NES+MpsKfjNWoI5Krh/4bZ6EqNc6U5NDYXtJwVGDknFRj53lJkYDH6ed3ugdHtiHI="} } }

Now let’s jump to the next part where we will create a new bucket and start uploading a picture.

Deploy to your AWS Account with SAM

We will create a CloudFormation template using the Serverless Application Model to build our application. Let’s get started with creating a simple bucket.

Deploy the bucket

Create a new file in your project directory named template.yaml and paste the following code. It will create a new Amazon S3 Bucket in your account with a random name. We will add afterward the integration with AWS Lambda.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  Bucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: "AES256"
      PublicAccessBlockConfiguration:
        BlockPublicAcls: True
        BlockPublicPolicy: True
        IgnorePublicAcls: True
        RestrictPublicBuckets: True
      OwnershipControls:
        Rules:
          - ObjectOwnership: BucketOwnerEnforced
      NotificationConfiguration: # Integrates the events with EventBridge
        EventBridgeConfiguration:
          EventBridgeEnabled: true

Outputs:
  BucketName:
    Value: !Ref Bucket
    Description: Name of the bucket

Then we use sam CLI tool to deploy the template.

sam deploy --guided

Configuring SAM deploy
======================

        Looking for config file [samconfig.toml] :  Not found

        Setting default arguments for 'sam deploy'
        =========================================
        Stack Name [sam-app]: # enter the name of your application here
           Stack Name [sam-app]: img-archiver
        AWS Region [eu-west-1]: 
        #Shows you resources changes to be deployed and require a 'Y' to initiate deploy
        Confirm changes before deploy [y/N]: 
        #SAM needs permission to be able to create roles to connect to the resources in your template
        Allow SAM CLI IAM role creation [Y/n]: 
        #Preserves the state of previously provisioned resources when an operation fails
        Disable rollback [y/N]: 
        Save arguments to configuration file [Y/n]: 
        SAM configuration file [samconfig.toml]: 
        SAM configuration environment [default]: 

Select the default options. At the end of the deployment, the SAM tool will display the name of the bucket created.

----------------------------------------------------------------------------------------------------------------------------
Outputs                                                                                                                     
----------------------------------------------------------------------------------------------------------------------------
Key                 BucketName                                                                                              
Description         Name of the bucket                                                                                      
Value               img-archiver-bucket-4isywdroe8s1                                                                        
----------------------------------------------------------------------------------------------------------------------------

Upload a picture to the newly created bucket (here img-archiver-bucket-4isywdroe8s1) and check if your lambda works.

aws s3 cp my-cool-picture.jpg s3://img-archiver-bucket-4isywdroe8s1/my-cool-picture.jpg

Invoke your function using the following command

cargo lambda invoke --data-ascii '{"detail":{"object":{"key":"my-cool-picture.png", "size": 1234}, "bucket": {"name":"img-archiver-bucket-4isywdroe8s1"}}}'

Once the processing is done, it should return null and you can confirm that your file has been archived by checking its storage class and that the thumbnail file exists by using the AWS CLI or the AWS Console.

aws s3api head-object --bucket img-archiver-bucket-4isywdroe8s1 --key my-cool-picture.png
{
    "AcceptRanges": "bytes",
    "LastModified": "Fri, 13 Jan 2023 20:29:34 GMT",
    "ContentLength": 6692553,
    "ETag": "\"ef0eeecca3eee585ffcd78dba8295f0c\"",
    "ContentType": "image/png",
    "ServerSideEncryption": "AES256",
    "Metadata": {},
    "StorageClass": "DEEP_ARCHIVE"
}
aws s3 ls s3://img-archiver-bucket-4isywdroe8s1/thumbnail/
2023-01-13 15:29:34      49700 my-cool-picture.jpeg

Deploy the application

Now that the code is working, let’s add a new resource in our CloudFormation template for the function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

  ImageProcessorFunction:
    Type: AWS::Serverless::Function
    Properties:
      MemorySize: 256
      Architectures: ["arm64"] 
      Handler: bootstrap
      Runtime: provided.al2
      Timeout: 30
      CodeUri: target/lambda/aws_image_archiver/
      Policies:
        S3CrudPolicy:
          BucketName: !Ref Bucket
      Events:
        Trigger:
          Type: CloudWatchEvent
          Properties:
            Pattern:
              source:
                - aws.s3
              detail-type:
                - Object Created
              detail:
                bucket:
                  name:
                  - !Ref Bucket
                object:
                  key:
                    - prefix: "archive/"

We told CloudFormation to deploy an ARM-based function and trigger it when EventBridge (CloudWatchEvent) matches an event for a newly created object under the archive prefix. We also need to compile our code before being able to upload. For that step we will use cargo lambda tool which also allows cross-platform compilation to compile into arm64.

cargo lambda build --release --arm64

Run sam deploy to update the previous deployment with the Lambda function.

sam deploy

Upload a nice picture to the archive prefix.

aws s3 cp my-cool-picture.png s3://img-archiver-bucket-4isywdroe8s1/archive/my-cool-picture.png

Validate that it has been archived.

aws s3api head-object --bucket img-archiver-bucket-4isywdroe8s1 --key archive/my-cool-picture.png
{
    "AcceptRanges": "bytes",
    "LastModified": "Fri, 13 Jan 2023 20:29:34 GMT",
    "ContentLength": 6692553,
    "ETag": "\"ef0eeecca3eee585ffcd78dba8295f0c\"",
    "ContentType": "image/png",
    "ServerSideEncryption": "AES256",
    "Metadata": {},
    "StorageClass": "DEEP_ARCHIVE"
}
aws s3 ls s3://img-archiver-bucket-4isywdroe8s1/thumbnail/archive/
2023-01-13 15:29:34      49700 my-cool-picture.jpeg

Conclusion

In this blog post we have built a Lambda function in Rust to automate the archiving of pictures. It loads the pictures into memory, save a thumbnail to S3 and then changes the storage class to Glacier for cost savings.

In terms of performance, I have not performed a benchmark but I was able to process all the pictures as I copied them without worrying. P99 was about 1.5s for image conversion.

In a next post, I will cover some improvements to this Lambda function such as refactoring some parts into a library, better error handling and retrieving the first few bytes of a file to make sure that it is a picture. I had some videos in my library and I didn’t need a thumbnail for them.

If you want to get the complete source code, you can find it here

Main image by Anne G via Flickr
comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy