AWS Lambda is a powerful tool that can be used to transform S3 objects on request. This article will show you how to use Lambda to transform S3 objects into JSON or XML. You can also use Lambda to transform S3 objects into Amazon DynamoDB tables or Amazon EBS volumes. To start, you will need an AWS account and the AWS Lambda CLI. The AWS Lambda CLI is a command-line tool that allows you to manage your Lambda functions and create and run lambdas. To install the AWS Lambda CLI, type the following command in a terminal: $ sudo apt-get install aws-cli Next, you will need to create a new function in your AWS account. To do this, type the following command in a terminal: aws lambda create –functionName myFunction –inputs S3://myS3File/myData –outputs S3://myS3File/myData2
Object Lambda lets you put a Lambda function in front of S3 objects, allowing them to be transformed on request by your own custom code. Since it runs automatically on Lambda, you don’t have to worry about running your own proxy layer.
What Is Object Lambda?
Object Lambda basically takes the place of an API in front of S3. Previously, you’d have to set up a proxy layer on your own infrastructure to handle transforming objects on request. This adds complexity, so AWS added a better solution.
RELATED: What Are Lambda Functions, And How Do You Use Them?
Instead of accessing objects directly, you’ll do so through an Object Lambda Access Point. When you make a GET request for a file in an S3 bucket, the Lambda function for that access point will be automatically called, allowed to access the original object, and return a transformed object back to the application.
The uses for this can be basic, like redacting info or converting JSON to XML, but since it’s your own code, you can do whatever you’d like. You could, for example, run a database lookup and return a transformed object with new data, or make requests to external APIs.
You can have multiple access points per bucket, which can each represent multiple “views” of the underlying data. To use different access points, you won’t need to update any client code. Simply change the bucket name to the ARN of the Object Lambda Access Point.
You also don’t need to access the original object by the exact name. For example, your application could request picture_1920x1080.jpg, which would find picture.jpg and resize it to the given dimensions. In this case, the Lambda function would need extra permissions to access the bucket contents.
Of course, you’ll need to pay for all the time spent running Lambda functions. If you’re running a lot of functions through a user-facing access point, this could start to add up. If your transformations are static, you might want to consider caching the objects in a separate S3 bucket. For example, if you have a function that applies filters/compression to an image, you might want to cache the results instead of rebuilding on every request. For things that depend on external state, though, this won’t be possible.
RELATED: How To Backup an S3 Bucket (And Why You’d Even Want To)
Using Object Lambda
Head over to the S3 Management Console to get started. Each Object Lambda Access Point needs a regular access point behind it. You’ll need to create this from Access Points > Create in the sidebar.
Enter a name and select a bucket, and make sure to select “Internet” unless this bucket is limited to a single VPC. Once it’s created, copy the ARN for the access point.
Create an Object Lambda Access Point:
Give it a name and paste it in the ARN of the access point, and the console should display the name of the underlying bucket.
At this point, you’ll need to select a Lambda function. If you have one prepared, you can enter the ARN or select it from the list. Otherwise, you’ll need to head over to the Lambda Management Console to create one.
At this point, the code is up to you, although AWS provides the following example, which takes the original object and transforms it to uppercase. No matter what language you end up using, you’ll need to grab the event context, make a request to S3 using the URL, transform the object, and then write the response using the new WriteGetObjectResponse API, returning an HTTP status code afterward.
The event object that Lambda receives will look something like this:
There are two important pieces of info here—the userRequest section, which contains info about the initial request, like URL and HTTP headers, and the userIdentity section, which can be used to personalize the response based on IAM user.
RELATED: AWS IAM Users Versus. IAM Roles: Which One Should You Use?