Connecting to a database from AWS Lambda function

Lambda is the backbone of AWS serverless portfolio. You focus on the application business logic while AWS does the infrastructure hard work. But nothing is for free; I'll talk about some complexities and considerations for using a database within Lambda functions.

Packaging the database driver

The Lambda function will contain the AWS packages for the selected platform by default, so you don't need to include boto3 for example in your package if you are using python. But this is not the case for DB drivers. So the follwoing needs to be considered if your Lamda needs to access a database:

Your lambda function must be deployed as a zip package that contains the needed DB drivers.
If the drive needs to be compiled or depends on other binary libraries, make sure to bundle all binaries in the package and all binaries must be compiled for Linux x86-64 platform.
Your zip package can't exceed 50 MB zipped, or 250 MB unzipped.
The Lamda function cold start time increases with the size increase of the deployment package. So if you have multiple options, it is recommended to select the driver with smaller package size assuming it fits with your requirements.

Connecting to the database server

Like any other application, your Lambda function needs to have a network connectivity to the DB server. By default the Lambda function runs in a VPC managed by AWS with internet access, so in this case it will have access to only resources exposed to the internet. There are two options:

(I don't recommend this option) Make your database internet accessible, so the Lambda function will access it using its public IP. This option is not secure as it exposes your database to possible attacks from the internet.
Configure the lambda function to use your VPC. You do this by specifying one or more subnets and security groups during the function creation.

Although the 2nd option is the most secure option, but it has several drawbacks:

Slower cold start time of the lambda function. With 1st invocation of the Lambda function (after deployment, or after being recycled), or during scale-out, the 1st call can take several extra seconds creating an ENI in your VPC for the lambda function.
The Lambda function by default doesn't have internet access (including access to other AWS services) unless the used subnet(s) are configured with a NAT gateway.

To create a Lambda function with VPC access:

Create a security group (name it for example lambda-sg).
Add a rule to the security group used by the DB to allow inbound access from the lambda-sg to the DB port.
During Lambda function creation, add one or more subnets in the same VPC as the DB server to the lambda, and specify lambda-sg in the list of security groups.

Managing DB connections

Lambda manages the lifecycle of the function. A Lambda function runs in a container. The container is created when the function is 1st accessed or when more instances of the function are needed due to the load. Each Lambda container can serve only one request at a time. After serving the request it can serve another one. After some timeout the container is deleted. So we can say each instance of the Lambda has 4 main states:

Initializing: Initialization takes time which can be several seconds. This includes creating the container, unpacking the function package and its layers, creating the VPC ENI if needed then executing the bootstrap and the initialization code of the function. This adds up to the 1st request execution time.
Serving a request: The function handler is called to serve a new request.
Idle waiting for a new request: It starts after returning the response of the previous request. During this state the function container is kept frozen. This means any per-request clean-up must be done before returning the response. The container will be resumed when a new request arrives.
Terminated: After timeout (controlled by aws, not configurable by the customer) the container is terminated. It just gets termianted without any notification to the function, so there is not opportunity to run any instance wide clean-up.

It is important to understand this lifecycle while dealing with DB connections. In DB terms:

Connection pooling is useless in Lambda function. If used it should contain maximum one connection, if more, the extra connections will remain idle and will not be used. Remember, Lambda function instance can serve only one request at a time.
If connections are created in the handler, they should be closed before returning the response. As the container is frozen after the response is returned till next request.
If the connection is created in the initialization code (outside the handler), it remains open till the TTL (idle timeout) and is closed by the DB server. This can cause severe issues to the DB server if the lambda has a high traffic. Assume due to the load aws created 1000 instances of the Lambda function (the default limit per region), this means 1000 database connection are created. If some of the instances where recycled, their old connections will be kept open (leaked) till the DB idle timeout (the default is 8 hours in mysql), and the new instances will create new connections. The new connections will keep accumulating and can cause DB server extra resources consumption or connections be rejected if the server reaches the maximum connections limit.

Some common solutions to correctly manage the DB connections:

Connection is created in the Lambda handler

This is the simplest solution and will prevent connections leakage. The connection is created when needed, and closed before returning or on failure before propagating the error.

But creating new connections is slow, also the DB server runs extra logic to process new connections which increases the CPU load.

This option is suitable for Lambda function with low execution rate.

Connection is created during instance initialization

When the Lambda function execution rate is high enough, the function instance is re-used for multiple requests. So it is logical to cache heavy resources like open DB connections between calls instead of creating a new one with each request. But as there is no clean-up handler in Lambda, the function can't clean-up open connections which will lead to connections leakage as I described earlier.

Some solutions can be used to minimize the leakage issue:

Reduce the DB connection idle timeout, so the connections is garbage collected by the DB server faster.
Add connection validation, retry and old connections clean-up logic to the Lambda function, Serverless MySQL is a good example.

Adding an external connection pooling layer

A proxy server can be added in the middle between the lambda function and the DB server:

The Lambda function opens new connection to the DB proxy server inside the handler with each request. The proxy server connection is light-weight, so it takes much less resources than DB server ones and are created much faster.
The proxy server will keep a pool of open connections between it and the DB server. These DB connections are re-used by several connections coming from the Lambda function. This results in less number of open connections to the DB server, and much less rate of new DB connections creation. This reduces the lambda function execution time and reduces the load on the DB server.

RDS Proxy is one solution that is provided by AWS. Currently it supports only Amazon RDS for MySQL and Amazon Aurora with MySQL compatibility. Other open source and commercial options are available for different DB engines, but you need to install and maintain them.

Securely storing the DB credential

This is a very old dilemma; where should I store the DB credentials so my code can read them to be able to connect to the DB server.

There are 3 recommended solutions for Lambda functions:

Environment variables.: You can specify the values of some environment variables during Lambda function deployment, and the function will read them during initialization or handler execution. This is the simplest solution. Optionally the environment variables can be encrypted with a custom IAM key and you can control the access to this key using IAM policies. Lamda environment variables has the drawback that the credentials must be available during function deployment. Also you have to configure each Lambda function if you have many accessing the DB. Credentials rotation is a challenge here.
AWS Secrets Manager is another option, but you have to add extra code in the Lambda function to read the credentials from the secret store, this can be during initialization and cashed for all handler calls. It has the benefit that credentials are managed centrally and can be configured for auto-password rotation. Access to the credentials in the secrets manager is controlled using IAM policies.
IAM authentication, it is supported for RDS/Aurora MySQL and Postgres in addition to RDS Proxy. This is a custom authentication method, and doesn't need to keep any passwords. The Lambda function calls an RDS API (generate-db-auth-token) to generate temporary credentials that can be used for authentication. Access is managed using IAM policies (who can use this credentials) and using normal DB grants/permissions (authorization to the DB resources). The 1st two options are generic to any DB engine, but this one is restricted to MySQL and Postgres RDS/Aurora if enabled. Also it a has a limit to the new connections rate.