Amazon DynamoDB at a glance
I'll try to give in few points a very short overview about Amazon DynamoDB. This is based on my understanding of available AWS documentations.
DynamoDB is a NoSQL, key-value and document database service.
- Single-digit millisecond latency whatever the load is.
- Data is stored as a pair of key-value. The Key can be composite of a partition key and sorting key, or it can be just a partition key.
- The partition key is used to locate the value in a specific partition. Partitions are distributed across multiple servers.
- Read operations can be consistent reads (more expensive can have performance penalty), or eventually consistent reads.
- One or more global indexes (up to 20) can be added anytime to a table. A global index can have a partition key and optionally a sort key different than the table partition and sort keys. Maintaining the global index incurs additional write requests in addition to storage. The read and write capacity for global tables are managed separately from the table.
- One or more (up to 5) local indexes can be added to a table during table creation. The local index have the same partition key as the table, but the sort key can be different. Maintaining the local index incurs additional write requests in addition to storage. Local indexes share the same provisioned capacity as the table.
- DynamoDB Streams can be used to capture table changes in near-real-time.
- DAX is an acceleration layer for repeatable (eventually consistent) reads with micro-second latencies. A DAX cluster is composed of a primary node and optionally one or more replicas (up to 9 replicas). The client requests are distributed evenly across the nodes in the cluster. More nodes in the cluster will increase the throughput, and larger node instance type will increase the caching capacity.
Serverless: You don't manage any servers, except when using DynamoDB Accelerator (DAX).
DynamoDB Accelerator (DAX) is server-based, you need to specify the number and type of the servers to manage its capacity.
There are two modes regarding how the table capacity scales to the load:
- On-Demand: The customer doesn't have to plan for the needed capacity, the service automatically adapts to your workload.
- Provisioned: You need to specify the needed read and write capacity units, or the allowed range if auto-scaling option is specified.
- Two types of backups are available per table:
- On demand packup, you can create a full consistent backup of a single table and keep it as needed. No impact on performance during taking the backup. This type of backup is preserved when the table is deleted.
- Point-in-time recovery, which is managed automatically when enabled. It allows restoring the table content to the state it was at any point of time for up to 35 days. This backup is lost when the table is deleted.
- By default the data is replicated across multiple availability zones (AZs) in the same region. Global tables option can be used to replicate data in multiple regions.
- AWS Lambda: AWS Lambda functions can be called from triggers on a DynamoDB stream.
- Amazon Redshift: loading data directly from a DynamoDB table into Amazon Redshift table.
- Amazon EMR: Query the data directly from DynamoDB tables using external Hive tables.
- Data encryption at rest
- Connections are TLS encrypted
- Requests are authorized against IAM policies.
The service is priced based on:
- Read and write request - per "million request units" or "provisioned capacity units/hour"
- Storage - per GB-Month
- Backup - per GB-Month
- Restoration - per restored GB
- DAX - based on number and node instance types per hour
- Streams - per 100,000 read request units
- Data transfer in is free
- Data transfer out is per GB