icon / menu / white V2Created with Sketch.
Switch LanguageSwitch Language
English
AWS Athena with DynamoDB as Data sources in 5 minutes

AWS Athena with DynamoDB as Data sources in 5 minutes

DynamoDB is a NoSQL database offered by AWS. It’s designed for low latency and high availability, making it an optimal choice for applications that demand seamless scaling. With DynamoDB, AWS provides a feature-rich solution that equips you to handle growing workloads effortlessly. 

Athena is a robust data analytics tool within the AWS ecosystem. It empowers users with the ability to query a variety of data sources using familiar SQL language. Whether you’re diving into business intelligence tasks or other analytical endeavours, Athena simplifies the process and enables you to derive meaningful insights from your data. 

Connecting Amazon Athena to Amazon DynamoDB offers a powerful way to analyze data stored in DynamoDB using SQL queries. This integration can be valuable in various real-world scenarios: 

  1. Analyzing Large Datasets: If your DynamoDB tables store large volumes of data, running complex queries directly on the DynamoDB tables might be slow and resource-intensive. With Amazon Athena, you can run queries using SQL, which can be faster and more efficient for analyzing large datasets.
  2. Business Intelligence and Reporting: Athena can help you generate reports and perform business intelligence tasks on the data in DynamoDB. You can use SQL queries to aggregate, filter, and transform data to gain insights into customer behaviour, sales trends, or operational metrics. 

There are many more you can do with both tools, you get the idea. In order to connect DynamoDB as Athena’s data source you need to create a lambda function to be a connector between both tools but don’t worry AWS Console is pretty much telling you what to do. 

Let’s get into the hands-on 

Prerequisite 

  • AWS Account(For experimental purposes, use an Admin account of your own) 
    DynamoDB Table with the dataset that you want to play with(You can use this : https://github.com/pacozaa/DynamoDB-Coffee-Data-Generator) 
  • Cost should be FREE but you can set the AWS budget and check AWS Cost Explorer for your peace of mind. 
  • Basic Lambda experience is a plus 
  • Basic Athena experience is a plus 
  • Basic S3 experience is a plus 

Athena Part 1

  1. Go to the Athena console then open the left-hand side menu and click Data sources 

step2-data-sources-view

2. Click on Create data source 

3. Choose Amazon DynamoDB as the data source 

step3-select-dynamodb

4. Fill in the form — Add the Data source name that you’d like. 

step4-data-sources-detail

5. Scroll down and you will see Connection details This is where you choose or create the Lambda function you’d like to use as a connector to your DynamoDB. 

In case you don’t have it click the Create Lambda function This will open the AWS Lambda page with the Application configuration that is almost ready for you to deploy 

step5-data-sources-detail-lambda

Lambda Function 

  1. AWS will bring you to the Lambda> Application creation page with the default setting for AthenaDynamoDBConnector 

step6-go-to-lambda

  1. Scroll down and you will see the form to fill in the Application name and SpillBucket
  • Application name -> Put in the custom name that you’d like 
  • SpillBucket -> Put in the bucket name you’d like to use as SpillBucket 

step7-lambda-configstep11-lambda-config-5

3. Next ConnectorConfig

  • AthenaCatalogName->Put in a custom name that you’d like as well
  • SpillPrefix->Put in the prefix that will be used inside the bucket 

step8-lambda-config-2step10-lambda-config-4

4. Click Deploy 

step9-lambda-config-3

5. After deployment you can check the result at the Lambda Dashboard, Applications, Functions 

step13-check-lambda-dashboard

Athena Part 2 

  1. Now we can switch back or go back to AWS Athena that we paused from Part 1 

    step12-back-to-data-sources
  2. Click Next to finish creating the data source. Your data source should have Associated databases with the database you want as the data source

step14-check-data-sources-associate-table

step15-data-sources-page

Athena Query Editor 

  1. Go to Athena Query Editor
  2. Select the DynamoDB data source you just created in the Data sourcedropdown and the database you want which in this case should be default 
  3. Try to run interactive SQL editor with this query 

SELECT count(*) FROM “table_name”; 

This will count how many rows your database has(In this case my table is named coffeemagic) 

Run the query 

step16-query-1

4. You will see this error If you haven’t created S3 bucket with the same name yet. Go to S3 and create it. 

step17-handle-surprised-error-redacted_dot_app-redacted_dot_app

5. You can check bucket name in Lambda>Functions>[Your function name for DynamoDB connector] and go to tab Configuration>Environment variables>spill_bucket 

step18-go-to-function

step19-check-lambda-config-bucket-name

6. Run the Query again, this time you should see the desired result 

step21-run-the-query-again

Note: If you have a permission issue please check your IAM user policy or Lambda>Configuration>Permissions 

step20-lambda-config

Conclusion 
There you have it. Let’s have fun with DynamoDB with SQL in Athena Query Editor.
Elevate your data analytics game, streamline operations, and make informed, real-time decisions.

Contact us now to revolutionize your data infrastructure and propel your business towards unparalleled efficiency and success!

 

CONTACT US

Related articles

The Rise of Hybrid Cloud: Balancing Security, Flexibility and Cost Optimization
4 mins
Tech trends
The Rise of Hybrid Cloud: Balancing Security, Flexibility and Cost Optimization
AI Use Case: Intelligent Document Processing and Summarization
1 mins
Tech trends
AI Use Case: Intelligent Document Processing and Summarization
AI Use Case: Automating Insurance Claim Processing from End to End
2 mins
Tech trends
AI Use Case: Automating Insurance Claim Processing from End to End

Button / CloseCreated with Sketch.