How to Read CSV File from S3 Bucket in AWS Lambda: Complete Step by Step Guide
By Braincuber Team
Published on March 21, 2026
Reading CSV files from S3 buckets in AWS Lambda is a common requirement for data processing pipelines and serverless applications. This comprehensive guide will teach you two powerful methods to accomplish this task: using the Requests library for HTTP-based access and the Boto3 client for direct AWS SDK integration. You'll learn how to set up proper IAM permissions, handle errors effectively, and process CSV data efficiently in your Lambda functions.
What You'll Learn:
- Create Lambda execution roles with S3 read permissions
- Read CSV files using Requests library with HTTP GET requests
- Implement Boto3 client get_object() method for direct S3 access
- Handle CSV parsing with Python csv module and Pandas
- Troubleshoot common errors and dependency issues
Understanding AWS Lambda and S3 Integration
AWS Lambda provides serverless compute service that runs your code in response to events and automatically manages the underlying compute resources. When combined with Amazon S3's scalable object storage, you can create powerful data processing workflows without managing servers. The key to successful integration lies in proper permissions configuration and choosing the right method for accessing your CSV files.
Step 1: Create Lambda Execution Role with S3 Read Permissions
For the Lambda service to read files from the S3 bucket, you need to create a Lambda execution role that has S3 read permissions. This role acts as the identity that your Lambda function assumes when executing.
Navigate to IAM Console
Sign in to the AWS Console and navigate to the Identity and Access Management (IAM) console. Click Roles and then click Create role.
Select Trusted Entity
Choose AWS service as a Trusted Entity and Lambda as the Use Case.
Add S3 Permissions
Add the AmazonS3ReadOnlyAccess policy for read-only S3 access.
Create and Name Role
Give the role a name and description, review the attached policies, and click Create role.
Attach Role to Lambda
Go to your Lambda function, click Edit in the Execution role section, choose the IAM role you created, and save to attach it.
Step 2: Read CSV File Using Requests Library
The Requests library is a popular Python module for making HTTP requests and interacting with web services. It simplifies the process of sending HTTP requests, handling responses, and managing various aspects of web communication. You can use it to send a GET request to your S3 object URL and read the CSV data.
HTTP GET Request
Send GET request to S3 object URL and receive response with status code 200 on success.
CSV Parsing
Parse response.text using csv.reader to iterate through CSV rows efficiently.
import requests
import csv
# URL of the CSV file
url = "https://mrcloudgurudemo.s3.us-east-2.amazonaws.com/sample_csv.csv"
try:
# Send an HTTP GET request to the URL
response = requests.get(url)
# Check if the request was successful (HTTP status code 200)
if response.status_code == 200:
# Parse the CSV data from the response content
csv_data = response.text
# Use the csv.reader to read the data
reader = csv.reader(csv_data.splitlines())
# Iterate through the rows in the CSV
for row in reader:
# Process each row as needed
print(row)
else:
print(f"Failed to fetch data. Status code: {response.status_code}")
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
| Expected Output |
|---|
| ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'] |
| ['5.1', '3.5', '1.4', '0.2', 'Iris-setosa'] |
| ['4.9', '3', '1.4', '0.2', 'Iris-setosa'] |
| ['4.7', '3.2', '1.3', '0.2', 'Iris-setosa'] |
Dependency Issue Warning
If you're using requests library version 2.30.0 in AWS Lambda, you might encounter a "Cannot Import Name DEFAULTCIPHERS from urllib3.util.ssl" error. Solve this by downgrading requests to version 2.29.0.
Step 3: Read CSV File Using Boto3 Client get_object() Method
The Boto3 library is the AWS SDK for Python, providing an object-oriented API as well as low-level access to AWS services. Boto3 makes it easy to create, configure, and manage AWS resources from your Python applications. The get_object() method provides direct access to S3 objects without going through HTTP URLs.
Direct S3 Access
Use Boto3 client for secure, direct access to S3 objects without public URLs.
Pandas Integration
Seamlessly convert CSV data to Pandas DataFrame for advanced data analysis.
import boto3
import pandas as pd
import io
def lambda_handler(event, context):
# Initialize the S3 client
s3 = boto3.client('s3')
# Specify the S3 bucket and object key of the CSV file
bucket_name = 'mrcloudgurudemo'
file_key = 'sample_csv.csv'
try:
# Read the CSV file from S3
response = s3.get_object(Bucket=bucket_name, Key=file_key)
csv_content = response['Body'].read().decode('utf-8')
# Create a Pandas DataFrame
df = pd.read_csv(io.StringIO(csv_content))
# Now you have your DataFrame (df) for further processing
# Example: Print the first 5 rows
print(df.head(5))
return {
'statusCode': 200,
'body': 'File read successfully into DataFrame.'
}
except Exception as e:
return {
'statusCode': 500,
'body': str(e)
}
| Pandas DataFrame Output |
|---|
| sepal_length sepal_width petal_length petal_width species |
| 0 5.1 3.5 1.4 0.2 Iris-setosa |
| 1 4.9 3.0 1.4 0.2 Iris-setosa |
| 2 4.7 3.2 1.3 0.2 Iris-setosa |
| 3 4.6 3.1 1.5 0.2 Iris-setosa |
Credentials Configuration
To interact with AWS services using Boto3, you must configure security credentials using aws configure command or IAM roles to avoid "unable to locate credentials" errors.
Step 4: Compare Methods and Choose Best Approach
Both methods have their advantages and use cases. Understanding the differences will help you choose the right approach for your specific requirements.
| Method | Best For | Security | Performance |
|---|---|---|---|
| Requests Library | Public S3 objects, simple use cases | Lower (requires public access) | Good for small files |
| Boto3 Client | Private S3 objects, enterprise use | Higher (IAM-based access) | Better for large files |
Security: Boto3 uses IAM roles for secure access without exposing credentials
Scalability: Boto3 handles large files better with streaming capabilities
Dependencies: Requests may require version downgrading for Lambda compatibility
Frequently Asked Questions
How do I handle large CSV files in AWS Lambda?
Use Boto3's streaming capabilities with response['Body'] and process data in chunks. Consider using Lambda's 512MB temporary storage or S3 for intermediate processing.
What permissions does my Lambda function need to read from S3?
Attach AmazonS3ReadOnlyAccess policy or create custom IAM policy with s3:GetObject permission for specific buckets and objects.
Can I read CSV files from private S3 buckets?
Yes, use Boto3 client with proper IAM roles. Requests library requires public access or presigned URLs for private buckets.
How do I debug CSV parsing errors in Lambda?
Enable CloudWatch logging, add try-catch blocks around CSV parsing, and log the raw content to identify encoding or formatting issues.
What's the best way to process CSV data after reading?
Use Pandas DataFrames for data manipulation and analysis, or process row-by-row with csv.reader for memory efficiency with large files.
Need Help with AWS Lambda Development?
Our experts can help you build serverless applications, optimize Lambda functions, and implement robust data processing pipelines.
