What is AWS?
AWS is a cloud service provider that aims to provide storage, computing resources, mobile applications, databases and other cloud computing-related functionality. But what is cloud computing? Cloud computing allows access to a large volume of information related to a company’s activity without investing in an infrastructure to access and dispose of it. The savings in hardware, software, security, support, maintenance, etc. are really significant.
One of its core components is AWS S3, the object storage service offered by AWS. Its impressive accessibility and durability have made it a benchmark for storing video, images and data. You can combine S3 with other services to create infinitely scalable applications.
Specifically, Boto3 allows you to interact with Amazon Web Services, but what is Boto3? Here, the official Amazon reference on its SDK (Software development kit):
The AWS SDK for Python (Boto3) provides a Python API for AWS infrastructure services. Using the Python SDK, you can build applications on top of Amazon S3, Amazon EC2, Amazon DynamoDB, and more.
This is just a brief overview of AWS and its Python SDK for creating, updating and deleting AWS resources directly from your Python scripts. Today I’m going to explain the basics of the AWS SDK with a focus on S3 bucket management:
- Installation
- Configuration
- Client vs. Resource (Connection type).
- Common operations.
- File management.
Installation
Install the latest version of Boto3 using pip:
Configuration
Before you can start using Boto3, you must set up the authentication credentials. If you already have an IAM user that has full access to S3, you can use those user credentials (your access key and your secret access key) without having to create a new user. If not, you can create a new user only with the AmazonS3FullAccess policy and then store the new credentials.
This screen will display the credentials generated by the user. Click on the Download.csv button to make a copy of the credentials.
Now that you have your new user, you need to create a new file, ~/.aws/credentials:
Open the file and paste the credentials you just downloaded with the following structure, filling in the placeholders:
Save the file.
Now that you have set up these credentials, you have a default profile that Boto3 will use to interact with your AWS account.
There is one more configuration to set: the default region that Boto3 should interact with. (AWS Service Endpoint). Choose your preferred region, e.g. eu-west-1 (Ireland).
Create a new file, ~/.aws/config and fill it with this structure.
Save the file. Now we are ready for the rest of this tutorial.
Below, you will see the different options Boto3 gives you to connect to S3 and other AWS services.
Client vs Resource
Essentially, all Boto3 does is call the AWS API on your behalf. For most AWS services, Boto3 offers two different ways to access these abstract APIs:
- Client: access to low-level services.
- Resource: access to the top-level object-oriented service.
You can use either to interact with S3.
To connect to the low-level client interface, the Boto3 client() function must be used. Next, enter the name of the service you want to connect to, in this case, s3:
To connect to the high-level interface, a similar approach is followed, but using resource():
After successfully connecting to both versions, you may be asking yourself: “Which one should I use?
With clients, there is more programming work to be done. Most customer operations give you an answer in the form of a dictionary. To get the exact information you need, you will have to analyse that dictionary yourself. With the resource methods, the SDK does that work for you.
With the client, you may see some slight performance improvements. The downside is that your code becomes less readable than it would be if you were using the resource. Resources offer better abstraction, and your code will be easier to understand.
Understanding how the client and the resource are generated is also important when considering which to choose:
- Boto3 generates the client from a JSON service definition file. Client methods support all types of interaction with the target AWS service.
- Resources, on the other hand, are generated from JSON resource definition files.
Boto3 generates the client and the resource from different definitions. As a result, you may find cases where an operation supported by the customer is not offered by the resource. Here’s the interesting part: you don’t need to change your code to use the client everywhere. For that operation, you can access the client directly through the resource as follows: s3_resource.meta.client.
One such client operation is .generate_presigned_url(), which allows you to give your users access to an object within your bucket for a set period of time, without requiring them to have AWS credentials.
Common operations
In this section we attach the code snippets that will help us to perform different operations with AWS S3:
Create a bucket:
Delete a bucket with files:
Upload files:
Conclusion
I hope that after reading this article you will find it easy to get started with the boto3 Python API, make your first connection to the AWS cloud computing infrastructure and take advantage of its storage system. Here at Bounsel, we use boto3 to connect to the AWS infrastructure and exploit its capabilities. In particular, we exploit AWS S3 buckets for mass storage of legal documents, and also its cloud computing resources (such as Cloud9) to train at top speed Artificial Intelligence models to bring you incredible functionalities, such as named entity recognition.