AWS Certified Solutions Architect - Associate Exam Notes > Fundamentals > Amazon S3

Amazon S3

Amazon S3 is one of the main building blocks of AWS being advertised as “infinitely scaling” storage

S3 Buckets

Amazon S3 allows to store objects (files) in “buckets”
Buckets must have a globally unique name
Buckets are defined at the region level
Bucket naming conventions:
- No uppercase
- No underscore
- 3-63 character long names
- Name must no be an IP
- Must start with a lowercase letter or number

S3 Objects

In a bucket an object is a file
Objects do have a key as an identifier, which is basically the full path of the file:
- s3://my-bucket/my_file.txt
- s3://my-bucket/my_folder/another_folder/my_file.txt
The key is composed of the prefix + object name
- s3://my-bucket/my_folder/another_folder/my_file.txt
In S3 there is no concept of directories within the buckets, just keys with very long names that contain slashes ("/")
Object values are the content of the body:
- Max object size in S3 is 5TB
- In case of a upload bigger than 5GB, we must use multi-part upload
Each object can have metadata: list of text key/value pairs - system or user added
Each object can have tags: unicode key/value par, useful for security, lifecycle. A bucket can have up to 10 tags
If versioning is enabled each object has a version ID

S3 Versioning

Files can have a version in S3, this should be enabled at the bucket level
If a files is uploaded with the same key (same filename) the version of the file will be changed, the existing file wont be overridden, we will have both files available with different versions
It is best practice to version the files, because:
- The files will be protected against unintended deletes
- The files can be rolled back to previous versions
Notes:
- Any file that is not versioned prior to enabling versioning will have the version “null”
- Suspending versioning does not delete the previous versions of the file
Deleting versioned files:
- When deleting a versioned file adds a delete marker to the file, but the file wont be deleted
- The file can be restored by deleting the delete marker
- Deleting the delete marker and the file together is a permanent delete, meaning the file wont be able to be restored

S3 Security

Encryption at Rest

AWS provides 4 methods of encryption for objects in S3
1. SSE-S3: encrypts S3 objects using keys handled and managed by AWS
2. SSE-KMS: leverage AWS Key Management Service to manage encryption keys
3. SSE-C: the encryption keys are managed by the user
4. Client Side Encryption
SSE-S3:
- Keys used for encryption is managed by Amazon S3
- Objects are encrypted in the server side
- It uses AES-256 encryption
- In order to have SSE-S3 encryption clients must set a header, which is x-amz-server-side-encryption": “AES256”
SSE-KMS:
- Encryption keys are handled and managed by KMS
- KMS allows the manage which keys will be used for the encryption, moreover it has audit trails in order to be able to see who was using the KMS key
- Objects are encrypted in the server-side
- In order to have SSE-S3 encryption clients must set a header, which is x-amz-server-side-encryption": “aws:kms”
SSE-C:
- Server side encryption using data keys provided by the user from the outside of AWS
- Amazon S3 does not store the encryption key provided by the user
- Data should be transmitted through HTTPS, because a key is sent the AWS
- Encryption key must be provided in the header of the request for every request
- When retrieving the object, the same encryption key must be provided in the header
Client Side Encryption:
- Data must be encrypted before sending it to S3
- This is usually accomplished by using a third party encryption library, like Amazon S3 Encryption Client
- The user is solely responsible for encryption-decryption
- The keys and the encryption cycle is managed by the user

Encryption in transit (SSL/TLS)

Amazon S3 exposes:
- HTTP endpoint for non-encrypted data
- HTTPS endpoint for encryption in flight which relies on SSL/TLS
Most clients for S3 will use HTTPS by default
In case of SSE-C encryption HTTPS in mandatory

S3 Security Overview

User based security:
- Accomplished by using IAM policies: specified which calls should be allowed for a specified sued from IAM console
Resource based security:
- Accomplished by using bucket policies which are bucket wide rules from the S3 console. These rules may allow cross account access to the bucket
- We also have Object Access Control Lists (ACL) and Bucket Access Control Lists which allow finer grain control over the bucket Note: an IAM principle can access an S3 object if:
- The user IAM permission allows it or the resource policy allows it
- The is no explicit deny

S3 Bucket Policies

Bucket policies are JSON based documents
They can be applied to both buckets and objects in buckets
The effect of a statement in the bucket policy can be either allow or deny
The principle in the policy represents the account or the user for which the policy applies to
Common use cases for S3 bucket policies:
- Grant public access to the bucket
- Force objects to be encrypted at the upload
- Grant access to another account (cross account access)

Bucket Settings for Block Public Access

Relatively new settings that was created to block public access to buckets and objects if the account has some restrictions:
S3 provides 4 different kind of block public access settings:
- new access control lists
- any access control lists
- new public bucket or access point policies
- block public and cross-account access to buckets and objects through any public bucket or access point policies
These settings were created to prevent company data leaks

S3 Other Security Features

Networking:
- S3 supports VPC Endpoints (for instances in VPC without internet access)
Logging and Audit:
- S3 Access Logs can be stored in other S3 buckets
- API calls can be logged in AWS CloudTrail
User Security:
- MFA Delete can be required in versioned buckets in order to protect for accidental deletions
- Pre-Signed URLs: ULRs that are valid only for a limited time

S3 Websites

S3 can host static websites and have them accessible from the internet
The website URL will be something like this:
- <bucket-name>.s3-website-<AWS-region>.amazonaws.com
- <bucket-name>.s3-website.<AWS-region>.amazonaws.com
In case of 403 errors we have to make sure that the bucket policy allows public reads

CORS

An origin is a scheme (protocol), host (domain) or port
CORS: Cross Origin Resource Sharing
CORS is a web browser based mechanism to allow requests to other origins while visiting the main one
Same origin example: http://example.com/app1 and http://example.com/app2
Different origins: http://example.com and http://otherexample.com
The request wont be fulfilled unless the other origin allows for the request, using CORS headers (example: Access-Control-Allow-Origin, Access-Control-Allow-Method)

S3 CORS

If a client does a cross-origin request on an S3 bucket, the correct CORS headers need to be enabled in order for the request to succeed
Request can be allowed for a specified origin (by specifying the URL of the origin) or for all origins (by using *)

S3 Consistency Model

S3 is strong read-after-write consistent