Amazon S3

Amazon S3

  • Amazon S3 is one of the main building blocks of AWS being advertised as “infinitely scaling” storage

S3 Buckets

  • Amazon S3 allows to store objects (files) in “buckets”
  • Buckets must have a globally unique name
  • Buckets are defined at the region level
  • Bucket naming conventions:
    • No uppercase
    • No underscore
    • 3-63 character long names
    • Name must no be an IP
    • Must start with a lowercase letter or number

S3 Objects

  • In a bucket an object is a file
  • Objects do have a key as an identifier, which is basically the full path of the file:
    • s3://my-bucket/my_file.txt
    • s3://my-bucket/my_folder/another_folder/my_file.txt
  • The key is composed of the prefix + object name
    • s3://my-bucket/my_folder/another_folder/my_file.txt
  • In S3 there is no concept of directories within the buckets, just keys with very long names that contain slashes ("/")
  • Object values are the content of the body:
    • Max object size in S3 is 5TB
    • In case of a upload bigger than 5GB, we must use multi-part upload
  • Each object can have metadata: list of text key/value pairs - system or user added
  • Each object can have tags: unicode key/value par, useful for security, lifecycle. A bucket can have up to 10 tags
  • If versioning is enabled each object has a version ID

S3 Versioning

  • Files can have a version in S3, this should be enabled at the bucket level
  • If a files is uploaded with the same key (same filename) the version of the file will be changed, the existing file wont be overridden, we will have both files available with different versions
  • It is best practice to version the files, because:
    • The files will be protected against unintended deletes
    • The files can be rolled back to previous versions
  • Notes:
    • Any file that is not versioned prior to enabling versioning will have the version “null”
    • Suspending versioning does not delete the previous versions of the file
  • Deleting versioned files:
    • When deleting a versioned file adds a delete marker to the file, but the file wont be deleted
    • The file can be restored by deleting the delete marker
    • Deleting the delete marker and the file together is a permanent delete, meaning the file wont be able to be restored

S3 Security

Encryption at Rest

  • AWS provides 4 methods of encryption for objects in S3
    1. SSE-S3: encrypts S3 objects using keys handled and managed by AWS
    2. SSE-KMS: leverage AWS Key Management Service to manage encryption keys
    3. SSE-C: the encryption keys are managed by the user
    4. Client Side Encryption
  • SSE-S3:
    • Keys used for encryption is managed by Amazon S3
    • Objects are encrypted in the server side
    • It uses AES-256 encryption
    • In order to have SSE-S3 encryption clients must set a header, which is x-amz-server-side-encryption": “AES256”
  • SSE-KMS:
    • Encryption keys are handled and managed by KMS
    • KMS allows the manage which keys will be used for the encryption, moreover it has audit trails in order to be able to see who was using the KMS key
    • Objects are encrypted in the server-side
    • In order to have SSE-S3 encryption clients must set a header, which is x-amz-server-side-encryption": “aws:kms”
  • SSE-C:
    • Server side encryption using data keys provided by the user from the outside of AWS
    • Amazon S3 does not store the encryption key provided by the user
    • Data should be transmitted through HTTPS, because a key is sent the AWS
    • Encryption key must be provided in the header of the request for every request
    • When retrieving the object, the same encryption key must be provided in the header
  • Client Side Encryption:
    • Data must be encrypted before sending it to S3
    • This is usually accomplished by using a third party encryption library, like Amazon S3 Encryption Client
    • The user is solely responsible for encryption-decryption
    • The keys and the encryption cycle is managed by the user

Encryption in transit (SSL/TLS)

  • Amazon S3 exposes:
    • HTTP endpoint for non-encrypted data
    • HTTPS endpoint for encryption in flight which relies on SSL/TLS
  • Most clients for S3 will use HTTPS by default
  • In case of SSE-C encryption HTTPS in mandatory

S3 Security Overview

  • User based security:
    • Accomplished by using IAM policies: specified which calls should be allowed for a specified sued from IAM console
  • Resource based security:
    • Accomplished by using bucket policies which are bucket wide rules from the S3 console. These rules may allow cross account access to the bucket
    • We also have Object Access Control Lists (ACL) and Bucket Access Control Lists which allow finer grain control over the bucket Note: an IAM principle can access an S3 object if:
    • The user IAM permission allows it or the resource policy allows it
    • The is no explicit deny

S3 Bucket Policies

  • Bucket policies are JSON based documents
  • They can be applied to both buckets and objects in buckets
  • The effect of a statement in the bucket policy can be either allow or deny
  • The principle in the policy represents the account or the user for which the policy applies to
  • Common use cases for S3 bucket policies:
    • Grant public access to the bucket
    • Force objects to be encrypted at the upload
    • Grant access to another account (cross account access)

Bucket Settings for Block Public Access

  • Relatively new settings that was created to block public access to buckets and objects if the account has some restrictions:
  • S3 provides 4 different kind of block public access settings:
    • new access control lists
    • any access control lists
    • new public bucket or access point policies
    • block public and cross-account access to buckets and objects through any public bucket or access point policies
  • These settings were created to prevent company data leaks

S3 Other Security Features

  • Networking:
    • S3 supports VPC Endpoints (for instances in VPC without internet access)
  • Logging and Audit:
    • S3 Access Logs can be stored in other S3 buckets
    • API calls can be logged in AWS CloudTrail
  • User Security:
    • MFA Delete can be required in versioned buckets in order to protect for accidental deletions
    • Pre-Signed URLs: ULRs that are valid only for a limited time

S3 Websites

  • S3 can host static websites and have them accessible from the internet
  • The website URL will be something like this:
    • <bucket-name>.s3-website-<AWS-region>.amazonaws.com
    • <bucket-name>.s3-website.<AWS-region>.amazonaws.com
  • In case of 403 errors we have to make sure that the bucket policy allows public reads

CORS

  • An origin is a scheme (protocol), host (domain) or port
  • CORS: Cross Origin Resource Sharing
  • CORS is a web browser based mechanism to allow requests to other origins while visiting the main one
  • Same origin example: http://example.com/app1 and http://example.com/app2
  • Different origins: http://example.com and http://otherexample.com
  • The request wont be fulfilled unless the other origin allows for the request, using CORS headers (example: Access-Control-Allow-Origin, Access-Control-Allow-Method)

S3 CORS

  • If a client does a cross-origin request on an S3 bucket, the correct CORS headers need to be enabled in order for the request to succeed
  • Request can be allowed for a specified origin (by specifying the URL of the origin) or for all origins (by using *)

S3 Consistency Model

  • S3 is strong read-after-write consistent