AWS TL;DR: S3 Fundamentals
Back to AWS TL;DR Hub

AWS S3 Fundamentals

/tldr: Scalable, highly durable, object storage for anything.

Storage Global Infrastructure Durability

1. Core Concepts: Buckets and Objects

S3 is **Object Storage**, not a file system (like EBS) or a database. It stores flat files called Objects.

The Bucket

  • **Global Namespace:** Bucket names must be globally unique across all of AWS (like a website domain name).
  • **Region Specific:** A bucket is created in a specific AWS Region.
  • **Storage Container:** It serves as a container for objects (data files).
Bucket Name Example: my-globally-unique-app-data-2024
                    

The Object

  • **Key/Value Pair:** Data is stored as an Object (Value) and retrieved via its Key (Full Path/Name).
  • **Metadata:** Each object includes system metadata (size, date, owner) and optional user-defined metadata.
  • **Size Limit:** Objects can range from 0 bytes up to 5 TB.
Object Key Example: project-folder/images/user-avatar-123.jpg
                    

2. Durability and Consistency

Extreme Durability

S3 is designed for eleven nines (99.999999999%) of durability. This is achieved by storing data redundantly across a minimum of three Availability Zones (AZs) within a region, using techniques like erasure coding.

  • **Durability vs. Availability:** Durability is the likelihood that data won't be lost. Availability (typically 99.99%) is the likelihood the service is accessible and operational.
  • **Shared Responsibility:** AWS guarantees the durability of the platform; you are responsible for securing access (IAM, Bucket Policies).

Read-After-Write Consistency

S3 provides **Read-After-Write Consistency** for all operations (PUTs, POSTs, and DELETEs). This means:

  • When you write a new object, you can immediately read it back.
  • When you overwrite or delete an existing object, subsequent read requests will return the latest version (or a "Not Found" error if deleted).

3. Storage Classes (Cost Tiers)

You select a storage class based on access frequency, which directly impacts cost. The lower the access frequency, the cheaper the monthly storage, but the higher the retrieval cost.

S3 Standard (General Purpose)

**Best For:** Frequently accessed data (e.g., website content, mobile apps, default storage).

*Highest availability, lowest latency, highest storage cost.*

S3 Intelligent-Tiering & S3 Standard-IA

**Best For:** Data accessed less frequently but requiring immediate access when needed (e.g., backups, logs).

*Intelligent-Tiering automatically moves data between Standard and IA based on access patterns.*

S3 Glacier Flexible Retrieval & Deep Archive

**Best For:** Long-term archival and compliance data.

*Lowest storage cost, but retrieval requires a delay (minutes to hours) and incurs a retrieval fee.*

S3 is the storage backbone for nearly every AWS service.

AWS Fundamentals Series: S3