AWS TL;DR: Amazon Macie (Sensitive Data Discovery)
Back to AWS TL;DR Hub

Amazon Macie

/tldr: Security service that uses machine learning and pattern matching to discover, monitor, and protect sensitive data in S3.

Sensitive Data Discovery Data Security & Privacy S3 Focus

1. Core Function: Sensitive Data Discovery

Macie's primary job is continuous, automated data security analysis. It inventories your S3 buckets and then uses a combination of machine learning and pattern matching to find and classify files containing sensitive information.

What Macie Looks For (Managed Data Identifiers)

  • **Personally Identifiable Information (PII):** Names, addresses, phone numbers, email addresses.
  • **Financial Data:** Credit card numbers, bank account numbers, routing numbers.
  • **Credentials:** API keys, secret keys, passwords.
  • **Protected Health Information (PHI):** HIPAA-related data.
  • **Custom Data:** You can define your own regular expressions (RegEx) to search for proprietary data formats (e.g., specific project IDs).

2. Continuous Monitoring and Findings

Macie runs continuous job monitoring across all S3 buckets and generates security findings when it detects anomalies, such as exposure risks or actual sensitive data leaks.

Key Features

  • **Discovery Jobs:** You can run one-time, scheduled, or continuous jobs to deeply inspect specific buckets or your entire S3 environment.
  • **Risk Scoring:** Macie assigns a severity level (High, Medium, Low) to findings, prioritizing the most critical risks (e.g., an S3 bucket with PII that is publicly accessible).
  • **Security Hub Integration:** Findings are automatically published to AWS Security Hub, allowing for centralized security management and remediation.
  • **Automated Response:** Findings can trigger Amazon EventBridge events, enabling automated remediation actions (e.g., triggering a Lambda function to restrict public access to an infected bucket).

3. Security Dashboard & Multi-Account Support

The Macie dashboard provides a visual overview of your data security posture across all managed accounts.

// Dashboard Key Metrics:

// 1. **Sensitive Data Volume:** Total amount of sensitive data found.
// 2. **Top Risky Buckets:** A list of buckets with the most critical findings.
// 3. **Data Access Patterns:** Insights into how and by whom data is being accessed (e.g., unusual API calls).

// Multi-Account Management:
// Macie supports AWS Organizations, allowing a delegated administrator account to manage Macie jobs and findings centrally across all member accounts.
            

Billing Model

Macie's cost is based on three main usage types, typically with a free tier for the first tier:

  • S3 Bucket Inventory Monitoring: (Lowest cost, often free tier) Cost based on the number of buckets monitored.
  • Sensitive Data Discovery: (Highest cost) Cost based on the volume of data scanned (GB) during discovery jobs.

Macie is your automated PII watchdog, helping you maintain compliance and prevent data leaks in S3.

AWS Fundamentals Series: Amazon Macie