AWS Attribution From Day One: No Orphans, No Click-Ops Mystery
Every AWS resource should trace back to its creator from the moment it exists. Here's the CloudTrail pipeline that makes that automatic — and why waiting makes it impossible.
Someone on your security team files a ticket. An S3 bucket with a wide bucket policy — public read, created two years ago. The tag says Environment: prod. No owner tag. The team that originally owned that account was reorged eight months ago.
Who created it? Why? Is it still in use?
The CloudTrail console goes back 90 days. The bucket is two years old. The trail was enabled six months after the account was created. You’re not going to find out.
That’s the orphan problem. And the answer to it isn’t better tooling after the fact — it’s a decision you make on day one.
Why attribution matters, and why waiting kills it
Every AWS resource has a creator. The creator’s IAM identity, the time, the eventName — all of it flows through CloudTrail the moment the API call is made. That data is free to capture and trivially small to store.
But it has a shelf life. The default CloudTrail console window is 90 days. Without a trail shipping events to S3, that context evaporates. Once it’s gone, you can’t reconstruct it — not with better tools, not with more budget.
This is the cloud-infrastructure version of a broader security context problem: context that’s free to capture in real time is expensive — or impossible — to reconstruct later. An orphan resource isn’t a resource nobody owns. It’s a resource where the ownership context was never captured, and then the window closed.
The fix is not complicated. It’s a pipeline you set up once, before anything exists that you might later wish you could trace.
The architecture
The goal: every Create* and Put* and Run* call across your entire AWS organization, queryable forever, attributed to the IAM principal that made it.
AWS Organization (all member accounts)
│
│ Organization Trail (management events, free)
▼
Raw S3 bucket
(lifecycle: IA after 30 days)
│
▼
EventBridge schedule (nightly)
│
▼
Step Function
├─► Lambda: read prior day's CloudTrail gzip
├─► Lambda: parse + aggregate per resource ARN
└─► Lambda: write parquet
│
▼
WORM S3 bucket
(Object Lock compliance, 7-year retention)
│
Athena queries ◄── SecOps / Audit / IR
Why each piece:
- Organization Trail — single trail that captures events from every account in the org. Management events (the
Create*/Put*calls you care about) are free for the first copy. No per-account setup. - Raw bucket — CloudTrail dumps here first. Standard storage with lifecycle to Infrequent Access after 30 days. Input buffer.
- Nightly Step Function — runs after midnight UTC, processes the prior day’s logs. Batch is intentional: cheaper, simpler, and attribution doesn’t need to be real-time.
- Lambda aggregation — reads the prior day’s gzipped JSON, extracts
(resourceARN, userIdentity.arn, eventTime, eventName), groups by resource, writes Parquet. - WORM bucket — Object Lock in compliance mode. 7-year retention. Cannot be deleted or shortened — not by an admin, not by the root account. This is your audit trail.
- Athena — partition projection on
date / account / region. Queries run against the WORM bucket. No cluster to manage; you pay per byte scanned.
Single-account setup: same architecture, swap the organization trail for a single-account trail. One-line change in Terraform.
Implementation
Organization trail (Terraform)
resource "aws_cloudtrail" "org_trail" {
name = "org-management-trail"
s3_bucket_name = aws_s3_bucket.cloudtrail_raw.id
is_organization_trail = true
include_global_service_events = true
is_multi_region_trail = true
enable_log_file_validation = true
}
Requires: running in the management account, with CloudTrail trusted access enabled for the organization (aws organizations enable-aws-service-access --service-principal cloudtrail.amazonaws.com).
Lambda aggregation core (Python)
import gzip, json, boto3
def handler(event, context):
s3 = boto3.client("s3")
records = []
for key in event["keys"]:
obj = s3.get_object(Bucket=event["raw_bucket"], Key=key)
with gzip.open(obj["Body"], "rt") as f:
trail = json.load(f)
for r in trail.get("Records", []):
for resource in r.get("resources", []):
records.append({
"resource_arn": resource.get("ARN", "unknown"),
"principal_arn": r["userIdentity"].get("arn", "unknown"),
"event_time": r["eventTime"],
"event_name": r["eventName"],
"account_id": r["recipientAccountId"],
"region": r["awsRegion"],
})
# write_parquet(records, event["worm_bucket"], event["partition_key"])
return {"count": len(records)}
The repo has the full version: multi-resource events (RunInstances creates several resources in one call), pagination, error retry, and the Parquet write. The snippet shows the core extraction loop.
WORM bucket (Terraform)
resource "aws_s3_bucket_object_lock_configuration" "worm" {
bucket = aws_s3_bucket.cloudtrail_worm.id
rule {
default_retention {
mode = "COMPLIANCE"
years = 7
}
}
}
COMPLIANCE mode: the lock cannot be shortened or removed — not by an admin, not by the root account. Relevant for SOC 2 and HIPAA where auditors want tamper-proof chain of custody.
Athena — resource timeline
SELECT event_time, event_name, principal_arn
FROM cloudtrail_attribution
WHERE resource_arn = 'arn:aws:s3:::my-bucket'
AND dt BETWEEN '2024-01-01' AND '2024-12-31'
ORDER BY event_time;
The table uses partition projection so Athena skips partitions automatically — no MSCK REPAIR TABLE, no manual partition registration. Full DDL is in the repo.
What it costs
Management events via an organization trail: free for the first copy to S3. No per-account charge, no per-event charge.
S3 storage for management events at a typical org (a few hundred accounts): a few dollars a month. Parquet is compact.
Data events (S3 GetObject, Lambda invokes, DynamoDB reads): $0.10 per 100K events. Enable selectively on high-value buckets only, not org-wide. This pipeline intentionally skips data events unless you add them.
CloudTrail Insights and real-time delivery are paid. This architecture skips both deliberately — nightly batch is sufficient for attribution and significantly cheaper.
Day-one checklist
- Enable CloudTrail trusted access in AWS Organizations
- Deploy org trail via Terraform (management account)
- Create raw S3 bucket with lifecycle policy (→ IA after 30 days)
- Create WORM S3 bucket with Object Lock in COMPLIANCE mode, 7-year retention
- Deploy Step Function + Lambda aggregation (nightly EventBridge schedule)
- Create Athena table with partition projection
- Verify: query yesterday’s events, confirm attribution is flowing
Reference implementation: github.com/LuD1161/aws-attribution-day-one (coming soon)
Part of the Security Engineering Is a Context Problem series.
Idea, framing, and edits: Aseem. Drafting assistance: Claude.
Comments