VOOZH about

URL: https://dev.to/dspv/how-to-control-cloudwatch-logs-costs-on-ecs-li3

⇱ How to Control CloudWatch Logs Costs on ECS - DEV Community


How to Control CloudWatch Logs Costs on ECS?

Originally published at https://fortem.dev/blog/cloudwatch-costs-ecs
ECS sends all logs to CloudWatch with retention set to Never Expire by default. 4 steps to cut your CloudWatch bill by 60-80%: retention, log level filtering, Insights queries, and per-service monitoring.


Use Case

Your AWS bill shows CloudWatch at $400 this month. You have 15 ECS services logging INFO-level to CloudWatch — with retention set to Never Expire. You didn't configure this. ECS did it by default. The fix takes 4 steps.

TL;DR

  • 01ECS default log driver sends everything to CloudWatch with retention = Never Expire — you didn't set this, ECS did
  • 024-step fix: set retention (90% impact), filter by log level (5%), Insights instead of streaming (3%), monitor per-service (2%)
  • 03One Terraform line: retention_in_days = 30 — cuts storage cost by 60-80% immediately
  • 04Real example: 15 services, 3 GB/day → $135/mo (before) → $30/mo (after) — 78% savings
  • 05Download the skill file — your AI agent can audit and fix this for you in 5 minutes

Why CloudWatch is silently eating your AWS bill

ECS creates CloudWatch log groups with no retention policy by default — logs accumulate forever at $0.50/GB ingestion plus $0.03/GB/month storage, with no upper bound. Every container's stdout goes to CloudWatch. Logs accumulate forever and your bill grows every month. You did not configure this.

The part that surprises most teams: ECS creates log groups with no retention policy. No retention = Never Expire = logs accumulate forever = your bill grows every month. We audited a 15-service fleet where CloudWatch was $135/month — more than the compute cost for two of the environments combined. Retention is one lever; right-sizing and scheduling are the rest of the picture.

Cost component15 services, INFO level, 3 GB/day

Ingestion ($0.50/GB)$45/mo

Storage ($0.03/GB/month)$54/mo (grows every month)

Insights queries ($0.50/GB)$36/mo (5 queries/day)

Total$135/mo

KEY INSIGHT: Key insight Three separate charges on the same data. Ingestion is pay-what-you-send. Storage is pay-what-you-keep. Insights is pay-what-you-scan. ECS defaults mean you pay all three — with no upper bound — on every log line your application prints.

Download the skill file — let AI fix it

The downloadable skill file lets your AI agent scan all CloudWatch log groups, identify which ones lack retention, estimate monthly cost per group, and apply fixes — without writing a line of code. Everything runs locally on your machine against your AWS account.

*CloudWatch Cost Optimizer Finds log groups without retention, estimates monthly *

Step 1 — Set retention on every log group

Adding retention_in_days = 30 to every aws_cloudwatch_log_group Terraform resource cuts CloudWatch storage cost by 60–80% immediately — it is the single highest-impact change in this guide. Find every log group without retention and set it to something sensible.

This single change has the biggest impact of any step in this guide. Every log group with Never Expire keeps accumulating data you will never query. The commands below find them and set a sensible ceiling.

Find groups without retention:

aws logs describe-log-groups \
 --query 'logGroups[?retentionInDays==`null`].[logGroupName,storedBytes]' \
 --output table

Set 30-day retention on one group:

aws logs put-retention-policy \
 --log-group-name "/aws/ecs/your-service" \
 --retention-in-days 30

Terraform — the one-liner that saves you $$$:

resource "aws_cloudwatch_log_group" "ecs_service" {
 name = "/ecs/${var.env_prefix}-${var.service_name}"
 retention_in_days = 30 # ← was null (Never Expire). Now 30 days.
}

EnvironmentRetentionWhy

Production90 daysCompliance + incident investigation

Staging30 daysRecent deploy history

Dev / QA7 daysActive development only

CI/CD / Build1 dayDon't store ephemeral build logs

Step 2 — Filter by log level

Switching ECS production services from INFO to WARN log level reduces ingested log volume by one to two orders of magnitude, cutting both the $0.50/GB ingestion and $0.03/GB storage charges. Switch production to WARN, keep INFO for staging.

“CloudWatch Logs charges $0.50 per GB ingested, $0.03 per GB stored per month, and $0.50 per GB scanned by Logs Insights queries — beyond the 5 GB/month free tier.”

— aws.amazon.com/cloudwatch/pricing, verified June 2026

Spring Boot, Express, Django — they all default to INFO-level logging. That means every HTTP request, every database query, every cache hit generates a log line. Production doesn't need INFO. Switch to WARN.

# Find which log groups ingest the most data (last 7 days)
aws logs start-query \
 --log-group-name "/aws/ecs/prod-api" \
 --start-time $(date -v-7d +%s) \
 --end-time $(date +%s) \
 --query-string "stats count() by @logStream | sort count desc | limit 10"

# Check your framework's log level:
# Spring Boot: logging.level.root=WARN in application.properties
# Express: set LOG_LEVEL=warn
# Django: LOGGING['root']['level'] = 'WARNING'

KEY INSIGHT: Key insight An INFO-level web server can generate one to two orders of magnitude more log volume than the same server at WARN. If you're paying $0.50/GB for ingestion, every unnecessary log line costs you money — twice (once to ingest, once to store).

Step 3 — Use Insights instead of streaming everything

Use CloudWatch Logs Insights to query on demand at $0.50/GB scanned rather than streaming every log line to a third-party tool that charges separately for ingestion and indexing. For compliance, subscription filter to S3.

Datadog's log pricing is two-part: ingestion is billed separately from indexing (making logs searchable). Once you index everything for debugging — which is the point of streaming logs there — the combined cost per GB is several times CloudWatch's ingest ($0.50/GB) + storage ($0.03/GB) total. For debugging, use CloudWatch Logs Insights instead — query on demand, pay per GB scanned ($0.50/GB), not per GB ingested or indexed.

“Datadog charges separately for log ingestion and for indexing logs to make them searchable — to query logs during incident response, they need to be indexed.”

— datadoghq.com/pricing, verified June 2026

# Find errors in the last hour across all services
aws logs start-query \
 --log-group-name "/aws/ecs/prod-api" \
 --start-time $(date -v-1H +%s) \
 --end-time $(date +%s) \
 --query-string "fields @timestamp, @message | filter @message like /ERROR/ | sort @timestamp desc | limit 50"

# For compliance: subscription filter → S3 (cheap, durable)
aws logs put-subscription-filter \
 --log-group-name "/aws/ecs/prod-api" \
 --filter-name "AllToS3" \
 --filter-pattern "" \
 --destination-arn "arn:aws:firehose:..."

Step 4 — Find which service costs the most

One Insights query grouping by log stream and sorting by byte volume identifies which ECS service is responsible for the majority of your CloudWatch bill — run it in under 5 minutes. You don't know which service is responsible until you run it.

Total CloudWatch cost is $400 — but which of your 15 services is responsible for $300 of it? This Insights query tells you in 5 minutes.

# Top log producers by byte volume (last 7 days)
aws logs start-query \
 --log-group-name "/aws/ecs/prod-api" \
 --start-time $(date -v-7d +%s) \
 --end-time $(date +%s) \
 --query-string "stats sum(strlen(@message)) as totalBytes by @logStream | sort totalBytes desc | limit 10"

Once you know which service generates the most logs, go to that service and do three things: (1) check its log level, (2) check if it's logging stack traces on every request, (3) check if it's logging health check pings. Those three fix 90% of high-volume log problems. And when you're done with CloudWatch, the next invisible cost is per-environment attribution.


Find what your fleet is spending: fortem.dev/audit