Cloud Costs Monitor – Multi-Cloud Cost Management System | Dmitry Konovalov

This project is a self-hosted, multi-cloud cost management system for AWS and Azure that I designed and built from scratch. It runs daily to collect granular cost line items at the service × region × account level, detects spend anomalies, sends configurable alerts, and produces a rich suite of HTML reports — from executive summaries and rolling-window trend views to per-owner and per-department drilldowns and an annual budget tracker. All data is stored locally in a SQLite database, keeping the system lightweight and self-contained with no external SaaS dependencies.

Key Contributions

Multi-cloud data collection via AWS Cost Explorer and Azure Cost Management APIs, stored as granular service × region × account × date line items in SQLite.
27 HTML report types generated by an interactive PowerShell menu (reports.ps1), including Executive Summary, Rolling Window, Executive/Rolling Owners variants, Per-Owner, Department, and Alert Owner screening reports.
Anomaly detection engine that flags spend changes only when both a percentage threshold (default 15 %) and an absolute USD threshold (default $100) are exceeded simultaneously, with a configurable noise floor to suppress false positives from negligible-cost services.
Budget Tracker showing YTD spend vs annual budget with a pro-rated period budget, an annualised forecast based on daily run rate, a progress bar with a pace marker, and four status badges (ON TRACK / AT RISK / OVER BUDGET / UNBUDGETED).
Owner-centric YAML model (owners.yaml) mapping cloud account IDs to named owners and departments with annual budgets, email delivery targets, and integrity checks (double-count and over-allocation warnings).
Efficient historical backfill using date-range API calls rather than day-by-day calls, supporting a -FillGaps mode and safe re-runs via UPSERT semantics.
Alerting via SMTP email and webhooks (Slack, PagerDuty, custom HTTP) with per-owner email delivery that auto-CCs department heads and attaches the HTML report as a MIME attachment.
PowerShell wrappers for Windows (with Task Scheduler support) and Bash/Python entry points for Linux/macOS cron.
Modular code architecture: refactored a 2,500-line executive_report.py monolith into three focused modules — report_utils (formatting, I/O, email), report_queries (all SQL), and report_html (CSS, print helpers, HTML renderers) — with full backward-compatibility re-exports so no consumer files required changes.
88 automated tests: 12 analyzer tests, 26 budget calculation tests, 22 config-loader tests, 8 database round-trip tests, and 20 PowerShell smoke tests for all report types.

Highlights

All HTML reports are self-contained, fully styled files — emailable directly or opened in any browser with no server required.
Dual-threshold detection prevents alert fatigue: a service jumping from $1 to $2 (100 % but negligible USD) does not trigger an alert.
Azure auto-discovery: if subscriptions: is omitted from config.yaml, the system discovers all subscriptions for a tenant via az account list.
Azure rate-limit handling: automatic exponential backoff with Retry-After header respect, plus a configurable inter-subscription delay to avoid 429 errors during large backfills.
Budget over-allocation and account double-count issues surface as visible warning banners in every affected report rather than silently skewing numbers.
The system explicitly does not replace cloud-native billing UIs — it aggregates and reports across providers for a single unified view.

Architecture

The system follows a layered, plugin-style design:

Cloud APIs (AWS Cost Explorer / Azure Cost Management)
        │
        ▼
src/providers/*       ← BaseProvider ABC; one implementation per cloud
        │
        ▼
src/database.py       ← SQLite upsert into daily_costs + cost_line_items
        │
   ┌────┴────────────────┐
   ▼                     ▼
src/analyzers/*      src/*_report.py
(anomaly detect)     (HTML generation)
   │                     │
   ▼                     ▼
src/notifiers/*      reports/*.html
(email / webhook)

Key source modules

Module	Purpose
`src/main.py`	Daily run orchestrator — CLI entry point
`src/config_loader.py`	YAML → typed dataclasses
`src/database.py`	SQLite read/write with UPSERT
`src/budget_calc.py`	`BudgetStatus` dataclass, YTD/forecast maths
`src/report_utils.py`	Shared formatting helpers
`src/report_queries.py`	All SQL query functions
`src/report_html.py`	CSS constant, inline HTML helpers
`src/executive_report.py`	Executive summary assembler
`src/rolling_report.py`	Rolling N-day HTML report
`src/owner_report.py`	Per-owner scoped rolling report
`src/department_report.py`	Per-department scoped rolling report
`src/alert_owner_report.py`	Screens all owners; generates reports for flagged ones only
`src/aws_backfill.py`	AWS bulk historical backfill
`src/azure_backfill.py`	Azure bulk historical backfill
`src/analyzers/threshold.py`	Percentage and absolute threshold checks
`src/notifiers/email.py`	SMTP delivery with HTML attachment
`src/notifiers/webhook.py`	HTTP POST (Slack / PagerDuty format)

Database schema (SQLite)

Two tables with UNIQUE constraints and UPSERT semantics make all operations idempotent:

daily_costs — aggregated daily total per account (provider, account_id, cost_date, total_cost, currency).

cost_line_items — granular service × region records (provider, account_id, cost_date, service_name, region, cost, currency, plus JSON columns for provider-specific dimensions, tags, and metadata).

Reports Overview

Report	`-Report` flag	Default window	Budget Tracker
Executive Summary	`Executive`	Full month	Yes
Executive Owners	`ExecutiveOwners`	Full month	Yes
Rolling Window	`Rolling`	7 days	Yes
Rolling Owners	`RollingOwners`	7 days	Yes
Per-Owner	`OwnerReport`	30 days	Yes
All Owners (batch)	`AllOwners`	30 days	Yes
Department	`DeptReport`	30 days	Yes
All Departments (batch)	`AllDepts`	30 days	Yes
Alert Owner	`AlertOwners`	Configurable	No
Daily Trend → Forecast	SQL menu 1–12	Varies	No

# Executive report for January, emailed after generation
.\reports.ps1 -Report Executive -Start 2026-01-01 -End 2026-01-31 -Format HTML -Email

# Rolling 14-day owners report with tighter thresholds
.\reports.ps1 -Report RollingOwners -Days 14 -Format HTML -Threshold 10 -ThresholdUsd 500

# Daily alerting run — email flagged owners only, department head CC'd
.\reports.ps1 -Report AlertOwners -Days 1 -Format HTML -Threshold 10 -ThresholdUsd 200 -Email

# Backfill last 90 days, filling only missing dates
.\backfill.ps1 -Days 90 -FillGaps

Anomaly Detection

On each daily run the engine compares every account × service × region trio from yesterday against the prior day. A notable change is flagged only when all of the following conditions hold simultaneously, preventing noise:

|Δ%| ≥ percentage_change threshold (default 15 %)
|Δ USD| ≥ absolute_change threshold (default $100)
Prior cost ≥ min_cost_for_percentage floor (default $10) — waived for tiny services

Decreases are suppressed by default (alert_on_decrease: false) and can be enabled when budget reclamation tracking is needed.

Budget Tracker

Each HTML report with a Budget Tracker section displays:

Column	Description
Yr Budget	Annual budget in USD from `owners.yaml`
YTD Spend	Spend from fiscal year start to report end date
YTD %	Progress bar with a pace marker at the expected YTD %
Period Budget	Pro-rated share of the annual budget for the report period
Period Spend	Actual spend for the period
Forecast	Annualised forecast based on daily run rate
Status	ON TRACK / AT RISK / OVER BUDGET / UNBUDGETED badge

Status thresholds: ON TRACK ≤ 90 % of budget; AT RISK 90–110 %; OVER BUDGET > 110 %; UNBUDGETED when no budget is configured.

Scheduling

Windows — Task Scheduler

$action  = New-ScheduledTaskAction -Execute "powershell.exe" `
               -Argument "-NonInteractive -File C:\cloudcosts\run.ps1" `
               -WorkingDirectory "C:\cloudcosts"
$trigger = New-ScheduledTaskTrigger -Daily -At "08:00"
Register-ScheduledTask -TaskName "CloudCostsMonitor" -Action $action -Trigger $trigger -RunLevel Highest

Linux / macOS — cron

# Collect costs daily at 8 AM
0 8 * * * /opt/cloudcosts/.venv/bin/python -m src.main >> /var/log/cloudcosts.log 2>&1

# Weekly executive report every Monday at 9 AM
0 9 * * 1 /opt/cloudcosts/.venv/bin/python -m src.executive_report \
    --start $(date -d "last month" +%Y-%m-01) \
    --end $(date -d "today" +%Y-%m-01) \
    --html --email >> /var/log/cloudcosts.log 2>&1

Testing

# All Python unit tests
.\.venv\Scripts\python.exe -m pytest tests/ -v

# PowerShell smoke tests (requires populated database)
.\tests\Test-Reports.ps1

# PowerShell linter (zero warnings expected)
Invoke-ScriptAnalyzer -Path . -Recurse -Settings PSScriptAnalyzerSettings.psd1

Test file	Count	Coverage
`test_analyzers.py`	12	Threshold engine, % and absolute checks, `alert_on_decrease`
`test_budget_calc.py`	26	`compute_budget_status()`, status thresholds, edge cases
`test_config_loader.py`	22	Config parsing, dataclass defaults, env-var resolution
`test_database.py`	8	Store/query round-trips, UPSERT idempotency
`Test-Reports.ps1`	20	Smoke tests for all 20 `reports.ps1` report types

Key Contributions#

Highlights#

Architecture#

Key source modules#

Database schema (SQLite)#

Reports Overview#

Anomaly Detection#

Budget Tracker#

Scheduling#

Windows — Task Scheduler#

Linux / macOS — cron#

Testing#