dwh-auditor β DWH Cost Audit & Governance ToolΒΆ
dwh-auditor is an open-source CLI tool that parses BigQueryβs INFORMATION_SCHEMA to instantly perform cost optimization, security auditing, and governance enforcement for your cloud data warehouse.
Tip
It never accesses actual table data. Since it only reads metadata (INFORMATION_SCHEMA), it can be deployed instantly even in enterprise environments with strict security policies.
Key FeaturesΒΆ
# |
Feature |
Description |
|---|---|---|
πΈ |
Ad-hoc High-Cost Query Detection |
Displays the Top-N ranking of single queries with the highest billed bytes over the past N days. |
π |
Recurring Execution Alert (Periodic High-Cost Queries) |
Detects queries executed periodically from batches or dashboards that have high accumulated costs. |
π¨ |
Full Scan Detection |
Warns of inefficient full table scans caused by missing partition filters in the |
π§ |
Zombie Table Detection |
Identifies tables that have not been referenced for a long time to visualize unnecessary storage costs. |
π |
Multi-format Output (Markdown / JSON) |
Integrate into CI/CD to save as GitHub Actions Artifacts or output jq-parsable results. |
QuickstartΒΆ
pip install dwh-auditor
# Generate a configuration file
dwh-auditor init
# Audit BigQuery project (Console output)
dwh-auditor analyze --project my-gcp-project --days 30
# Generate Markdown report
dwh-auditor analyze --project my-gcp-project --output markdown
Documentation Table of ContentsΒΆ
Architecture
API Reference
- API Reference
- dwh_auditor.config β Load configuration files and manage DWH settings
- dwh_auditor.models β Internal data model for DWH auditing (Pydantic)
- dwh_auditor.extractor β BigQuery metadata extraction layer
- dwh_auditor.analyzer β DWH cost analysis/security diagnosis logic
- dwh_auditor.reporter β Audit result output/report generation layer
Deployment
Required IAM PermissionsΒΆ
dwh-auditor only reads metadata, requiring minimal permissions.
IAM Role |
Usage |
|---|---|
|
View dataset and table metadata |
|
View job history ( |
Warning
Permissions beyond roles/bigquery.dataViewer are NOT required. It does not access the actual data (records) in the tables.