dwh-auditor β€” DWH Cost Audit & Governance ToolΒΆ

Python Versions License MIT PyPI version

dwh-auditor is an open-source CLI tool that parses BigQuery’s INFORMATION_SCHEMA to instantly perform cost optimization, security auditing, and governance enforcement for your cloud data warehouse.

DWH-Auditor Console Report Sample

Tip

It never accesses actual table data. Since it only reads metadata (INFORMATION_SCHEMA), it can be deployed instantly even in enterprise environments with strict security policies.

Key FeaturesΒΆ

#

Feature

Description

πŸ’Έ

Ad-hoc High-Cost Query Detection

Displays the Top-N ranking of single queries with the highest billed bytes over the past N days.

πŸ”„

Recurring Execution Alert (Periodic High-Cost Queries)

Detects queries executed periodically from batches or dashboards that have high accumulated costs.

🚨

Full Scan Detection

Warns of inefficient full table scans caused by missing partition filters in the WHERE clause.

🧟

Zombie Table Detection

Identifies tables that have not been referenced for a long time to visualize unnecessary storage costs.

πŸ“Š

Multi-format Output (Markdown / JSON)

Integrate into CI/CD to save as GitHub Actions Artifacts or output jq-parsable results.

QuickstartΒΆ

pip install dwh-auditor

# Generate a configuration file
dwh-auditor init

# Audit BigQuery project (Console output)
dwh-auditor analyze --project my-gcp-project --days 30

# Generate Markdown report
dwh-auditor analyze --project my-gcp-project --output markdown

Documentation Table of ContentsΒΆ

Deployment

Required IAM PermissionsΒΆ

dwh-auditor only reads metadata, requiring minimal permissions.

IAM Role

Usage

roles/bigquery.metadataViewer

View dataset and table metadata

roles/bigquery.resourceViewer

View job history (INFORMATION_SCHEMA.JOBS)

Warning

Permissions beyond roles/bigquery.dataViewer are NOT required. It does not access the actual data (records) in the tables.

Index and SearchΒΆ