Quickstart — Installation and Initial Audit¶
This page aims to guide you through installing dwh-auditor and executing your first BigQuery audit in under 5 minutes.
Prerequisites¶
Requirement |
Details |
|---|---|
Python Version |
3.9 or higher |
GCP Authentication |
|
IAM Permissions |
|
Installation¶
Installation via pip¶
pip install dwh-auditor
Installation Check:
dwh-auditor --help
Installation via uv (Recommended)¶
Using uv provides faster dependency resolution and high reproducibility.
uv add dwh-auditor
Installation from source (Developers)¶
git clone https://github.com/shirokurolab/dwh-auditor.git
cd dwh-auditor
# Create virtual environment and install dependencies
uv venv && uv pip install -e ".[dev]"
GCP Authentication Settings¶
Application Default Credentials (Recommended)¶
Using ADC from the Google Cloud SDK is the simplest approach for local development.
gcloud auth application-default login
Using Service Account Keys (CI/CD environments)¶
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
Warning
Service account keys (JSON files) must never be committed to Git repositories. For CI/CD environments, use GitHub Actions Secrets or Workload Identity Federation.
Generating Configuration File (init command)¶
Generate a configuration file template before running an audit.
dwh-auditor init
A config.yaml file will be generated in the current directory. You can use it to customize cost rates and thresholds. For details, refer to Configuration File (config.yaml).
Executing Audit (analyze command)¶
Basic Usage¶
# Analyze my-gcp-project for the past 30 days and output to console
dwh-auditor analyze --project my-gcp-project --days 30
Specifying Tokyo Region¶
dwh-auditor analyze \
--project my-gcp-project \
--region region-asia-northeast1 \
--days 30
Generating Markdown Report¶
dwh-auditor analyze \
--project my-gcp-project \
--days 30 \
--output markdown \
--report-path audit_report.md
Tip
--output markdown を指定すると GitHub Actions の Artifact として保存し、チームに共有するのに便利です。
JSON を出力する場合¶
dwh-auditor analyze \
--project my-gcp-project \
--output json > result.json
複数のコンピューティングプロジェクトをまたいで分析する場合¶
ストレージ(データ用プロジェクト)とコンピューティング(クエリ実行用プロジェクト)が分かれている場合、 クエリが実行されるプロジェクトのリストをカンマ区切りで渡します。
dwh-auditor analyze \
--project my-storage-project \
--job-projects my-compute-prj1,my-compute-prj2 \
--days 30
Command Reference¶
List of ‘analyze’ Command Options¶
Usage: dwh-auditor analyze [OPTIONS]
Options:
-p, --project TEXT 分析対象の GCP プロジェクト ID [required]
-jp, --job-projects TEXT クエリが実行されるプロジェクトのカンマ区切りリスト
-r, --region TEXT BigQuery のリージョン [default: region-us]
-d, --days INTEGER 過去何日分を分析するか [default: 30]
-c, --config TEXT 設定ファイルのパス [default: config.yaml]
-o, --output TEXT 出力形式: console, markdown または json [default: console]
--report-path TEXT Markdown レポートの出力先 [default: report.md]
--help ヘルプを表示
Next Steps¶
Configuration File (config.yaml) — Customize thresholds and cost rates in
config.yamlArchitecture — 3-Tier DWH Audit Architecture — Understand the internal design and 3-tier architecture of dwh-auditor
API Reference — Learn how to integrate it as a Python API