Quickstart — Installation and Initial Audit

This page aims to guide you through installing dwh-auditor and executing your first BigQuery audit in under 5 minutes.

Prerequisites

Requirement

Details

Python Version

3.9 or higher

GCP Authentication

gcloud auth application-default login or GOOGLE_APPLICATION_CREDENTIALS

IAM Permissions

roles/bigquery.metadataViewer / roles/bigquery.resourceViewer

Installation

Installation via pip

pip install dwh-auditor

Installation Check:

dwh-auditor --help

Installation via uv (Recommended)

Using uv provides faster dependency resolution and high reproducibility.

uv add dwh-auditor

Installation from source (Developers)

git clone https://github.com/shirokurolab/dwh-auditor.git
cd dwh-auditor

# Create virtual environment and install dependencies
uv venv && uv pip install -e ".[dev]"

GCP Authentication Settings

Application Default Credentials (Recommended)

Using ADC from the Google Cloud SDK is the simplest approach for local development.

gcloud auth application-default login

Using Service Account Keys (CI/CD environments)

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

Warning

Service account keys (JSON files) must never be committed to Git repositories. For CI/CD environments, use GitHub Actions Secrets or Workload Identity Federation.

Generating Configuration File (init command)

Generate a configuration file template before running an audit.

dwh-auditor init

A config.yaml file will be generated in the current directory. You can use it to customize cost rates and thresholds. For details, refer to Configuration File (config.yaml).

Executing Audit (analyze command)

Basic Usage

# Analyze my-gcp-project for the past 30 days and output to console
dwh-auditor analyze --project my-gcp-project --days 30

Specifying Tokyo Region

dwh-auditor analyze \
  --project my-gcp-project \
  --region region-asia-northeast1 \
  --days 30

Generating Markdown Report

dwh-auditor analyze \
  --project my-gcp-project \
  --days 30 \
  --output markdown \
  --report-path audit_report.md

Tip

--output markdown を指定すると GitHub Actions の Artifact として保存し、チームに共有するのに便利です。

JSON を出力する場合

dwh-auditor analyze \
  --project my-gcp-project \
  --output json > result.json

複数のコンピューティングプロジェクトをまたいで分析する場合

ストレージ(データ用プロジェクト)とコンピューティング(クエリ実行用プロジェクト)が分かれている場合、 クエリが実行されるプロジェクトのリストをカンマ区切りで渡します。

dwh-auditor analyze \
  --project my-storage-project \
  --job-projects my-compute-prj1,my-compute-prj2 \
  --days 30

Command Reference

List of ‘analyze’ Command Options

Usage: dwh-auditor analyze [OPTIONS]

Options:

  -p, --project TEXT       分析対象の GCP プロジェクト ID  [required]
  -jp, --job-projects TEXT クエリが実行されるプロジェクトのカンマ区切りリスト
  -r, --region TEXT        BigQuery のリージョン           [default: region-us]
  -d, --days INTEGER       過去何日分を分析するか          [default: 30]
  -c, --config TEXT        設定ファイルのパス              [default: config.yaml]
  -o, --output TEXT        出力形式: console, markdown または json [default: console]
      --report-path TEXT   Markdown レポートの出力先       [default: report.md]
  --help                   ヘルプを表示

Next Steps