dwh_auditor.analyzer — DWH cost analysis/security diagnosis logic¶
The dwh_auditor.analyzer package is pure Python logic that performs diagnostics by matching the Pydantic model received from the Extractor with the thresholds in config.yaml.
Note
This package does not import google.cloud.bigquery at all. ** Since there is no external API communication, unit tests complete in milliseconds. You can test it by simply passing a dummy QueryJob / TableStorage object.
Cost analysis (analyzer.cost)¶
Analysis logic for high cost queries.
Note: This module must not import google.cloud.bigquery at all. Configure only pure Python logic and ensure unit tests complete in milliseconds.
定常コスト分析 (analyzer.recurring)¶
定常実行クエリの分析ロジック.
Note: This module must not import google.cloud.bigquery at all.
Full scan detection (analyzer.scan)¶
Full scan (inefficient query) detection logic.
Note: This module must not import google.cloud.bigquery at all. Configure only pure Python logic and ensure unit tests complete in milliseconds.
- dwh_auditor.analyzer.scan.detect_full_scans(jobs, tables, config)[source]¶
Detect queries that may result in a full scan.
バイト比率バリデーション方式: クエリの課金バイト数が参照テーブルの物理サイズの 90% 以上ならフルスキャンとみなす。
- Parameters:
tables (list[TableStorage]) – テーブルサイズ情報をルックアップするためのリスト
config (AppConfig) – しきい値
- Returns:
FullScanInsight のリスト
- Return type:
Zombie table detection (analyzer.zombie)¶
テーブルプロファイリング・およびゾンビ(未使用)判定ロジック.
Note: This module must not import google.cloud.bigquery at all. Configure only pure Python logic and ensure unit tests complete in milliseconds.
Analysis Runner (analyzer.runner)¶
Analysis runner: Calls each Analyzer and aggregates it into an AuditResult.
Note: This module must not import google.cloud.bigquery at all.
- dwh_auditor.analyzer.runner.run_analysis(top_cost_jobs, heavy_scan_jobs, recurring_stats, table_usages, tables, config, analyzed_days, project_id)[source]¶
Runs all analyzers and returns comprehensive audit results.
- Parameters:
- Returns:
AuditResult object
- Return type: