dwh_auditor.models — DWH 監査用内部データモデル (Pydantic)¶

dwh_auditor.models パッケージは、各層（Extractor → Analyzer → Reporter）がデータを受け渡す際の「型契約」を定義します。 dict 型を直接受け渡すのではなく Pydantic モデルを使うことで、静的型チェックとランタイムバリデーションの両方を実現します。

クエリジョブモデル (`models.job`)¶

BigQuery クエリジョブのデータモデル.

class dwh_auditor.models.job.QueryJob(*, job_id, user_email, query, creation_time, total_bytes_billed=0, cache_hit=False, referenced_tables=<factory>, statement_type='SELECT')[ソース]¶

ベースクラス: BaseModel

BigQuery のクエリジョブ履歴を表すモデル.

Extractor 層が BQ の INFORMATION_SCHEMA.JOBS から取得し、 Analyzer 層に渡す際のデータ契約として機能します。

パラメータ:

job_id (str)
user_email (str)
query (str)
creation_time (datetime)
total_bytes_billed (int)
cache_hit (bool)
referenced_tables (list[str])
statement_type (str)

job_id: str¶

user_email: str¶

query: str¶

creation_time: datetime¶

total_bytes_billed: int¶

cache_hit: bool¶

referenced_tables: list[str]¶

statement_type: str¶

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

テーブルストレージモデル (`models.table`)¶

BigQuery テーブルストレージのデータモデル.

class dwh_auditor.models.table.TableStorage(*, project_id, dataset_id, table_id, total_logical_bytes=0, total_physical_bytes=0, active_logical_bytes=0)[ソース]¶

ベースクラス: BaseModel

BigQuery テーブルごとのストレージ情報を表すモデル.

Extractor 層が BQ の INFORMATION_SCHEMA.TABLE_STORAGE から取得し、 Analyzer 層（ゾンビテーブル検知）に渡す際のデータ契約として機能します。

パラメータ:

project_id (str)
dataset_id (str)
table_id (str)
total_logical_bytes (int)
total_physical_bytes (int)
active_logical_bytes (int)

project_id: str¶

dataset_id: str¶

table_id: str¶

total_logical_bytes: int¶

total_physical_bytes: int¶

active_logical_bytes: int¶

property full_table_id: str¶: プロジェクト・データセット・テーブル名を結合した完全修飾名.

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

分析結果モデル (`models.result`)¶

分析結果のデータモデル.

Analyzer 層が Reporter 層に渡す診断結果の型定義です。

class dwh_auditor.models.result.CostInsight(*, job, estimated_cost_usd, scanned_tb)[ソース]¶

ベースクラス: BaseModel

高コストクエリの分析結果.

パラメータ:

job (QueryJob)
estimated_cost_usd (float)
scanned_tb (float)

job: QueryJob¶

estimated_cost_usd: float¶

scanned_tb: float¶

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class dwh_auditor.models.result.FullScanInsight(*, job, scanned_gb)[ソース]¶

ベースクラス: BaseModel

フルスキャンと判定されたクエリの分析結果.

パラメータ:

job (QueryJob)
scanned_gb (float)

job: QueryJob¶

scanned_gb: float¶

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class dwh_auditor.models.result.TableUsageProfile(*, table, is_zombie, last_accessed_at=None, top_users=<factory>, access_count_30d=0, size_gb)[ソース]¶

ベースクラス: BaseModel

テーブルの利用状況プロファイルとゾンビ判定結果.

パラメータ:

table (TableStorage)
is_zombie (bool)
last_accessed_at (datetime | None)
top_users (list[str])
access_count_30d (int)
size_gb (float)

table: TableStorage¶

is_zombie: bool¶

last_accessed_at: datetime | None¶

top_users: list[str]¶

access_count_30d: int¶

size_gb: float¶

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class dwh_auditor.models.result.RecurringCostInsight(*, query_hash, query_sample, execution_count, total_estimated_usd, total_scanned_tb, last_executed_at)[ソース]¶

ベースクラス: BaseModel

バッチやBI等から定常的に実行されている高コストクエリの分析結果.

パラメータ:

query_hash (str)
query_sample (str)
execution_count (int)
total_estimated_usd (float)
total_scanned_tb (float)
last_executed_at (datetime)

query_hash: str¶

query_sample: str¶

execution_count: int¶

total_estimated_usd: float¶

total_scanned_tb: float¶

last_executed_at: datetime¶

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class dwh_auditor.models.result.AuditResult(*, analyzed_days, project_id, total_jobs_analyzed, total_tables_analyzed, top_expensive_queries=<factory>, recurring_expensive_queries=<factory>, full_scans=<factory>, table_profiles=<factory>)[ソース]¶

ベースクラス: BaseModel

Analyzer 層が最終的に出力する総合監査結果レポート.

パラメータ:

analyzed_days (int)
project_id (str)
total_jobs_analyzed (int)
total_tables_analyzed (int)
top_expensive_queries (list[CostInsight])
recurring_expensive_queries (list[RecurringCostInsight])
full_scans (list[FullScanInsight])
table_profiles (list[TableUsageProfile])

analyzed_days: int¶

project_id: str¶

total_jobs_analyzed: int¶

total_tables_analyzed: int¶

top_expensive_queries: list[CostInsight]¶

recurring_expensive_queries: list[RecurringCostInsight]¶

full_scans: list[FullScanInsight]¶

table_profiles: list[TableUsageProfile]¶

model_config = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

dwh_auditor.models — DWH 監査用内部データモデル (Pydantic)¶

クエリジョブモデル (models.job)¶

テーブルストレージモデル (models.table)¶

分析結果モデル (models.result)¶

クエリジョブモデル (`models.job`)¶

テーブルストレージモデル (`models.table`)¶

分析結果モデル (`models.result`)¶