dwh_auditor.models — Internal data model for DWH auditing (Pydantic)¶
The dwh_auditor.models package defines the “type contract” when each layer (Extractor → Analyzer → Reporter) passes data. By using the Pydantic model rather than passing dict types directly, we achieve both static type checking and runtime validation.
Query job model (models.job)¶
BigQuery query job data model.
- class dwh_auditor.models.job.QueryJob(*, job_id, user_email, query, creation_time, total_bytes_billed=0, cache_hit=False, referenced_tables=<factory>, statement_type='SELECT')[source]¶
Bases:
BaseModelA model that represents query job history in BigQuery.
It serves as the data contract that the Extractor layer retrieves from BQ’s INFORMATION_SCHEMA.JOBS and passes it to the Analyzer layer.
- Parameters:
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Table storage model (models.table)¶
BigQuery table storage data model.
- class dwh_auditor.models.table.TableStorage(*, project_id, dataset_id, table_id, total_logical_bytes=0, total_physical_bytes=0, active_logical_bytes=0)[source]¶
Bases:
BaseModelA model that represents storage information for each BigQuery table.
It serves as a data contract that the Extractor layer retrieves from BQ’s INFORMATION_SCHEMA.TABLE_STORAGE and passes it to the Analyzer layer (zombie table detection).
- Parameters:
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Analysis result model (models.result)¶
Data model of analysis results.
Type definition for diagnostic results that the Analyzer layer passes to the Reporter layer.
- class dwh_auditor.models.result.CostInsight(*, job, estimated_cost_usd, scanned_tb)[source]¶
Bases:
BaseModelAnalysis results of high cost queries.
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class dwh_auditor.models.result.FullScanInsight(*, job, scanned_gb)[source]¶
Bases:
BaseModelAnalysis results of queries determined to be full scans.
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class dwh_auditor.models.result.TableUsageProfile(*, table, is_zombie, last_accessed_at=None, top_users=<factory>, access_count_30d=0, size_gb)[source]¶
Bases:
BaseModelテーブルの利用状況プロファイルとゾンビ判定結果.
- Parameters:
- table: TableStorage¶
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class dwh_auditor.models.result.RecurringCostInsight(*, query_hash, query_sample, execution_count, total_estimated_usd, total_scanned_tb, last_executed_at)[source]¶
Bases:
BaseModelバッチやBI等から定常的に実行されている高コストクエリの分析結果.
- Parameters:
- last_executed_at: datetime¶
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class dwh_auditor.models.result.AuditResult(*, analyzed_days, project_id, total_jobs_analyzed, total_tables_analyzed, top_expensive_queries=<factory>, recurring_expensive_queries=<factory>, full_scans=<factory>, table_profiles=<factory>)[source]¶
Bases:
BaseModelComprehensive audit result report finally output by the Analyzer layer.
- Parameters:
analyzed_days (int)
project_id (str)
total_jobs_analyzed (int)
total_tables_analyzed (int)
top_expensive_queries (list[CostInsight])
recurring_expensive_queries (list[RecurringCostInsight])
full_scans (list[FullScanInsight])
table_profiles (list[TableUsageProfile])
- top_expensive_queries: list[CostInsight]¶
- recurring_expensive_queries: list[RecurringCostInsight]¶
- full_scans: list[FullScanInsight]¶
- table_profiles: list[TableUsageProfile]¶
- model_config = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].