dwh_auditor.extractor — BigQuery metadata extraction layer¶
The dwh_auditor.extractor package is the only layer that takes metadata from BigQuery’s INFORMATION_SCHEMA and transforms it into a Pydantic model.
Warning
Direct import of the google.cloud.bigquery library is limited to bigquery.py in this package. Never import from Analyzer/Reporter/CLI. Due to this restriction, when testing, you can only mock BigQueryExtractor and test all other layers without mocking.
Test method:
def test_get_job_history(mocker):
# Only mock google.cloud.bigquery.Client
mock_client = mocker.patch("dwh_auditor.extractor.bigquery.bq.Client")
mock_client.return_value.query.return_value.result.return_value = [
{"job_id": "j1", "user_email": "u@e.com", ...}
]
extractor = BigQueryExtractor(project_id="my-project", region="region-us")
jobs = extractor.get_job_history(days=30)
assert len(jobs) == 1
BigQuery metadata extraction layer.
Warning: Only this module can import google.cloud.bigquery. It should not be imported directly from other modules (analyzer/, reporter/, main.py).