Enricher Agent¶
enricher ¶
Enrichment agent logic for processing URLs and media.
This module implements the enrichment workflow using Pydantic-AI agents. It provides: - EnrichmentWorker (Orchestrates URL and Media enrichment) - Async orchestration via enrich_table
EnrichmentOutput ¶
Bases: BaseModel
Structured output for enrichment agents.
EnrichmentRuntimeContext dataclass ¶
EnrichmentRuntimeContext(
cache: EnrichmentCache,
output_sink: Any,
site_root: Path | None = None,
duckdb_connection: Backend | None = None,
target_table: str | None = None,
usage_tracker: UsageTracker | None = None,
pii_prevention: dict[str, Any] | None = None,
task_store: Any | None = None,
)
Runtime context for enrichment execution.
MediaEnrichmentConfig dataclass ¶
Config for media enrichment enqueueing.
EnrichmentWorker ¶
Bases: BaseWorker
Worker for media enrichment (e.g. image description).
Source code in src/egregora/agents/enricher.py
enrichment_config property ¶
Get effective enrichment configuration.
close ¶
Explicitly close the ZIP handle to release resources.
Should be called when done with the worker. Also called by exit for context manager support.
Source code in src/egregora/agents/enricher.py
__enter__ ¶
__exit__ ¶
__exit__(
_exc_type: type[BaseException] | None,
_exc_val: BaseException | None,
_exc_tb: TracebackType | None,
) -> None
Context manager exit - ensures ZIP handle is closed.
run ¶
Process pending enrichment tasks in batches.
Source code in src/egregora/agents/enricher.py
load_file_as_binary_content ¶
Load a file as BinaryContent for pydantic-ai agents.
Source code in src/egregora/agents/enricher.py
fetch_url_with_jina async ¶
Fetch URL content using Jina.ai Reader.
Use this tool ONLY if the standard 'WebFetchTool' fails to retrieve meaningful content. Examples of when to use this: - The standard fetch returns "JavaScript is required" or "Access Denied" (403/429). - The content is empty or contains only cookie/GDPR banners. - The page is a Single Page Application (SPA) that didn't render.
Source code in src/egregora/agents/enricher.py
schedule_enrichment ¶
schedule_enrichment(
messages_table: Table,
media_mapping: MediaMapping,
enrichment_settings: EnrichmentSettings,
context: EnrichmentRuntimeContext,
run_id: UUID | None = None,
) -> None
Schedule enrichment tasks for background processing.