Configuration¶
MODERN (Phase 2-4): Egregora configuration lives in .egregora/config.yml, separate from rendering (MkDocs).
Configuration sources (priority order): 1. CLI arguments - Highest priority (one-time overrides) 2. .egregora/config.yml - Main configuration file 3. Defaults - Defined in Pydantic EgregoraConfig model
CLI Configuration¶
The egregora process command accepts many options:
| Bash | |
|---|---|
Core Options¶
| Option | Description | Default |
|---|---|---|
--output | Output directory for blog | . |
--timezone | Timezone for message timestamps | System timezone |
--step-size | Size of each processing window | 1 |
--step-unit | Unit: messages, hours, days | days |
--min-window-size | Minimum messages per window | 10 |
--from-date | Start date (YYYY-MM-DD) | First message |
--to-date | End date (YYYY-MM-DD) | Last message |
Model Configuration¶
| Option | Description | Default |
|---|---|---|
--model | Gemini model for writing | models/gemini-flash-latest |
--enricher-model | Model for URL/media enrichment | models/gemini-flash-latest |
--embedding-model | Model for embeddings | models/text-embedding-004 |
RAG Configuration¶
| Option | Description | Default |
|---|---|---|
--retrieval-mode | ann (approximate) or exact | ann |
--retrieval-nprobe | ANN search quality (1-100) | 10 |
--embedding-dimensions | Embedding dimensions | 768 |
Privacy Options¶
| Option | Description | Default |
|---|---|---|
--anonymize/--no-anonymize | Enable/disable name anonymization | True |
--detect-pii/--no-detect-pii | Enable/disable PII detection | True |
Feature Flags¶
| Option | Description | Default |
|---|---|---|
--enrich/--no-enrich | Enable URL/media enrichment | False |
--profile/--no-profile | Generate author profiles | False |
Environment Variables¶
MODERN: Only credentials live in environment variables (keep out of git).
| Bash | |
|---|---|
.egregora/config.yml¶
MODERN (Phase 2-4): Main configuration file (maps to Pydantic EgregoraConfig model).
Generated automatically by egregora init or egregora process on first run:
Location: .egregora/config.yml in site root (next to mkdocs.yml)
Advanced Configuration¶
Custom Prompt Templates¶
MODERN (Phase 2-4): Override prompts by placing custom Jinja2 templates in .egregora/prompts/.
Directory structure:
| Text Only | |
|---|---|
Priority: Custom prompts (.egregora/prompts/) override package defaults (src/egregora/prompts/).
Example: Override writer prompt
| Bash | |
|---|---|
Agents automatically detect and use custom prompts. Check logs for:
| Text Only | |
|---|---|
Database Configuration¶
Egregora stores persistent data in DuckDB:
- Location:
.egregora/egregora.db(by default) - Tables:
rag_chunks,annotations,elo_ratings
To use a different database:
| Bash | |
|---|---|
Cache Configuration¶
Egregora caches LLM responses to reduce API costs:
- Location:
.egregora/cache/(by default) - Type: Disk-based LRU cache using
diskcache
To clear the cache:
| Bash | |
|---|---|
Model Selection¶
Writer Models¶
For blog post generation:
gemini-flash-latest: Fast, creative, excellent for blog posts (recommended)
Enricher Models¶
For URL/media descriptions:
gemini-flash-latest: Fast, cost-effective (recommended)
Embedding Models¶
For RAG retrieval:
text-embedding-004: Latest, 768 dimensions (recommended)text-embedding-003: Older, 768 dimensions
Performance Tuning¶
Batch Sizes¶
Adjust batch sizes in src/egregora/utils/batch.py or through configuration:
Rate Limiting¶
Egregora automatically handles rate limits with exponential backoff. To customize:
| Python | |
|---|---|
Examples¶
High-Quality Blog¶
| Bash | |
|---|---|
Fast, Cost-Effective¶
| Bash | |
|---|---|
Privacy-Focused¶
Next Steps¶
- Architecture Overview - Understand the pipeline
- Privacy Model - Learn about anonymization
- API Reference - Dive into the code