URL Convention System¶
Overview¶
The URL Convention is the single source of truth for canonical URLs in the generated site. It defines how documents are addressed and referenced throughout the system, independent of how they are physically stored.
Key Distinction: - URL Convention: Generates canonical URLs for the site (e.g., /profiles/abc123) - Adapter Implementation: Decides how to persist/serve those URLs (filesystem, database, API, etc.)
For Static Site Generators (SSG) like MkDocs, URLs map to filesystem paths. But other adapters (e.g., database + FastAPI) could serve the same URLs without any files.
Architecture¶
Core Components¶
UrlConvention(Protocol/Interface)- Defines the contract for canonical URL generation
-
Location:
src/egregora/data_primitives/protocols.py -
StandardUrlConvention(Implementation) - Concrete implementation of URL generation rules
- Generates URLs like
/profiles/{uuid},/posts/{slug}, etc. -
Location:
src/egregora/output_adapters/conventions.py -
Output Adapter (Implementation-Specific)
- Uses URL convention to get canonical URLs
- Decides how to persist/serve documents at those URLs
- MkDocs: Translates URLs to filesystem paths
- Database: Could insert into DB with URL as identifier
- API: Could register URL routes
Document Persistence Flow¶
URL Convention Rules by Document Type¶
These are the canonical URLs used across the site, regardless of adapter implementation.
PROFILE Documents¶
Metadata Required: uuid (author's UUID)
Canonical URL: /profiles/{uuid}
Example:
| Python | |
|---|---|
Generated URL: /profiles/d65bbd29-dccb-55bb-a839-bed92ffe262b
Adapter-Specific Storage: - MkDocs: docs/profiles/d65bbd29-dccb-55bb-a839-bed92ffe262b.md - Database: profiles table with url column - API: GET /profiles/d65bbd29-dccb-55bb-a839-bed92ffe262b
Key Rules: - Uses full UUID from metadata (not truncated) - No date prefix in URL - URL is stable across regenerations
POST Documents¶
Metadata Required: slug, date
Canonical URL: /posts/{slug}
Example:
| Python | |
|---|---|
Generated URL: /posts/my-blog-post
Adapter-Specific Storage: - MkDocs: docs/posts/2025-03-15-my-blog-post.md (date in filename) - Database: posts table with url, date, slug columns - API: GET /posts/my-blog-post (date in metadata)
Key Rules: - URL uses slug only (clean URLs) - Slug is normalized (lowercase, hyphens) - Date stored in metadata, not URL (for Jekyll/MkDocs compat)
JOURNAL Documents¶
Metadata Required: window_label or slug
Canonical URL: /journal/{safe_label}
Example:
| Python | |
|---|---|
Generated URL: /journal/2025-03-15-08-00-12-00
Key Rules: - Label is slugified (special characters converted to hyphens) - Represents time windows or agent memory snapshots
MEDIA Documents¶
Metadata Required: filename, media_hash
Canonical URL: /media/{type}/{identifier}.{ext}
Example:
| Python | |
|---|---|
Generated URL: /media/images/abc123.jpg
Key Rules: - Hash-based naming prevents collisions - Preserves file extension - Organized by media type (images/, videos/, etc.)
ENRICHMENT_MEDIA Documents¶
Metadata Required: parent_id, parent_slug
Canonical URL: /media/{type}/{parent_slug}
Example:
| Python | |
|---|---|
Generated URL: /media/images/vacation-photo (matches /media/images/vacation-photo.jpg)
Key Rules: - Enrichment documents describe media files - URL pattern aligns with paired media file
ENRICHMENT_URL Documents¶
Canonical URL: Variable (uses suggested_path if available)
Example:
| Python | |
|---|---|
Generated URL: /media/urls/article-abc123/
Key Rules: - Often uses suggested_path for flexibility - Falls back to hash-based naming if no suggested_path
Why This Matters¶
1. Single Source of Truth for URLs¶
The URL convention is the only authority for canonical URLs: - How documents are addressed in the site - How documents are referenced across templates - How links are generated
What it does NOT dictate: - Physical storage location (adapter-specific) - Storage format (files, DB, API, etc.) - Implementation details
2. Adapter Independence¶
The same URL can be served by different adapters:
3. Consistency Across References¶
All parts of the system use the same canonical URL: - Writer agent: [Profile](/profiles/abc123) - Template rendering: <a href="/profiles/abc123"> - Navigation menus: {url: "/profiles/abc123"} - Internal cross-references: /profiles/abc123
This ensures no broken links regardless of adapter.
4. Prevents Duplicates¶
The duplicate profile files issue occurred because: - ❌ Legacy code bypassed URL convention and manually created files - ✅ Modern code uses URL convention → adapter persistence
With a single URL source, duplicates are impossible.
Implementation: URL Generation¶
conventions.py¶
Implementation: MkDocs Adapter (Filesystem)¶
adapter.py¶
This is MkDocs-specific - other adapters would implement differently.
Anti-Patterns to Avoid¶
❌ Don't Bypass persist()¶
| Python | |
|---|---|
Why it's wrong: - Bypasses URL convention (no canonical URL generated) - May use wrong path format - Breaks link consistency across site - Not tracked in adapter's index
❌ Don't Assume Storage Format¶
| Python | |
|---|---|
Why it's wrong: - Assumes SSG adapter (might be database) - Path format is adapter-specific (MkDocs adds date prefix) - Fragile to adapter changes
❌ Don't Manually Construct URLs¶
| Python | |
|---|---|
Why it's wrong: - May not match canonical URL (short vs full UUID) - Leads to broken links - Inconsistent with URL convention
✅ Always Use URL Convention¶
Testing URL Conventions¶
Unit Tests Should Verify¶
- Canonical URL generation for each document type
- URL stability (same metadata → same URL)
- URL format matches expected patterns
- Metadata extraction uses correct fields
- Edge cases (missing metadata, special characters, etc.)
Do NOT test: Adapter-specific storage paths (those are adapter tests)
Example Test¶
Summary¶
Remember: 1. URL convention generates canonical URLs for the site 2. Adapters decide how to persist/serve those URLs 3. Same URL can be served by files (SSG), database, API, etc. 4. All documents must flow through persist() to get canonical URLs 5. Never bypass URL convention with manual storage code 6. Document metadata drives URL generation 7. Each document type has specific URL patterns
When designing: - Think: "What URL should this document have in the site?" - Don't think: "Where should I save this file?"
When in doubt: Check StandardUrlConvention.canonical_url() to see how your document type generates URLs.