Aevum Data Standards
Technical specification for structured knowledge ingestion, schema definition, metadata requirements, and interoperability guidelines for Aevum Encyclopedia contributors and integrators.
Overview
The Aevum Data Standards v2.0 define the canonical format for all encyclopedia entries, cross-references, and supplementary media. Built upon JSON-LD and semantic web principles, v2.0 introduces stricter typing, enhanced provenance tracking, and native support for multilingual knowledge graphs.
This specification is mandatory for all API submissions, bulk imports, and editorial pipeline integrations. Non-compliant data will be rejected during validation.
Core Principles
- Machine-Readable & Human-Readable: Every entry must maintain structural clarity while remaining editable by human contributors.
- Provenance-First: All claims require traceable source attribution with confidence scoring.
- Versioned & Immutable History: Data objects are append-only; updates create new versions rather than overwriting.
- Interoperable: Native mapping to schema.org, Wikidata, and DCMI standards ensures cross-platform compatibility.
Schema Structure
All entries must conform to the ae:EncyclopediaEntry context. Below is the minimal valid structure:
{
"@context": "https://schema.aevum.org/v2/entry",
"@type": "ae:EncyclopediaEntry",
"ae:id": "ae:entry/quantum_computing/0042",
"ae:title": {
"en": "Quantum Computing",
"es": "Computación Cuántica",
"zh": "量子计算"
},
"ae:summary": {
"en": "Quantum computing leverages quantum mechanical phenomena..."
},
"ae:body": [
{
"@type": "ae:Section",
"ae:heading": "Fundamental Principles",
"ae:content": "Superposition, entanglement, and interference..."
}
],
"ae:metadata": {
"ae:version": "2.0.0",
"ae:created": "2024-03-15T10:00:00Z",
"ae:modified": "2025-01-20T14:30:00Z",
"ae:contributors": ["ae:user/dr_rachel_kim", "ae:user/ai_verifier_v3"],
"ae:provenance": [
{
"ae:source": "https://arxiv.org/abs/quant-ph/2024.01",
"ae:confidence": 0.98,
"ae:type": "peer_reviewed"
}
]
}
}
@context URL must exactly match https://schema.aevum.org/v2/entry. Alternative contexts will fail schema validation.
Metadata Fields
Metadata governs indexing, search ranking, and editorial workflow. The following table outlines required and optional fields:
| Field | Type | Status | Description |
|---|---|---|---|
ae:id |
string (IRI) | Required | Unique identifier following ae:entry/{slug}/{hash} |
ae:tags |
array[string] | Optional | Classification tags for discovery. Max 12 per entry. |
ae:geo |
object | Optional | Geospatial coordinates for location-based topics |
ae:multimedia |
array[object] | Optional | Images, audio, video, or interactive datasets |
ae:disclaimer |
string | Optional | Editorial notes on disputed or evolving topics |
Validation & Quality
Before ingestion, all payloads pass through the Aevum Validation Pipeline (AVP). Key checks include:
- Schema Compliance: Strict JSON Schema draft-2020-12 validation against
ae:EncyclopediaEntry - Provenance Verification: External URLs are pinged; dead links or non-indexable sources trigger warnings.
- Linguistic Consistency: Multilingual fields undergo tone and register analysis to prevent machine-translation artifacts.
- Conflict Detection: NLP cross-references against existing knowledge graph to flag contradictory claims.
Submissions with critical errors are rejected immediately. Warnings are logged but allow conditional acceptance pending editorial review.
Versioning Strategy
Aevum follows Semantic Versioning (SemVer) for data standards:
MAJOR: Breaking changes (new required fields, deprecated contexts, structural overhauls)MINOR: Backwards-compatible additions (new optional fields, extended vocabularies)PATCH: Bug fixes, validation rule corrections, context URL updates
The API endpoint /v2/entries will remain stable for the lifetime of v2.x. Deprecation notices will be published 12 months prior to major transitions.
Migration from v1.0
Existing v1.0 datasets can be upgraded using the official ae-migrate CLI tool. Key changes include:
- Replaced flat
contentstring with structuredae:bodyarray - Deprecated
author_namein favor ofae:contributorswith IRI resolution - Added mandatory
ae:provenancearray - Language objects now require ISO 639-1 codes as keys
npm install -g @aevum/migrate
ae-migrate --input v1_dump.json --output v2_compliant.json --strict
ae:body restructuring is recommended before submission.
API & Implementation
Integrations should use the RESTful ingestion endpoint:
POST https://api.aevum.org/v2/entries
Headers:
Authorization: Bearer <YOUR_TOKEN>
Content-Type: application/ld+json
X-Aevum-Version: 2.0.0
Response codes:
201 Created— Validated and queued for publication400 Bad Request— Schema or validation failure409 Conflict— Duplicate or contradictory entry detected429 Too Many Requests— Rate limit exceeded (100 req/min for standard tier)
For high-volume partners, request a dedicated webhook pipeline with async processing and batch validation support.
Support & Feedback
Technical questions, schema proposals, and bug reports should be directed to:
- GitHub Issues:
github.com/aevum-encyclopedia/data-standards - Developer Discord:
discord.gg/aevum-dev - Email:
standards@aevum.org
Contributors are encouraged to review the Editorial Style Guide alongside this technical specification to ensure semantic and tonal consistency across the encyclopedia.