Lemma Editor Specification (Canonical)¶

Purpose and scope¶

The Lemma Editor is the canonical administrative interface for editing lexical entries in Spoke Plus Content Bank. It is not a generic CRUD form; it is a structured editorial workspace aligned with the System Contract, current schema snapshot, and Content Bank operations architecture.

This specification is normative for future automation and manual changes.

Source-of-truth alignment¶

The Lemma Editor contract must remain aligned with:

System Contract (docs/system/system-contract.md)
System Map (docs/system/system-map.json)
Current Database Schema (docs/system/schema-current.json)
Latest Schema docs (docs/schema/latest-schema.md + docs/schema/latest-schema.json)
Content Bank operational docs (docs/content-bank/WORD_DETAIL_PANEL_SPEC.md, docs/content-bank/CONTENT_BANK_OPS_AUTOMATION.md)

UI architecture contract (required)¶

The Lemma Editor UI must follow this component hierarchy:

LemmaEditorPage
LemmaEditorProvider (state orchestration + data loading)
- LemmaEditorLayout
- Tab components:
  - CoreTab
  - TaxonomiesTab
  - MorphologyTab
  - SensesTab
  - TranslationsTab
  - SentencesTab
  - SemanticRelationsTab
  - PronunciationTab
  - ChunksTab

Non-regression rule¶

The Lemma Editor must not be replaced by a simplified CRUD interface.
Any refactor must preserve the tabbed editorial architecture and section-level responsibilities.

Editorial sections and canonical database mapping¶

Each editor section maps to canonical tables and must preserve these relationships:

Core → lemmas
Taxonomies → taxonomy_categories + taxonomy_values (+ pivot usage via content_item_taxonomies)
Morphology (lemma_forms) → lemma_forms
Senses → senses
Translations → translations
Sentences → sentences + sentence_lemmas
Semantic Relations (lexical_relations) → lexical_relations
Pronunciation / TTS → lemma_pronunciations + tts_assets
Chunks / Sentence usage → chunk_components

These mappings are mandatory and must not be collapsed into ad-hoc structures.

Section behavior contract¶

Core¶

Owns lexical identity and base editorial fields (for example: lemma, lifecycle status fields, media references used by editorial workflows).
Reads and persists canonical lemma-level data.

Taxonomies¶

Presents category/value selections from taxonomy_categories and taxonomy_values.
Applies lemma classification through the taxonomy join model used by Content Bank.

Morphology¶

Manages inflected or derived forms in lemma_forms.

Senses¶

Manages semantic sense entries in senses.

Translations¶

Exposes lemma/sense translation data contract backed by the translations table family.

Sentences¶

Manages usage examples with sentence rows and lemma linkage (sentences, sentence_lemmas).

Semantic Relations¶

Manages lexical links and collocational relations in lexical_relations.

Pronunciation / TTS¶

Handles pronunciation metadata and synthesized/manually managed audio assets (lemma_pronunciations, tts_assets).

Chunks / Sentence usage¶

Manages multiword composition references in chunk_components.

Governance rule: protected implementation files¶

The following files are protected and may only receive minimal, targeted patches:

web/src/app/admin/content-bank/lemmas/[id]/page.tsx
web/src/app/components/content-bank/LemmaEditor.tsx

Automation tools must never fully rewrite these files.