Skip to content

Lemma Editor Specification (Canonical)

Purpose and scope

The Lemma Editor is the canonical administrative interface for editing lexical entries in Spoke Plus Content Bank. It is not a generic CRUD form; it is a structured editorial workspace aligned with the System Contract, current schema snapshot, and Content Bank operations architecture.

This specification is normative for future automation and manual changes.

Source-of-truth alignment

The Lemma Editor contract must remain aligned with:

  • System Contract (docs/system/system-contract.md)
  • System Map (docs/system/system-map.json)
  • Current Database Schema (docs/system/schema-current.json)
  • Latest Schema docs (docs/schema/latest-schema.md + docs/schema/latest-schema.json)
  • Content Bank operational docs (docs/content-bank/WORD_DETAIL_PANEL_SPEC.md, docs/content-bank/CONTENT_BANK_OPS_AUTOMATION.md)

UI architecture contract (required)

The Lemma Editor UI must follow this component hierarchy:

  • LemmaEditorPage
  • LemmaEditorProvider (state orchestration + data loading)
    • LemmaEditorLayout
    • Tab components:
      • CoreTab
      • TaxonomiesTab
      • MorphologyTab
      • SensesTab
      • TranslationsTab
      • SentencesTab
      • SemanticRelationsTab
      • PronunciationTab
      • ChunksTab

Non-regression rule

  • The Lemma Editor must not be replaced by a simplified CRUD interface.
  • Any refactor must preserve the tabbed editorial architecture and section-level responsibilities.

Editorial sections and canonical database mapping

Each editor section maps to canonical tables and must preserve these relationships:

  1. Corelemmas
  2. Taxonomiestaxonomy_categories + taxonomy_values (+ pivot usage via content_item_taxonomies)
  3. Morphology (lemma_forms) → lemma_forms
  4. Sensessenses
  5. Translationstranslations
  6. Sentencessentences + sentence_lemmas
  7. Semantic Relations (lexical_relations) → lexical_relations
  8. Pronunciation / TTSlemma_pronunciations + tts_assets
  9. Chunks / Sentence usagechunk_components

These mappings are mandatory and must not be collapsed into ad-hoc structures.

Section behavior contract

Core

  • Owns lexical identity and base editorial fields (for example: lemma, lifecycle status fields, media references used by editorial workflows).
  • Reads and persists canonical lemma-level data.

Taxonomies

  • Presents category/value selections from taxonomy_categories and taxonomy_values.
  • Applies lemma classification through the taxonomy join model used by Content Bank.

Morphology

  • Manages inflected or derived forms in lemma_forms.

Senses

  • Manages semantic sense entries in senses.

Translations

  • Exposes lemma/sense translation data contract backed by the translations table family.

Sentences

  • Manages usage examples with sentence rows and lemma linkage (sentences, sentence_lemmas).

Semantic Relations

  • Manages lexical links and collocational relations in lexical_relations.

Pronunciation / TTS

  • Handles pronunciation metadata and synthesized/manually managed audio assets (lemma_pronunciations, tts_assets).

Chunks / Sentence usage

  • Manages multiword composition references in chunk_components.

Governance rule: protected implementation files

The following files are protected and may only receive minimal, targeted patches:

  • web/src/app/admin/content-bank/lemmas/[id]/page.tsx
  • web/src/app/components/content-bank/LemmaEditor.tsx

Automation tools must never fully rewrite these files.