Schema Snapshot (Documentation Canonical View)¶
- Snapshot date: 2026-03-08
- Scope: Content Bank + Language Learning Engine
- Purpose: authoritative documentation aligned with implemented, taxonomy-driven architecture
Canonical Content Bank tables¶
vocabularylemma_formssensessense_translationssentencessentence_tokens
Canonical taxonomy tables¶
taxonomy_categoriestaxonomy_valuescontent_item_taxonomies
Taxonomy relationship:
vocabulary -> content_item_taxonomies -> taxonomy_values -> taxonomy_categories
Common category systems include:
- parts_of_speech
- cefr_levels
- frequency_bands
- semantic_domains
- registers
- lemma_types
- grammar_topics
- languages
Chunk and composition tables¶
vocabulary_componentschunk_components
Chunks are stored in vocabulary with type='chunk' and components must reference valid lemmas.
Grammar and media extension tables¶
lemma_assetslemma_grammarverb_conjugations
verb_conjugations includes: tense_key, person_key, form, is_irregular.
Selected vocabulary fields (current documented schema)¶
idlemmalemma_normalizedtypelanguage_idbase_langlanguage_codepospos_idlemma_type_idcefr_levelcefr_level_idfrequency_rankintroduced_chapterintroduced_unit_idintroduced_stepdifficulty_scoreeditorial_statuscreated_atupdated_at
Progression + difficulty notes:
- introduced_unit_id and introduced_step feed Lexical Unlock Graph progression logic.
- difficulty_score is normalized to [0.0, 1.0] (0.0 easiest, 1.0 most difficult) and can come from taxonomy heuristics or AI suggestion.
Grammar-topic language scope¶
grammar_topics values are language-scoped at taxonomy level.
Examples:
- English: present_simple, past_simple, modal_verbs
- Portuguese: preterito_perfeito, preterito_imperfeito
- Turkish: aorist, evidential_past
Compatibility note¶
No documented feature is removed. Legacy entities and mirror fields remain supported for backward compatibility while taxonomy mappings remain canonical.
Language-engine expansion tables (implemented additive layers)¶
lemma_roleslemma_semantic_classesverb_object_constraintsmodifier_constraintsstudent_lemma_progresspattern_difficultylemma_frequencycollocation_strengthsentence_patternsconversation_turnsconversation_lemmas
Audit-required table status:
- lemma_roles: implemented and used.
- student_lemma_progress: implemented and used.
- lemma_frequency: implemented and used.
- sentence_patterns: implemented and used.
- conversation_turns: implemented and used.
- conversation_lemmas: implemented and used.