Spoke Plus Course Engine Specification¶

1. Purpose¶

This document defines the current Course Engine behavior and data model, and separates implemented capabilities from planned adaptive logic.

2. Architectural Principle¶

Single Linguistic Source of Truth¶

The official Content Bank of SPOKE PLUS is the universal morphological linguistic model.

The system uses vocabulary as the root lexical registry (lemma-level source of truth).
word_forms stores morphological variations derived from each lemma (inflection, conjugation, gender/number variants, etc.).
sentences are governed by explicit tokenization through sentence_tokens, with each token linked to a valid lexical reference (vocabulary and, when applicable, word_forms).
The model is multilingual by design and not segmented into isolated per-language content banks.
All future engine evolution must preserve this single linguistic source.

3. Linguistic Engine (Morphological Core)¶

3.1 Content Bank Core (Implemented)¶

The Content Bank is modeled as a lemma-first linguistic graph.

Core entities:

vocabulary: canonical lemma/root entry.
word_forms: morphological realizations of a lemma.
word_tags, word_translations, word_assets, word_grammar: lexical enrichment layers.
semantic_relations: semantic links between lexical entries.
sentences, sentence_tokens, sentence_tags: contextual usage and tokenized sentence structure.
pos_colors, tts_assets: rendering and speech-support layers.

3.2 Sentence Integrity Rule (Implemented)¶

No sentence is considered valid content without token-level linkage.

sentence_tokens is mandatory as the structural bridge between sentence text and the lexical bank.
Every token must reference an existing lexical source (vocabulary, with morphological form support via word_forms where applicable).
This guarantees consistency between authored sentence text and reusable linguistic units.

3.3 Universal Multilingual Model (Implemented)¶

The linguistic bank is universal and shared.

The same lexical infrastructure supports all language pairs.
Language metadata is carried in records and relations, not by duplicating isolated schemas.
Reuse happens across courses/lessons through mappings, while lexical truth remains centralized.

3.4 Evaluation Direction (Planned)¶

Assessment logic is evolving toward token-level linguistic diagnostics.

Future evaluation will be token-by-token.
Scoring will combine morphological correctness and syntactic role/ordering checks.
This structure is the foundation for adaptive remediation and precision feedback.

4. Course Structure¶

4.1 Hierarchy (Implemented)¶

The engine uses a normalized progression hierarchy:

courses
units
skills
lessons
lesson_content_map

Ordering is maintained through order_index columns and supporting indexes.

4.2 Skill-Level Progression Thresholds (Implemented)¶

skills includes:

min_sessions_required
min_mastery_required

These fields define structure-level progression criteria at model level and support future unlock simulations.

5. Skill Dependency Model¶

5.1 Current State¶

A dedicated skill_dependencies table is planned and not present in current migrations.

5.2 Planned Graph Concept¶

The target model is a directed graph where edges represent prerequisites:

Parent skill → child skill
Unlock evaluation through dependency checks

5.3 Planned Unlock Types¶

Sequential: unlock by structural order.
Dependency: unlock when prerequisite graph constraints are satisfied.
Manual: unlock by admin/override controls.

6. Unlock Simulation Concept¶

6.1 Current Foundation (Implemented Data Inputs)¶

Current platform data already captures inputs needed for unlock simulation:

Session records (sessions)
Scoring and correctness aggregates (vw_session_summary, attempts data)
Skill progress metrics (user_skill_progress)

6.2 Planned Simulation Rules¶

Future unlock checks are expected to evaluate:

sessions_completed against min_sessions_required
completion_score / mastery threshold checks
Dependency graph validation (once dependencies are modeled)

7. XP Economy Concept¶

7.1 Current State (Implemented)¶

XP is tracked in user-facing progression tables (user_xp) and session-level earned XP (sessions.earned_xp).
Cookie balance is tracked in user_wallet with transaction history.

7.2 Planned Reward Expansion¶

XP reward balancing per skill/session type.
Cookie reward balancing tied to session outcomes.
Streak multiplier mechanics as a future progression modifier.

8. Future Adaptive Engine Overview¶

The current system already exposes weak-item and due-item analytics views that will feed adaptive logic:

vw_practice_most_wrong
vw_practice_due_items

Planned adaptive capabilities:

Weak skill detection and ranking.
Error-rate weighted recommendations.
Smart session generation from weakest/due targets.
Difficulty balancing based on learner outcomes.
Linguistic adaptation based on token-level morphology and syntax signals.

These capabilities are roadmap design targets, not fully automated runtime behavior yet.

9. Official Morphological Specs (Reference)¶

The official detailed specs for token-level morphology and lexical engineering are maintained in:

docs/MORPHOLOGICAL_EVALUATION_SPEC.md
docs/MORPHOLOGY_GENERATION_SPEC.md
docs/WORD_DETAIL_PANEL_SPEC.md
docs/irregulars/en_verbs.json (EN irregular overrides)

These documents are normative for implementation details and should be used as the source for future API/UI behavior at token-level evaluation and word-detail tooling.