Skip to content

OPERATIONS — SpokePlus Process Governance

Arquitetura de processos

O SpokePlus roda exclusivamente via PM2 com um ecossistema único (ecosystem.config.cjs):

  1. spokeplus — API backend (server.js)
  2. spokeplus — worker de TTS (workers/ttsWorker.js)
  3. spokeplus-web — Next.js (web, porta 3001)

Políticas fixas por processo:

  • instances: 1
  • exec_mode: fork
  • autorestart: true
  • max_memory_restart: 500M
  • NODE_ENV=production

Portas utilizadas

  • 3000: API backend (spokeplus)
  • 3001: Frontend web (spokeplus-web)

Comando de verificação da API:

ss -ltnp | grep :3000

Fluxo de deploy

Deploy automatizado (GitHub Actions) deve seguir este fluxo:

  1. npm ci --omit=dev na raiz
  2. npm ci && npm run build em web/
  3. pm2 start ecosystem.config.cjs || pm2 reload ecosystem.config.cjs
  4. pm2 save

Proibido: iniciar backend isolado com pm2 start server.js.

Troubleshooting padrão

Erro EADDRINUSE

Sintoma: Port 3000 already in use. Exiting.

Diagnóstico:

lsof -i :3000

Ação:

  1. Identificar processo fora do PM2 ocupando a porta.
  2. Encerrar processo indevido.
  3. Reiniciar via PM2 (pm2 reload ecosystem.config.cjs).

Backend não sobe

  1. Verificar se foi iniciado manualmente (bloqueado por governança).
  2. Confirmar ambiente PM2: pm2 list.
  3. Consultar logs: pm2 logs spokeplus --lines 100.

Procedimento seguro de restart

  • Aplicar mudanças sem derrubar manualmente os processos:
pm2 reload ecosystem.config.cjs
pm2 save
  • Para subir do zero:
pm2 start ecosystem.config.cjs
pm2 save

Comandos oficiais suportados

npm run pm2:start
npm run pm2:reload
npm run pm2:stop
pm2 list
pm2 logs spokeplus --lines 100
pm2 logs spokeplus --lines 100
pm2 logs spokeplus-web --lines 100

Spok chat: rotas, auth e conversa

  • O frontend usa web/lib/api.ts e envia todas as rotas /admin/* para NEXT_PUBLIC_API_BASE_URL (fallback https://api.spokeplus.com).
  • apiFetch envia Authorization: Bearer <supabase session access_token> e também credentials: include para suportar cookie/httpOnly quando aplicável.
  • Em 401/403, o app marca sessão expirada e redireciona para /admin/login (sem loop).
  • SpokPanel sempre envia conversation_id em POST /admin/copilot/messages e GET /admin/copilot/history.
  • SpokPanel usa POST /admin/copilot/messages/stream (SSE) para mensagens sem anexos e mantém POST /admin/copilot/messages (multipart) para mensagens com anexos.
  • Nova Conversa limpa estado local imediatamente e gera novo conversation_id; histórico novo retorna vazio por contrato da API.

Mapa de rotas

  • app.spokeplus.com → Next.js (porta 3001).
  • api.spokeplus.com → Express API (porta 3000).
  • O frontend de admin deve consumir API pelo host de API para evitar colisão com rotas de página do Next.

Smoke test Spok

Use node scripts/spokSmokeTest.mjs com variáveis: - SPOKEPLUS_API_BASE_URL (default https://api.spokeplus.com) - ADMIN_TOKEN (Bearer token de admin)

Validações: 1. POST em /admin/copilot/messages retorna assistant_message. 2. GET em /admin/copilot/history com conversation_id novo retorna history vazio.

Streaming SSE no Spok (POST /admin/copilot/messages/stream)

  • Content-Type de entrada: application/json.
  • Content-Type de saída: text/event-stream.
  • Eventos enviados pelo backend:
  • data: {"content":"..."} para chunks incrementais;
  • data: {"done":true} no encerramento.
  • Headers obrigatórios para evitar buffering:
  • Cache-Control: no-cache, no-transform
  • Connection: keep-alive
  • X-Accel-Buffering: no
  • Se a resposta entrar em governança de confirmação (needsConfirmation), o endpoint envia evento único com payload de confirmação e encerra sem streaming incremental de conteúdo.
  • Fluxo com anexos continua fora deste endpoint (usar multipart em /admin/copilot/messages).

Nginx (proxy em frente da API)

No bloco de location/proxy da API, garantir:

proxy_buffering off;
proxy_read_timeout 3600;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

Anexos no Spok (POST /admin/copilot/messages)

  • O endpoint aceita multipart/form-data com os campos:
  • message
  • conversation_id
  • confirm (opcional)
  • files[] (0..5 arquivos)
  • Limites de governança no backend:
  • máximo 5 arquivos por mensagem;
  • até 10MB por arquivo;
  • upload em memória via multer.memoryStorage().
  • Armazenamento:
  • bucket Supabase Storage: copilot-attachments (criado automaticamente se não existir);
  • path: copilot-attachments/<conversationId>/<messageId>/<filename_sanitizado>;
  • arquivos ficam privados e o backend retorna signedUrl por anexo.
  • Metadados (name, size, mime, url, storage_path) ficam registrados no request_payload.attachments do ai_action_logs, preservando histórico e compatibilidade com conversas antigas.

Admin polling rules (stability / anti-storm)

To prevent /admin/* request storms and 429 cascades, all admin polling must follow these constraints:

  • Use web/lib/polling.ts:startPolling for interval-based fetching.
  • Minimum interval is 5000ms for any poll.
  • Recommended interval is 15000ms for system health/status cards and queues.
  • /admin/system/errors poll should be >=20000ms.
  • Logs auto-refresh must be OFF by default and manual refresh is preferred.
  • Poll loops must stop when:
  • component unmounts (always return cleanup);
  • tab is hidden (document.visibilityState !== 'visible');
  • auth is no longer allowed (unauthenticated, sessionExpired, forbidden);
  • global API rate-limit cooldown is active.
  • Avoid useEffect dependency loops:
  • fetch effects should depend on stable routing keys (e.g. courseId) and auth state;
  • never use loaded arrays/objects (e.g. items, themes, status) as fetch-trigger dependencies.
  • web/lib/api.ts provides global single-flight dedupe and client-side 429 backoff; callers should surface errors but must not implement aggressive retry loops.

System Integrity scans and Feature Registry operations

How integrity scans work

POST /admin/system/integrity/run executes additive sections:

  • api_contract
  • ui_health
  • schema_taxonomy
  • unregistered_features (UNREGISTERED FEATURES category)

The new unregistered_features section cross-checks runtime/UI/backend usage against the System Feature Registry.

How to register features

Use registerSystemFeature in services/systemFeatureRegistry.js:

registerSystemFeature({
  id: 'tts_generation',
  type: 'workflow',
  routes: ['/admin/tts'],
  endpoints: ['POST /admin/tts/generate'],
  schema_dependencies: ['tts_assets'],
  taxonomy_dependencies: ['languages'],
  critical_actions: ['generate_audio'],
  tests: ['tests/ttsPipeline.test.js'],
});

Minimum expectation for new features:

  • Declare all routes/pages and API endpoints.
  • Declare schema and taxonomy dependencies.
  • Declare critical actions.
  • Attach integrity coverage metadata (tests and/or playwright_scenario).

How to interpret unregistered-feature warnings

Common warnings:

  • unregistered_feature_route: UI route found but not registered.
  • unregistered_feature_endpoint: UI endpoint usage found but not registered.
  • endpoint_without_contract_check: endpoint declared but not part of explicit contract checks.
  • ui_taxonomy_dependency_undeclared / unregistered_taxonomy_dependency: taxonomy usage missing in registry declarations.
  • unregistered_schema_dependency: backend table usage missing in registry declarations.

Critical signal:

  • critical_flow_missing_coverage: feature has critical actions but no integrity coverage metadata.

Operational policy: do not treat new feature work as complete while UNREGISTERED FEATURES findings remain unresolved for that feature.