Veit F. 422fa5b3ab feat: implement bot intelligence system with 8 archetypes and coaching
Add personality-driven bots with 8 archetypes (Nit, TAG, LAG, Maniac, Calling Station, Loose Fish, Old Man, Monster TAG) across 5 skill levels.

Includes:

- Three-layer decision pipeline (base strategy → personality filter → skill noise)

- Decision timer system with archetype-specific timeout defaults

- Observation tracking engine (VPIP, PFR, Fold-to-CBet, WTSD, bet sizing, timing tells)

- Player classification engine with weighted scoring and confidence scaling

- Table setup UI with visual seat editor and quick presets

- Info display system with 4 visibility levels

- Teaching coach with post-hand analysis and real-time suggestions

Archives bot-intelligence change and syncs all 8 delta specs to main specs.
2026-05-17 22:41:09 +02:00

96 lines
5.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## Context
Current bot implementation uses coin-flip decisions with no strategic reasoning, personality, or skill variation. The `PlayerSeat` type has basic fields (chips, status, cards) but no personality or skill attributes. Bot turns execute instantly with no timing consideration. No observation tracking exists — the player learns nothing about opponent behavior.
The game uses a functional state pattern: actions return new GameState objects. Bot decisions are invoked during turn processing in `turn.ts`.
## Goals / Non-Goals
**Goals:**
- Replace coin-flip bot decisions with personality-driven, skill-aware decision engine
- Provide 8 distinct bot archetypes across 5 skill levels with realistic play patterns
- Enable player-configurable table setup with visual seat editor
- Track opponent statistics and classify player types automatically
- Offer configurable learning assistance (info levels, feedback, coaching)
- Implement sequential decision timer with timing tells
- Keep all logic client-side with no external dependencies
**Non-Goals:**
- Multiplayer networking (single-player vs bots only)
- Machine learning-based classification (rule-based for v1)
- Bot-to-bot adaptation (bots adapt to human player only)
- Tournament mode (cash game focus)
- Persistent cross-session memory (optional toggle in v2)
## Decisions
### Decision 1: Three-layer bot decision pipeline
**Chosen**: Base Strategy → Personality Filter → Skill Noise
```
GTO-informed baseline → Archetype bias injection → Mistake/noise layer
```
**Rationale**: Separates concerns cleanly. Base strategy computed once, personality reshapes probabilities, skill injects errors. Easy to tune each layer independently and add new archetypes without rewriting core logic.
**Alternatives considered:**
- Lookup table per archetype × skill: Too many combinations (8 × 5 = 40 tables), hard to maintain
- ML model: Overkill for v1, requires training data, opaque decision process defeats teaching purpose
### Decision 2: Weighted scoring classification over pure rules
**Chosen**: Each archetype has a scoring function; stats contribute weighted points; highest score wins with confidence based on total.
**Rationale**: Handles edge cases naturally (e.g., VPIP=32% is clearly LAG-adjacent but rigid rules might miss it). Produces natural confidence percentages for UI. Tunable via weight adjustments.
**Alternatives considered:**
- Rule-based if/else: Brittle at boundaries, no confidence metric, confusing when misclassifies
- Naive Bayes classifier: More accurate but adds complexity, less transparent for teaching purposes
### Decision 3: Archetype-specific mistake libraries
**Chosen**: Each archetype has its own set of possible mistakes; skill level controls frequency. A Novice TAG folds too much; a Novice Fish calls too much — same skill, different errors.
**Rationale**: Realistic — bad players fail in ways consistent with their style. More educational for the player to observe pattern-consistent mistakes.
### Decision 4: Sequential timer with configurable duration
**Chosen**: Turn passes sequentially, each player gets independent countdown. Duration configurable (5-30s). Human can have same timer, no timer, or custom timer. Timeout triggers archetype-appropriate default action.
**Rationale**: Mimics online poker flow. Configurable duration accommodates different learning paces. Timeout defaults add realism (real players sometimes fold from inaction).
### Decision 5: Bet sizing tracked per street
**Chosen**: Track average bet sizes on pre-flop, flop, turn, and river separately. Flag patterns like "always bets 1/3 pot" or "polarized river betting."
**Rationale**: Different streets reveal different information. Pre-flop sizing shows aggression level, post-flop sizing reveals hand reading ability. Per-street tracking enables richer classification.
### Decision 6: Timing data as observable tell
**Chosen**: Record decision time for every action. Track fast vs slow distributions per action type (call/fold/raise/check). Skill level controls timing consistency — Novice has random timing, Ultra deliberately randomizes.
**Rationale**: Timing is a real poker skill. Teaches players to notice hesitation patterns. Adds depth without UI complexity (just track timestamps).
## Risks / Trade-offs
[Complex decision engine] → Start with simplified base strategy (position-based hand ranking) rather than full GTO solver. Can be upgraded incrementally.
[Performance with many tracked stats] → Observation tracking is per-hand, not per-decision. Stats update once per completed hand, keeping computation minimal.
[Classification accuracy in early hands] → System shows "insufficient data" until minimum sample size reached (~10 hands). Confidence starts low and increases. Player learns patience.
[UI complexity from many settings] → Table setup uses progressive disclosure: basic presets first, advanced options expandable. Info/feedback settings use simple dropdowns.
[Mistake injection feeling unrealistic] → Mistakes are probabilistic, not deterministic. Same bot plays differently each game. Testing with human review to calibrate feel.
## Migration Plan
No migration needed — this is new functionality added to an early-stage project (v0.0.1). Existing bot logic will be replaced entirely. The `PlayerSeat` type will extend with personality/skill fields, but existing fields remain unchanged for backward compatibility.
## Open Questions
- Should the base strategy use a simplified GTO approximation or position-based hand ranking charts? Hand charts are simpler to implement and debug.
- How granular should bet sizing tracking be? (e.g., track exact percentages vs bucket into "small/medium/large")
- Should timeout actions feel like mistakes (count against skill) or neutral decisions?