Add personality-driven bots with 8 archetypes (Nit, TAG, LAG, Maniac, Calling Station, Loose Fish, Old Man, Monster TAG) across 5 skill levels. Includes: - Three-layer decision pipeline (base strategy → personality filter → skill noise) - Decision timer system with archetype-specific timeout defaults - Observation tracking engine (VPIP, PFR, Fold-to-CBet, WTSD, bet sizing, timing tells) - Player classification engine with weighted scoring and confidence scaling - Table setup UI with visual seat editor and quick presets - Info display system with 4 visibility levels - Teaching coach with post-hand analysis and real-time suggestions Archives bot-intelligence change and syncs all 8 delta specs to main specs.
96 lines
5.7 KiB
Markdown
96 lines
5.7 KiB
Markdown
## Context
|
||
|
||
Current bot implementation uses coin-flip decisions with no strategic reasoning, personality, or skill variation. The `PlayerSeat` type has basic fields (chips, status, cards) but no personality or skill attributes. Bot turns execute instantly with no timing consideration. No observation tracking exists — the player learns nothing about opponent behavior.
|
||
|
||
The game uses a functional state pattern: actions return new GameState objects. Bot decisions are invoked during turn processing in `turn.ts`.
|
||
|
||
## Goals / Non-Goals
|
||
|
||
**Goals:**
|
||
- Replace coin-flip bot decisions with personality-driven, skill-aware decision engine
|
||
- Provide 8 distinct bot archetypes across 5 skill levels with realistic play patterns
|
||
- Enable player-configurable table setup with visual seat editor
|
||
- Track opponent statistics and classify player types automatically
|
||
- Offer configurable learning assistance (info levels, feedback, coaching)
|
||
- Implement sequential decision timer with timing tells
|
||
- Keep all logic client-side with no external dependencies
|
||
|
||
**Non-Goals:**
|
||
- Multiplayer networking (single-player vs bots only)
|
||
- Machine learning-based classification (rule-based for v1)
|
||
- Bot-to-bot adaptation (bots adapt to human player only)
|
||
- Tournament mode (cash game focus)
|
||
- Persistent cross-session memory (optional toggle in v2)
|
||
|
||
## Decisions
|
||
|
||
### Decision 1: Three-layer bot decision pipeline
|
||
|
||
**Chosen**: Base Strategy → Personality Filter → Skill Noise
|
||
|
||
```
|
||
GTO-informed baseline → Archetype bias injection → Mistake/noise layer
|
||
```
|
||
|
||
**Rationale**: Separates concerns cleanly. Base strategy computed once, personality reshapes probabilities, skill injects errors. Easy to tune each layer independently and add new archetypes without rewriting core logic.
|
||
|
||
**Alternatives considered:**
|
||
- Lookup table per archetype × skill: Too many combinations (8 × 5 = 40 tables), hard to maintain
|
||
- ML model: Overkill for v1, requires training data, opaque decision process defeats teaching purpose
|
||
|
||
### Decision 2: Weighted scoring classification over pure rules
|
||
|
||
**Chosen**: Each archetype has a scoring function; stats contribute weighted points; highest score wins with confidence based on total.
|
||
|
||
**Rationale**: Handles edge cases naturally (e.g., VPIP=32% is clearly LAG-adjacent but rigid rules might miss it). Produces natural confidence percentages for UI. Tunable via weight adjustments.
|
||
|
||
**Alternatives considered:**
|
||
- Rule-based if/else: Brittle at boundaries, no confidence metric, confusing when misclassifies
|
||
- Naive Bayes classifier: More accurate but adds complexity, less transparent for teaching purposes
|
||
|
||
### Decision 3: Archetype-specific mistake libraries
|
||
|
||
**Chosen**: Each archetype has its own set of possible mistakes; skill level controls frequency. A Novice TAG folds too much; a Novice Fish calls too much — same skill, different errors.
|
||
|
||
**Rationale**: Realistic — bad players fail in ways consistent with their style. More educational for the player to observe pattern-consistent mistakes.
|
||
|
||
### Decision 4: Sequential timer with configurable duration
|
||
|
||
**Chosen**: Turn passes sequentially, each player gets independent countdown. Duration configurable (5-30s). Human can have same timer, no timer, or custom timer. Timeout triggers archetype-appropriate default action.
|
||
|
||
**Rationale**: Mimics online poker flow. Configurable duration accommodates different learning paces. Timeout defaults add realism (real players sometimes fold from inaction).
|
||
|
||
### Decision 5: Bet sizing tracked per street
|
||
|
||
**Chosen**: Track average bet sizes on pre-flop, flop, turn, and river separately. Flag patterns like "always bets 1/3 pot" or "polarized river betting."
|
||
|
||
**Rationale**: Different streets reveal different information. Pre-flop sizing shows aggression level, post-flop sizing reveals hand reading ability. Per-street tracking enables richer classification.
|
||
|
||
### Decision 6: Timing data as observable tell
|
||
|
||
**Chosen**: Record decision time for every action. Track fast vs slow distributions per action type (call/fold/raise/check). Skill level controls timing consistency — Novice has random timing, Ultra deliberately randomizes.
|
||
|
||
**Rationale**: Timing is a real poker skill. Teaches players to notice hesitation patterns. Adds depth without UI complexity (just track timestamps).
|
||
|
||
## Risks / Trade-offs
|
||
|
||
[Complex decision engine] → Start with simplified base strategy (position-based hand ranking) rather than full GTO solver. Can be upgraded incrementally.
|
||
|
||
[Performance with many tracked stats] → Observation tracking is per-hand, not per-decision. Stats update once per completed hand, keeping computation minimal.
|
||
|
||
[Classification accuracy in early hands] → System shows "insufficient data" until minimum sample size reached (~10 hands). Confidence starts low and increases. Player learns patience.
|
||
|
||
[UI complexity from many settings] → Table setup uses progressive disclosure: basic presets first, advanced options expandable. Info/feedback settings use simple dropdowns.
|
||
|
||
[Mistake injection feeling unrealistic] → Mistakes are probabilistic, not deterministic. Same bot plays differently each game. Testing with human review to calibrate feel.
|
||
|
||
## Migration Plan
|
||
|
||
No migration needed — this is new functionality added to an early-stage project (v0.0.1). Existing bot logic will be replaced entirely. The `PlayerSeat` type will extend with personality/skill fields, but existing fields remain unchanged for backward compatibility.
|
||
|
||
## Open Questions
|
||
|
||
- Should the base strategy use a simplified GTO approximation or position-based hand ranking charts? Hand charts are simpler to implement and debug.
|
||
- How granular should bet sizing tracking be? (e.g., track exact percentages vs bucket into "small/medium/large")
|
||
- Should timeout actions feel like mistakes (count against skill) or neutral decisions?
|