PokeR/spec.md at 422fa5b3ab5d662d41dc46ff909225e1552eb03b

Veit F. 422fa5b3ab feat: implement bot intelligence system with 8 archetypes and coaching

Add personality-driven bots with 8 archetypes (Nit, TAG, LAG, Maniac, Calling Station, Loose Fish, Old Man, Monster TAG) across 5 skill levels.

Includes:

- Three-layer decision pipeline (base strategy → personality filter → skill noise)

- Decision timer system with archetype-specific timeout defaults

- Observation tracking engine (VPIP, PFR, Fold-to-CBet, WTSD, bet sizing, timing tells)

- Player classification engine with weighted scoring and confidence scaling

- Table setup UI with visual seat editor and quick presets

- Info display system with 4 visibility levels

- Teaching coach with post-hand analysis and real-time suggestions

Archives bot-intelligence change and syncs all 8 delta specs to main specs.

2026-05-17 22:41:09 +02:00

3.3 KiB

Raw Blame History

ADDED Requirements

Requirement: Weighted scoring classification algorithm

The system SHALL classify each observed opponent using a weighted scoring algorithm. Each archetype SHALL have a scoring function that evaluates the opponent's tracked statistics (VPIP, PFR, Fold-to-CBet, WTSD) and assigns weighted points. The archetype with the highest score SHALL be the inferred type. The system SHALL compute a confidence percentage based on the score magnitude and sample size.

Scenario: LAG is correctly classified

WHEN an opponent has VPIP=38%, PFR=25%, Fold-to-CBet=45%, WTSD=60% after 40 hands
THEN the LAG scoring function produces the highest score and the bot is classified as LAG

Scenario: Calling Station is correctly classified

WHEN an opponent has VPIP=42%, PFR=8%, Fold-to-CBet=15%, WTSD=75% after 30 hands
THEN the Calling Station scoring function produces the highest score and the bot is classified as Fish

Requirement: Confidence scales with sample size

The system SHALL require a minimum number of observed hands before producing a classification. With fewer than 10 hands, the system SHALL report "insufficient data." Between 10-20 hands, confidence SHALL be capped at 60%. Between 20-40 hands, confidence SHALL be capped at 80%. After 40+ hands, full confidence SHALL be calculated from score magnitude.

Scenario: Insufficient data shown early

WHEN only 5 hands have been played against an opponent
THEN the classification displays "Insufficient data" with no type assigned

Scenario: Confidence increases with more hands

WHEN an opponent has been observed for 50 hands and scores strongly for TAG
THEN confidence may reach 85-90% for a consistent Medium-skill bot

Requirement: Classification difficulty scales with bot skill

The system SHALL adjust classification reliability based on the bot's skill level. Novice and Beginner bots SHALL produce clear, easily classifiable patterns (high confidence achieved quickly). Hard and Ultra bots SHALL produce ambiguous or mixed statistics that reduce achievable confidence and may require 70+ hands for reliable classification.

Scenario: Novice bot is quickly classified

WHEN a Novice LAG has played 12 hands with extreme loose-aggressive patterns
THEN the system classifies them as LAG with 60% confidence (capped by sample size)

Scenario: Ultra bot resists easy classification

WHEN an Ultra TAG has played 40 hands with balanced, position-dependent play
THEN the system may show lower confidence (~65%) due to overlapping stat ranges

Requirement: Secondary pattern detection

The system SHALL detect secondary behavioral patterns beyond core statistics, including bet sizing tendencies (small/standard/polarized), timing tells (fast calls vs slow folds), and bluff frequency estimation. These patterns SHALL supplement the primary archetype classification and be available for teaching hints.

Scenario: Bet sizing pattern is detected

WHEN a bot consistently bets 30% of pot on all post-flop streets over 25 hands
THEN the system flags "characteristically small bet sizes" as a secondary pattern

Scenario: Timing tell is identified

WHEN a bot's calls average 1.5s and folds average 8.5s over 30 decisions
THEN the system flags "fast on comfort, slow on discomfort" as a timing tell

3.3 KiB Raw Blame History