Beating Blackjack with Deep Reinforcement Learning

Ever wondered if a machine can beat the house at Blackjack? In this project, we dive into the world of Deep Reinforcement Learning (DRL) to teach an agent how to play (and win) at the most iconic casino games of all time. Using techniques like Proximal Policy Optimization (PPO), this project explores what happens when you mix machine learning with a deck of cards and a bit of Vegas flair.

What You’ll Find

Blackjack isn’t just about luck—it’s about smart choices under uncertainty. That makes it a perfect playground for DRL. Here’s what this project brings to the table:

✨ PPO in action: Watch an agent learn to make optimal moves in a stochastic game environment.
🤖 Smarter than the average player: The agent doesn’t just rely on what the average human player sees; it learns how to count cards, in its own unique way.
⚡ Lightning-fast training: The agent is trained using JAXAgents and JAX for smooth tensor ops, because speed matters when you’re simulating millions of hands.

Features

Deep Reinforcement Learning: Harnesses the power of PPO to train a smart Blackjack player.
Creative Card Counting: The agent discovers its own internal card counting—our agent strategy.
Performance Insights: Through rigorous evaluation (including hypothesis testing), we show the agent might actually have what it takes to beat the game.

blackjack_hypothesis