DEV Community

# reinforcementlearning

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Why Most Game NPCs Feel Dead (And How Emotion and Memory Fix It)

Why Most Game NPCs Feel Dead (And How Emotion and Memory Fix It)

1
Comments
4 min read
[Meta-RL] We told an AI agent 'you can fail 3 times.' Accuracy went up 19%.

[Meta-RL] We told an AI agent 'you can fail 3 times.' Accuracy went up 19%.

4
Comments
4 min read
Challenging Dogma: Simple Fine-Tuning Enables Continual Learning in VLA Models

Challenging Dogma: Simple Fine-Tuning Enables Continual Learning in VLA Models

Comments
2 min read
Reinforcement Learning for Robotics: A Comprehensive 2025 Guide

Reinforcement Learning for Robotics: A Comprehensive 2025 Guide

1
Comments
52 min read
How I Built a Readable AlphaZero From Scratch — A Deep Dive Into the Code

How I Built a Readable AlphaZero From Scratch — A Deep Dive Into the Code

1
Comments
10 min read
From Pixels to Physicality ☃️: Engineering Olaf with Reinforcement ✨ Learning, Control Systems, and Illusion Design 🤖

From Pixels to Physicality ☃️: Engineering Olaf with Reinforcement ✨ Learning, Control Systems, and Illusion Design 🤖

2
Comments
8 min read
I Built an AI Arena and Trained AlphaZero to Play Gomoku: Here’s How

I Built an AI Arena and Trained AlphaZero to Play Gomoku: Here’s How

1
Comments
4 min read
Fixing an Off-By-One Bug in PufferLib's PPO Implementation

Fixing an Off-By-One Bug in PufferLib's PPO Implementation

Comments
2 min read
Multi armed bandit exercise 2.5 with C#

Multi armed bandit exercise 2.5 with C#

Comments
4 min read
Sutton & Barto Gridworld example in C#

Sutton & Barto Gridworld example in C#

Comments
5 min read
HRPO-X v1.0.1: from HRPO paper production-hardened runnable code

HRPO-X v1.0.1: from HRPO paper production-hardened runnable code

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.