PRL: Reversal Learning

One of 2 colors pays +10. When rule reverses — switch

About this trainer

PRL is a probabilistic reversal-learning task. You face two options and pick one on each trial; the better option pays off most of the time but not always, so you learn by trial and error which one is currently 'good'. Without warning the rule flips and the other option becomes better, and your job is to notice the change and switch, instead of stubbornly sticking with what used to work.

What it develops

It trains cognitive flexibility and feedback-driven learning under uncertainty: holding onto a rule while it pays, distinguishing a genuine reversal from a run of bad luck, and updating your choice without over-reacting to a single misleading outcome.

History

The idea grew out of mid-20th-century animal learning research, where animals were taught a simple discrimination and then had the reward contingencies reversed to see how fast they could relearn. The probabilistic version for humans took shape in cognitive neuroscience around the early 2000s, when noisy feedback was added to better mimic real-world uncertainty and to probe how the brain copes with changing rules.

Who created it — and when

There is no single inventor. Reversal learning comes from the behaviourist tradition of discrimination-reversal studies in the 1940s and 1950s, associated with researchers such as Harry Harlow, the Kendlers and N. J. Mackintosh. The modern probabilistic reversal task used in human imaging is commonly credited to Roshan Cools, Luke Clark and colleagues at Cambridge around 2002, building on that older lineage rather than founding it.

How to train

Treat one bad outcome as noise, not proof: only conclude the rule has flipped after several misses in a row from the option you thought was best. Keep a rough mental tally of recent results rather than reacting to the very last trial, and once you have switched, commit to the new choice long enough to confirm it before doubting it again.

How long to practise

Short, regular sessions work best: roughly 5 to 10 minutes, a few times a week. The skill being exercised is fast updating, so several short blocks beat one long grind, where fatigue makes you either too twitchy or too rigid.

Evidence base

Evidence is strongest for the obvious thing: with practice you get better at the task itself and at telling real rule changes apart from unlucky streaks, and the task reliably tracks differences in flexibility between groups in clinical and neuroscience research. Claims that this kind of training broadly transfers to everyday decision-making or general 'cognitive flexibility' are weak and contested, and the wider brain-training literature gives little reason to expect far transfer, so treat any grand promise with caution.

Recommendations

Before you switch, ask yourself whether you have really seen a pattern of failures or just one unlucky result, and only flip when the evidence has piled up.

FAQ

Why did I lose even though I picked the 'right' option?

Because the good option only pays off most of the time, not every time. A single loss is often just noise; the rule has not necessarily reversed.

How do I know when the rule has actually flipped?

Look for a run of poor outcomes from the choice that used to work, not one bad result. Once misses cluster together, that is your cue to switch.

Will this make me more flexible in real life?

It will reliably make you better at this task and similar ones. Broad transfer to everyday decisions is not well supported, so enjoy the practice without banking on life-changing gains.

Variants

Variations change the difficulty by adjusting how reliable the good option is (for example 80/20 versus a noisier 70/30), how often reversals happen, whether you track two options or several, and whether feedback comes as rewards, as losses, or both. Deterministic versions remove the luck entirely and simply flip a rule that always holds.

Play in browser Download