Reinforcement Schedules

People occasionally quit their comfortable, well-paying jobs over gambling addictions. Why? The excitement of unpredictable jackpots can be literally intoxicating, even for the most intelligent species on the planet. In addition to building such a strong reinforcement history that desirable behaviours (e.g. sitting politely for petting) become self-rewarding, we can harness the power of gambling addictions to build extreme reliability in our dogs that lasts through long periods with little contrived reinforcement.

Reinforcement is Predictable Reinforcement is Unpredictable
A certain Number of behaviours produces reinforcement. Fixed Ratio Schedules Variable Ratio Schedules
Performing behaviour for a certain amount of time produces reinforcement. Fixed Interval Schedules Variable Interval Schedules

Thus far, you have been using a fixed ratio schedule of 1 behaviour : 1 reward, known as a continuous reinforcement schedule. This is the clearest way to communicate while the learner is still developing an idea of what the behaviour actually is. A continuous reinforcement schedule also builds the strongest reinforcement history, which is why it is generally maintained for emergency recalls, and why almost all working detection K-9 handlers reward every single time the dog makes a find, throughout the life of the K-9.

Why Switch?

While the continuous reinforcement schedule does build the strongest reinforcement history, it is not practical to toss a treat or toy every time our dogs respond to a verbal cue. And for duration behaviours, such as stay, the use of variable interval schedules is absolutely essential to building a strong behaviour of variable duration. Additionally, unpredictability actually protects against extinction, the branch of operant conditioning responsible for extinguishing behaviours upon cessation of reinforcement. Everyday behaviours survive best on variable ratio and interval schedules; for infrequent life-or-death behaviours, such as the emergency recall or a search and rescue find that may occur twice a month, it is both practical and advisable to have a reward on hand for each and every repetition.

Small Steps

If you simply switch from a 1:1 to 1:10 reinforcement schedule for sit and down, you're likely to experience extinction of the behaviour, with the puppy deciding that sit and down are no longer worthwhile behaviours to offer you. The easiest step to take for owners is from 1:1 fixed ratio to 1:2 fixed ratio, and only your puppy's strongest behaviour, which is usually the sit. This means that you will reward the puppy with a treat or toy every other time she sits on cue. Continue to praise her each time, and incorporate sitting on cue into your daily life as a way for her to earn the small things that make her happy, such as going outside. Once you have mastered the idea of deliberately rewarding every other time, switch from a 1:2 fixed ratio schedule to a 1:2 variable ratio schedule, meaning that on average, she gets rewarded for sitting 50% of the time, but that she cannot predict which sit is going to earn her a reward.

Once you are comfortable rewarding approximately half of her sits, and only if her enthusiasm and reliability have retained their initial strength, decrease your rewards to approximately one third, then one fourth, and so on. Once your dog is reliably and enthusiastically sitting with contrived rewards being delivered on a 1:50 variable ratio schedule, she is literally operating on a slot machine schedule!