Learning science by doing chores

Chores are generally undesirable and boring tasks that we perform regularly. They don’t really pose any danger to us (unless of course you’re in charge of cleaning the toilet after taco night), so we could just… not do them, right?

Well, the thing about chores is that, while not dangerous in any way, not doing them does come with consequences. And all these consequences work against putting us in a more desirable position. If we never dusted our home, we would likely be sneezing more. If we never cleaned the shower drain, it would get funky and we would trip over all the clogged-up hair.

And if we never did the dishes… well, yuck
. So while we don’t necessarily have to do them, we are certainly better off for it. And the same applies to dogs and training. Obedience is not necessarily a fun undertaking for most dogs, but we teach them the importance of it with consequences in order to build a strong relationship and communication.

THE CHAMBER OF SECRETS

Operant conditioning – also known as instrumental learning – was first studied extensively by Edward Thorndike, who built escape puzzles for cats. But the real deep understanding of this model of learning came from B. F. Skinner a few decades later. Skinner believed that classical conditioning was too simple of an explanation for complex behaviours, in living, thinking beings. He wanted to prove that we learned more through cause and effect, than simply through repetition.

He built a so-called operant conditioning chamber – more commonly referred to as a Skinner box (see above) – where he could expose small animals to highly controlled stimulus and study the effects on the animals. The purpose of this was to attempt to make the animals conduct certain behaviours, in order to gain reinforcement (the machine dispensed food) or escape punishment (the floor became electrified). I won’t go into too much detail here, but you can watch one of his experiments with hungry rats on Youtube.

MOTIVATION – TOWARDS OR AWAY FROM

Motivation is what drives us all. It makes us do or not do certain things, because of the current state we are in. If you are hungry, that sensation holds enough power to push you off the couch, onto your legs, make you walk to the kitchen, make food and eat, in order to relieve pressure. Yes, hunger is pressure. And pressure is a motivator. But on you can also have a craving, without being hungry. Now this may hold the same power over you, and make you repeat the previous steps all over again, only to go get some chocolate this time.

While both the above examples produce the same outcome (behaviour change), their motivation is very different. One is negatively reinforcing (eating to remove the hunger), while the other is positively reinforcing (enjoying a chocolate treat, simply because it is desirable). 

THE FANTASTIC FOUR

There are four ways to motivate someone to do something: positive reinforcement (+R), negative reinforcement (-R), positive punishment (+P) and negative punishment (-P).

Okay, first, let’s get this out of the way: positive and negative does NOT mean good and bad. It means additive and subtractive. In other words, think of it as maths (I mean, this is science after all). So positive then, means we are adding something to create reinforcement or punishment, while negative means we remove something to create reinforcement or punishment. For this to happen, we need to determine what is appetitive (desirable) and what is aversive (undesirable) to the person or animal we are training. It is also very important to understand that to be able to use punishment or correction, the subject needs to completely, 100%, understand what is being asked of it first.

REINFORCEMENT VS. PUNISHMENT

“What is a punishment?” I get this question almost daily.
“My dog isn’t food motivated, so how can I train it?” is another frequent complaint.

And, sadly, I often hear this as well; “Punishment? So, you hit your dogs?”
Let me make one thing absolutely clear, before I go any further. Hitting a dog is not punishing it. That’s abusing it. This is my stance on that matter.

Let’s look at the definition of reinforcement and punishment, as we use it in dog training:


Reinforcement increases the likelihood that a behaviour will be repeated.
Punishment decreases the likelihood that a behaviour will be repeated.


At its very essence, it is that simple.

Unfortunately, the only one who can determine what is reinforcing and what is punishing, is your dog. What one dog finds extremely enticing, another might find aversive. Where one dog may not even notice a leash correction, another may run away at a quiet ‘No’.
My dogs are absolutely crazy about any type of food, whereas some dogs would nearly starve themselves in order to get a toy. But by following the simple definition above (increase and decrease of certain behaviours) it is usually pretty easy to work out what your dog finds appetitive and what it finds aversive.

But this is all getting very technical. Let’s try and break it down and make it a little more fun!


SUBJECT #1 – THE VIKING DOG TRAINER

Species:
Homo Sapiens (barely)
Age:
Dead in dog years
Sex:
Male
Distinguishing features:
Hair in all the wrong places
Conditioned markers (reinforcer and punisher):
“Yes” and “bad Viking”


Subject’s appetitive:
Wine (red, Shiraz)


Subject’s aversive:
Dirty dishwater and pop music


Task to perform:
Doing the dishes
Subject’s knowledge of the task:
Has been taught using positive reinforcement, knows every step of the process and clearly understands what is asked of him. 
Criteria for successful completion of task:
All dishes have been cleaned with soap and water, until there is no food residue left, then set aside to dry.
Verbal command:
“Dishes”. 

STUDY COMMENCEMENT AND OUTCOMES

We have put the subject in a control position (sitting on a chair in front of all the equipment he needs to complete the task). We have then said the subject’s name to gain his attention and given him the verbal command ‘dishes’.

The subject was given the command four times and below is the result (consequence) for each of the four repetitions. 

#1 – POSITIVE REINFORCEMENT

Subject completed the task satisfactory and was rewarded with one unit of wine (wine is added in order to create reinforcement, hence it is positive). However, we believe the subject can complete the task a little faster, so next round we will add a low level aversive pressure, in order to encourage a faster and better result. 

#2 – NEGATIVE REINFORCEMENT

In order to make the subject complete the task faster, we start playing pop music for him when he begins the task. As the subject hates pop music, and he knows the music will go away when he completes the task, he will do the task faster and more effectively. The moment of reinforcement is the second the music is turned off (the music is removed from the equation, therefore it is negative).

#3 – NEGATIVE PUNISHMENT

The subject knows the task but didn’t perform it successfully. He only cleaned half the dishes before giving up. We therefore punish the subject, by withholding his primary reinforcer, the glass of wine. Next time, the subject should perform the task to a more satisfactory standard, as he knows there is an aversive consequence when not completing it (the reinforcer is removed, hence, the punishment is negative). 

#4 – POSITIVE PUNISHMENT

The subject actively refused to undertake the task. We know with 100% certainty that he knows how to complete the task, so there is no confusion on the subject’s behalf. Therefore, we add a positive punishment by throwing dirty dishwater in the subject’s face (water is added, therefore the punishment is positive). This is a strong correction, which will lead to the subject not refusing to do the task again, as he knows there is a clear aversive consequence when not performing.

Utilising all four quadrants of operant learning, we can now build a really strong response to any trained task, by adding appetitive and aversive consequences. This will ensure that the subject completes the dishes every time he is asked to, with speed and proficiency, to a standard that is acceptable. 

UNMISTAKABLY CLEAR COMMUNICATION

As you can see, operant conditioning allows us to clearly guide our dog through whatever we may be aiming for. We’re telling the dog yes when its right and – just as importantly – we’re telling it no when it is wrong. By doing this, we are allowing the dog to understand the full picture of what we are attempting to communicate, reducing stress and frustration, and increasing engagement.

There are many trainers that encourage people to ignore bad behaviour and simply wait for good behaviour, so they can reward that. I think this is plain silly. Why would you not want your dog to understand both sides of the behaviour? Also, if a dog is repeatedly jumping on you, and you’re just turning your nose up at it, don’t you think that builds more frustration in the dog? And have we forgotten that there is such a thing called a ‘learning event’, where the dog may now think it has to jump on your for two minutes, then sit, in order to get its reward? With operant conditioning, you simply say no, add some pressure, and reward the alternate behaviour. Task complete.

Beautifully simple, beastly effective!


Stian Berg
Dog trainer & behaviourist