Theory of Learning, “Operant Conditioning"

by Nazneen Shah
Theory of Learning, “Operant Conditioning” by B.F Skinner

Learning: What goes in the process of learning? How do we learn?

There are various theories which throw light on the phenomenon of learning. Each theory with its systematic body of knowledge explains the nature and process of learning. These theories represent broad principles and techniques of learning. These theories also put forth various methods of learning and suggest the teacher and learner to take proper steps for the effective learning.

Modern learning theories maybe classified into two broad types, namely:
A) Stimulus response- associationist type of learning theory.
B) Gestalt field or field cognitive type of learning theory.

A) S-R associationst type of theory interprets learning in terms of the change in behavior of the learner brought about by association of the response to a series of stimuli. The chief exponents of this type of theories are:
1. Edward L. Thorndike: his idea and system is called “connectionism”, “Trial and Error” or “S-R learning theories.
II. John B. Watson and Evan Petrovich Pavlov: their idea and system is known as classical condition.
III. Burrhus Frederic Skinner is called “Operant conditioning”

None of these theories are said to be complete in all aspects for explaining the phenomenon of learning. Each one of them gives partial description. For example one theory is good in explaining the learning process in one situation while the others hold equally good in the other different situations

Trial and Error or S-R Learning theory

Thorndike put a hungry cat in a puzzle box. There was only one door for exit which could be opened by correctly manipulating a latch. A fish was placed outside the box. The smell of the fish worked as a strong motive for the hungry cat to come out of the box. Consequently the cat made every possible effort to come out. It tries to squeeze through every opening, it claws and bites at the bars or wires. In this way, it made a number of random movements. In one of the random movement, by chance the latch was manipulated. The cat comes out and got its reward response. Now it was able to open the door without any error or learnt the way of opening the door. This experiment sums up the following stages in the process of learning.

Drive: hunger intensified with the sight and smell of the food i.e. smell of a fish.

Goal: To get the food by getting out of the box.

Block: The cat was confined in the box with a closed door.

Random movements: the cat persistently tried to get out of the box.

Chance success: As a result of these striving and random movements, the cat, by chance, succeeded in opening the latch

Selection: (of proper movement) gradually the cat recognized the correct opening way by manipulating the latch out of its random movements

Fixation: At last the cat learned the proper way of opening the door by eliminating all the incorrect responses and fixing only right response. Now it was able to open the door without any error

Thorndike named the learning of his experiment as “Trial and Error” He maintained that the learning is nothing but the stamping in of the correct responses and stamping out of the incorrect responses through trial and error. In trying for the correct solution the cat made so many vain attempts. It committed errors and errors before getting success. On subsequent trial, it tried to avoid the erroneous ways and repeat the correct ways of manipulating the latch. Thorndike called it “learning by selecting and connecting” as it provides an opportunity for the selection of the proper responses and connect or associate them with appropriate stimuli. In this reference Thorndike has written “Learning is connecting. The mind is man’s connection system”

Evan Patrovich Pavlov’s “Classical conditioning” learning theory

Pavlov gave birth to a new theory of learning known as conditioned response theory or simply as learning by conditioning.

Experiment: In one of the experiments, Pavlov kept a dog hungry for the night and then tied him on to the experimental table which was fitted with certain mechanically controlled devices. The dog was made comfortable and distractions were excluded as far as it was possible to do so. The observer kept himself hidden from the view of the dog but able to view the experiment by means of a set of mirrors. Arrangements were made to give food to the dog through automatic devices. Every time when food was presented before the dog, he also arranged for the ringing of a bell. When the food presented before the dog and the bell was rung, there was automatic secretion of saliva from the mouth of the dog. The activity of presenting the food accompanied with a ringing of the bell was repeated several times and the amount of saliva secreted was measured.

After several trials, the dog was given no food but the bell was rung. In this case also the amount of saliva secreted was recorded and measured. It was found that even in the absence of food (the natural stimulus), the ringing of the bell (an artificial stimulus) caused the dog to secrete the saliva (natural response)

It considers the learning as a habit formation and is based on the principle of association and substitution. It is simply a stimulus-response type of learning where in place of a natural stimulus like food and water etc, the artificial stimulus like sound of the bell, can evoke a natural response. When both the artificial or neutral stimulus (ringing of the bell) and natural stimulus (food) are brought together, several times, the dog becomes habituated or conditioned to respond to this situation. There becomes perfect association between the types of stimuli presented together. As a result, after some times natural stimulus can be substituted or replaced by an artificial stimulus and this artificial stimulus is able to evoke the natural response.

Diagrammatic Presentation of the Experiment

Natural Stimulus Natural response
(Presentation of food) (Salivation)
S1 R2

S2 R2
Artificial Stimulus General Alertness
(Ringing bell)

Experiment No 2. In one of the experiment done by Watson, the subject was the human baby of eleven months. The baby was given a rabbit to play. The baby liked it very much and was pleased to touch its fur. He watched carefully the pleasant responses of the baby. After some times in the course of the experiment, a loud noise was produced to frighten the baby. As soon as the baby touched the rabbit the baby was frightened. Each time when he wanted to touch the rabbit, the loud noise was produced and he gave fear response.

From these experiments, Watson and Pavlov concluded that all type of learning can be explained through the process of conditioning. What is this process?
It is a learning process by means of which artificial stimulus is able behave like a natural stimulus when both natural and artificial stimulus are presented together. In this type of learning, association plays a great role since the individual responds to an artificial stimulus because he associates it with the natural stimulus

Burrhus Frederick Skinner (1904-1990) was born in Susquehanna Penensyvanian railroad town closed to the New York State border.

At school and College, skinner was interested in literature and biology and considered becoming a poet and novelist. However he becomes interested in psychology after reading books by Pavlov and Watson. He enrolled at the psychology department at Harvard University gaining a PhD degree in psychology in 1931. In1948 he joined psychology department in Harvard University there he remained professionally active until his death in 1990.

During World War II Skinner participated in a government research project, the result of which were not made public until 1959. He had been conditioning pigeons to pilot missiles and torpedoes. The pigeons were so highly trained that they could guide a missile right down into the smokestack of navel destroyer

Skinner’s Experiments regarding ‘ operant conditioning’

B.F Skinner conducted a series of experiment with animals. For conducting the experiments with rats, he designed a special apparatus known as Skinner’s Box. It was a much modified form of the puzzle box used by Thorndike for his experiments with cats. The darken sound proof box mainly consists of a grid floor, a system of light or sound produced at the time of delivering a pallet of food in the food cup, a lever and a food cup. It is arranged so that when a rat (hungry or thirsty) presses the lever the feeder mechanism is activated, a light or a special sound is produced and a small pallet of food or small drops of water is released into the food cup. For recording the observation of the experiments, the lever is connected with a recorder system which produces a graphical tracing of the lever pressing against the length of the time the rat is in box.

To begin with, Skinner, in one of his experiments, placed a hungry rat in the box. In this experiment pressing of the bar in a desirable way by the rat could result in the production of a click- sound acted as a cue or signal indicating to the rat if it respond by going to the food cup, it will be rewarded. The rat was rewarded for each of his proper attempts for pressing the lever. The lever press response having been a rewarded, was repeated and when it occurred, it was again rewarded which further increased the probability of the repetition of the lever press response and so on. In this way ultimately the rat learned the act of pressing the lever as desired by the experimenter
For doing experiments with pigeons Skinner made use of another specific apparatus called ‘pigeon box’. A pigeon in this experiment had to peck at a lighted plastic key mounted on the wall at head high was subsequently rewarded by receiving grain. With the help of such experiments, Skinner put forward his theory of operant conditioning for learning not only the simple responses like pressing of the lever but also for learning the most difficult and complex series of responses pressing of the lever or latch but also for learning the most difficult and complex series of responses.
Although classified and included in the category of conditioning, operant conditioning differs a lot from the classical conditioning advocated by Watson and Pavlov. The most outstanding difference lies in the order related with the initiation and response i.e. stimulus response mechanism. In classical conditioning the organism is passive. It must wait for something to happen for responding. The presence of a stimulus for evoking a response is essential. The behavior can not be emitted in the absence of a cause. The child expresses fear when he hears a loud noise; the dog waits for food to arrive before salivating. In each of such instances, the subject has no control over the happening. He is made to behave in response to the stimulus situations. Thus, the behavior is said to be initiated by the environment, the organism simply responds.
Skinner revolted against ‘no stimulus no response’ mechanism in the evolution of behavior. He argued that in practical situation in our life we can not wait for things to happen in the environment. Man is not a victim of the environment. He may often manipulate the things in the environment with his own initiative. Therefore, it is not always essential that there must be some know stimulus or causes of evoking a response. Quite often, most of our responses could not be attributed to the known stimuli. The organism itself initiates the behavior. A dog, a child, or an individual ‘does” something ‘behaves’ in some manner, it ‘operates’ on the environment and in turn environment responds to the activity. How the environment responds to the activity, rewarding or not, largely determines whether the behavior will be repeated, maintained or avoided.
From where Skinner got the cue for such ideas in a question that can arise at this stage. Definitely it was from the studies and observations of an earlier psychologist named Thorndike. Through his experiments, for propagating his famous trial and error theory of learning. Thorndike concluded that the rewards of a response (like getting food after chance success through the randomized movements) lead to the repetition of an act and the strengthening of S-R associations. These conclusions made Skinner begin a series of experiments to find the consequences of the rewards in repeating and maintaining behavior. Based on the findings of his experiments, he concluded that “behavior is shaped and maintain by its consequences”. It is operated by the organism and maintained by itself. The occurrence of such behavior was named as operant behavior and the process of learning, that plays the part in learning such behavior, was named by him as operant conditioning

Some concepts used by Skinner for bringing out his theory of learning---------- Operant conditioning.

Operant: Skinner considers an operant as a set of acts that constitutes an organism’s doing something e.g raising its head, walking about, pushing a lever, etc

Reinforcer and Reinforcement;
The concept of reinforcement is identical to the presentation of a reward. A reinforcer is the stimulus whose presentation or removal increases the probability of a response re-occurring. Skinner thinks to two kinds of reinforcers—positive and negative.
A positive reinforcer is any stimulus the introduction or presentation of which increase the likelihood of a particular behavior. Food, water etc, are classified as positive reinforcers. A negative reinforcer is any stimulus the removal or withdrawal of which increases the likelihood of a particular behavior. Electric shock, loud noise etc, are said to be negative reinforcers is the reinforcement operation schedule of gambling devices. Here rewards are unpredictable and keep the players well-motivated thou Operant: Skinner considers an operant as a set of acts that constitutes an organism’s doing something e.g., raising its head, walking about, pushing a lever, etc.

Reinforcer and Reinforcement: The concept of reinforcement is identical to the presentation of a reward. A reinforcer is the stimulus whose presentation or removal increases the probability of a response re-occurring. Skinner thinks of two kinds of reinforcers_____Positive and Negative.
A positive reinforcer is any stimulus the introduction or presentation of which increase the likelihood of a particular behavior, food and water are classified as positive reinforcers. A negative reinforcer is any stimulus the removal or withdrawal of which decreases the likelihood of a particular behavior. Electric shock, a loud noise, etc are said to be negative reinforcers.

The schedules of Reinforcement: Skinner put forward the idea of planning of schedules of reinforcement of conditioning the operant behavior of the organism.

1. Continuous Reinforcement Schedule: It is hundred percent reinforcement schedules where provision is made to reinforce or reward every correct response of the organism during acquisition of learning. For example, a student may be rewarded for every correct answer he gives to questions or problems put by the teacher.
2. Fixed Interval Reinforcement Schedule: In this schedule the organism is rewarded for a response made only after a set of interval of time e.g., every 3 minutes or every 5 minutes. How many times he has given correct response during this fixed interval of time does not matter, it is only on the expiry of the fixed interval that he is presented with some reinforcement.
3. Fixed Ratio Reinforcement Schedule: In this schedule the reinforcement is given after a fixed number of responses. A rat, for example, might be given a pallet of food after a certain number of level presses. The child solves five sums and he gets a chocolate.
4. Variable Reinforcement Schedule: when reinforcement is given at varying intervals of time or after a varying number of responses, it is called a variable reinforcement schedule. In this case reinforcement is intermittent or irregular. The individual does not know when he is going to be rewarded and consequently he remains motivated throughout the learning process in the wait of reinforcement. For example the card game and gambling, try and try again slogan, In classroom teaching learning VR schedule operates when student is not allowed to reinforce each time he raises his hand to answer a question, but the more often he raises his hand, the more likely he is to called upon by the teacher. Good marks and promotion may come at unpredictable time.

Defining Operant Conditioning. Operant conditioning refers to a kind of learning process where a response is made more probable or more frequent by reinforcement. It helps in the learning of operant behavior, the behavior that is not necessarily associated with a known stimuli.

The difference between the two types of conditioning

Classical respondent conditioning
Operant conditioning

Similar Videos