Concept: Operant conditioning
Dopamine signaling is implicated in reinforcement learning, but the neural substrates targeted by dopamine are poorly understood. We bypassed dopamine signaling itself and tested how optogenetic activation of dopamine D1 or D2 receptor–expressing striatal projection neurons influenced reinforcement learning in mice. Stimulating D1 receptor–expressing neurons induced persistent reinforcement, whereas stimulating D2 receptor–expressing neurons induced transient punishment, indicating that activation of these circuits is sufficient to modify the probability of performing future actions.
Noncontingent reinforcement (NCR) is the response-independent delivery of a reinforcer (Vollmer, Iwata, Zarcone, Smith, and Mazaleski in Journal of Applied Behavior Analysis 26: 9-21 1993). Two staff members (preservice education majors) implemented NCR procedures for two students with autism spectrum disorder (ASD) who exhibited problem behavior and attended an after-school program. The amount of training on NCR and procedural fidelity was measured for each staff member, and the effects of the treatment on problem behavior were evaluated. Results indicate NCR is a low-effort procedure that reduced problem behavior of two participants with ASD. • NCR can both reduce problem behaviors of clients who engage in difficult behaviors (Carr, Severtson, & Lepper, 2009). • NCR can be used for clients for whom extinction-induced behaviors are dangerous (Tucker, Sigafoos, and Bushell in Behavior Modification, 22: 529-547, 1998). • Nonbehavioral providers can implement NCR with high fidelity, making it a good procedure to use when collaborating with other professionals (teachers, SLP, parents, etc.; Matson, 2009). • NCR can be used when clinicians first begin working with a client until more detailed interventions are created.
Smoking tobacco remains one of the leading causes of preventable deaths in North America. Nicotine reinforces smoking behavior, in part, by enhancing the reinforcing properties of reward-related stimuli, or conditioned stimuli (CSs), associated with tobacco intake. To investigate how pharmaceutical interventions may affect this property of nicotine, we examined the effect of four US Food and Drug Administration (FDA) approved drugs on the ability of nicotine to enhance operant responding for a CS as a conditioned reinforcer. Thirsty rats were exposed to 13 Pavlovian sessions where a CS was paired with water delivery. Nicotine (0.4 mg/kg) injections were administered before each Pavlovian session. Then, in separate groups of rats, the effects of varenicline (1 mg/kg), bupropion (10 and 30 mg/kg), lorcaserin (0.6 mg/kg), and naltrexone (2 mg/kg), and their interaction with nicotine on responding for conditioned reinforcement were examined. Varenicline and lorcaserin each reduced nicotine-enhanced responding for conditioned reinforcement, whereas naltrexone had a modest effect of reducing response enhancements by nicotine. In contrast, bupropion enhanced the effect of nicotine on this measure. The results of these studies may inform how pharmaceutical interventions can affect smoking cessation attempts and relapse through diverse mechanisms, either substituting for, or interacting with, the reinforcement-enhancing properties of nicotine.
This article introduces the ArduiPod Box, an open-source device built using two main components (i.e., an iPod Touch and an Arduino microcontroller), developed as a low-cost alternative to the standard operant conditioning chamber, or “Skinner box.” Because of its affordability, the ArduiPod Box provides an opportunity for educational institutions with small budgets seeking to set up animal laboratories for research and instructional purposes. A pilot experiment is also presented, which shows that the ArduiPod Box, in spite of its extraordinary simplicity, can be effectively used to study animal learning and behavior.
The dopaminergic system is involved in reward encoding and reinforcement learning. Dopaminergic neurons from this system in the substantia nigra/ventral tegmental area complex (SN/VTA) fire in response to unexpected reinforcing cues. The goal of this study was to investigate whether individuals can gain voluntary control of SN/VTA activity, thereby potentially enhancing dopamine release to target brain regions. Neurofeedback and mental imagery were used to self-regulate the SN/VTA. Real-time functional magnetic resonance imaging (rtfMRI) provided abstract visual feedback of the SN/VTA activity while the subject imagined rewarding scenes. Skin conductance response (SCR) was recorded as a measure of emotional arousal. To examine the effect of neurofeedback, subjects were assigned to either receiving feedback directly proportional (n =15, veridical feedback) or inversely proportional (n=17, inverted feedback) to SN/VTA activity. Both groups of subjects were able to up-regulate SN/VTA activity initially without feedback. Veridical feedback improved the ability to up-regulate SN/VTA compared to baseline while inverted feedback did not. Additional dopaminergic regions were activated in both groups. The ability to self-regulate SN/VTA was differentially correlated with SCR depending on the group, suggesting an association between emotional arousal and neurofeedback performance. These findings indicate that SN/VTA can be voluntarily activated by imagery and voluntary activation is further enhanced by neurofeedback. The findings may lead the way towards a non-invasive strategy for endogenous control of dopamine.
Long Evans rats (n=32) were trained for 2 weeks to respond to an auditory conditioned stimulus (CS) which signaled the delivery of a 20% sucrose unconditioned stimulus (US) with varying probabilities. Animals were randomly assigned to 1 of 4 groups. In the control groups, the CS signaled sucrose delivery with equal probabilities across two weeks, at 100% (Group 100-100) and 25% (Group 25-25) respectively. In the experimental groups (Group 100-25) and (Group 25-100), sucrose probabilities were switched between weeks 1 and 2. Three behavioral measures were recorded: latency to enter the sucrose port upon CS presentation, head entries throughout the session and ultrasonic vocalizations. The results suggest that all groups formed associations between the CS and US, as evidenced by a decrease in latency to respond to the CS across days. The experimental groups were also able to detect when sucrose probability changed, as evidenced by Group 25-100’s increase in head entries, to the level of Group 100-100 in week 2, and Group 100-25’s decrease in head entries, to the level of Group 25-25 in week 2. Group 100-25 also produced an increase in “22 kHz” ultrasonic vocalizations following the downshift on the first day of week 2. The increase in this ultrasonic frequency range, which is associated with negative affect in rats, preceded both the decrease in head entries and the increase in missed trials, consistent with a multistage model of behaviors resulting from US probability reduction.
Recent work has advanced our knowledge of phasic dopamine reward prediction error signals. The error signal is bidirectional, reflects well the higher order prediction error described by temporal difference learning models, is compatible with model-free and model-based reinforcement learning, reports the subjective rather than physical reward value during temporal discounting and reflects subjective stimulus perception rather than physical stimulus aspects. Dopamine activations are primarily driven by reward, and to some extent risk, whereas punishment and salience have only limited activating effects when appropriate controls are respected. The signal is homogeneous in terms of time course but heterogeneous in many other aspects. It is essential for synaptic plasticity and a range of behavioural learning situations.
Reward enhancement by nicotine has been suggested as an important phenomenon contributing toward tobacco abuse and dependence. Reinforcement value is a multifaceted construct not fully represented by any single measure of response strength. The present study evaluated the changes in the reinforcement value of a visual stimulus in 16 male Sprague-Dawley rats using the reinforcer demand technique proposed by Hursh and Silberberg. The different parameters of the model have been shown to represent differing facets of reinforcement value, including intensity, perseverance, and sensitivity to changes in response cost. Rats lever-pressed for 1-min presentations of a compound visual stimulus over blocks of 10 sessions across a range of response requirements (fixed ratio 1, 2, 4, 8, 14, 22, 32). Nicotine (0.4 mg/kg, base) or saline was administered 5 min before each session. Estimates from the demand model were calculated between nicotine and saline administration conditions within subjects and changes in reinforcement value were assessed as differences in Q0, Pmax, Omax, and essential value. Nicotine administration increased operant responding across the entire range of reinforcement schedules tested, and uniformly affected model parameter estimates in a manner suggesting increased reinforcement value of the visual stimulus.
This article describes a laboratory system for running learning experiments in operant chambers with various species. It is based on a modern version of a classical learning chamber for operant conditioning, the so-called “Skinner box”. Rather than constituting a stand-alone unit, as is usually the case, it is an integrated part of a comprehensive technical solution, thereby eliminating a number of practical problems that are frequently encountered in research on animal learning and behavior. The Vienna comparative cognition technology combines modern computer, stimulus presentation, and reinforcement technology with flexibility and user-friendliness, which allows for efficient, widely automatized across-species experimentation, and thus makes the system appropriate for use in a broad range of learning tasks.
RATIONALE: The orexin (Orx)/hypocretin system has been implicated in reward-seeking, especially for highly salient food and drug rewards. We recently demonstrated that signaling at the OxR1 receptor is involved in sucrose reinforcement and reinstatement of sucrose-seeking elicited by sucrose-paired cues in food-restricted rats. Because sucrose reinforcement has both a hedonic and caloric component, it remains unknown what aspect of this reward drives its reinforcing value. OBJECTIVES: The present study examined the involvement of the Orx system in operant responding for saccharin, a noncaloric, hedonic (sweet) reward, and in cue-induced reinstatement of extinguished saccharin-seeking in ad libitum-fed vs food-restricted male subjects. METHODS: Male Sprague Dawley rats were fed ad libitum or food-restricted and trained to self-administer saccharin. We determined the effects of pretreatment with the OxR1 receptor antagonist SB-334867 (SB; 10-30 mg/kg) on fixed ratio (FR) saccharin self-administration and on cue-induced reinstatement of extinguished saccharin-seeking. RESULTS: SB decreased responding and number of reinforcers earned during FR responding for saccharin and decreased cue-induced reinstatement of extinguished saccharin-seeking. All of these effects were obtained similarly in food-restricted and ad libitum-fed rats. CONCLUSIONS: These results indicate that signaling at the OxR1 receptor is involved in saccharin reinforcement and reinstatement of saccharin-seeking elicited by saccharin-paired cues regardless of food restriction. These findings lead us to conclude that the Orx system contributes to the motivational effects of hedonic food rewards, independently of caloric value and homeostatic needs.