Competition between memories for reactivation as a mechanism for long-delay credit assignment
Subhadra Mokashe*1, Paul Miller2
1Neuroscience Graduate Program, Brandeis University, Waltham, USA 2Department of Biology, Brandeis University, Waltham, USA
*Email: subhadram@brandeis.edu
Introduction Animals learn to associate an event with its outcome, as in conditioned taste aversion, when they gain aversion to a conditioned stimulus (CS, recently experienced taste) if sickness is later induced [1]. Overshadowing arises if another intervening taste (interfering stimulus, IS) gains some credit for the causality of the outcome, thereby reducing the aversion to the CS [2]. The known short-term correlational plasticity mechanisms do not wholly explain how networks of neurons achieve long-delay credit assignment. We hypothesize that reactivation of stimuli during sickness causes specific associative learning between those stimuli and the sickness, and the competition between the stimuli for reactivation could explain overshadowing. Methods We build a spiking recurrent network model with clustered connectivity for excitatory neurons and unstructured inhibitory feedback. We assume the recurrent strengths are enhanced at the time of stimulus presentation due to Hebbian mechanisms and then decay in time. Given that the IS is introduced after the CS, the IS ensemble has higher recurrent strength than the CS ensemble. When we simulate the network, we see reactivation of both tastes (Fig 1 A). We calculate the fraction of time the network spends reactivating a stimulus as a readout of association with the outcome (sickness). We vary the interstimulus interval by changing the difference in recurrent strengths (Δ) and vary the delay to sickness by varying the recurrent strengths.
Results When we look at the time spent in each state as we increase Δ, we see that not only the time spent in the IS increases, but the time spent in the CS decreases (Fig. 1 B). We only changed the recurrent strengths of the IS ensemble; the time spent in the CS ensemble was affected, indicating competition between the memories for reactivation and accounts for overshadowing. When the CS to IS interval is held constant, paradoxically, more conditioning to the CS is shown by a later sickness onset than earlier sickness [2]. We can explain the result via greater time spent in the CS state (Fig. 1 D) with an appropriate decay profile of recurrent weights (Fig. 1 C) such that the reduced overshadowing outweighs the reduction in conditioning with increased delay.
Discussion How actions are associated with delayed outcomes is not well understood. We explore the reactivation of memories as a mechanism for long-delay credit assignment in conditioned taste aversion (CTA). We show that competition between memories for reactivation could explain how credit is assigned when there is ambiguity about the cause of an outcome. We use theoretical predictions to constrain our model and are able to explain experimental findings for overshadowing [2]. This study could explain credit assignment not only in CTA and overshadowing but also in other forms of long-delay learning and provide insights into how credit is assigned when there is ambiguity in the cause of an outcome.
Figure 1. A. Reactivation of the stimuli. B. Fraction of time spent by the network in stimuli states as a function of Δ. C. Time spent in the CS state as a function of the recurrent strength Δ, specific decay profile of the recurrent weights (red line). D. Rebound seen in the time spent in the CS state as a function of delay to the sickness onset only in the presence of the IS (red line). Acknowledgements
We acknowledge Donald Katz and Hannah Germaine for discussions about the work. We thank NIH, NINDS for funding via R01 NS104818. References ● https://doi.org/10.1037/h0029807 ● https://doi.org/10.3758/s13420-016-0246-x