• 검색 결과가 없습니다.

G. Neural activity related to reward prediction error and

IV. DISCUSSION

We compared the activities of CA3 and CA1 in dynamic foraging situations to gain insight into the hippocampal processes that underlie value-based decision making. We found that reward prediction error and updated chosen value neural signals were significantly stronger in CA1 compared to CA3 in rats. Collectively, this result indicates a more important role of CA1 than CA3 in keeping track of values of potential choices based on the history of past choices and their outcomes.

In rats, reward prediction error and updated chosen value signals were significantly stronger in CA1 than CA3 when the animal’s choice outcome was revealed. However, advanced choice signals were found in neither brain structure. This result suggests that CA1 is not directly involved in action selection process per se or controlling exploitation vs. exploration trade-off (randomness in action selection), but likely to influence action selection indirectly via its role in updating values associated with different options. Our results consistently indicate a selective role of CA1 in value learning, but not in action selection.

Several lines of evidence indicate that different types of memory are Reinforcement Learning theory, the dorsolateral striatum has been proposed to mediate model-free reinforcement learning (or incremental value learning based on actual outcomes), whereas the hippocampus has been proposed to contribute to model-based Reinforcement Learning (or knowledge-based value learning) based on its role in remembering facts and events and simulating hypothetical episodes (Lee et al., 2012b; Doll et al., 2012). These theories often assume that the striatum is in charge of gradual associative learning based on actual experiences, whereas the hippocampus is involved in ‘cognitive’ learning of facts and events (such as one-trial learning, vicarious learning, and forming cognitive maps). However, our results indicate that the hippocampus, especially CA1, may also contribute to incremental value learning in a dynamic foraging situation.

What is the neural basis of a value signal that is stronger CA1 than CA3? One possibility is the different projection of dopamine and its

effects on CA3 and CA1. Dopaminergic projections from the ventral tegmental area (VTA) are different between CA3 and CA1. And dopamine receptor subtype distributions are also different (Gasbarri et al., 1997; Shohamy and Adcock, 2010). Dopamine is known as carrier of RPE signals (Schultz et al., 1997; Roesch et al., 2007;

Cohen et al., 2012) and regulates synaptic plasticity and transmission in CA1 (e.g., Frey et al., 1990; Otmakhova and Lisman, 1996; Li et al., 2003; O’Carroll and Morris, 2004; Zhang et al., 2009; Brzosko et al., 2015; Rosen et al., 2015; see also Hansen and Manahan-Vaughan, 2014 for review). Dopamine might be different effects to CA3 vs.

CA1 neurons through these mechanisms, so CA1 neuronal activity is independently regulated to values compare to CA3. This possibility is supported by previous study that inactivation of the ventral tegmental area changes spatial firing of CA1 place cells, but not CA3 (Martig and Mizumori 2011). Moreover, other afferent projections to CA1, such as CA2 projections (Tamamaki et al., 1988; Shinohara et al., 2012; Kohara et al., 2014), direct layer III entorhinal cortical projections (Witter, 1986, 1993; Amaral, 1993), thalamic projections (Herkenham, 1978; Wouterlood et al., 1990) and prefrontal cortical projections (Rajasethupathy et al., 2015), may contribute to value-related neural activity of CA1 neurons. Future studies that combine manipulation of specific afferent predictions and monitoring of CA1 neuronal activity may help clarify the role of dopaminergic and other afferent projections in CA1 value processing.

Recent research indicates that hippocampus has an important role which is imagining future episodes (Buckner, 2010; Schacter et al., 2012; Gaesser et al., 2013; Mullally and Maguire, 2014). In rats, hippocampal place cells go through sequential neural activity (replays) during sleep and immobile awakening during task that reflect not only experienced but also unexperienced trajectories (e.g., Louie and Wilson, 2001; Lee and Wilson, 2002; Foster and Wilson, 2006; Diba and Buzsaki, 2007; Carr et al., 2011; Johnson and Redish, 2007;

Gupta et al., 2010; Dragoi and Tonegawa, 2011; Pfeiffer and Foster, 2013). Our results increase the possibility that value signal represented in CA1 may contribute to replay of place cells.

Consistent with this possibility, trajectories reconstructed from replays of CA1 place cells are preferentially directed to not only previously visited but also unvisited (but observed) reward locations in rats (Foster and Wilson, 2006; Pfeiffer and Foster, 2013;

Olafsdottir et al., 2015). Value-related CA1 neurons involved in replay may be a way of evaluating expected values of replayed place cell sequences, which would be useful for event sequences (or rewarding trajectories) for maximizing value and simulating the most probable.

Ⅴ. CONCLUSIONS

The hippocampus is known to play a crucial role in declarative memory (remembering facts and events), but not in gradual stimulus-response or response-outcome association in which striatum is known to play an important role (McDonald and White, 1993; Packard and Knowlton, 2002). From the standpoint of RL theory, the dorsolateral striatum has been proposed to mediate model-free RL (or incremental value learning based on actual outcomes), whereas the hippocampus has been proposed to contribute to model-based RL (or knowledge-based value learning) based on its role in remembering facts and events and simulating hypothetical episodes (Lee et al., 2012b; Doll et al., 2012). However, our results indicate that the hippocampus, especially CA1, contributes to incremental value learning in a dynamic foraging situation. The modality of information to be remembered might be a factor that determines the involvement of the hippocampus in incremental learning.

REFERENCES

1. Amaral DG. Emerging principles of intrinsic hippocampal organization. Curr. Opin. Neurobiol. 1993; 3: 225–229.

2. Amaral DG., Ishizuka N, Claiborne B. Neurons, numbers and the hippocampal network. Prog Brain Res. 1990; 83: 1-11.

3. Ambrose RE., Pfeiffer BE., Foster DJ. Reverse Replay of Hippocampal Place Cells Is Uniquely Modulated by Changing Reward. Neuron. 2016 Sep 7; 91(5): 1124-36.

4. Baeg EH, Kim YB, Jang JH, Kim HT, Mook-Jung IH, Jung MW.

Fast spiking and regular spiking neural correlates of fear conditioning in the medial prefrontal cortex of the rat. Cereb Cortex.

2001 May; 11(5): 441-51.

5. Barnes CA, McNaughton BL, Mizumori SJ, Leonard BW, Lin LH.

Comparison of spatial and temporal characteristics of neuronal activity in sequential stages of hippocampal processing. Prog Brain Res. 1990; 83: 287-300.

6. Barraclough DJ, Conroy ML, Lee D. Prefrontal cortex and decision making in a mixed-strategy game. Nat Neurosci 2004; 7: 404-410.

7. Baum WM. On two types of deviation from the matching law: bias and undermatching. J Exp Anal Behav. 1974 Jul; 22(1): 231-42.

8. Bornstein AM, Daw ND. Cortical and hippocampal correlates of deliberation during model-based decisions for rewards in humans.

PLoS Comput Biol. 2013; 9(12).

9. Brzosko Z, Schultz W, Paulsen O. Retroactive modulation of spike timing-dependent plasticity by dopamine. Elife. 2015 Oct 30; 4.

10. Buckner RL. The role of the hippocampus in prediction and imagination. Annu Rev Psychol. 2010; 61: 27–48.

11. Carr MF, Jadhav SP, and Frank LM. Hippocampal replay in the awake state: a potential substrate for memory consolidation and retrieval. Nat. Neurosci. 2011; 14: 147–153.

12. Cohen JY, Haesler S, Vong L, Lowell BB, Uchida N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature. 2012 Jan 18; 482(7383): 85-8.

13. Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ. Cortical substrates for exploratory decisions in humans. Nature. 2006; 441:

876-879.

14. Diba K., and Buzsáki G. Forward and reverse hippocampal place-cell sequences during ripples. Nat. Neurosci. 2007; 10: 1241–1242.

15. Doll BB, Simon DA, Daw, ND. The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 2012 Dec; 22(6):

1075-1081.

16. Dragoi G., and Tonegawa S. Preplay of future place cell sequences by hippocampal cellular assemblies. Nature. 2011; 469:

397–401.

17. Fenton AA, Lytton WW, Barry JM, Lenck-Santini PP, Zinyuk LE, Kubik S, Bures J, Poucet B, Muller RU, Olypher AV: Attention-like modulation of hippocampus place cell discharge. J Neurosci 2010;

30: 4613-4625.

18. Foster DJ, and Wilson MA. Reverse replay of behavioural sequences in hippocampal place cells during the awake state.

Nature. 2006; 440: 680–683.

19. Frey U, Schroeder H, Matthies H. Dopaminergic antagonists

prevent long-term maintenance of posttetanic LTP in the CA1 region of rat hippocampal slices. Brain Res. 1990 Jul 2; 522(1):

69-75.

20. Gaesser B, Spreng RN, McLelland VC, Addis DR, Schacter DL.

Imagining the future: evidence for a hippocampal contribution to constructive processing. Hippocampus. 2013 Dec; 23(12): 1150-61.

21. Gasbarri A, Sulli A, Packard MG. The dopaminergic mesencephalic projections to the hippocampal formation in the rat.

Prog Neuropsychopharmacol Biol Psychiatry. 1997 Jan; 21(1): 1-22.

22. Gupta AS, van der Meer MA, Touretzky DS, and Redish AD.

Hippocampal replay is not a simple function of experience. Neuron.

2010; 65: 695–705.

23. Hansen N, Manahan-Vaughan D. Dopamine D1/D5 receptors mediate informational saliency that promotes persistent hippocampal long-term plasticity. Cereb Cortex. 2014 Apr; 24(4):

845-58.

24. Herkenham,M. The connections of the nucleus reuniens thalami:

evidence for a direct thalamo-hippocampal pathway in the rat. J.

Comp. Neurol. 1978; 177: 589–609.

25. Huh N, Jo S, Kim H, Sul JH, Jung MW. Model-based reinforcement learning under concurrent schedules of reinforcement in rodents. Learn Mem. 2009 Apr 29;16(5):315-23.

26. Ito M, Doya K. Multiple representations and algorithms for reinforcement learning in the cortico-basal ganglia circuit. Curr Opin Neurobiol. 2011; 21: 368-373.

27. Jackson JC, Johnson A, Redish AD. Hippocampal sharp waves and reactivation during awake states depend on repeated sequential experience. J Neurosci. 2006 Nov 29; 26(48): 12415-26.

28. Johnson A, and Redish AD. Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. J. Neurosci.

2007; 27: 12176–12189.

29. Kim H, Sul JH, Huh N, Lee D, Jung MW. Role of striatum in updating values of chosen actions. J Neurosci. 2009 Nov 25;

29(47): 14701-12.

30. Kim H, Lee D, Jung MW. Signals for previous goal choice persist in the dorsomedial, but not dorsolateral striatum of rats. J Neurosci.

2013 Jan 2; 33(1): 52-63.

31. Kohara K, Pignatelli M, Rivest AJ, Jung HY, Kitamura T, Suh J.

Cell type-specific genetic and optogenetic tools reveal hippocampal CA2 circuits. Nat. Neurosci. 2014; 17: 269–279.

32. Kuhl BA, Shah AT, DuBrow S, Wagner AD. Resistance to forgetting associated with hippocampus-mediated reactivation during new learning. Nat Neurosci. 2010 Apr; 13(4): 501-6.

33. Lau B, Glimcher PW. Dynamic response-by-response models of matching behavior in rhesus monkeys. J Exp Anal Behav. 2005 Nov;

84(3): 555-79.

34. Lee AK, and Wilson MA. Memory of sequential experience in the hippocampus during slow wave sleep. Neuron. 2002; 36: 1183–

1194.

35. Lee D, Conroy ML, McGreevy BP, Barraclough DJ. Reinforcement learning and decision making in monkeys during a competitive game.

Brain Res Cogn Brain Res. 2004a; 22: 45-58.

36. Lee H, Ghim JW, Kim H, Lee D, Jung MW. Hippocampal neural correlates for values of experienced events. J Neurosci. 2012a, Oct 24; 32(43): 15053-65.

37. Lee D, Seo H, Jung MW. Neural basis of reinforcement learning and decision making. Annu Rev Neurosci. 2012b; 35: 287-308.

38. Lee I, Griffin AL, Zilli EA, Eichenbaum H, Hasselmo ME. Gradual translocation of spatial correlates of neuronal firing in the hippocampus toward prospective reward locations. Neuron. 2006;

51: 639-650.

39. Lee I, Yoganarasimha D, Rao G, Knierim JJ. Comparison of population coherence of place cells in hippocampal subfields CA1 and CA3. Nature. 2004b, July; 430(6998): 456-459.

40. Leutgeb JK, Leutgeb S, Treves A, Meyer R, Barnes CA, McNaughton BL, Moser MB, Moser EI. Progressive transformation of hippocampal neuronal representations in "morphed"

environments. Neuron. 2005; 48: 345-358.

41. Leutgeb S, Leutgeb JK, Treves A, Moser MB, Moser EI. Distinct ensemble codes in hippocampal areas CA 3 and CA1. Science. 2004 Aug 27; 305(5688): 1295-8.

42. Li S, Cullen WK, Anwyl R, Rowan MJ. Dopamine-dependent facilitation of LTP induction in hippocampal CA1 by exposure to spatial novelty. Nat Neurosci. 2003 May; 6(5): 526-31.

43. Louie K, and Wilson MA. Temporally structured replay of awake hippocampal ensemble activity during rapid eye movement sleep.

Neuron. 2001; 29: 145–156.

44. Lu L, Igarashi KM, Witter MP, Moser EI, Moser MB. Topography

of place maps along the CA3-to-CA2 axis of the hippocampus.

Neuron. 2015; 87: 1078-1092.

45. Markus EJ, Qin YL, Leonard B, Skaggs WE, McNaughton BL, Barnes CA. Interactions between location and task affect the spatial and directional firing of hippocampal neurons. J Neurosci. 1995; 15:

7079-7094.

46. Marr D. Simple memory: a theory for archicortex. Philos Trans R Soc Lond B Biol Sci. 1971; 262: 23-81.

47. Martig AK, Mizumori SJ. Ventral tegmental area disruption selectively affects CA1/CA2 but not CA3 place fields during a differential reward working memory task. Hippocampus. 2011 Feb;

21(2): 172-84.

48. McDonald RJ, White NM. A triple dissociation of memory systems:

hippocampus, amygdala, and dorsal striatum. Behav Neurosci. 1993;

107(1): 3-22.

49. McNaughton BL, and Morris RG. Hippocampal synaptic enhancement and information storage within a distributed memory system. Trends Neurosci. 1987; 10, 408-415.

50. Mizuseki K, Royer S, Diba K, Buzsaki G. Activity dynamics and behavioral correlates of CA3 and CA1 hippocampal pyramidal neurons. Hippocampus. 2012 Aug; 22(8): 1659-80.

51. Moita MA, Rosis S, Zhou Y, LeDoux JE, Blair HT. Putting fear in its place: remapping of hippocampal place cells during fear conditioning. J Neurosci. 2004; 24: 7015-7023.

52. Mullally SL, Maguire EA. Memory, Imagination, and Predicting the Future: A Common Brain Mechanism? Neuroscientist. 2013 Jul 11;

20(3): 220-234.

53. Muller RU, Kubie JL. The effects of changes in the environment on the spatial firing of hippocampal complex-spike cells. J Neurosci. 1987; 7: 1951-1968.

54. Neter J, Kutner MH, Nachtsheim CJ, and Wasserman W. Applied Linear Statistical Models, Vol. 4. Chicago, IL: Irwin; 1996, p. 318.

55. O'Carroll CM, Morris RG. Heterosynaptic co-activation of glutamatergic and dopaminergic afferents is required to induce persistent long-term potentiation. Neuropharmacology. 2004 Sep;

47(3): 324-32.

56. O'Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ.

Temporal difference models and reward-related learning in the human brain. Neuron. 2003; 38: 329-337.

57. Olafsdottir HF, Barry C, Saleem AB, Hassabis D, Spiers HJ.

Hippocampal place cells construct reward related sequences through unexplored space. Elife. 2015; 4.

58. O'Keefe J, Nadal L. The hippocampus as a cognitive map. Oxford, Clarendn Press; 1978

59. O'Keefe J, Dostrovsky J. The hippocampus as a spatial map.

Preliminary evidence from unit activity in the freely-moving rat.

Brain Res. 1971; 34: 171-175.

60. Otmakhova NA, Lisman JE. D1/D5 dopamine receptor activation increases the magnitude of early long-term potentiation at CA1 hippocampal synapses. J Neurosci. 1996 Dec 1; 16(23): 7478-86.

61. Packard MG, Knowlton BJ. Learning and memory functions of the basal ganglia. Annu Rev Neurosci. 2002; 25(1): 563-593.

62. Pfeiffer BE, Foster DJ. Hippocampal place-cell sequences depict future paths to remembered goals. Nature. 2013; 497: 74-79.

63. Rajasethupathy P, Sankaran S, Marshel JH, Kim CK, Ferenczi E, Lee SY. Projections from neocortex mediate top-down control of memory retrieval. Nature. 2015; 526: 653–659.

64. Ranck JB Jr. Studies on single neurons in dorsal hippocampal formation and septum in unrestrained rats. I. Behavioral correlates and firing repertoires. Exp Neurol. 1973 Nov; 41(2): 461-531.

65. Roesch MR, Calu DJ, Schoenbaum G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat Neurosci. 2007 Dec; 10(12): 1615-24.

66. Rolls ET, and Treves A. Neural networks and brain function, Vol 572. Oxford university press Oxford; 1998.

67. Rosen ZB, Cheung S, Siegelbaum SA. Midbrain dopamine neurons bidirectionally regulate CA3-CA1 synaptic drive. Nat Neurosci.

2015 Dec; 18(12): 1763-71.

68. Samejima K, Ueda Y, Doya K, Kimura M. Representation of action-specific reward values in the striatum. Science. 2005; 310:

1337-1340.

69. Schacter DL, Addis DR, Hassabis D, Martin VC, Spreng RN, Szpunar KK. The future of memory: remembering, imagining, and the brain. Neuron. 2012 Nov 21; 76(4): 677-94.

70. Schmitzer-Torbert N, Jackson J, Henze D, Harris K, Redish AD.

Quantitative measures of cluster quality for use in extracellular recordings. Neuroscience. 2005; 131: 1-11.

71. Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997 Mar 14; 275(5306): 1593-9.

72. Shohamy D, Adcock RA. Dopamine and adaptive memory. Trends

Cogn Sci. 2010 Oct; 14(10): 464-72.

73. Singer AC, Frank LM. Rewarded outcomes enhance reactivation of experience in the hippocampus. Neuron. 2009 Dec 24; 64(6):

910-21.

74. Shinohara Y, Hosoya A, Yahagi K, Ferecskó AS, Yaguchi K, Sík A. Hippocampal CA3 and CA2 have distinct bilateral innervation patterns to CA1 in rodents. Eur. J. Neurosci. 2012; 35: 702–710.

75. Smith DM, Mizumori SJ. Hippocampal place cells, context, and episodic memory. Hippocampus. 2006; 16: 716-729.

76. Song EY, Kim YB, Kim YH, Jung MW. Role of active movement in place-specific firing of hippocampal neurons. Hippocampus. 2005;

15: 8-17.

77. Squire LR. Meory system of the brain : a brief history and current perpective. Neurobiol Learn Mem. 2004 Nov; 82(3): 171-7.

78. Sul JH, Kim H, Huh N, Lee D, Jung MW. Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making.

Neuron. 2010 May 13; 66(3): 449-60.

79. Sul JH, Jo S, Lee D, Jung MW. Role of rodent secondary motor cortex in value-based action selection. Nat Neurosci. 2011 Aug 14;

14(9): 1202-8.

80. Sutton RS, Barto AG. Reinforcement Learning. Cambridge MA, MIT Press; 1998.

81. Tamamaki N, Abe K, and Nojyo Y. Three-dimensional analysis of the whole axonal arbors originating from single CA2 pyramidal neurons in the rat hippocampus with the aid of a computer graphic technique. Brain Res. 1988; 452: 255–272.

82. Tanaka SC, Doya K, Okada G, Ueda K, Okamoto Y, Yamawaki S.

Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nat Neurosci. 2004; 7: 887– 893.

83. Tulving E: Episodic memory: from mind to brain. Annu Rev Psychol. 2002; 53: 1-25.

84. Vazdarjanova A, Guzowski JF. Differences in hippocampal neuronal population responses to modifications of an environmental context: evidence for distinct, yet complementary, functions of CA3 and CA1 ensembles. J Neurosci. 2004 Dec; 24(29): 6489-6496.

85. Wimmer GE, Shohamy D. Preference by association: How memory mechanisms in the hippocampus bias decisions. Science.

2012; 338: 270–273.

86. Witter MP. A survey of the anatomy of the hippocampal formation, with emphasis on the septotemporal organization of its intrinsic and extrinsic connections. Adv. Exp. Med. Biol. 1986; 203: 67–82.

87. Witter MP. Organization of the entorhinal-hippocampal system: a review of current anatomical data. Hippocampus. 1993; 3: 33–44.

88. Wood ER, Dudchenko PA, Robitsek RJ, Eichenbaum H:

Hippocampal neurons encode information about different types of memory episodes occurring in the same location. Neuron. 2000; 27:

623-633.

89. Wouterlood FG, Saldana E, and Witter MP. Projection from the nucleus reuniens thalami to the hippocampal region: light and electron microscopic tracing study in the rat with the anterograde tracer Phaseolus vulgaris-leucoagglutinin. J. Comp. Neurol. 1990;

296: 179–203.

90. Zhang JC, Lau PM, Bi GQ. Gain in sensitivity and loss in temporal

contrast of STDP by dopaminergic modulation at hippocampal synapses. Proc Natl Acad Sci USA. 2009 Aug 4; 106(31): 13028-33.

-국문요약-

해마의 CA3와 CA1의 효용가치 신경 신호의 비교 연구

아주대학교 의생명과학과 신경과학 전공 이 성 현

(지도교수 정민환, 김병곤)

해마가 가치를 바탕으로 한 의사결정 과정에서 어떻게 기여하는지 알 아보기 위해 쥐의 해마의 CA1과 CA3를 중심으로 T자형 미로에서 dynamic foraging task를 수행하여 실험하였다. 효용 가치와 그 선택의 보상을 통한 선택 행동의 가치를 갱신하는데 필요한 신경 신호는 보상 구역에서 보상이 나왔을 때 CA1과 CA3, 두 영역 모두에서 합쳐진다.

그러나, 선택의 결과에 대한 신호는 CA3보다 CA1에서 더 강하게 나타 났다. 또한 CA3의 효용 가치 신호가 빠르게 줄어들었지만, CA1의 선택 가치 신호는 보상 구역에서 머물 때 계속 유효하게 유지됐다. 더군다나, 보상 예측 오차와 갱신된 효용 가치도 역시 CA3보다 CA1이 더 강하게 나타났다. 이러한 결과와 더불어, CA1의 효용 가치 신호가 해마이행부 (subiculum)의 효용 가치 신호보다 더 강하게 보였다는 선행 연구의 결 과를 통해, CA1이 경험한 사건을 토대로 한 가치 판단에 중요한 영역이 될 수 있음을 보여준다. 이러한 가치 관련 신호가 경험한 사건의 평가를 위한 해마 신경 과정에 기여하는지 여부와 방법에 대한 후속 연구가 필 요하다.

핵심어 : 의사결정, 해마, CA1, CA3, 효용 가치

관련 문서