A Reconfigurable Two‐WSe 2 ‐Transistor Synaptic Cell for Reinforcement Learning

Advanced Materials(2022)

引用 46|浏览10
暂无评分
摘要
Reward‐modulated spike‐timing‐dependent plasticity (R‐STDP) is a brain‐inspired reinforcement learning (RL) rule, exhibiting potential for decision‐making tasks and artificial general intelligence. However, the hardware implementation of the reward‐modulation process in R‐STDP usually requires complicated Si complementary metal–oxide–semiconductor (CMOS) circuit design that causes high power consumption and large footprint. Here, a design with two synaptic transistors (2T) connected in a parallel structure is experimentally demonstrated. The 2T unit based on WSe 2 ferroelectric transistors exhibits reconfigurable polarity behavior, where one channel can be tuned as n‐type and the other as p‐type due to nonvolatile ferroelectric polarization. In this way, opposite synaptic weight update behaviors with multilevel (>6 bit) conductance states, ultralow nonlinearity (0.56/−1.23), and large G max / G min ratio of 30 are realized. By applying positive/negative reward to (anti‐)STDP component of 2T cell, R‐STDP learning rules are realized for training the spiking neural network and demonstrated to solve the classical cart–pole problem, exhibiting a way for realizing low‐power (32 pJ per forward process) and highly area‐efficient (100 µm 2 ) hardware chip for reinforcement learning.
更多
查看译文
关键词
synaptic cell,reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要