학술논문
65 GOPS/neuron Photonic Tensor Core with Thin-film Lithium Niobate Photonics
Document Type
Working Paper
Author
Lin, Zhongjin; Shastri, Bhavin J.; Yu, Shangxuan; Song, Jingxiang; Zhu, Yuntao; Safarnejadian, Arman; Cai, Wangning; Lin, Yanmei; Ke, Wei; Hammood, Mustafa; Wang, Tianye; Xu, Mengyue; Zheng, Zibo; Al-Qadasi, Mohammed; Esmaeeli, Omid; Rahim, Mohamed; Pakulski, Grzegorz; Schmid, Jens; Barrios, Pedro; Jiang, Weihong; Morison, Hugh; Mitchell, Matthew; Qiang, Xiaogang; Guan, Xun; Jaeger, Nicolas A. F.; Rusch, Leslie A. n; Shekhar, Sudip; Shi, Wei; Yu, Siyuan; Cai, Xinlun; Chrostowski, Lukas
Source
Subject
Language
Abstract
Photonics offers a transformative approach to artificial intelligence (AI) and neuromorphic computing by providing low latency, high bandwidth, and energy-efficient computations. Here, we introduce a photonic tensor core processor enabled by time-multiplexed inputs and charge-integrated outputs. This fully integrated processor, comprising only two thin-film lithium niobate (TFLN) modulators, a III-V laser, and a charge-integration photoreceiver, can implement an entire layer of a neural network. It can execute 65 billion operations per second (GOPS) per neuron, including simultaneous weight updates-a hitherto unachieved speed. Our processor stands out from conventional photonic processors, which have static weights set during training, as it supports fast "hardware-in-the-loop" training, and can dynamically adjust the inputs (fan-in) and outputs (fan-out) within a layer, thereby enhancing its versatility. Our processor can perform large-scale dot-product operations with vector dimensions up to 131,072. Furthermore, it successfully classifies (supervised learning) and clusters (unsupervised learning) 112*112-pixel images after "hardware-in-the-loop" training. To handle "hardware-in-the-loop" training for clustering AI tasks, we provide a solution for multiplications involving two negative numbers based on our processor.
Comment: 19 pages, 6 figures
Comment: 19 pages, 6 figures