CMOS-based area-and-power-efficient neuron and synapse circuits for time-domain analog spiking neural networks

Xiangyu Chen xiangyuchenCN@hotmail.com Systems Design Lab., School of Engineering, the University of Tokyo, Tokyo, Japan    Takeaki Yajima Department of Electrical and Electronic Engineering, Kyushu University, Fukuoka, Japan    Hisashi Inoue National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan    Isao H. Inoue National Institute of Advanced Industrial Science and Technology (AIST), Ibaraki, Japan    Zolboo Byambadorj Systems Design Lab., School of Engineering, the University of Tokyo, Tokyo, Japan    Tetsuya Iizuka iizuka@vdec.u-tokyo.ac.jp Systems Design Lab., School of Engineering, the University of Tokyo, Tokyo, Japan
August 26, 2022
Abstract

Conventional neural structures tend to communicate through analog quantities such as currents or voltages, however, as CMOS devices shrink and supply voltages decrease, the dynamic range of voltage/current-domain analog circuits becomes narrower, the available margin becomes smaller, and noise immunity decreases. More than that, the use of operational amplifiers (op-amps) and clocked or asynchronous comparators in conventional designs leads to high energy consumption and large chip area, which would be detrimental to building spiking neural networks. In view of this, we propose a neural structure for generating and transmitting time-domain signals, including a neuron module, a synapse module, and two weight modules. The proposed neural structure is driven by leakage currents in the transistor triode region and does not use op-amps and comparators, thus providing higher energy and area efficiency compared to conventional designs. In addition, the structure provides greater noise immunity due to internal communication via time-domain signals, which simplifies the wiring between the modules. The proposed neural structure is fabricated using TSMC 65 nm CMOS technology. The proposed neuron and synapse occupy an area of 127  and 231 , respectively, while achieving millisecond time constants. Actual chip measurements show that the proposed structure successfully implements the temporal signal communication function with millisecond time constants, which is a critical step toward a hardware reservoir computing for human-computer interaction.

preprint: AIP/123-QED

As the recent improvement in the computing power is limited by Von Neumann bottleneck while the technology scaling slows down due to the end of Moore’s law, researchers are facing unprecedented challenges such as the growing cost and difficulty to meet ever-increasing demand for higher performance computingMerolla et al. (2014); Patrick et al. (2020). Although deep neural networks (DNNs), which are the second generation of artificial neural networks (ANNs), have developed rapidly in recent years, their huge energy consumption has forced people to find an alternative wayZhang et al. (2020); Shin and Yoo (2020); LeCun et al. (2015); Kohno,Takashi and Aihara,Kazuyuki (2008); Chicca,E. and Indiveri,G. (2020); Bo,Yeheng et al. (2020). Spiking neural networks (SNNs) are favored as the third generation of ANNs that can more realistically mimic biological neurons. SNNs consist of neurons and synapses, and are usually built using a bottom-up approach, which means that each component of the SNNs needs to be designed firstBo,Yeheng et al. (2020); Yang,Kezhou and Sengupta,Abhronil (2020); Chen et al. (2022); Maass (1997); Radhakrishnan et al. (2021); Chen et al. (2021); Seok (2018).

Many hardware implementations of spiking neurons or synapses have been reportedIndiveri et al. (2006); Wu et al. (2015); Joubert et al. (2012); Aamir et al. (2018a); Basu and Hasler (2010); Dutta et al. (2017); Rubino et al. (2019); Aamir et al. (2018b), and most of these conventional spiking neurons or synapses use analog quantities such as voltage and current to communicate with each other. However, as the CMOS devices scale down and their supply voltage decreases, the dynamic range of voltage/current-domain analog circuits becomes narrower, the available margin becomes smaller, and their noise immunity degradesAsada et al. (2018). On the other hand, thanks to the scaled transistors that have an improved operation speed with sharp signal transitions, the analog information can be represented more efficiently in time domain, i.e. a time interval of two signal transitions. This so-called time-domain circuit have another advantage in its power efficiency as it often consists of inverters or logic gates that ideally consume no DC power Staszewski et al. (2004); Asada et al. (2018). Thus, time-domain circuits are ideal for future implementations of low-power SNNs. To implement the leaky integrate function of neurons on signals, conventional designs usually build integrators with operational amplifiers (op-amps)Wu et al. (2015) and often use large on-chip capacitors and resistors to mimic the millisecond time constants of biological neuronsBasu and Hasler (2010); Aamir et al. (2018a). Moreover, to implement the neuron “fire" function, a clocked or asynchronous comparator is usually used to set the threshold for neuron excitationAamir et al. (2018b); Wu et al. (2015); Joubert et al. (2012); Indiveri et al. (2006); Aamir et al. (2018a). The bias current of the asynchronous comparator undoubtedly increases the power consumption of the neuron, while the clocked comparator requires additional clock signal distribution and the complex comparator structure occupies a large chip area.

To address these aforementioned issues, in this paper we propose an original neural structure for generating and transmitting time-domain signals to compose a time-domain neural network. The integrated structure includes neuron and synapse modules that respectively generate and transmit time-domain signals, as well as weight modules for learning functions. One of our main target applications is a reservoir computing that processes human-activity-related information such as speech, handwriting, etc. As Ref. Gallicchio and Micheli, 2011 demonstrated that the learning performance increases when the time constant of the input effect are matched between the target function and the reservoir dynamics, we use a millisecond time constant as a design target to apply to recurrent SNNs that deal with real-time time-series information to achieve learning functions for dynamic systems.

(a)
(b)
Figure 1: (a) The proposed structure, (b) Micrograph of the chip.

The proposed neural structure is shown in Fig. 0(a), which is based on the proposed neuron, synapse, and weight modules, which will be described in detail below. In this structure, the input of the neuron module is connected to two weight modules, one for tuning the inhibitory signal and the other for the excitatory signal. We fabricated the proposed neural structure shown in Fig. 0(a) with TSMC 65 nm standard CMOS technology. The micrographs of the chip is shown in Fig. 0(b), where the die area of neuron, synapse and weight modules are 127 , 231  and 525 , respectively.

(a)
(b)
Figure 2: (a) Simplified LIF neuron model, (b) Behavior of LIF neurons Chen et al. (2022).

A simplified diagram of a biological neuron based on the LIF model is shown in Fig. 1(a). The LIF neuron model consists mainly of a membrane capacitor, a leaky resistor, and a voltage comparator. The behavior of the LIF neuron, shown in Fig. 1(b), can be described as follows: the neuron receives signals from other neurons via synapses and the soma generates action potentials in response to these external signals. If a neuron receives a sufficient number of spikes through the synapse, its membrane potential will reach a threshold value, causing the neuron to “fire"Abbott and Dayan (2005); Gerstner and Kistler (2012); Chen et al. (2022).

(a)
(b)
(c)
(d)
Figure 3: (a) Circuit diagram of the proposed neuron module, (b) Behaviours of proposed LIF neuron and synapse modules, (c) Circuit diagram of the proposed synapse module, (d) Circuit diagram of the proposed weight module.

The use of inverters to implement the “fire" function is already known as an alternative to comparators. Ref. Yajima, 2022 has proposed an inverter-based neuron, which is well suited for use in the proposed neural structure, and therefore the neuron used in this study was designed based on Ref. Yajima, 2022, which is shown in Fig. 2(a). It consists of input device, leaky integrator device, fire device and delay device. Originally in Ref. Yajima, 2022, the circuit is not assumed to be designed as an element to build a neural network, thus does not have a structure to receive excitatory and inhibitory signals. In the proposed circuit, on the other hand, the input device consists of and receives an excitatory input and an inhibitory input, respectively. The inputs to and are narrow pulse signals as shown in Fig. 2(a), which is generated from a pre-stage synapse. The activity of the pre-stage synapse is represented by the pulse frequency and the coupling weight is represented by the pulse width. When more than one pre-stage synapses are connected to compose a network, the multiple pulses can be applied through OR logic, or by adding input devices connected in parallel. With the parallel input devices, the neuron circuit can accept multiple pulses even at the same time.

Once rises to the threshold voltage , the fire device is activated (Fig. 2(b)(ii)). In conventional designs, LIF neurons mostly use comparators to set the threshold voltage. This is not friendly for building SNNs that are as energy efficient and bio-scale as the brain. In this study, the fire device is implemented with an inverter, which can set the threshold voltage with only two transistors instead of a conventional clocked comparator or asynchronous comparator. Though there may be a threshold variation due to process, voltage and temperature fluctuations, it can be seen as mimicking the difference between individuals of real neurons. In addition, the learning function is able to compensate for threshold differences and process variations. When there is an excitatory pulse input, will be turned on instantaneously, which causes more current to charge and to rise rapidly. Conversely, an inhibitory pulse input signal will cause to turn on momentarily, causing to charge slower or even discharge through , which in turn slows down the rate of rise or makes it fall.

(a)
(b)
(c)
(d)
Figure 4: (a) Photo of the experimental setup, (b) The effect of weight module, (c) The effect of a neuron’s input (i.e., the output of the pre-stage synapse) on its output, (d) The effect of a neuron’s output on synapse’s output.
(a)
(b)
Figure 5: (a) Another combined structure fabricated for the measuring synapse, (b) Synchronous transient relationship between and .

When the fire device is activated, it generates a low level of to be connected to , which will increase the current to charge the membrane capacitor , resulting in an instantaneous increase of the membrane potential , which promotes the triggering of the fire device. This mimics the influx of Na into the cell membrane prompting a rapid increase in membrane voltage, i.e., a positive feedback effect. Finally, the low level of generated from the fire device is converted to a high level of (Fig. 2(b)(iii)) by a delay device that includes a three-stage inverter and connects the to and , resetting to zero. This process mimics the activation of K channels in biological neurons, resulting in the outward flow of K ions and the eventual return of the cell membrane to its resting state.

As mentioned above, achieving millisecond time constants consistent with human activity typically requires the use of extremely large on-chip capacitors and resistors, which would severely hinder the large-scale integration of SNNs. Here, we achieve millisecond time constants by using a small capacitor of only 20 fF by utilizing the sub-threshold leakage current through the transistors , and , whereas the leakage currents through and are negligible as their gate lengths are designed to be sufficiently larger than those of , and . In order for to rise to activate the fire device, the leakage current through PMOS FETs must be larger than that through NMOS FET . The difference of the leakage current should be on the order of 1 pA to achieve the target time constant of 10 ms for 20 fF . The gate width and length of these transistors are designed to meet this requirement.

Synapses are essential modules in SNNs, as neurons are interconnected by them. We have designed a neuron module for generating time-domain signals, and then we need a transmission medium, i.e., a synapse, to transmit this time-domain signal to other neurons. To compose a complete neural network, we design a synapse module based on frequency signals, as shown in Fig. 2(c). The synapse consists mainly of a voltage-controlled ring oscillator operating under a leakage current, which is composed of a three-stage inverter ( and ). The previous neuron circuit fires and generates a spike , which is inverted by an inverter, making open for a short time, and the current flowing through charges , which will increase . Once reaches the voltage that triggers the oscillation, the ring oscillator begins to oscillate (Fig. 2(b)(iv) and Fig. 2(b)(v)). If the preceding neuron does not fire for a long time, will leak until the initial state, at which point the synapse becomes inactive again. Since is equivalent to the supply voltage of the ring oscillator, the current flowing out of controls and thus the frequency of the normal ring oscillator.

SNNs achieve learning function by adjusting the weights; therefore, we propose a weight module that is compatible with the proposed time-domain neuron and synapse modules described above, as shown in Fig. 2(d). The proposed weight module tunes the time-domain information, which is the width of the output pulses. This module consists of a delay line, a multiplexer, and an AND gate. is the square wave signal from the synapse that will pass through the delay line. is the digital code that represents weight, which is determined after learning and is used to control the multiplexer. The width of the output pulse that corresponds to the time-domain weight is adjusted according to which tap in the inverter chain is selected by the multiplexer. As mentioned earlier, if the excitatory or inhibitory pulse width is wide, the voltage in the subsequent neuron is charged or discharged faster, respectively. This corresponds to a large weight. In this study, we chose a multiplexer with 16 inputs, i.e. four bit weights (0000 to 1111). The output of the weight module is connected to the input device of the subsequent neuron circuits. The frequency of the pulse (pulse spacing) and the width of the pulse act simultaneously on the neuron to change its activity. The frequency of the pulse is determined by the output frequency of the previous synapse, while the coupling strength depends on the width of the pulse output determined by the weight module.

Figure  3(a) shows the experimental setup used to test the fabricated neural structure chip (Fig. 0(b)), where the chip was placed on a probe station Summit11000 and tested with probes in direct contact with it. In the experiments, we assume that the inputs of the two weight modules is the pre-stage synapses, which is acted by the arbitrary function generators. The output of the neuron is connected to the synapse module, and the output of which will be varied in response to the change in the output of the neuron. We used a Tektronix AFG31252 arbitrary function generator as a pre-stage synapse to provide square wave signals for our fabricated neural circuits. At the same time, we observed the output waveforms using oscilloscopes (Keysight MSOX6004A and DSOX93304Q).

The experimental results are shown in Figs. 3(b)3(c) and 3(d). Figure 3(b) demonstrates the effect of changing the weight when a 100 Hz square wave signal is fed to the neuron by the pre-stage synapse (function generator). The insets (i), (ii), (iii) and (iv) of Fig. 3(b) show the comparison of the neuron fire intervals for the cases where the weight is set to 0001, 0010, 0100 and 1000, respectively. The proposed neuron is basically firing with the rate determined by the leakage currents into and out from in balance, and input from the previous stage modulates it. We can see that the neuron fires faster when the weights become larger.

Figure 3(c) compares the variation of neuron fire times depending on the signal from the pre-stage synapse. The insets (i), (ii) and (iii) of Fig. 3(c) show the cases with 100 Hz inhibitory input (weight is set to 1100), no input, and with 100 Hz excitatory input (weight is set to 1100), respectively, from which we can see that the inhibitory input decreases the fire frequency of the neuron and increases the fire interval, while the excitatory input works as the opposite of the inhibitory input. The experimental results show that the firing interval of the proposed neuron is on the order of milliseconds, which is in accordance with the feature of biological neurons having millisecond time constants. When no signal is fed from the pre-stage synapse, the power consumption is about 800 pW, generating about 20 spikes in a 100 ms cycle. From this, it can be roughly estimated that each spike consumes about 4 pJ of energy. Subsequently, the insets (i), (ii) and (iii) of Fig. 3(c) were used as input signals to the synapse to influence . The measured waveforms in these three cases are shown in Fig. 3(d). The average of the frequencies for each case measured in 5 s time period are 41 Hz, 90 Hz and 98 Hz, respectively.

To facilitate the observation of the synchronous response of the synapse, we also fabricated the structure of Fig. 4(a). Figure 4(b) is the experimental results on Fig. 4(a). We used a Tektronix AFG31252 arbitrary function generator to generate a 10 Hz square wave signal , and after passes through a weight module, it produces a spike signal . The voltage is observed through an on-chip source follower as an analog buffer. With the arrival of the , the voltage at the synapse rises instantaneously, as shown in Fig. 4(a)(i), which in turn increases the frequency of . If the does not arrive for a long time, decreases, which in turn affects the frequency to become smaller.

Technology
Energy per
spike (pJ)
Single
neuron
area
()
Frequency
(Hz)
Neuron
model
Sim. or
Meas.
Indiveri et al.,2006
350 nm
CMOS
900 2573 100 IF Meas.
Wu et al.,2015
180 nm
CMOS
9.3 N/A LIF Meas.
Joubert et al.,2012
65 nm
CMOS
41 538 300 LIF Sim.
Aamir et al.,2018a
65 nm
CMOS
200 3363 AdEx-IF111Adaptive-exponential integrate-and-fire. Meas.
Dutta et al.,2017
32 nm
SOI
MOSFET
35 1.8 LIF Meas.
Rubino et al.,2019
22 nm
FD-SOI
14 900 30 AdEx-IF111Adaptive-exponential integrate-and-fire. Meas.
this
work
65 nm
CMOS
4 127 230 LIF Meas.
Table 1: Performance Comparison of Stand-Alone Neuron Circuits.

Table 1 shows the performance comparison among stand-alone neuron circuits. The proposed neuron circuit has advantages in terms of energy consumption and area. The designs in Refs. Indiveri et al., 2006; Wu et al., 2015; Joubert et al., 2012; Aamir et al., 2018a used a clocked or an asynchronous comparator, and these designs take up a large amount of chip area as well as power consumption. The neuron fabricated in a non-CMOS process proposed in Ref. Dutta et al., 2017 does not require a comparator, which leads to an advantage in area. However, its energy consumption is relatively high and these particular technologies are less mature and thus more costly compared to standard CMOS processes.

For low-frequency analog circuit designs, flicker noise is of important concern. Since our design goal is millisecond time constants, i.e., operating frequencies in hundreds of hertz, we had to consider the tradeoff between operating frequency and flicker noise impact. We decided to use thick oxide transistors to reduce the impact of the flicker noise, while achieving a longer time constant. In addition, we deliberately used transistors with larger gate area to further mitigate the impact of it in our design. With these design choices, the impact of noise on the proposed modules is suppressed to be negligible. In this design, the in the neuron shown in Fig. 2(a) is designed to be 20 fF and implemented using metal-oxide-metal (MOM) capacitor, while the in the synapse module shown in Fig. 2(c) is designed to be 70 fF and implemented with metal-insulator-metal (MIM) capacitor with upper metal layers on top of transistors of the synapse module, which does not require an additional area.

In summary, we have proposed a neural structure for generating and transmitting time-domain signals. The proposed neuron and synapse occupy an area of 127  and 231 , respectively. This structure does not use op-amps and comparators to provide advantages in area and power consumption. The proposed time-domain neural structure benefits from scaled process technologies compared to conventional voltage/current-domain designs. Actual chip fabrication and measurement results successfully demonstrate the temporal signal communication function with millisecond time constants. The proposed time-domain neural structure is well suited for building spiking neural networks for processing real-time time-series information for human-computer interaction.

This work is supported by Japan Science and Technology Agency (JST) CREST Grant Number JPMJCR19K2.

References