Research focus

Our research interest lies in computing hardware and VLSI circuits with the foci given to energy-efficiency, artificial intelligence and machine learning, and hardware security.

Algorithm Hardware Co-Design for AI/ML, DSP, Comm

Create new algorithm and hardware together to minimize resource use for neural networks, deep learning, digital signal processing (DSP), and wireless communication.

Deep learning based channel estimation processor for 5G millimeter-wave communication systems: TCASI21
Sub-microwatt keyword spotting hardware based on depth-separable convolution neural networks: ISSCC20
Memory-efficient neural network architecture search (NAS): ArXiv19, CVPR20
KTAN: Knowledge Transfer Adversarial Network: IJCNN20
FPGA based deep learning accelerator: ISLPED19
High-capacity fingerprint recognition system based on dynamic capacity estimation of associative memory: Arxiv17, Biocas18
Recursive synaptic bit reuse for associative memory: DATE17, TVLSI18
Recursive binary neural network training model: TCASI19
Nanowatt 96-channel brain-computer-interface processor, Neural Spike Processor, for motor-intention decoding: DATE17, ESSCIRC18
Spike sorter hardware with Bayesian unsupervised learning: VLSI16 TVLSI19
Informative screening in unsupervised learning for spike sorting: DAC15
Energy- and area-efficient FFT processor: ISSCC11, JSSC12, ESSCIRC17

In-Memory Computing (IMC) SRAM Circuits and Architecture

Create novel in and near memory computing architecture for SRAM, DRAM, and NVM. Breaking the memory wall in Von Neumann architecture and also bypassing row-by-row memory access toward single-cycle vector-matrix dot product.

DIMC, digital in-memory computing SRAM based on approximate arithmetic: ISSCC22
MBIMC, analog-mixed-signal in-memory computing SRAM supporting multi-bit weights and inputs: CICC22
C3SRAM, in-memory computing SRAM based on the capacitive coupling mechanism: ESSCIRC19, JSSC20, DATE21, DAC21, DNT21
A deep neural network accelerator featuring in-memory computing SRAM: Vesti: TVLSI19, Asiloma19; PIMCA: VLSI21
k-nearest neighbor accelerator using in-memory-computing (IMC) SRAM: ISLPED19
XNOR-SRAM, in-memory computing SRAM based on the resistive computing mechanism, achieving over 400 TOPS/W and 5 TOPS/mm2 for a vector-matrix dot product in a 65-nm CMOS: VLSI18, GLSVLSI19, JSSC19

Bio-Inspired Neuromorphic Hardware

Build better machine-learning hardware with the insights from the nature. Artificial cochlear, spiking neural networks, unsupervised learning, and so on.

Sub-microwatt end-to-end keyword spotting chip with the background noise robustness: ISSCC21, JSSC21
Sub-300-nanowatt always-on spiking neural network: ASSCC20, FIN21
1-Microwatt voice activity detection chip: ISSCC18, JSSC19
Energy-efficient neuromorphic classifiers on a SNN chip: NECO16
Neuromorphic processor featuring online learning: ISLPED15, VLSI-SOC15, VLSI16, TVLSI19

Ultra-Low-Power Near/Sub-Threshold Digital Processor

Create the sub-milliwatt and sub-microwatt digital processors via circuit and architecture innovations.

Metastability error detection and correction (MEDAC): ISSCC20, JSSC21
CATENA, a sub-0.4-mW 16-core spatial arracy processor (aka coarse-grained reconfigurable array, CGRA): VLSI19, JSSC20
Accelerator architecture: homogeneous vs. heterogeneous: ISCA19
Triggered processing element design: Micro17, Github
Saving leakage in a multi-core processor in NTV: TVLSI16, S3S13
Feedforward leakage self-suppression (FLSL) logic family: SSCL19, MWSCAS19, ISCAS21, ISCAS21
Wide-pulsed-latch pipelining for near threshold voltage processors JSSC17
Body-swapping error correction for non-Von-Neumann architecture VLSI16, TVLSI19
Sparse error detection technique: TVLSI16
Near Threshold Voltage Razor (EDAC): JSSC15
Regenerator based interconnect and clock networks: ISLPED14
Energy- and area-efficient FFT processor: ISSCC11, JSSC12, ESSCIRC17
Pipeline methodology for NTV processors: DAC11
Cubic-millimeter wireless sensor: ISSCC11
Femtowatt SRAM: ISCAS11
Clock network design for NTV and STV circuits: ISLPED10
Millimeter-scale sensor system: ISSCC10
Energy-optimal technology selection: ISLPED09
Near-threshold ROM design: CICC08
Power gating for nanowatt processors: TVLSI11, ESSCIRC08
Phoenix processor: VLSI08, JSSC09

Integrated Voltage Regulators for Next-Generation SoCs

Create smaller, power-efficient voltage regulators, DC-DC converters, and harvesting power converters.

Digital LDO for an AMS load: SSCL20
Single-inductor multiple-output (SIMO) DC-DC converters for ultra-low power SoCs: SSCL21
48V-to-0.75V DC-DC converter for data-center application: ECCE20
Multi LDO systems with distributed event-driven control: VLSI19, JSSC21
BTWC sizing of a switched-capacitor DC-DC converter: ISLPED18
Synchronous and asynchronous comparator circuits: ISLPED17
Digital LDO based on event-driven control: ISSCC16, ISSCC17, VLSI18, SSCL18
Tripple-mode energy harvesting DC-DC converter: JSSC17
Hybrid switched-capacitor and LDO DC-DC converter: VLSI09
Picowatt 2-transistor voltage reference (2T-Vref): JSSC12

Hardware Security

Hardware X Security

Detection-driven protection of physical attack: VLSI21
Ultra-compact and robust PUF based on analog circuits: JSSC16
Technique to transform SRAM to analog PUF: JSSC18
Machine-learning based blacklisting against CLKSCREW attacks: ISLPED18
High energy-efficiency AES accelerator: VLSI19

Hybrid Analog Digital Computing for Differential Equation Solution

Revive analog computing for scientific computing.

Hybrid analog digital computer chip in a 65 nm process: ESSCIRC15, JSSC16, Spectrum18
Solving linear algebra problems in the hybrid analog digital computer: ISCA16
Solving nonlinear partial differential equations in the hybrid analog digital computer: Micro17

Thermal and Reliability Management

Explore new thermal sensors, aging sensors, and related hardware and software stacks for dynamic management.

Sub 300-um2 on-chip temperature sensor: ISSCC14 JSSC15
Sub 50-um2 on-chip temperature sensor: CICC15, JLPEA16, MWSCAS17
Technique to transform SRAM to a temperature sensor: CICC17, JSSC18
Technique to transform SRAM to a stochastic ADC: SSCL19
Technique to place miniature temperature sensors: ISLPED17
In-situ and in-field aging monitoring for a pipeline: DAC14
In-situ and in-field aging monitoring for SRAM: ISSCC15, TVLSI18, ESSCIRC16, IRPS18