Research focus

We focus on the silicon based computing hardware design. We are interested in various architecture from near-threshold many-core spatial array accelerators to in-memory computing hardware to hybrid analog digital computing to bio-inspired neuromorphic hardware. In addition to the computing hardware design, we are also exploring on-chip integrated power management to power such emerging architecture in the most power efficient manners.

Sub-Milliwatt Spatial Array Accerelator (aka CGRA)


Create the next-generation programmable computing platform for mobile and embedded devices. Saving the power waste from underutilized hardware and always-on infrastructure.

  • Accelerator architecture: homogeneous vs. heterogeneous: ISCA19

  • CATENA, a sub-0.4-mW 16-core spatial arracy processor: VLSI19

  • Triggered processing element design: Micro17, Github

  • Saving leakage in a multi-core processor in NTV: TVLSI16, S3S13

In- and Near-Memory Computing (IMC, PIM)


Create novel in and near memory computing architecture for SRAM, DRAM, and NVM. Breaking the memory wall in Von Neumann architecture and also bypassing row-by-row memory access toward single-cycle vector-matrix dot product.

  • C3SRAM, in-memory computing SRAM based on the capacitive coupling mechanism: ESSCIRC19

  • Vesti, a deep neural network accelerator featuring in-memory computing SRAM: TVLSI19, Asiloma19

  • k-nearest neighbor accelerator using in-memory-computing (IMC) SRAM: ISLPED19

  • XNOR-SRAM, in-memory computing SRAM based on the resistive computing mechanism, achieving over 400 TOPS/W and 5 TOPS/mm2 for a vector-matrix dot product in a 65-nm CMOS: VLSI18, GLSVLSI19

Bio-Inspired Neuromorphic Architecture


Build better machine-learning hardware with the insights from the nature. Artificial cochlear, spiking neural networks, unsupervised learning, and so on.

Resource-Efficient Machine Learning


Create new training and inference models that minimize resource use for deep and convolutional neural networks.

  • Memory-efficient neural network architecture search (NAS): ArXiv19

  • FPGA based deep learning accelerator: ISLPED19

  • High-capacity fingerprint recognition system based on dynamic capacity estimation of associative memory: Arxiv17, Biocas18

  • Recursive synaptic bit reuse for associative memory: DATE17, TVLSI18

  • Recursive binary neural network training model: TCASI19

Brain Computer Interface Implant Hardware


Algorithm and hardware design for biomedical neuro implants.

  • Nanowatt 96-channel brain-computer-interface processor, Neural Spike Processor, for motor-intention decoding: DATE17, ESSCIRC18

  • Spike sorter hardware with Bayesian unsupervised learning: VLSI16 TVLSI19

  • Informative screening in unsupervised learning for spike sorting: DAC15

Hybrid Analog Digital Computer


Revive analog computing for scientific computing.

  • Hybrid analog digital computer chip in a 65 nm process: ESSCIRC15, JSSC16, Spectrum18

  • Solving linear algebra problems in the hybrid analog digital computer: ISCA16

  • Solving nonlinear partial differential equations in the hybrid analog digital computer: Micro17

Ultra-Low-Power, Near-Threshold Voltage (NTV) SoCs


Nanowatt millimeter-scale processor design and near- and sub-threshold voltage circuit design techniques.

  • Feedforward leakage self-suppression (FLSL) logic family: SSCL19, MWSCAS19

  • Compact and energy-efficient NTV FFT process: ESSCIRC17

  • Wide-pulsed-latch pipelining for near threshold voltage processors JSSC17

  • Body-swapping error correction for non-Von-Neumann architecture VLSI16, TVLSI19

  • Sparse error detection technique: TVLSI16

  • Near Threshold Voltage Razor (EDAC): JSSC15

  • Regenerator based interconnect and clock networks: ISLPED14

  • Most energy-efficient FFT processor: ISSCC11, JSSC12

  • Pipeline methodology for NTV processors: DAC11

  • Cubic-millimeter wireless sensor: ISSCC11

  • Femtowatt SRAM: ISCAS11

  • Clock network design for NTV and STV circuits: ISLPED10

  • Millimeter-scale sensor system: ISSCC10

  • Energy-optimal technology selection: ISLPED09

  • Near-threshold ROM design: CICC08

  • Power gating for nanowatt processors: TVLSI11, ESSCIRC08

  • Phoenix processor: VLSI08, JSSC09

Integrated Power Management Circuits for Next-Generation SoCs


Create smaller, power-efficient voltage regulators, DC-DC converters, and harvesting power converters.

  • Multi LDO systems with distributed event-driven control: VLSI19

  • BTWC sizing of a switched-capacitor DC-DC converter: ISLPED18

  • Synchronous and asynchronous comparator circuits: ISLPED17

  • Digital LDO based on event-driven control: ISSCC16, ISSCC17, VLSI18, SSCL18

  • Tripple-mode energy harvesting DC-DC converter: JSSC17

  • Hybrid switched-capacitor and LDO DC-DC converter: VLSI09

  • Picowatt 2-transistor voltage reference: JSSC12

Thermal and Reliability Management


Explore new thermal sensors, aging sensors, and related hardware and software stacks for dynamic management.

Hardware Security


Hardware X Security

  • Ultra-compact and robust PUF based on analog circuits: JSSC16

  • Technique to transform SRAM to analog PUF: JSSC18

  • Machine-learning based blacklisting against CLKSCREW attacks: ISLPED18

  • High energy-efficiency AES accelerator: VLSI19