ECE598NS MinVAD • Brady Salz

This was my final project for ECE598NS: Machine Learning in Silicon. I was curious about finding minimum solutions to speech classifiers, since I'm a big fan of talking to my phone for random things. Some great work had already been done at both KU Leuven¹ and MIT² on high performance voice detectors, but I wanted to see if I could make one that went even lower without any large sacrifices to accuracy.

I simulated two subsystems:

A analog filter bank followed by an integrator
A digital “sub-precision sampling” ADC

Both were trained using a DecisionTreeClassifier from scikit-learn in Python, then quantized to various precisions. I had surprisingly good results, showing high accuracy with only 5-8 bits of comparison data, compared to the normal 16 bit or even floating bit we're used to computing. My final results didn't include many analog non-linearities, but I think they still have pretty solid merit, since those can be “learned” in the training algorithm. Final report and code below.

Final Report

GitHub Repo

K. M. H. Badami, S. Lauwereins, W. Meert, and M. Verhelst, “A 90 nm CMOS, 6uW Power-Proportional Acoustic Sensing Frontend for Voice Activity Detection,” IEEE Journal of Solid-State Circuits, vol. PP, no. 99, pp. 1–12, 2015. ↩︎
M. H. Price, “Energy-scalable Speech Recognition Circuits,” Ph.D. Thesis, MIT, Cambridge, MA, 2016. ↩︎