This was my final project for ECE598NS: Machine Learning in Silicon. I was curious about finding minimum solutions to speech classifiers, since I’m a big fan of talking to my phone for random things. Some great work had already been done at both KU Leuven1 and MIT2 on high performance voice detectors, but I wanted to see if I could make one that went even lower without any large sacrifices to accuracy.
I simulated two subsystems:
- A analog filter bank followed by an integrator
- A digital “sub-precision sampling” ADC
Both were trained using a
scikit-learn in Python, then quantized to various precisions. I had surprisingly good results, showing high accuracy with only 5-8 bits of comparison data, compared to the normal 16 bit or even floating bit we’re used to computing. My final results didn’t include many analog non-linearities, but I think they still have pretty solid merit, since those can be “learned” in the training algorithm. Final report and code below.
- K. M. H. Badami, S. Lauwereins, W. Meert, and M. Verhelst, “A 90 nm CMOS, 6uW Power-Proportional Acoustic Sensing Frontend for Voice Activity Detection,” IEEE Journal of Solid-State Circuits, vol. PP, no. 99, pp. 1–12, 2015. ^
- M. H. Price, “Energy-scalable Speech Recognition Circuits,” Ph.D. Thesis, MIT, Cambridge, MA, 2016. ^