Abstract

We present ClearBuds, a state-of-the-art hardware and software system for real-time speech enhancement. Our neural network runs completely on an iphone, allowing you to supress unwanted noises while taking phone calls on the go. ClearBuds bridges state-of-the-art deep learning for blind audio source separation and in-ear mobile systems by making two key technical contributions: 1) a new wireless earbud design capable of operating as a synchronized, binaural microphone array, and 2) a lightweight dual-channel speech enhancement neural network that runs on a mobile device. Results show that our wireless earbuds achieve a synchronization error less than 64 microseconds and our network has a runtime of 21.4 ms on an accompanying mobile phone.

Our goal is to isolate a person's voice in the presence of background noise (e.g. street noise) or other people talking. We perform this separation using a pair of custom synchronized wireless earbuds and a lightweight neural network that runs on an iPhone.

This is a picture of the earbuds. A quarter is provided for a size comparison.