Voice isolation in Microsoft Teams enables personalized noise suppression for calls and meetings

Summary:

Remote meetings have changed the way we collaborate, offering valuable flexibility in how and where we work. In any meeting, you are likely to see colleagues participating from a variety of environments, including offices with open floor plans, busy coffee shops, or even waiting for a flight. This flexibility frequently comes with uncontrollable distractions from nearby sounds and voices that can disrupt the focus and productivity of our meetings and calls.

In a previous blog, we detailed innovative audio enhancements for Microsoft Teams. Powered by a deep learning model, these capabilities offer exceptional noise suppression and echo cancellation, reducing interruptions and paving the way for more efficient meetings.

We recently rolled out a noise suppression mode for Teams called “voice isolation”. When enabled, it ensures only your voice is transmitted, suppressing unwanted background speech caught by your microphone.

How to enable voice isolation

Opting into this feature is easy. You will go through a quick and secure enrollment process that involves reading aloud a short paragraph in one of 25 languages to create your “voice profile”, a set of voice characteristics that are unique to you. This voice profile then will be used by the voice isolation model to ensure only your voice is transmitted through your microphone. Alongside suppressing unwanted background speech, voice isolation still removes all other background noises while optimizing overall audio quality.

Voice.png

Once you have created your voice profile and enabled voice isolation, it will take effect during your next meeting or call.

Deep learning powers personalized speech filtering

The goals for this new noise suppression model included removing unwanted background speech while enhancing the audio quality of the user’s voice. The model was developed using large training and evaluation data sets totaling over 2000 hours of speech and noise across a wide variety of criteria, including genders, dialects, voice characteristics, devices, background noises, echo scenarios, and others. Testing and evaluation for efficacy was also performed across devices, to ensure model performance when one device is used to enroll, but another device is used during a call or meeting.

During the iterative process of training and evaluating the voice isolation model, we had to make our model resistant to:

  1. Over-suppression, defined as filtering elements of detected speech that you intended to be transmitted. For example, a colleague may want to join your meeting, huddling alongside you in front of the device. We developed a mechanism within the model that identifies when a speaker very close to the microphone is being suppressed. An alert then gives you the option to quickly disable voice isolation just for that meeting to enable your nearby colleague to be heard through your microphone as well.
  2. Leakage occurs when unwanted background speech is not thoroughly suppressed. There is a trade-off when tuning a model for leakage vs. over-suppression. If a model is too aggressive, it will have less leakage but may be subject to over-suppression since it would be less forgiving towards minor differences in your voice. To ensure we prevent instances of your voice being cut out, we have opted for a more forgiving model. However, if you have a sibling that has similar voice characteristics to yours, and they are speaking relatively close to you, their voice may not be filtered out.

Considering these challenges, we have iteratively trained and rigorously evaluated the model to ensure it provides the best audio quality to users in a variety of scenarios. The quality assurance was done by creating and tracking a set of robust objective metrics alongside iterative subjective evaluation via large-scale crowdsourcing.

Confidence in the quality of your transmitted audio

In addition, we have developed a detector within the model to identify and alert you whenever any noise is suppressed from your microphone. This dynamic notification gives you confidence in the content and quality of your transmitted audio. For example, if while on a call, you are concerned that a leaf-blower or barking dog may be making it difficult for others to hear you, the indicator on your self-view can give you peace of mind that such noises are not being transmitted.

Voice 2.png

Be heard anywhere

Voice isolation will be available for meetings and calls on the Windows desktop Teams client in April 2024. Now you can confidently keep your microphone unmuted for more seamless participation in meetings, regardless of who may be talking in the background. Learn how to reduce background noise in Teams.

Date: 2024-03-26 15:00:00Z
Link: https://techcommunity.microsoft.com/t5/microsoft-teams-blog/voice-isolation-in-microsoft-teams-enables-personalized-noise/ba-p/4096077