Environmental Noise Cancellation for Sound Processing

In today’s fast-paced world, clear and effective communication is more important than ever. With the widespread use of telephones, video conferencing, and other communication systems, the popularity of hands-free devices like wireless earbuds and headphones, and the increasing demand for high-quality audio, sound processing methods have become a crucial aspect of our daily lives. Advances in speech recognition technology have only added to the need for better and more advanced sound processing techniques. In this blog post, we’ll explore the growing importance of Environmental Noise Cancellation (ENC) and why it is essential in today’s world.

In this part we will discuss:

Active vs. Passive noise reduction
Single-Channel or Multi-Channel systems
Near-end / Far-end users
And finally, what is the difference between ANC and ENC.

With the rise of modern life and ubiquitous use of audio devices, it has become increasingly challenging to hear clearly in noisy environments. This is where environmental noise cancellation (ENC) comes in. ENC is a sound processing technique used in audio systems to reduce or eliminate unwanted sounds from the surrounding environment, allowing the listener to better hear the intended audio signal.

In recent years, the need for ENC in audio systems has increased due to several reasons. Firstly, our environments have become increasingly noisy. Traffic, construction, wind, and crowded public spaces all contribute to the overall noise level, making it more difficult to hear clearly. Additionally, the widespread use of audio devices such as smartphones, laptops, and headphones means that people are listening to audio in more varied environments than ever before. In many cases, the background noise can be louder than the audio signal itself, making it hard to hear what’s being said or played back.

Another factor contributing to the need for ENC is the increased demand for high-quality audio. With the rise of high-quality streaming services and the popularity of high-end audio equipment, people have higher expectations for the quality of the audio they consume. However, this also means that background noise is more noticeable and can detract from the overall listening experience.

Finally, the COVID-19 pandemic has forced many people to work from home and rely on online meetings, that can be disrupted by background noise from home appliances, family members, pets, or even outdoor sounds. This has made ENC a crucial tool for remote work and online meetings, helping to ensure clear communication even in noisy home environments.

Let’s set the groundwork for the basic concepts.

Active or passive?

Traditionally, the distinction between active and passive environmental noise cancellation was based on the method used. Active Environmental Noise Cancellation (ANC) involved generating an “opposite noise signal” to create destructive interference with the original noise, while passive methods referred to physical design and layout aimed at preventing noise from reaching the sensors. Recently, more advanced techniques have been developed for environmental noise cancellation that use an active approach for processing the received signal, reducing noise without actively generating “anti-noise” signal. These techniques are classified as ENC. We will later address thoroughly the differences between ANC and ENC.

Single/Multi channel

The number of sensors used is an important factor in sound processing. A single-channel system uses only one sensor, such as a microphone, to record sound. On the other hand, a multi-channel system uses multiple sensors, usually two or more.

In single-channel speech enhancement, the goal is to remove background noise and/or increase the volume of the speech signal, assuming that all the necessary information is contained within the single-channel input. Through the use of classical sound processing algorithms and advanced neural network architectures, there has been significant progress in separating speech and noise signals and enhancing speech while reducing noise.

Multi-channel systems utilize multiple audio channels to process sound, with the assumption that each channel contains different information that can be combined to enhance the speech signal. For example, two microphones can be placed in a way that one records the speech of a user and the other records everything else, which is considered background noise. With more than one sensor, it’s also possible to determine the direction of the source and focus on it more accurately.

Multi-channel systems are often more effective than single-channel systems, as they can use spatial information to separate the speech signal from background noise. This results in a more accurate representation of the speech signal, which can lead to better speech recognition and improved overall speech quality. However, while multi-channel speech enhancement is typically more effective, it also requires more processing power and is often more complex than single-channel systems. This can result in increased latency and more energy consumption, which can drain your device’s battery quicker.

At CEVA, we are dedicated to enhancing the performance of single channel voice processing through the utilization of cutting-edge deep learning techniques. Our focus is on achieving highest noise reduction performance for low power devices. With CEVA-ClearVox ENC, we bring high-end ENC capabilities to compact, energy-efficient, and cost-effective devices.

Near-end / Far-end

Near-end processing refers to when the client is present in the environment with the noise during the environmental noise reduction process. On the other hand, Far-end processing applies to clients who desire environmental noise reduction but are not physically located in the noisy environment, such as those on the other end of a phone call.

When targeting near-end processing one can leverage the presence of noise and use different sensors to spatially capture noise and reduce it in the incoming signal using active noise reduction methods, for instance. However, this approach is more sensitive to sound delays and may not be effective against all types of noise.

In far-end processing, when reducing local noise in a transmitted signal, for example, when taking a work call in a coffee shop, ENC can utilize different processing methods that are more tolerant of latency.

ANC vs. ENC

Now that we have clarified some basic concepts, we can refine the differences between ANC and ENC.

ANC is an active approach that is commonly targeted at the near-end user. It uses a sensor to capture the noise around the user, inverts it, and then uses a generator to broadcast it in a way that destructively interferes with the external sound, cancelling it out from the signal injected to the earphones. ANC is commonly used when a user wants to listen to music and block out external noise.

On the other hand, ENC performs sound processing techniques on the signal captured by the near-end microphone so that the far-end user will not receive any background noise from the near-end side. In this case, noise reduction is performed on the signal itself, and the processed signal is then transmitted to the other end.

To summarize, Environmental Noise Cancellation (ENC) has become a crucial aspect of our daily lives, enabling clear and effective communication in a world that is increasingly dependent on audio communication systems. Based on the use-case and the system used, there are various methods to reduce unwanted background noises in audio signals caused by the environment.
In the next part of this blog, we will delve into the challenges, recent developments, and the various technologies and techniques used in ENC, with a focus on deep learning. We will explore how deep learning methods are being utilized to improve the accuracy and effectiveness of speech enhancement systems, and how this is leading to better communication experiences for users. Stay tuned for Part Two!

#Active Noise Cancellation #ANC #ENC #Environmental Noise Cancellation #noise cancellation #noise reduction

Posted By

Tomer Badug

Tomer Badug is an Audio Deep Learning and Algorithms Researcher at CEVA. He holds a Bachelor of Science degree in Electrical Engineering and a Master of Science degree in Electrical Engineering, both obtained from Tel Aviv University. With a strong background in electrical engineering and a deep understanding of acoustics and audio processing, Tomer has dedicated his career to researching and developing cutting-edge algorithms for audio applications. His work at CEVA has focused on leveraging the latest advancements in deep learning and artificial intelligence to design and implement advanced audio processing systems for a wide range of applications. With his expertise in deep learning and his passion for innovation, Tomer is constantly pushing the boundaries of audio technology, paving the way for new advancements in the field. He is committed to staying at the forefront of his industry and continues to be a leading voice in developing cutting-edge algorithms for audio applications.