A Real-Time Wideband Neural Vocoder at 1.6 Kb/S Using LPCNet

A Real-Time Wideband Neural Vocoder at 1.6 Kb/S Using LPCNet

This time, we turn LPCNet into a very low-bitrate neural speech codec (see submitted paper) that’s actually usable on current hardware and even on phones. The encoder extracts information about the pitch and the shape of the vocal tract, transmits that information to the decoder, and then the decoder resynthesizes a new speech signal based on what the encoder provided. All pitch codecs rely heavily on pitch, but unlike waveform coders where pitch “just” helps reducing the amount redundancy, vocoders have no fallback and will generate bad-sounding (or even unintelligible) speech if the pitch is wrong.

Source: people.xiph.org