Unlike VALL-E, however, VALL-E 2 performs zero-shot text-to-speech synthesis (TTS), which uses text inputs to generate speech for voices it hasn't been explicitly trained on. It uses a vast ...
Some results have been hidden because they may be inaccessible to you