Dynamic tempered sampling in generative models inference
摘要:
A method of sampling output audio samples includes, during a packet loss concealment event, obtaining a sequence of previous output audio samples. At each time step during the event, the method includes generating a probability distribution over possible output audio samples for the time step. Each sample includes a respective probability indicating a likelihood that the corresponding sample represents a portion of an utterance at the time step. The method also includes determining a temperature sampling value based on a function of a number of time steps that precedes the time step, and an initial, a minimum, and a maximum temperature sampling value. The method also includes applying the temperature sampling value to the probability distribution to adjust a probability of selecting possible samples and randomly selecting one of the possible samples based on the adjusted probability. The method also includes generating synthesized speech using the randomly selected sample.
公开/授权文献
信息查询
0/0