SPEAKER ATTRIBUTED TRANSCRIPT GENERATION
    2.
    发明公开

    公开(公告)号:EP4345816A2

    公开(公告)日:2024-04-03

    申请号:EP24156964.9

    申请日:2020-03-19

    IPC分类号: G10L15/14

    摘要: A computer implemented method processes audio streams recorded during a meeting by a plurality of distributed devices. Operations include performing speech recognition on each audio stream by a corresponding speech recognition system to generate utterance-level posterior probabilities as hypotheses for each audio stream, aligning the hypotheses and formatting them as word confusion networks with associated word-level posteriors probabilities, performing speaker recognition on each audio stream by a speaker identification algorithm that generates a stream of speaker-attributed word hypotheses, formatting speaker hypotheses with associated speaker label posterior probabilities and speaker-attributed hypotheses for each audio stream as a speaker confusion network, aligning the word and speaker confusion networks from all audio streams to each other to merge the posterior probabilities and align word and speaker labels, and creating a best speaker-attributed word transcript by selecting the sequence of word and speaker labels with the highest posterior probabilities.