Document transcription system training
    1.
    发明授权
    Document transcription system training 有权
    文件转录系统培训

    公开(公告)号:US08335688B2

    公开(公告)日:2012-12-18

    申请号:US10922513

    申请日:2004-08-20

    IPC分类号: G10L15/26 G10L15/18

    摘要: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

    摘要翻译: 提供用于训练用于语音识别的声学模型的系统。 特别地,这样的系统可以用于基于口语音频流和口头音频流的非文字转录来执行训练。 这样的系统可以识别表示具有多个口头形式的概念的非文字记录中的文本。 该系统可以尝试在音频流中识别在非文字转录中产生相应文本的音频流中的实际语音形式,从而产生更准确地表示语音音频流的经修改的脚本。 修改和更准确的誊本可用于训练声学模型,从而产生比使用直接基于原始非文字誊本进行训练的常规技术产生的更好的声学模型。

    Automated Extraction of Semantic Content and Generation of a Structured Document from Speech
    2.
    发明申请
    Automated Extraction of Semantic Content and Generation of a Structured Document from Speech 审中-公开
    语义内容的自动提取和语音结构化文档的生成

    公开(公告)号:US20100299135A1

    公开(公告)日:2010-11-25

    申请号:US12471167

    申请日:2009-05-22

    IPC分类号: G10L15/26 G06F17/27

    摘要: Techniques are disclosed for automatically generating structured documents based on speech, including identification of relevant concepts and their interpretation. In one embodiment, a structured document generator uses an integrated process to generate a structured textual document (such as a structured textual medical report) based on a spoken audio stream. The spoken audio stream may be recognized using a language model which includes a plurality of sub-models arranged in a hierarchical structure. Each of the sub-models may correspond to a concept that is expected to appear in the spoken audio stream. Different portions of the spoken audio stream may be recognized using different sub-models. The resulting structured textual document may have a hierarchical structure that corresponds to the hierarchical structure of the language sub-models that were used to generate the structured textual document.

    摘要翻译: 公开了基于语音自动生成结构化文档的技术,包括识别相关概念及其解释。 在一个实施例中,结构化文档生成器使用集成过程来基于口头音频流来生成结构化文本文档(诸如结构化文本医疗报告)。 可以使用包括以分层结构布置的多个子模型的语言模型来识别口语音频流。 每个子模型可以对应于期望出现在口头音频流中的概念。 可以使用不同的子模型来识别口语音频流的不同部分。 所得到的结构化文本文档可以具有对应于用于生成结构化文本文档的语言子模型的分层结构的层次结构。

    Automated extraction of semantic content and generation of a structured document from speech
    4.
    发明授权
    Automated extraction of semantic content and generation of a structured document from speech 有权
    自动提取语义内容,并从语音生成结构化文档

    公开(公告)号:US07584103B2

    公开(公告)日:2009-09-01

    申请号:US10923517

    申请日:2004-08-20

    IPC分类号: G10L15/18

    CPC分类号: G10L15/1815 G16H15/00

    摘要: Techniques are disclosed for automatically generating structured documents based on speech, including identification of relevant concepts and their interpretation. In one embodiment, a structured document generator uses an integrated process to generate a structured textual document (such as a structured textual medical report) based on a spoken audio stream. The spoken audio stream may be recognized using a language model which includes a plurality of sub-models arranged in a hierarchical structure. Each of the sub-models may correspond to a concept that is expected to appear in the spoken audio stream. Different portions of the spoken audio stream may be recognized using different sub-models. The resulting structured textual document may have a hierarchical structure that corresponds to the hierarchical structure of the language sub-models that were used to generate the structured textual document.

    摘要翻译: 公开了基于语音自动生成结构化文档的技术,包括识别相关概念及其解释。 在一个实施例中,结构化文档生成器使用集成过程来基于口头音频流来生成结构化文本文档(诸如结构化文本医疗报告)。 可以使用包括以分层结构布置的多个子模型的语言模型来识别口语音频流。 每个子模型可以对应于期望出现在口头音频流中的概念。 可以使用不同的子模型来识别口语音频流的不同部分。 所得到的结构化文本文档可以具有对应于用于生成结构化文本文档的语言子模型的分层结构的层次结构。

    Automated Extraction of Semantic Content and Generation of a Structured Document from Speech
    5.
    发明申请
    Automated Extraction of Semantic Content and Generation of a Structured Document from Speech 审中-公开
    语义内容的自动提取和语音结构化文档的生成

    公开(公告)号:US20090048833A1

    公开(公告)日:2009-02-19

    申请号:US12253241

    申请日:2008-10-17

    IPC分类号: G10L15/26

    CPC分类号: G10L15/1815 G16H15/00

    摘要: Techniques are disclosed for automatically generating structured documents based on speech, including identification of relevant concepts and their interpretation. In one embodiment, a structured document generator uses an integrated process to generate a structured textual document (such as a structured textual medical report) based on a spoken audio stream. The spoken audio stream may be recognized using a language model which includes a plurality of sub-models arranged in a hierarchical structure. Each of the sub-models may correspond to a concept that is expected to appear in the spoken audio stream. Different portions of the spoken audio stream may be recognized using different sub-models. The resulting structured textual document may have a hierarchical structure that corresponds to the hierarchical structure of the language sub-models that were used to generate the structured textual document.

    摘要翻译: 公开了基于语音自动生成结构化文档的技术,包括识别相关概念及其解释。 在一个实施例中,结构化文档生成器使用集成过程来基于口头音频流来生成结构化文本文档(诸如结构化文本医疗报告)。 可以使用包括以分层结构布置的多个子模型的语言模型来识别口语音频流。 每个子模型可以对应于期望出现在口头音频流中的概念。 可以使用不同的子模型来识别口语音频流的不同部分。 所得到的结构化文本文档可以具有对应于用于生成结构化文本文档的语言子模型的分层结构的层次结构。

    Document transcription system training
    7.
    发明申请
    Document transcription system training 有权
    文件转录系统培训

    公开(公告)号:US20060041427A1

    公开(公告)日:2006-02-23

    申请号:US10922513

    申请日:2004-08-20

    IPC分类号: G10L15/26

    摘要: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

    摘要翻译: 提供用于训练用于语音识别的声学模型的系统。 特别地,这样的系统可以用于基于口语音频流和口头音频流的非文字转录来执行训练。 这样的系统可以识别表示具有多个口头形式的概念的非文字记录中的文本。 该系统可以尝试在音频流中识别在非文字转录中产生相应文本的音频流中的实际语音形式,从而产生更准确地表示语音音频流的经修改的脚本。 修改和更准确的誊本可用于训练声学模型,从而产生比使用直接基于原始非文字誊本进行训练的常规技术产生的更好的声学模型。

    Content-Based Audio Playback Emphasis
    8.
    发明申请
    Content-Based Audio Playback Emphasis 有权
    基于内容的音频播放强调

    公开(公告)号:US20100318347A1

    公开(公告)日:2010-12-16

    申请号:US12859883

    申请日:2010-08-20

    IPC分类号: G06F17/27

    摘要: Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.

    摘要翻译: 公开了用于促进校对口头音频流的草稿的过程的技术。 一般来说,通过播放对应的口语音频流,强调音频流中与那些高度相关或可能被错误地转录的那些区域,来校对草稿。 例如,区域可能会被强调为比相关程度低且可能被正确转录的地区的播放速度更慢。 强调音频流中最重要的那些区域是正确转录的,那些最有可能被错误转录的区域增加了校对者准确地纠正这些区域中的任何错误的可能性,从而提高了抄本的整体准确性。

    Document extension in dictation-based document generation workflow
    9.
    发明授权
    Document extension in dictation-based document generation workflow 有权
    基于口授的文档生成工作流中的文档扩展

    公开(公告)号:US08781829B2

    公开(公告)日:2014-07-15

    申请号:US13527347

    申请日:2012-06-19

    IPC分类号: G10L15/00

    摘要: An automatic speech recognizer is used to produce a structured document representing the contents of human speech. A best practice is applied to the structured document to produce a conclusion, such as a conclusion that required information is missing from the structured document. Content is inserted into the structured document based on the conclusion, thereby producing a modified document. The inserted content may be obtained by prompting a human user for the content and receiving input representing the content from the human user.

    摘要翻译: 自动语音识别器用于产生表示人类语言内容的结构化文档。 对结构化文档应用最佳实践来得出结论,例如从结构化文档中缺少所需信息的结论。 根据结论将内容插入到结构化文档中,从而生成修改后的文档。 插入的内容可以通过向人类用户提示内容并从人类用户接收表示内容的输入来获得。

    Content-based audio playback emphasis
    10.
    发明授权
    Content-based audio playback emphasis 有权
    基于内容的音频播放强调

    公开(公告)号:US07844464B2

    公开(公告)日:2010-11-30

    申请号:US11187119

    申请日:2005-07-22

    IPC分类号: G10L21/00

    摘要: Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.

    摘要翻译: 公开了用于促进校对口头音频流的草稿的过程的技术。 一般来说,通过播放对应的口语音频流,强调音频流中与那些高度相关或可能被错误地转录的那些区域,来校对草稿。 例如,区域可能会被强调为比相关程度低且可能被正确转录的地区的播放速度更慢。 强调音频流中最重要的那些区域是正确转录的,那些最有可能被错误转录的区域增加了校对者准确地纠正这些区域中的任何错误的可能性,从而提高了抄本的整体准确性。