SYSTEM AND METHOD OF RECORDING UTTERANCES USING UNMANAGED CROWDS FOR NATURAL LANGUAGE PRECESSING
    1.
    发明申请
    SYSTEM AND METHOD OF RECORDING UTTERANCES USING UNMANAGED CROWDS FOR NATURAL LANGUAGE PRECESSING 审中-公开
    使用无损曲线对自然语言进行记录的UTTERANCES的系统和方法

    公开(公告)号:WO2017044369A1

    公开(公告)日:2017-03-16

    申请号:PCT/US2016/049855

    申请日:2016-09-01

    Abstract: A system and method of recording utterances for building Named Entity Recognition ("NER") models, which are used to build dialog systems in which a computer listens and responds to human voice dialog. Utterances to be uttered may be provided to users through their mobile devices, which may record the user uttering (e.g., verbalizing, speaking, etc.) the utterances and upload the recording to a computer for processing. The use of the user's mobile device, which is programmed with an utterance collection application (e.g., configured as a mobile app), facilitates the use of crowd-sourcing human intelligence tasking for widespread collection of utterances from a population of users. As such, obtaining large datasets for building NER models may be facilitated by the system and method disclosed herein.

    Abstract translation: 记录建立命名实体识别(“NER”)模型的系统和方法,用于构建对话系统,其中计算机侦听并响应人声对话。 可以通过其移动设备向用户提供要发出的话语,其可以记录用户发声(例如,言语,说话等)的话语,并将记录上传到计算机进行处理。 使用用话语收集应用(例如,配置为移动应用程序)编程的用户的移动设备有助于使用人群来源的人类智能任务来广泛地收集来自用户群体的话语。 因此,通过本文公开的系统和方法可以获得用于构建NER模型的大数据集。

    SYSTEM AND METHOD FOR ELICITING OPEN-ENDED NATURAL LANGUAGE RESPONSES TO QUESTIONS TO TRAIN NATURAL LANGUAGE PROCESSORS
    2.
    发明申请
    SYSTEM AND METHOD FOR ELICITING OPEN-ENDED NATURAL LANGUAGE RESPONSES TO QUESTIONS TO TRAIN NATURAL LANGUAGE PROCESSORS 审中-公开
    用于对自然语言处理程序进行培训的问题的开放式自然语言应答的系统和方法

    公开(公告)号:WO2017044415A1

    公开(公告)日:2017-03-16

    申请号:PCT/US2016/050389

    申请日:2016-09-06

    Abstract: Systems and methods gathering text commands in response to a command context using a first crowdsourced are discussed herein. A command context for a natural language processing system may be identified, where the command context is associated with a command context condition to provide commands to the natural language processing system. One or more command creators associated with one or more command creation devices may be selected. A first application one the one or more command creation devices may be configured to display command creation instructions for each of the one or more command creators to provide text commands that satisfy the command context, and to display a field for capturing a user-generated text entry to satisfy the command creation condition in accordance with the command creation instructions. Systems and methods for reviewing the text commands using second and crowdsourced jobs are also presented herein.

    Abstract translation: 这里讨论了使用第一个众包来响应于命令上下文收集文本命令的系统和方法。 可以识别自然语言处理系统的命令上下文,其中命令上下文与命令上下文条件相关联,以向自然语言处理系统提供命令。 可以选择与一个或多个命令创建设备相关联的一个或多个命令创建者。 可以将一个或多个命令创建设备的第一应用程序配置为显示每个一个或多个命令创建者的命令创建指令,以提供满足命令上下文的文本命令,并显示用于捕获用户生成的文本的字段 根据命令创建指令来满足命令创建条件的条目。 本文还介绍了使用第二和众包作业查看文本命令的系统和方法。

    SYSTEM AND METHOD FOR PROVIDING WORDS OR PHRASES TO BE UTTERED BY MEMBERS OF A CROWD AND PROCESSING THE UTTERANCES IN CROWD-SOURCED CAMPAIGNS TO FACILITATE SPEECH ANALYSIS
    3.
    发明申请
    SYSTEM AND METHOD FOR PROVIDING WORDS OR PHRASES TO BE UTTERED BY MEMBERS OF A CROWD AND PROCESSING THE UTTERANCES IN CROWD-SOURCED CAMPAIGNS TO FACILITATE SPEECH ANALYSIS 审中-公开
    系统和方法,用于提供由CROWD成员改编的词语或句法,并处理CROWD-SOURCED CAMPAIGNS中的UTTERANCES以便进行语音分析

    公开(公告)号:WO2017044370A1

    公开(公告)日:2017-03-16

    申请号:PCT/US2016/049856

    申请日:2016-09-01

    Abstract: Systems and methods of providing text related to utterances, and gathering voice data in response to the text are provide herein. In various implementations, an identification token that identifies a first file for a voice data collection campaign, and a second file for a session script may be received from a natural language processing training device. The first file and the second file may be used to configure the mobile application to display a sequence of screens, each of the sequence of screens containing text of at least one utterance specified in the voice data collection campaign. Voice data may be received from the natural language processing training device in response to user interaction with the text of the at least one utterance. The voice data and the text may be stored in a transcription library.

    Abstract translation: 本文提供了提供与话语相关的文本以及响应于文本收集语音数据的系统和方法。 在各种实施方式中,可以从自然语言处理训练装置接收识别用于语音数据收集活动的第一文件的识别令牌和用于会话脚本的第二文件。 可以使用第一文件和第二文件来配置移动应用程序来显示屏幕序列,每个屏幕序列包含语音数据收集活动中指定的至少一个话语的文本。 响应于用户与至少一个话语的文本交互而可以从自然语言处理训练装置接收语音数据。 语音数据和文本可以存储在转录库中。

    SYSTEM AND METHOD OF ANNOTATING UTTERANCES BASED ON TAGS ASSIGNED BY UNMANAGED CROWDS
    4.
    发明申请
    SYSTEM AND METHOD OF ANNOTATING UTTERANCES BASED ON TAGS ASSIGNED BY UNMANAGED CROWDS 审中-公开
    基于由不同角色分配的标签提取UTTERANCES的系统和方法

    公开(公告)号:WO2017044409A1

    公开(公告)日:2017-03-16

    申请号:PCT/US2016/050373

    申请日:2016-09-06

    CPC classification number: G06F17/241 G06F17/218 G06F17/278 G06F17/2785

    Abstract: A system and method of tagging utterances with Named Entity Recognition ("NER") labels using unmanaged crowds is provided. The system may generate various annotation jobs in which a user, among a crowd, is asked to tag which parts of an utterance, if any, relate to various entities associated with a domain. For a given domain that is associated with a number of entities that exceeds a threshold N value, multiple batches of jobs (each batch having jobs that have a limited number of entities for tagging) may be used to tag a given utterance from that domain. This reduces the cognitive load imposed on a user, and prevents the user from having to tag more than N entities. As such, a domain with a large number of entities may be tagged efficiently by crowd participants without overloading each crowd participant with too many entities to tag.

    Abstract translation: 提供了使用非托管人群使用命名实体识别(“NER”)标签来标记话语的系统和方法。 系统可以生成各种注释作业,其中在人群中的用户被要求标记话语的哪个部分(如果有的话)涉及与域相关联的各种实体。 对于与超过阈值N值的多个实体相关联的给定域,可以使用多批作业(每个批次具有具有有限数量的用于标记的实体的作业)来标记来自该域的给定话语。 这减少了施加在用户上的认知负荷,并且防止用户不必标记超过N个实体。 因此,具有大量实体的域可以被群众参与者有效地标记,而不会使具有太多实体的每个群众参与者超载以进行标记。

    SYSTEM AND METHOD FOR VALIDATING NATURAL LANGUAGE CONTENT USING CROWDSOURCED VALIDATION JOBS
    5.
    发明申请
    SYSTEM AND METHOD FOR VALIDATING NATURAL LANGUAGE CONTENT USING CROWDSOURCED VALIDATION JOBS 审中-公开
    用自动验证作业验证自然语言内容的系统和方法

    公开(公告)号:WO2017044368A1

    公开(公告)日:2017-03-16

    申请号:PCT/US2016/049853

    申请日:2016-09-01

    Abstract: Systems and methods of validating transcriptions of natural language content using crowdsourced validation jobs are provided herein. In various implementations, a transcription pair comprising natural language content and text corresponding to a transcription of the natural language content may be gathered. A first group of validation devices may be selected for reviewing the transcription pair. A first crowdsourced validation job may be created for the first group of validation devices. The first crowdsourced validation job may be provided to the first group of validation devices. A vote representing whether or not the text accurately represents the natural language content may be received from each of the first group of validation devices. A validation score may be assigned to the transcription pair based, at least in part, on the votes from each of the first group of validation devices.

    Abstract translation: 本文提供使用众包验证作业验证自然语言内容的转录的系统和方法。 在各种实现中,可以收集包含自然语言内容和对应于自然语言内容的转录的文本的转录对。 可以选择第一组验证装置来检查转录对。 可以为第一组验证设备创建第一个众包验证作业。 可以将第一群众化验证作业提供给第一组验证设备。 可以从第一组验证装置中的每一个接收表示文本是否准确地表示自然语言内容的投票。 至少部分地基于来自第一组验证装置的投票而将确认分数分配给转录对。

Patent Agency Ranking