-
公开(公告)号:US08484014B2
公开(公告)日:2013-07-09
申请号:US12362428
申请日:2009-01-29
申请人: Xiaohua Liu , Ming Zhou , Hao Wei , Jing Zhao , Matthew R. Scott , Long Jiang , Gang Chen
发明人: Xiaohua Liu , Ming Zhou , Hao Wei , Jing Zhao , Matthew R. Scott , Long Jiang , Gang Chen
CPC分类号: G06F17/30684 , G10L15/19
摘要: A method and system for identifying documents relevant to a query that specifies a part of speech is provided. A retrieval system receives from a user an input query that includes a word and a part of speech. Upon receiving an input query that includes a word and a part of speech, the retrieval system identifies documents with a sentence that includes that word collocated with a word that is used as that part of speech. The retrieval system displays to the user an indication of the identified documents.
摘要翻译: 提供了一种用于识别与指定一部分语音的查询相关的文档的方法和系统。 检索系统从用户接收包括单词和一部分语音的输入查询。 在接收到包括单词和一部分语音的输入查询时,检索系统用包含该单词的句子识别文档,该单词与用作该部分语音的单词并置。 检索系统向用户显示所识别的文档的指示。
-
公开(公告)号:US20100114574A1
公开(公告)日:2010-05-06
申请号:US12362428
申请日:2009-01-29
申请人: Xiaohua Liu , Ming Zhou , Hao Wei , Jing Zhao , Matthew R. Scott , Long Jiang , Gang Chen
发明人: Xiaohua Liu , Ming Zhou , Hao Wei , Jing Zhao , Matthew R. Scott , Long Jiang , Gang Chen
IPC分类号: G10L15/04
CPC分类号: G06F17/30684 , G10L15/19
摘要: A method and system for identifying documents relevant to a query that specifies a part of speech is provided. A retrieval system receives from a user an input query that includes a word and a part of speech. Upon receiving an input query that includes a word and a part of speech, the retrieval system identifies documents with a sentence that includes that word collocated with a word that is used as that part of speech. The retrieval system displays to the user an indication of the identified documents.
摘要翻译: 提供了一种用于识别与指定一部分语音的查询相关的文档的方法和系统。 检索系统从用户接收包括单词和一部分语音的输入查询。 在接收到包括单词和一部分语音的输入查询时,检索系统用包含该单词的句子识别文档,该单词与用作该部分语音的单词并置。 检索系统向用户显示所识别的文档的指示。
-
公开(公告)号:US08275604B2
公开(公告)日:2012-09-25
申请号:US12406722
申请日:2009-03-18
申请人: Long Jiang , Shiquan Yang , Ming Zhou , Xiaohua Liu
发明人: Long Jiang , Shiquan Yang , Ming Zhou , Xiaohua Liu
CPC分类号: G06F17/28 , G06F17/2827
摘要: Embodiments for the adaptive learning of translation layout patterns to mine bilingual data are disclosed. In accordance with at least one embodiment, the adaptive learning of patterns to mine bilingual data includes processing a bilingual web page into a Document Object Model (DOM) tree. The embodiment further includes linking the bilingual snippet pairs of each node into a plurality bilingual snippet pairs. The embodiment also includes determining one or more best fit candidate patterns based on the plurality of translation snippets via a Support Vector Machine classifier. The embodiment additionally includes mining one or more translation pairs from the bilingual web page using the one or more best fit candidate patterns. The translation pairs are further stored in a data storage. The one or more translation pairs including at least one of a term pair, a phrase pair, or a sentence pair.
摘要翻译: 披露了双语数据翻译布局模式自适应学习的实施例。 根据至少一个实施例,对双语数据挖掘的模式的自适应学习包括将双语网页处理成文档对象模型(DOM)树。 该实施例还包括将每个节点的双语片段对链接成多个双语片段对。 该实施例还包括经由支持向量机分类器基于多个翻译片段来确定一个或多个最佳拟合候选模式。 该实施例另外包括使用一个或多个最佳拟合候选模式从双语网页挖掘一个或多个翻译对。 翻译对进一步存储在数据存储器中。 所述一个或多个翻译对包括术语对,短语对或句子对中的至少一个。
-
公开(公告)号:US20100241416A1
公开(公告)日:2010-09-23
申请号:US12406722
申请日:2009-03-18
申请人: Long Jiang , Shiquan Yang , Ming Zhou , Xiaohua Liu
发明人: Long Jiang , Shiquan Yang , Ming Zhou , Xiaohua Liu
IPC分类号: G06F17/28
CPC分类号: G06F17/28 , G06F17/2827
摘要: Embodiments for the adaptive learning of translation layout patterns to mine bilingual data are disclosed. In accordance with at least one embodiment, the adaptive learning of patterns to mine bilingual data includes processing a bilingual web page into a Document Object Model (DOM) tree. The embodiment further includes linking the bilingual snippet pairs of each node into a plurality bilingual snippet pairs. The embodiment also includes determining one or more best fit candidate patterns based on the plurality of translation snippets via a Support Vector Machine classifier. The embodiment additionally includes mining one or more translation pairs from the bilingual web page using the one or more best fit candidate patterns. The translation pairs are further stored in a data storage. The one or more translation pairs including at least one of a term pair, a phrase pair, or a sentence pair.
摘要翻译: 披露了双语数据翻译布局模式自适应学习的实施例。 根据至少一个实施例,对双语数据挖掘的模式的自适应学习包括将双语网页处理成文档对象模型(DOM)树。 该实施例还包括将每个节点的双语片段对链接成多个双语片段对。 该实施例还包括经由支持向量机分类器基于多个翻译片段来确定一个或多个最佳拟合候选模式。 该实施例另外包括使用一个或多个最佳拟合候选模式从双语网页挖掘一个或多个翻译对。 翻译对进一步存储在数据存储器中。 所述一个或多个翻译对包括术语对,短语对或句子对中的至少一个。
-
公开(公告)号:US20110246173A1
公开(公告)日:2011-10-06
申请号:US12753023
申请日:2010-04-01
申请人: Henry Li , Matthew Robert Scott , Xiaohua Liu , Hao Wei , Ming Zhou
发明人: Henry Li , Matthew Robert Scott , Xiaohua Liu , Hao Wei , Ming Zhou
IPC分类号: G06F17/28
CPC分类号: G06F17/2827 , G06F17/2854
摘要: Techniques for interactively presenting word-alignments of multilingual translations and automatically improving those translations based upon user feedback are described herein. With one or more implementations of the techniques described herein, a word-alignment user-interface (UI) concurrently displays a pair of bilingual sentences, where one is a translation of the other, and interactively highlights linked (i.e., “word-aligned”) words and phrases of the pair. Other implementations of the techniques described herein offer an option for a user to provide feedback about the existing word-alignments or realign the words or phrases. In still other described implementations, word-alignment is automatically improved based upon that user feedback.
摘要翻译: 本文描述了用于交互地呈现多语言翻译的字对齐并基于用户反馈自动改进这些翻译的技术。 通过本文描述的技术的一个或多个实现,字对齐用户界面(UI)同时显示一对双语句子,其中一个是另一个的翻译,并且交互地突出显示链接(即,“字对齐” )该对的单词和短语。 本文描述的技术的其他实施方案提供了用于用户提供关于现有单词对齐或重新对准单词或短语的反馈的选项。 在其他描述的实施方式中,基于该用户反馈自动改进字对齐。
-
公开(公告)号:US08930176B2
公开(公告)日:2015-01-06
申请号:US12753023
申请日:2010-04-01
申请人: Henry Li , Matthew Robert Scott , Xiaohua Liu , Hao Wei , Ming Zhou
发明人: Henry Li , Matthew Robert Scott , Xiaohua Liu , Hao Wei , Ming Zhou
IPC分类号: G06F17/28
CPC分类号: G06F17/2827 , G06F17/2854
摘要: Techniques for interactively presenting word-alignments of multilingual translations and automatically improving those translations based upon user feedback are described herein. With one or more implementations of the techniques described herein, a word-alignment user-interface (UI) concurrently displays a pair of bilingual sentences, where one is a translation of the other, and interactively highlights linked (i.e., “word-aligned”) words and phrases of the pair. Other implementations of the techniques described herein offer an option for a user to provide feedback about the existing word-alignments or realign the words or phrases. In still other described implementations, word-alignment is automatically improved based upon that user feedback.
摘要翻译: 本文描述了用于交互地呈现多语言翻译的字对齐并基于用户反馈自动改进这些翻译的技术。 通过本文描述的技术的一个或多个实现,字对齐用户界面(UI)同时显示一对双语句子,其中一个是另一个的翻译,并且交互地突出显示链接(即,“字对齐” )该对的单词和短语。 本文描述的技术的其他实施方案提供了用于用户提供关于现有单词对齐或重新对准单词或短语的反馈的选项。 在其他描述的实施方式中,基于该用户反馈自动改进字对齐。
-
公开(公告)号:US20130159277A1
公开(公告)日:2013-06-20
申请号:US13326028
申请日:2011-12-14
申请人: Xiaohua Liu , Ming Zhou , Furu Wei
发明人: Xiaohua Liu , Ming Zhou , Furu Wei
IPC分类号: G06F17/30
CPC分类号: G06F17/271 , G06F16/901 , G06F16/951 , G06F17/278 , G06F17/2785
摘要: Target based indexing of micro-blog content may include extracting, labeling, and indexing data contained in micro-blog entries. For example, by adapting natural language processing (NLP) technologies to a micro-blog entry, data is extracted in order to create an index. In one embodiment, a search engine may access the index in order to return results of a search query. In another embodiment, a user interface may display micro-blog entries categorically, allowing the user to access micro-blog entries by event, quote, opinion, or other category.
摘要翻译: 基于目标的微博内容索引可能包括提取,标注和索引微博条目中包含的数据。 例如,通过将自然语言处理(NLP)技术适应到微博条目,提取数据以创建索引。 在一个实施例中,搜索引擎可以访问索引以返回搜索查询的结果。 在另一个实施例中,用户界面可以分别显示微博条目,允许用户通过事件,报价,意见或其他类别访问微博条目。
-
公开(公告)号:US20140006012A1
公开(公告)日:2014-01-02
申请号:US13539674
申请日:2012-07-02
申请人: Ming Zhou , Furu Wei , Xiaohua Liu , Hong Sun , Yajuan Duan , Chengjie Sun , Heung-Yeung Shum
发明人: Ming Zhou , Furu Wei , Xiaohua Liu , Hong Sun , Yajuan Duan , Chengjie Sun , Heung-Yeung Shum
IPC分类号: G06F17/27
CPC分类号: G06F16/3329 , G06F16/3344 , G06F17/278
摘要: Techniques described enable answering a natural language question using machine learning-based methods to gather and analyze evidence from web searches. A received natural language question is analyzed to extract query units and to determine a question type, answer type, and/or lexical answer type using rules-based heuristics and/or machine learning trained classifiers. Query generation templates are employed to generate a plurality of ranked queries to be used to gather evidence to determine the answer to the natural language question. Candidate answers are extracted from the results based on the answer type and/or lexical answer type, and ranked using a ranker previously trained offline. Confidence levels are calculated for the candidate answers and top answer(s) may be provided to the user if the confidence levels of the top answer(s) surpass a threshold.
摘要翻译: 所描述的技术使用基于机器学习的方法来回答自然语言问题,以收集和分析来自网络搜索的证据。 分析收到的自然语言问题以提取查询单元,并使用基于规则的启发式和/或机器学习训练分类器来确定问题类型,答案类型和/或词汇答案类型。 采用查询生成模板来生成多个排列的查询,用于收集证据以确定自然语言问题的答案。 基于答案类型和/或词汇答案类型从结果中提取候选答案,并使用以前在线下训练的跑步者进行排名。 对于候选答案计算置信水平,如果顶级答案的置信水平超过阈值,则可以向用户提供顶级答案。
-
公开(公告)号:US08862459B2
公开(公告)日:2014-10-14
申请号:US13087407
申请日:2011-04-15
申请人: Long Jiang , Ming Zhou , Su Hao
发明人: Long Jiang , Ming Zhou , Su Hao
CPC分类号: G06F17/21 , G06F17/24 , G06F17/2863
摘要: Embodiments are disclosed for automatically generating a banner given a first scroll sentence and a second scroll sentence of a Chinese couplet. The first and/or second scroll sentence can be generated by an automatic computer system or by a human (e.g., manually generated and then provided as input to an automated banner generation system) or obtained from any source (e.g., a book) and provided as input. In one embodiment, an information retrieval process is utilized to identify banner candidates that best match the first and second scroll sentences. In one embodiment, candidate banners are automatically generated. In one embodiment, a ranking model is applied in order to rank banner candidates derived from the banner search and generation processes. One or more banners are then selected from the ranked banner candidates.
摘要翻译: 公开了用于自动生成给定中文对联的第一滚动句和第二滚动句的横幅的实施例。 第一和/或第二滚动句可以由自动计算机系统或人(例如,手动生成然后作为自动横幅生成系统的输入提供)或从任何来源(例如,书)获得并提供 作为输入。 在一个实施例中,使用信息检索处理来识别与第一和第二滚动句子最匹配的横幅候选。 在一个实施例中,自动生成候选横幅。 在一个实施例中,应用排序模型以排序从横幅搜索和生成处理导出的横幅候选。 然后从排名的横幅候选中选择一个或多个横幅。
-
公开(公告)号:US08000955B2
公开(公告)日:2011-08-16
申请号:US11788448
申请日:2007-04-20
申请人: Long Jiang , Ming Zhou , Su Hao
发明人: Long Jiang , Ming Zhou , Su Hao
IPC分类号: G06F17/27
CPC分类号: G06F17/21 , G06F17/24 , G06F17/2863
摘要: Embodiments are disclosed for automatically generating a banner given a first scroll sentence and a second scroll sentence of a Chinese couplet. The first and/or second scroll sentence can be generated by an automatic computer system or by a human (e.g., manually generated and then provided as input to an automated banner generation system) or obtained from any source (e.g., a book) and provided as input. In one embodiment, an information retrieval process is utilized to identify banner candidates that best match the first and second scroll sentences. In one embodiment, candidate banners are automatically generated. In one embodiment, a ranking model is applied in order to rank banner candidates derived from the banner search and generation processes. One or more banners are then selected from the ranked banner candidates.
摘要翻译: 公开了用于自动生成给定中文对联的第一滚动句和第二滚动句的横幅的实施例。 第一和/或第二滚动句可以由自动计算机系统或人(例如,手动生成然后作为自动横幅生成系统的输入提供)或从任何来源(例如,书)获得并提供 作为输入。 在一个实施例中,使用信息检索处理来识别与第一和第二滚动句子最匹配的横幅候选。 在一个实施例中,自动生成候选横幅。 在一个实施例中,应用排序模型以排序从横幅搜索和生成处理导出的横幅候选。 然后从排名的横幅候选中选择一个或多个横幅。
-
-
-
-
-
-
-
-
-