-
公开(公告)号:US11301633B2
公开(公告)日:2022-04-12
申请号:US16313337
申请日:2018-12-25
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ying Wang , Min Li , Mengyan Lu
IPC: G06K9/00 , G06F40/295 , G06N20/00 , G06F16/28 , G06F40/284
Abstract: A technical document scanner disclosed herein determines and categorizes various common issues among a large number of documents. An implementation of the technical document scanner is implemented using various computer process instructions including scanning a technical document to extract content, applying named entity recognition on the extracted content from the technical document to extract named entities, applying relation extraction on the named entities to extract relations between the named entities, and analyzing the relations between the entities to compose lists of high relevance entities for issue checking.
-
公开(公告)号:US20160180247A1
公开(公告)日:2016-06-23
申请号:US14578424
申请日:2014-12-20
Applicant: Microsoft Technology Licensing, LLC
Inventor: Min Li , Ruofei Zhang , Muhammad Adnan Alam
CPC classification number: G06N99/005 , G06F17/30864 , G06Q30/0277
Abstract: Functionality is described herein for analyzing an input linguistic item, such as a query, in a series of stages. The linguistic item includes one or more candidate items. In a first stage, a brand classifier component determines whether the linguistic item specifies at least one brand, to provide a classifier output result. In a second stage, a tagging component generates a set of tags for at least some of the candidate items in the linguistic item, based, in part, on the classifier output result, to generate a tagging output result. An action-taking component then generates at least one result item based on the tagging output result. Functionality is also described herein for producing the brand classifier component and the tagging component using machine-learning training techniques. The training techniques may include provisions to address the later appearance of new brands that do not appear in a brand dictionary.
Abstract translation: 这里描述了用于在一系列阶段中分析诸如查询的输入语言项目的功能。 语言项目包括一个或多个候选项目。 在第一阶段,品牌分类器组件确定语言项目是否指定至少一个品牌,以提供分类器输出结果。 在第二阶段中,标签组件部分地基于分类器输出结果为语言项目中的至少一些候选项生成一组标签,以产生标记输出结果。 然后,动作采取组件基于标记输出结果生成至少一个结果项。 本文还描述了使用机器学习训练技术来生产品牌分类器部件和标签部件的功能。 培训技术可能包括解决未出现在品牌字典中的新品牌的后期出现的条款。
-
公开(公告)号:US12229513B2
公开(公告)日:2025-02-18
申请号:US18365504
申请日:2023-08-04
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ying Suresh Wang , Min Li , Mengyan Lu
IPC: G06F40/295 , G06F16/28 , G06F40/284 , G06N20/00
Abstract: A technical document scanner disclosed herein determines and categorizes various common issues among a large number of documents. An implementation of the technical document scanner is implemented using various computer process instructions including scanning a technical document to extract content, applying named entity recognition on the extracted content from the technical document to extract named entities, applying relation extraction on the named entities to extract relations between the named entities, and analyzing the relations between the entities to compose lists of high relevance entities for issue checking.
-
公开(公告)号:US11321529B2
公开(公告)日:2022-05-03
申请号:US16313758
申请日:2018-12-25
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ying Wang , Min Li , Mengyan Lu
IPC: G06F40/295 , G06F40/205
Abstract: A date extractor disclosed herein allows extracting dates and date ranges from documents. An implementation of the date extractor is implemented using various computer process instructions including scanning a document to generate a plurality of tokens, assigning labels to token using named entity recognition machine to generate a named entity vector, extracting dates from the named entity vector by comparing each of the named entities of the named entity vector to predetermined patterns of dates to generate a date vector, generating a plurality of date pairs from the date vector, and extracting date-ranges by comparing the plurality of date pairs to predetermined patterns of date ranges.
-
公开(公告)号:US09659259B2
公开(公告)日:2017-05-23
申请号:US14578424
申请日:2014-12-20
Applicant: Microsoft Technology Licensing, LLC
Inventor: Min Li , Ruofei Zhang , Muhammad Adnan Alam
CPC classification number: G06N99/005 , G06F17/30864 , G06Q30/0277
Abstract: Functionality is described herein for analyzing an input linguistic item, such as a query, in a series of stages. The linguistic item includes one or more candidate items. In a first stage, a brand classifier component determines whether the linguistic item specifies at least one brand, to provide a classifier output result. In a second stage, a tagging component generates a set of tags for at least some of the candidate items in the linguistic item, based, in part, on the classifier output result, to generate a tagging output result. An action-taking component then generates at least one result item based on the tagging output result. Functionality is also described herein for producing the brand classifier component and the tagging component using machine-learning training techniques. The training techniques may include provisions to address the later appearance of new brands that do not appear in a brand dictionary.
-
公开(公告)号:US11763088B2
公开(公告)日:2023-09-19
申请号:US17657405
申请日:2022-03-31
Applicant: Microsoft Technology Licensing, LLC
Inventor: Ying Wang , Min Li , Mengyan Lu
IPC: G06N20/00 , G06F40/295 , G06F16/28 , G06F40/284
CPC classification number: G06F40/295 , G06F16/288 , G06F40/284 , G06N20/00 , G06V2201/134
Abstract: A technical document scanner disclosed herein determines and categorizes various common issues among a large number of documents. An implementation of the technical document scanner is implemented using various computer process instructions including scanning a technical document to extract content, applying named entity recognition on the extracted content from the technical document to extract named entities, applying relation extraction on the named entities to extract relations between the named entities, and analyzing the relations between the entities to compose lists of high relevance entities for issue checking.
-
-
-
-
-