Invention Grant
- Patent Title: Named entity extraction from a block of text
-
Application No.: US15364639Application Date: 2016-11-30
-
Publication No.: US10002123B2Publication Date: 2018-06-19
- Inventor: Brian Whitman , Hui Cao
- Applicant: SPOTIFY AB
- Applicant Address: SE Stockholm
- Assignee: Spotify AB
- Current Assignee: Spotify AB
- Current Assignee Address: SE Stockholm
- Agency: Merchant & Gould P.C.
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06F17/27 ; G06F17/21 ; G06F17/30 ; G10L15/26

Abstract:
A data processing method, program, and apparatus for identifying a document within a block of text. A block of text is tokenized into a plurality of text tokens according to at least one rule parser. Each of the plurality of text tokens is sequentially compared to a plurality of document tokens to determine if the text token matches one of the plurality of document tokens. The plurality of document tokens correspond to a plurality of documents which have been tokenized according to the one or more rule parsers. Each matched text token is filtered according to predetermined filtering criteria to generate one or more candidate text tokens. It is then determined whether sequence of candidate text tokens that occur in sequential order within the block of text match sequence of document tokens. If so, then it is determined that the document has been identified within the block of text. The document can correspond to an artist, a song names, and misspellings and aliases thereof.
Public/Granted literature
- US20170083505A1 NAMED ENTITY EXTRACTION FROM A BLOCK OF TEXT Public/Granted day:2017-03-23
Information query