Invention Grant
US08838599B2 Efficient lexical trending topic detection over streams of data using a modified sequitur algorithm
有权
使用修改的Sequitur算法对数据流进行有效的词汇趋势主题检测
- Patent Title: Efficient lexical trending topic detection over streams of data using a modified sequitur algorithm
- Patent Title (中): 使用修改的Sequitur算法对数据流进行有效的词汇趋势主题检测
-
Application No.: US12780850Application Date: 2010-05-14
-
Publication No.: US08838599B2Publication Date: 2014-09-16
- Inventor: Zhichen Xu , Yun Fu , Neal Sample
- Applicant: Zhichen Xu , Yun Fu , Neal Sample
- Applicant Address: US CA Sunnyvale
- Assignee: Yahoo! Inc.
- Current Assignee: Yahoo! Inc.
- Current Assignee Address: US CA Sunnyvale
- Agency: Martine Penilla Group, LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Embodiments are directed towards a Modified Sequitur algorithm (MSA) using pipelining and indexed arrays to identify trending topics within a plurality of documents having user generated content (UGC). The documents are parallelized and distributed across a plurality of network devices, which place at least some of the received documents into a buffer for which the MSA may then be applied to the documents within the buffer to identify n-grams or phrases within the documents' contents. The identified phrases are further analyzed to remove extraneous co-occurrences of phrases, and/or words based on a part of speech analysis. A weighting of the remaining phrases is used to identify trending topic phrases. Links to content in the plurality of UGC documents that is associated with the trending topic phrases may then be displayed to a client device.
Public/Granted literature
- US20110282874A1 EFFICIENT LEXICAL TRENDING TOPIC DETECTION OVER STREAMS OF DATA USING A MODIFIED SEQUITUR ALGORITHM Public/Granted day:2011-11-17
Information query