Invention Grant
US08838599B2 Efficient lexical trending topic detection over streams of data using a modified sequitur algorithm 有权
使用修改的Sequitur算法对数据流进行有效的词汇趋势主题检测

Efficient lexical trending topic detection over streams of data using a modified sequitur algorithm
Abstract:
Embodiments are directed towards a Modified Sequitur algorithm (MSA) using pipelining and indexed arrays to identify trending topics within a plurality of documents having user generated content (UGC). The documents are parallelized and distributed across a plurality of network devices, which place at least some of the received documents into a buffer for which the MSA may then be applied to the documents within the buffer to identify n-grams or phrases within the documents' contents. The identified phrases are further analyzed to remove extraneous co-occurrences of phrases, and/or words based on a part of speech analysis. A weighting of the remaining phrases is used to identify trending topic phrases. Links to content in the plurality of UGC documents that is associated with the trending topic phrases may then be displayed to a client device.
Information query
Patent Agency Ranking
0/0