Invention Application
US20160117400A1 SYSTEM, METHOD AND APPARATUS FOR AUTOMATIC TOPIC RELEVANT CONTENT FILTERING FROM SOCIAL MEDIA TEXT STREAMS USING WEAK SUPERVISION 审中-公开
自动主题相关内容的系统,方法和装置使用弱监督从社会媒体文本流中过滤

  • Patent Title: SYSTEM, METHOD AND APPARATUS FOR AUTOMATIC TOPIC RELEVANT CONTENT FILTERING FROM SOCIAL MEDIA TEXT STREAMS USING WEAK SUPERVISION
  • Patent Title (中): 自动主题相关内容的系统,方法和装置使用弱监督从社会媒体文本流中过滤
  • Application No.: US14877970
    Application Date: 2015-10-08
  • Publication No.: US20160117400A1
    Publication Date: 2016-04-28
  • Inventor: Arvind AgarwalCailing Dong
  • Applicant: Xerox Corporation
  • Main IPC: G06F17/30
  • IPC: G06F17/30 G06N5/04 G06N99/00
SYSTEM, METHOD AND APPARATUS FOR AUTOMATIC TOPIC RELEVANT CONTENT FILTERING FROM SOCIAL MEDIA TEXT STREAMS USING WEAK SUPERVISION
Abstract:
Presented are a system, method, and apparatus for automatic topic relevant content filtering from social media text streams using weak supervision. A computing device utilizes heuristic rules allowing topic filtering and a data stream data chunk identifier. A plurality of messages are transmitted as streaming message data from a social media network in real-time. The messages are split into a plurality of data stream data chunks according to the data stream data chunk identifier. A rule-based labeled data set L0 is built from one or more data instances in the first stream data chunk. An initial classifier is built based upon features of L0. The initial classifier is applied to a next data stream data chunk to build a labeled data set L1. A subset of representative instances S1 is selected from labeled data set L1. A first representative classifier C1 is constructed from representative instance S1.
Information query
Patent Agency Ranking
0/0