发明申请
- 专利标题: Text Segmentation and Label Assignment with User Interaction by Means of Topic Specific Language Models and Topic-Specific Label Statistics
- 专利标题(中): 通过主题特定语言模型和主题特定标签统计的用户交互的文本分段和标签分配
-
申请号: US10595831申请日: 2004-11-12
-
公开(公告)号: US20080201130A1公开(公告)日: 2008-08-21
- 发明人: Jochen Peters , Evgeny Matusov , Carsten Meyer , Dietrich Klakow
- 申请人: Jochen Peters , Evgeny Matusov , Carsten Meyer , Dietrich Klakow
- 申请人地址: NL EINDHOVEN
- 专利权人: KONINKLIJKE PHILIPS ELECTRONIC, N.V.
- 当前专利权人: KONINKLIJKE PHILIPS ELECTRONIC, N.V.
- 当前专利权人地址: NL EINDHOVEN
- 优先权: EP03104316.9 20031121
- 国际申请: PCT/IB04/52405 WO 20041112
- 主分类号: G06F17/27
- IPC分类号: G06F17/27
摘要:
The invention relates to a method, a computer program product, a segmentation system and a user interface for structuring an unstructured text by making use of statistical models trained on annotated training data. The method performs text segmentation into text sections and assigns labels to text sections as section headings. The performed segmentation and assignment is provided to a user for general review. Additionally, alternative segmentations and label assignments are provided to the user being capable to select alternative segmentations and alternative labels as well as to enter a user defined segmentation and user defined label. In response to the modifications introduced by the user, a plurality of different actions are initiated incorporating the re-segmentation and re-labelling of successive parts of the document or the entire document. Furthermore the method comprises a learning functionality, logging and analyzing user introduced modifications for adaptation of user's preferences and for further training of the statistical models.
公开/授权文献
信息查询