专利检索 ap:("Josemina Marcolla Magdalon" OR "Yigal Shai Dayan" OR "Victoria Mazel" OR "Daniel Cohen") AND inv:"Josemina Marcolla Magdalon" 第 1 页

1.

发明授权
Learning word segmentation from non-white space languages corpora 失效
标题翻译：从非空白语言语料库学习单词分割

公开(公告)号：US08165869B2

公开(公告)日：2012-04-24

申请号：US11953635

申请日：2007-12-10

申请人： Josemina Marcolla Magdalon , Yigal Shai Dayan , Victoria Mazel , Daniel Cohen

发明人： Josemina Marcolla Magdalon , Yigal Shai Dayan , Victoria Mazel , Daniel Cohen

IPC分类号： G06F17/27 , G06F17/20

CPC分类号： G06F17/2863 , G06F17/277

摘要： Illustrative embodiments provide a computer implemented method, apparatus, and computer program product for learning word segmentation from non-white space language corpora. In one illustrative embodiment, the computer implemented method receives text input characters and calculates a ratio-measure for each pair of characters in the input characters. The computer implemented method further determines whether the ratio-measure of each pair of characters is equal to a predetermined threshold value. Responsive to determining the ratio-measure is less than the predetermined threshold value, and a local-minimum value, the computer method further identifies the pair as a weak pair and breaks the weak pair of characters.

摘要翻译： 说明性实施例提供了一种用于从非空白语言语料库学习单词分割的计算机实现的方法，装置和计算机程序产品。在一个说明性实施例中，计算机实现的方法接收文本输入字符并且计算输入字符中每对字符的比率度量。计算机实现的方法还确定每对字符的比例度量是否等于预定阈值。响应于确定比率测量值小于预定阈值，并且局部最小值，计算机方法进一步将该对识别为弱对，并打破弱对的一对字符。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类