PREDICTING RESULTS FOR INPUT DATA BASED ON A MODEL GENERATED FROM CLUSTERS
    1.
    发明申请
    PREDICTING RESULTS FOR INPUT DATA BASED ON A MODEL GENERATED FROM CLUSTERS 审中-公开
    基于从集群生成的模型的输入数据的预测结果

    公开(公告)号:WO2007142982A1

    公开(公告)日:2007-12-13

    申请号:PCT/US2007/012762

    申请日:2007-05-30

    Inventor: PENG, Fuchun

    CPC classification number: G06F17/2775 G06F17/278 G06F17/2863

    Abstract: A method for predicting results for input data based on a model that is generated based on clusters of related characters, clusters of related segments, and training data. The method comprises receiving a data set that includes a plurality of words in a particular language. In the particular language, words are formed by characters. Clusters of related characters are formed from the data set. A model is generated based at least on the clusters of related characters and training data. The model may also be based on the clusters of related segments. The training data includes a plurality of entries, wherein each entry includes a character and a designated result for said character. A set of input data that includes characters that have not been associated with designated results is received. The model is applied to the input data to determine predicted results for characters within the input data.

    Abstract translation: 一种用于基于基于相关字符的集群,相关段的集群和训练数据生成的模型来预测输入数据的结果的方法。 该方法包括接收包含特定语言的多个单词的数据集。 在特定的语言中,单词由字符组成。 相关字符群由数据集形成。 至少基于相关字符和训练数据的集群生成模型。 该模型也可以基于相关段的集群。 训练数据包括多个条目,其中每个条目包括字符和所述字符的指定结果。 接收到一组包含尚未与指定结果相关联的字符的输入数据。 该模型应用于输入数据,以确定输入数据中字符的预测结果。

Patent Agency Ranking