BROWSING HISTORY LANGUAGE MODEL FOR INPUT METHOD EDITOR
    1.
    发明申请
    BROWSING HISTORY LANGUAGE MODEL FOR INPUT METHOD EDITOR 审中-公开
    浏览输入法编辑器的历史语言模型

    公开(公告)号:US20150199332A1

    公开(公告)日:2015-07-16

    申请号:US14423950

    申请日:2012-08-31

    Applicant: Mu Li Xi Chen

    Inventor: Mu Li Xi Chen

    Abstract: Some examples may include generating a browsing history language model based on browsing history information. Further, some implementations may include predicting and presenting a non-Latin character string based at least in part on the browsing history language model, such as in response to receiving a Latin character string via an input method editor interface.

    Abstract translation: 一些示例可以包括基于浏览历史信息生成浏览历史语言模型。 此外,一些实现可以包括至少部分地基于浏览历史语言模型来预测和呈现非拉丁字符串,诸如响应于经由输入方法编辑器界面接收拉丁字符串。

    Personal language model for input method editor

    公开(公告)号:US09824085B2

    公开(公告)日:2017-11-21

    申请号:US14423914

    申请日:2012-08-31

    Applicant: Mu Li Xi Chen

    Inventor: Mu Li Xi Chen

    CPC classification number: G06F17/2863 G06F17/2223 G06F17/276

    Abstract: Some examples include generating a personal language model based on linguistic characteristics of one or more files stored at one or more locations in a file system. Further, some implementations include predicting and presenting a non-Latin character string based at least in part on the personal language model, such as in response to receiving a Latin character string via an input method editor interface.

    PERSONAL LANGUAGE MODEL FOR INPUT METHOD EDITOR
    3.
    发明申请
    PERSONAL LANGUAGE MODEL FOR INPUT METHOD EDITOR 有权
    用于输入法编辑的个人语言模型

    公开(公告)号:US20150186362A1

    公开(公告)日:2015-07-02

    申请号:US14423914

    申请日:2012-08-31

    Applicant: Mu Li Xi Chen

    Inventor: Mu Li Xi Chen

    CPC classification number: G06F17/2863 G06F17/2223 G06F17/276

    Abstract: Some examples include generating a personal language model based on linguistic characteristics of one or more files stored at one or more locations in a file system. Further, some implementations include predicting and presenting a non-Latin character string based at least in part on the personal language model, such as in response to receiving a Latin character string via an input method editor interface.

    Abstract translation: 一些示例包括基于存储在文件系统中的一个或多个位置处的一个或多个文件的语言特征来生成个人语言模型。 此外,一些实现方式包括至少部分地基于个人语言模型来预测和呈现非拉丁字符串,例如响应于经由输入方法编辑器界面接收拉丁字符串。

    Displaying online advertisements
    4.
    发明授权
    Displaying online advertisements 有权
    显示在线广告

    公开(公告)号:US08239257B2

    公开(公告)日:2012-08-07

    申请号:US12601579

    申请日:2009-06-05

    Applicant: Mu Li

    Inventor: Mu Li

    CPC classification number: G06Q30/02 G06Q30/0241 G06Q30/0251

    Abstract: Disclosed is a method for displaying an advertisement. The method displays a present advertisement, determines whether the present advertisement has been displayed completely, and adds an identifier of the present advertisement to a priority advertisement list if the present advertisement has not been displayed completely. The method sends the priority advertisement list to the advertisement engine when requesting the advertisement engine for displaying a next advertisement. Using the priority advertisement list, the advertisement engine may give priority to the present advertisement in next advertisement assignment. Using an optimized advertisement display strategy, the disclosed method may increase coverage rates of advertisement contents to audiences, thereby improving advertisement effectiveness for advertisers and increasing cash flow return for website owners.

    Abstract translation: 公开了一种显示广告的方法。 该方法显示当前广告,确定当前广告是否已经完全显示,并且如果当前广告没有被完全显示,则将当前广告的标识符添加到优先广告列表。 当请求用于显示下一广告的广告引擎时,该方法将优先广告列表发送到广告引擎。 使用优先广告列表,广告引擎可以在下一个广告分配中优先考虑当前广告。 使用优化的广告显示策略,所公开的方法可以增加广告内容对观众的覆盖率,从而提高广告商的广告效果,增加网站所有者的现金流量回报。

    Using source-channel models for word segmentation
    5.
    发明授权
    Using source-channel models for word segmentation 有权
    使用源通道模型进行分词

    公开(公告)号:US07493251B2

    公开(公告)日:2009-02-17

    申请号:US10448644

    申请日:2003-05-30

    CPC classification number: G06F17/2755 G06F17/277

    Abstract: A method and apparatus for segmenting text is provided that identifies a sequence of entity types from a sequence of characters and thereby identifies a segmentation for the sequence of characters. Under the invention, the sequence of entity types is identified using probabilistic models that describe the likelihood of a sequence of entities and the likelihood of sequences of characters given particular entities. Under one aspect of the invention, organization name entities are identified from a first sequence of identified entities to form a final sequence of identified entities.

    Abstract translation: 提供了用于分割文本的方法和装置,其从字符序列识别实体类型的序列,从而识别字符序列的分割。 在本发明下,使用描述实体序列的可能性的概率模型和给定特定实体的字符序列的可能性来识别实体类型的序列。 在本发明的一个方面,从识别的实体的第一序列识别组织名称实体,以形成所识别实体的最终序列。

    QUERY SPELLER
    6.
    发明申请

    公开(公告)号:US20080046405A1

    公开(公告)日:2008-02-21

    申请号:US11465023

    申请日:2006-08-16

    CPC classification number: G06F17/3064

    Abstract: Candidate suggestions for correcting misspelled query terms input into a search application are automatically generated. A score for each candidate suggestion can be generated using a first decoding pass and paths through the suggestions can be ranked in a second decoding pass. Candidate suggestions can be generated based on typographical errors, phonetic mistakes and/or compounding mistakes. Furthermore, a ranking model can be developed to rank candidate suggestions to be presented to a user.

    Abstract translation: 自动生成用于纠正输入到搜索应用程序中的拼错查询条件的候选建议。 可以使用第一解码通道来生成每个候选建议的得分,并且通过建议的路径可以被排列在第二解码通行证中。 可以根据印刷错误,语音错误和/或复合错误生成候选建议。 此外,可以开发排名模型来排列要呈现给用户的候选建议。

    Unsupervised training for overlapping ambiguity resolution in word segmentation
    7.
    发明申请
    Unsupervised training for overlapping ambiguity resolution in word segmentation 审中-公开
    用于重叠模糊度分辨率的无监督训练

    公开(公告)号:US20050060150A1

    公开(公告)日:2005-03-17

    申请号:US10662502

    申请日:2003-09-15

    Applicant: Mu Li Jianfeng Gao

    Inventor: Mu Li Jianfeng Gao

    CPC classification number: G06F17/2863 G06F17/2775

    Abstract: A method for resolving overlapping ambiguity strings in unsegmented languages such as Chinese. The methodology includes segmenting sentences into two possible segmentations and recognizing overlapping ambiguity strings in the sentences. One of the two possible segmentations is selected as a function of probability information. The probability information is derived from unsupervised training data. A method of constructing a knowledge base containing probability information needed to select one of the segmentation is also provided.

    Abstract translation: 用于解析诸如中文的未分段语言中的重叠歧义字符串的方法。 该方法包括将句子分割成两个可能的分段,并识别句子中的重叠歧义字符串。 作为概率信息的函数选择两个可能的分段中的一个。 概率信息是从无监督的训练数据导出的。 还提供了构建包含选择分割之一所需的概率信息的知识库的方法。

    Distributional similarity-based models for query correction
    8.
    发明申请
    Distributional similarity-based models for query correction 有权
    基于分布相似性的查询校正模型

    公开(公告)号:US20080104056A1

    公开(公告)日:2008-05-01

    申请号:US11589557

    申请日:2006-10-30

    Applicant: Mu Li Ming Zhou

    Inventor: Mu Li Ming Zhou

    Abstract: A distributional similarity between a word of a search query and a term of a candidate word sequences is used to determine an error model probability that describes the probability of the search query given the candidate word sequence. The error model probability is used to determine a probability of the candidate word sequence given the search query. The probability of the candidate word sequence given the search query is used to select a candidate word sequence as a corrected word sequence for the search query. Distributional similarity is also used to build features that are applied in maximum entropy model to compute the probability of the candidate word sequence given the search query.

    Abstract translation: 使用搜索查询的词和候选词序列的词之间的分布相似度来确定描述候选词序列的搜索查询的概率的误差模型概率。 误差模型概率用于确定给定搜索查询的候选词序列的概率。 使用给出搜索查询的候选词序列的概率用于选择候选词序列作为搜索查询的校正单词序列。 分布相似性也用于构建在最大熵模型中应用的特征,以计算给定搜索查询的候选词序列的概率。

    Language classification with random feature clustering
    9.
    发明申请
    Language classification with random feature clustering 审中-公开
    语言分类与随机特征聚类

    公开(公告)号:US20060287848A1

    公开(公告)日:2006-12-21

    申请号:US11157091

    申请日:2005-06-20

    CPC classification number: G06F16/355

    Abstract: An ensemble of random feature clusters is built from training data using a clustering algorithm where some randomness has been introduced. For each clustered feature space, a classifier, such as a Naïve Bayesian Classifier, is trained, realizing a classifier ensemble. The final classification decision is made by the resulting classifier ensemble.

    Abstract translation: 随机特征群集由训练数据构建,使用聚类算法,其中引入了一些随机性。 对于每个聚类特征空间,训练一个分类器,如朴素贝叶斯分类器,实现分类器集合。 最终的分类决定是由所得到的分类器集合决定的。

    Post-processing system and method for correcting machine recognized text
    10.
    发明授权
    Post-processing system and method for correcting machine recognized text 失效
    用于校正机器识别文本的后处理系统和方法

    公开(公告)号:US07092567B2

    公开(公告)日:2006-08-15

    申请号:US10288645

    申请日:2002-11-04

    CPC classification number: G06K9/723 G06K2209/01

    Abstract: A method of post-processing character data from an optical character recognition (OCR) engine and apparatus to perform the method. This exemplary method includes segmenting the character data into a set of initial words. The set of initial words is word level processed to determine at least one candidate word corresponding to each initial word. The set of initial words is segmented into a set of sentences. Each sentence in the set of sentences includes a plurality of initial words and candidate words corresponding to the initial words. A sentence is selected from the set of sentences. The selected sentence is word disambiguity processed to determine a plurality of final words. A final word is selected from the at least one candidate word corresponding to a matching initial word. The plurality of final words is then assembled as post-processed OCR data.

    Abstract translation: 一种后处理来自光学字符识别(OCR)引擎和装置的字符数据的方法。 该示例性方法包括将字符数据分割成一组初始字。 初始字的集合被处理为字处理以确定与每个初始字对应的至少一个候选字。 该组初始单词被分割成一组句子。 该组句子中的每个句子包括与初始词对应的多个初始词和候选词。 从一组句子中选出一个句子。 所选择的句子是处理的词消除歧义以确定多个最终词。 从对应于匹配的初始字的至少一个候选字中选择最终字。 然后将多个最终单词组装为后处理OCR数据。

Patent Agency Ranking