Method and system for retrieving confirming sentences
    11.
    发明授权
    Method and system for retrieving confirming sentences 有权
    检索确认句子的方法和系统

    公开(公告)号:US07974963B2

    公开(公告)日:2011-07-05

    申请号:US11187567

    申请日:2005-07-22

    IPC分类号: G06F17/00

    CPC分类号: G06F17/3069 Y10S707/99933

    摘要: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.

    摘要翻译: 提供了一种方法,计算机可读介质和系统,其响应于查询从句子数据库中检索确认句子。 搜索引擎响应于查询从句子数据库中检索确认句子。 在检索确认语句中,搜索引擎基于查询来定义索引单元,索引单元包括来自查询的引理和与查询相关联的扩展索引单元。 然后,搜索引擎使用定义的索引单元作为搜索参数从句子数据库中检索多个句子。 由搜索引擎确定多个检索到的句子和查询中的每一个之间的相似度,其中每个相似度被确定为查询中的术语的语言权重的函数。 然后,搜索引擎基于所确定的相似度对多个检索到的句子进行排序。

    BOOSTING ALGORITHM FOR RANKING MODEL ADAPTATION
    12.
    发明申请
    BOOSTING ALGORITHM FOR RANKING MODEL ADAPTATION 有权
    用于排序模型适应的增强算法

    公开(公告)号:US20100153315A1

    公开(公告)日:2010-06-17

    申请号:US12337623

    申请日:2008-12-17

    IPC分类号: G06F15/18 G06F17/30

    CPC分类号: G06F17/3053

    摘要: Model adaptation may be performed to take a general model trained with a set of training data (possibly large), and adapt the model using a set of domain-specific training data (possibly small). The parameters, structure, or configuration of a model trained in one domain (called the background domain) may be adapted to a different domain (called the adaptation domain), for which there may be a limited amount of training data. The adaption may be performed using the Boosting Algorithm to select an optimal basis function that optimizes a measure of error of the model as it is being iteratively refined, i.e., adapted.

    摘要翻译: 可以执行模型适配以采用用一组训练数据(可能较大)训练的通用模型,并且使用一组特定领域的训练数据(可能小)来适配模型。 在一个域(称为背景域)中训练的模型的参数,结构或配置可以适应于可能存在有限量的训练数据的不同域(称为适配域)。 可以使用升压算法来执行自适应,以选择最优基函数,该优化基函数优化模型的误差量度,因为其被迭代地改进,即适应。

    PHARMACEUTICAL COMPOSITION CONTAINING DOCETAXEL-CYCLODEXTRIN INCLUSION COMPLEX AND ITS PREPARING PROCESS
    13.
    发明申请
    PHARMACEUTICAL COMPOSITION CONTAINING DOCETAXEL-CYCLODEXTRIN INCLUSION COMPLEX AND ITS PREPARING PROCESS 失效
    含有DOCETAXEL-CYCLODEXTRIN包含复合物的药物组合物及其制备方法

    公开(公告)号:US20100048685A1

    公开(公告)日:2010-02-25

    申请号:US12440942

    申请日:2006-10-13

    IPC分类号: A61K31/337 A61P35/00

    摘要: A docetaxel inclusion complex having improved water-solubility (up to 15 mg/ml) and stability (stability constant Ka=2056M−1-13051M−1), comprises docetaxel and hydroxypropyl-beta-cyclodextrin and/or sulfobutyl-beta-cyclodextrin in a ratio of 1:10-150. The method includes steps as follows: docetaxel dissolved in ethanol is added into water solution of cyclodextrin via stirring, until docetaxel is completely dissolved; said solution is filtered in 0.2-04 μm microporous membrane then ethanol is removed through reduced pressure to obtain the inclusion complex in a liquid form; or ethanol, followed by water is removed through reduced pressure, then dried to obtain the inclusion complex in a solid form.

    摘要翻译: 具有改善的水溶性(高达15mg / ml)和稳定性(稳定性常数Ka = 2056M-1-13051M-1)的多西紫杉醇包合物包含多西紫杉醇和羟丙基-β-环糊精和/或磺丁基β-环糊精 比例为1:10-150。 该方法包括以下步骤:通过搅拌将溶于乙醇的多西紫杉醇加入到环糊精的水溶液中,直至多西紫杉醇完全溶解; 将所述溶液在0.2-04μm微孔膜中过滤,然后通过减压除去乙醇,得到液体形式的包合络合物; 或乙醇,随后通过减压除去水,然后干燥,得到固体形式的包合络合物。

    RANKING MODEL ADAPTATION FOR SEARCHING
    14.
    发明申请
    RANKING MODEL ADAPTATION FOR SEARCHING 审中-公开
    排序模式适应搜索

    公开(公告)号:US20090276414A1

    公开(公告)日:2009-11-05

    申请号:US12112826

    申请日:2008-04-30

    IPC分类号: G06F17/30

    CPC分类号: G06F16/9535

    摘要: Search results provided by a search engine (e.g., for the Internet) are improved and/or made more accurate by addressing the limited availability of human labeled training data for certain domains (e.g., languages other than English, within certain date ranges, corresponding to queries over a certain length, etc.). More particularly, a ranking model trained on in-domain data, for which a small amount of human labeled training data (e.g., query/URL pairs) is available (e.g., languages other than English) is adjusted based upon out-domain data, for which a large amount of human labeled training data (e.g., query/URL pairs) is available (e.g., English). Thus, even though the resulting adapted in-domain ranking model is used in the context of in-domain data (e.g., non-English) to provide search results, the search results are improved because they are influenced by an abundance of, albeit out-domain, human labeled training data.

    摘要翻译: 搜索引擎提供的搜索结果(例如,对于互联网)进行改进和/或更准确地解决某些域名(例如,英语以外的语言,某些日期范围内的对应于 查询一定长度等)。 更具体地,针对域内数据进行训练的排名模型,基于域外数据来调整少量人类标记的训练数据(例如,查询/ URL对)可用(例如,除英语以外的语言) 为此,可以使用大量的人类标记的训练数据(例如,查询/ URL对)(例如,英语)。 因此,即使在域内数据(例如,非英语)的上下文中使用所产生的适应的域内排名模型来提供搜索结果,搜索结果被改进,因为它们受到丰富的影响,尽管 域名,人标签训练数据。

    Methods and systems for language translation
    15.
    发明授权
    Methods and systems for language translation 有权
    语言翻译的方法和系统

    公开(公告)号:US07536293B2

    公开(公告)日:2009-05-19

    申请号:US10462459

    申请日:2003-06-16

    IPC分类号: G06F17/28 G06F17/20

    CPC分类号: G06F17/289

    摘要: A translation service is disclosed, the service being provided to a wireless mobile device through a selective downloading of information from a server. The downloaded information includes a translation architecture having a language independent translation engine and at least one language dependent translation database. The language dependent translation database includes translation templates and a translation dictionary. A specialized database for a selected city or cities in the world can also be downloaded. Translation between languages is realized by applying the language dependent translation database, and optionally the city specific translation database, to the translation engine. The translation engine implements a user-driven term replacement scheme for simplifying the translation process.

    摘要翻译: 公开了翻译服务,该服务通过从服务器的信息的选择性下载而被提供给无线移动设备。 下载的信息包括具有语言无关的翻译引擎和至少一个与语言相关的翻译数据库的翻译架构。 语言相关的翻译数据库包括翻译模板和翻译字典。 也可以下载世界上某个城市或城市的专门数据库。 语言之间的翻译是通过将语言相关的翻译数据库和可选的城市特定翻译数据库应用于翻译引擎来实现的。 翻译引擎实现用户驱动的术语替换方案,以简化翻译过程。

    Automatic acquisition of a parallel corpus from a network
    16.
    发明申请
    Automatic acquisition of a parallel corpus from a network 审中-公开
    从网络自动获取并行语料库

    公开(公告)号:US20080168049A1

    公开(公告)日:2008-07-10

    申请号:US11650660

    申请日:2007-01-08

    IPC分类号: G06F17/30

    CPC分类号: G06F16/958

    摘要: Network pages are identified based on whether the pages include image alternative text that indicates that the network pages contain links to pages that are translations of each other. A plurality of pages and a plurality of respective uniform resource locators are downloaded from a server associated with the domain name of the identified network pages. The uniform resource locators are used to identify a set of candidate parallel page pairs and a set of features are created for each candidate parallel page pair. The sets of features are used to identify parallel page pairs, wherein the pages in a parallel page pair are translations of each other.

    摘要翻译: 基于页面是否包括指示网络页面包含彼此相互翻译的页面的链接的图像替代文本来标识网络页面。 从与所识别的网页的域名相关联的服务器下载多个页面和多个相应的统一资源定位符。 统一资源定位符用于识别一组候选并行页对,并为每个候选并行页对创建一组特征。 这些特征集用于识别并行页对,其中并行页对中的页是彼此的翻译。

    Factoid-based searching
    17.
    发明申请
    Factoid-based searching 有权
    基于实质的搜索

    公开(公告)号:US20070136280A1

    公开(公告)日:2007-06-14

    申请号:US11302560

    申请日:2005-12-13

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30616

    摘要: A query and a factoid type selection are received from a user. An index of passages, indexed based on factoids, is accessed and passages that are related to the query, and that have the selected factoid type, are retrieved. The retrieved passages are ranked and provided to the user based on a calculated score, in rank order.

    摘要翻译: 从用户接收到查询和事实类型选择。 访问基于事实的索引的段落索引,并检索与查询相关的段落,并且具有所选择的实例类型的段落。 检索到的段落按照排列顺序根据计算得分排列并提供给用户。

    Componentized slot-filling architecture
    19.
    发明申请
    Componentized slot-filling architecture 有权
    组件化插槽填充架构

    公开(公告)号:US20070094185A1

    公开(公告)日:2007-04-26

    申请号:US11246847

    申请日:2005-10-07

    IPC分类号: G06N5/00

    CPC分类号: G06F17/2785

    摘要: The subject disclosure pertains to systems and methods for performing natural language processing in which tokens are mapped to task slots. The system includes a mapper component that generates a lattice representing possible interpretations of the tokens, a decoder component that creates a ranked list of paths traversing the lattice, a scorer component that generates scores used to rank paths and post-processing components that format the paths for use by other software. Each of these components may be independent, such that the component may be modified or replaced without affecting the remaining components. This allows a variety of different mathematical models and algorithms to be tested or deployed without requiring changes to the remainder of the system.

    摘要翻译: 本发明涉及用于执行令牌被映射到任务时隙的自然语言处理的系统和方法。 该系统包括生成表示可能的令牌解释的格的映射器组件,创建遍历格子的路径的排序列表的解码器组件,产生用于对路径进行排序的得分器组件,以及后处理格式化路径的组件 供其他软件使用。 这些组件中的每一个可以是独立的,使得可以修改或替换组件而不影响剩余组件。 这允许测试或部署各种不同的数学模型和算法,而不需要更改系统的其余部分。

    Method and apparatus for generating and managing a language model data structure
    20.
    发明授权
    Method and apparatus for generating and managing a language model data structure 失效
    用于生成和管理语言模型数据结构的方法和装置

    公开(公告)号:US07020587B1

    公开(公告)日:2006-03-28

    申请号:US09608526

    申请日:2000-06-30

    IPC分类号: G06F7/60

    CPC分类号: G06F17/27 G10L15/285

    摘要: The generation and management of a language model data structure include assigning each segment of a received corpus to a node in a data structure that denotes dependencies between the respective nodes. A transitional probability between each of the nodes in the data structure is calculated. A frequency of occurrence is calculated for each item of the respective segments, and those nodes of the data structure associated with items that do not meet a minimum frequency of occurrence threshold are removed. The data structure may be managed across a system memory of a computer system and an extended memory of the computer system.

    摘要翻译: 语言模型数据结构的生成和管理包括将接收到的语料库的每个段分配给表示相应节点之间的依赖关系的数据结构中的节点。 计算数据结构中每个节点之间的过渡概率。 针对各段的每个项目计算出现频率,并且去除与不符合最小发生频率阈值的项目相关联的数据结构的那些节点。 可以跨计算机系统的系统存储器和计算机系统的扩展存储器来管理数据结构。