System for estimating a distribution of message content categories in source data
    1.
    发明授权
    System for estimating a distribution of message content categories in source data 有权
    用于估计源数据中消息内容类别分布的系统

    公开(公告)号:US09189538B2

    公开(公告)日:2015-11-17

    申请号:US13445257

    申请日:2012-04-12

    IPC分类号: G06F17/30 G06N99/00 G06K9/62

    摘要: A method of computerized content analysis that gives “approximately unbiased and statistically consistent estimates” of a distribution of elements of structured, unstructured, and partially structured source data among a set of categories. In one embodiment, this is done by analyzing a distribution of small set of individually-classified elements in a plurality of categories and then using the information determined from the analysis to extrapolate a distribution in a larger population set. This extrapolation is performed without constraining the distribution of the unlabeled elements to be equal to the distribution of labeled elements, nor constraining a content distribution of content of elements in the labeled set (e.g., a distribution of words used by elements in the labeled set) to be equal to a content distribution of elements in the unlabeled set. Not being constrained in these ways allows the estimation techniques described herein to provide distinct advantages over conventional aggregation techniques.

    摘要翻译: 一种计算机内容分析的方法,其给出了在一组类别中的结构化,非结构化和部分结构化的源数据的元素的分布的“大致无偏差和统计上一致的估计”。 在一个实施例中,这通过分析多个类别中的小组单独分类的元素的分布,然后使用从分析确定的信息来推断更大群体集合中的分布来完成。 执行该外推,而不限制未标记元素的分布等于标记元素的分布,也不限制标记集合中元素的内容分布(例如,标记集合中的元素使用的词的分布) 等于未标记集合中元素的内容分布。 不以这些方式被约束允许本文描述的估计技术提供与常规聚合技术相比的明显优点。

    System for estimating a distribution of message content categories in source data
    2.
    发明申请
    System for estimating a distribution of message content categories in source data 有权
    用于估计源数据中消息内容类别分布的系统

    公开(公告)号:US20090030862A1

    公开(公告)日:2009-01-29

    申请号:US12077534

    申请日:2008-03-19

    IPC分类号: G06F17/00 G06N5/00

    摘要: A method of computerized content analysis that gives “approximately unbiased and statistically consistent estimates” of a distribution of elements of structured, unstructured, and partially structured source data among a set of categories. In one embodiment, this is done by analyzing a distribution of small set of individually-classified elements in a plurality of categories and then using the information determined from the analysis to extrapolate a distribution in a larger population set. This extrapolation is performed without constraining the distribution of the unlabeled elements to be equal to the distribution of labeled elements, nor constraining a content distribution of content of elements in the labeled set (e.g., a distribution of words used by elements in the labeled set) to be equal to a content distribution of elements in the unlabeled set. Not being constrained in these ways allows the estimation techniques described herein to provide distinct advantages over conventional aggregation techniques.

    摘要翻译: 一种计算机内容分析的方法,其给出了在一组类别中的结构化,非结构化和部分结构化的源数据的元素的分布的“大致无偏差和统计上一致的估计”。 在一个实施例中,这通过分析多个类别中的小组单独分类的元素的分布,然后使用从分析确定的信息来推断更大群体集合中的分布来完成。 执行该外推,而不限制未标记元素的分布等于标记元素的分布,也不限制标记集合中元素的内容分布(例如,标记集合中的元素使用的词的分布) 等于未标记集合中元素的内容分布。 不以这些方式被约束允许本文描述的估计技术提供了与常规聚合技术相比的明显优点。

    SYSTEM FOR ESTIMATING A DISTRIBUTION OF MESSAGE CONTENT CATEGORIES IN SOURCE DATA
    3.
    发明申请
    SYSTEM FOR ESTIMATING A DISTRIBUTION OF MESSAGE CONTENT CATEGORIES IN SOURCE DATA 有权
    估算消息来源信息内容类别的系统

    公开(公告)号:US20120215784A1

    公开(公告)日:2012-08-23

    申请号:US13445257

    申请日:2012-04-12

    IPC分类号: G06F17/30

    摘要: A method of computerized content analysis that gives “approximately unbiased and statistically consistent estimates” of a distribution of elements of structured, unstructured, and partially structured source data among a set of categories. In one embodiment, this is done by analyzing a distribution of small set of individually-classified elements in a plurality of categories and then using the information determined from the analysis to extrapolate a distribution in a larger population set. This extrapolation is performed without constraining the distribution of the unlabeled elements to be equal to the distribution of labeled elements, nor constraining a content distribution of content of elements in the labeled set (e.g., a distribution of words used by elements in the labeled set) to be equal to a content distribution of elements in the unlabeled set. Not being constrained in these ways allows the estimation techniques described herein to provide distinct advantages over conventional aggregation techniques.

    摘要翻译: 一种计算机内容分析的方法,其给出了在一组类别中的结构化,非结构化和部分结构化的源数据的元素的分布的“大致无偏差和统计上一致的估计”。 在一个实施例中,这通过分析多个类别中的小组单独分类的元素的分布,然后使用从分析确定的信息来推断更大群体集合中的分布来完成。 执行该外推,而不限制未标记元素的分布等于标记元素的分布,也不限制标记集合中元素的内容分布(例如,标记集合中的元素使用的词的分布) 等于未标记集合中元素的内容分布。 不以这些方式被约束允许本文描述的估计技术提供与常规聚合技术相比的明显优点。

    System for estimating a distribution of message content categories in source data
    4.
    发明授权
    System for estimating a distribution of message content categories in source data 有权
    用于估计源数据中消息内容类别分布的系统

    公开(公告)号:US08180717B2

    公开(公告)日:2012-05-15

    申请号:US12077534

    申请日:2008-03-19

    IPC分类号: G06F17/00 G06F17/21 G06N5/00

    摘要: A method of computerized content analysis that gives “approximately unbiased and statistically consistent estimates” of a distribution of elements of structured, unstructured, and partially structured source data among a set of categories. In one embodiment, this is done by analyzing a distribution of small set of individually-classified elements in a plurality of categories and then using the information determined from the analysis to extrapolate a distribution in a larger population set. This extrapolation is performed without constraining the distribution of the unlabeled elements to be equal to the distribution of labeled elements, nor constraining a content distribution of content of elements in the labeled set (e.g., a distribution of words used by elements in the labeled set) to be equal to a content distribution of elements in the unlabeled set. Not being constrained in these ways allows the estimation techniques described herein to provide distinct advantages over conventional aggregation techniques.

    摘要翻译: 一种计算机内容分析的方法,其给出了在一组类别中的结构化,非结构化和部分结构化的源数据的元素的分布的“大致无偏差和统计上一致的估计”。 在一个实施例中,这通过分析多个类别中的小组单独分类的元素的分布,然后使用从分析确定的信息来推断更大群体集合中的分布来完成。 执行该外推,而不限制未标记元素的分布等于标记元素的分布,也不限制标记集合中元素的内容分布(例如,标记集合中的元素使用的词的分布) 等于未标记集合中元素的内容分布。 不以这些方式被约束允许本文描述的估计技术提供与常规聚合技术相比的明显优点。

    TECHNIQUES FOR PROVIDING XQUERY ACCESS USING WEB SERVICES
    5.
    发明申请
    TECHNIQUES FOR PROVIDING XQUERY ACCESS USING WEB SERVICES 有权
    使用WEB服务提供XQUERY ACCESS的技术

    公开(公告)号:US20110113061A1

    公开(公告)日:2011-05-12

    申请号:US13009712

    申请日:2011-01-19

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30923 G06F17/30861

    摘要: An XQuery access API is described, for providing access to XML data from a data source, using the XQuery language. A requestor can request, from a server, performance of an operation on XML data, wherein request messages and response messages conform to the Simple Object Access Protocol (SOAP). Request and response messages can be transmitted using Hypertext Transfer Protocol (HTTP) or Hypertext Transfer Protocol over Secure Socket Layer (HTTPS). The format of the request and response messages is specified in a definition of a Web service, where the definition conforms to the Web Service Description Language (WSDL).

    摘要翻译: 描述了一个XQuery访问API,用于使用XQuery语言提供从数据源访问XML数据。 请求者可以从服务器请求执行对XML数据的操作,其中请求消息和响应消息符合简单对象访问协议(SOAP)。 请求和响应消息可以使用超文本传输​​协议(HTTP)或超文本传输​​协议通过安全套接字层(HTTPS)传输。 请求和响应消息的格式在Web服务的定义中指定,其中定义符合Web服务描述语言(WSDL)。

    Processing queries against one or more markup language sources
    6.
    发明授权
    Processing queries against one or more markup language sources 有权
    处理针对一个或多个标记语言源的查询

    公开(公告)号:US07668806B2

    公开(公告)日:2010-02-23

    申请号:US10948536

    申请日:2004-09-22

    IPC分类号: G06F17/30

    摘要: Techniques are provided for processing a query, including receiving the query, where the query specifies certain operations to be performed, including (a) a first set of one or more operations that are to be performed on a markup language data source and (b) a second set of one or more operations that are to be performed on a second data source. Then it is determined that a first server that manages the markup language data source is capable of performing the first set of operations. A request is sent to the first server to perform the first set of operations. A response is received, where the response contains results of performing the first set of operations on the markup language data source. Finally, results are generated for the query based at least in part on the results of performing the first set of operations.

    摘要翻译: 提供了用于处理查询的技术,包括接收查询,其中查询指定要执行的某些操作,包括(a)要在标记语言数据源上执行的一个或多个操作的第一组,以及(b) 要在第二数据源上执行的一个或多个操作的第二组。 然后,确定管理标记语言数据源的第一服务器能够执行第一组操作。 发送请求到第一台服务器执行第一组操作。 收到响应,其中响应包含对标记语言数据源执行第一组操作的结果。 最后,至少部分地基于执行第一组操作的结果为查询生成结果。

    Processing queries against one or more markup language sources
    7.
    发明申请
    Processing queries against one or more markup language sources 有权
    处理针对一个或多个标记语言源的查询

    公开(公告)号:US20060031204A1

    公开(公告)日:2006-02-09

    申请号:US10948536

    申请日:2004-09-22

    IPC分类号: G06F17/30

    摘要: Techniques are provided for processing a query, including receiving the query, where the query specifies certain operations to be performed, including (a) a first set of one or more operations that are to be performed on a markup language data source and (b) a second set of one or more operations that are to be performed on a second data source. Then it is determined that a first server that manages the markup language data source is capable of performing the first set of operations. A request is sent to the first server to perform the first set of operations. A response is received, where the response contains results of performing the first set of operations on the markup language data source. Finally, results are generated for the query based at least in part on the results of performing the first set of operations.

    摘要翻译: 提供了用于处理查询的技术,包括接收查询,其中查询指定要执行的某些操作,包括(a)要在标记语言数据源上执行的一个或多个操作的第一组,以及(b) 要在第二数据源上执行的一个或多个操作的第二组。 然后,确定管理标记语言数据源的第一服务器能够执行第一组操作。 发送请求到第一台服务器执行第一组操作。 收到响应,其中响应包含对标记语言数据源执行第一组操作的结果。 最后,至少部分地基于执行第一组操作的结果为查询生成结果。

    Techniques for providing XQuery access using web services
    8.
    发明授权
    Techniques for providing XQuery access using web services 有权
    使用Web服务提供XQuery访问的技术

    公开(公告)号:US08375043B2

    公开(公告)日:2013-02-12

    申请号:US13009712

    申请日:2011-01-19

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30923 G06F17/30861

    摘要: An XQuery access API is described, for providing access to XML data from a data source, using the XQuery language. A requestor can request, from a server, performance of an operation on XML data, wherein request messages and response messages conform to the Simple Object Access Protocol (SOAP). Request and response messages can be transmitted using Hypertext Transfer Protocol (HTTP) or Hypertext Transfer Protocol over Secure Socket Layer (HTTPS). The format of the request and response messages is specified in a definition of a Web service, where the definition conforms to the Web Service Description Language (WSDL).

    摘要翻译: 描述了一个XQuery访问API,用于使用XQuery语言提供从数据源访问XML数据。 请求者可以从服务器请求执行对XML数据的操作,其中请求消息和响应消息符合简单对象访问协议(SOAP)。 请求和响应消息可以使用超文本传输​​协议(HTTP)或超文本传输​​协议通过安全套接字层(HTTPS)传输。 请求和响应消息的格式在Web服务的定义中指定,其中定义符合Web服务描述语言(WSDL)。