Determining near-optimal block size for incremental-type expectation maximization (EM) algrorithms
    1.
    发明申请
    Determining near-optimal block size for incremental-type expectation maximization (EM) algrorithms 有权
    确定增量型期望最大化(EM)算法的近似最优块大小

    公开(公告)号:US20050267717A1

    公开(公告)日:2005-12-01

    申请号:US11177734

    申请日:2005-07-08

    IPC分类号: G06F17/10 G06F17/18

    摘要: Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.

    摘要翻译: 公开了确定增量型期望最大化(EM)算法的近似最小块大小。 基于新的认识来确定块大小,即对于给定的块大小范围,使用增量型EM算法而不是标准EM算法导致的速度增加大致相同。 此外,该块大小可以由未达到收敛的EM算法的初始版本来确定。 对于当前块大小,确定速度增加,并且如果到目前为止确定的速度增加最大,则将当前块大小设置为目标块大小。 对于新的块大小重复此过程,直到不能确定新的块大小。

    Automatic data perspective generation for a target variable
    2.
    发明申请
    Automatic data perspective generation for a target variable 有权
    为目标变量生成自动数据透视图

    公开(公告)号:US20050234960A1

    公开(公告)日:2005-10-20

    申请号:US10824108

    申请日:2004-04-14

    摘要: The present invention leverages machine learning techniques to provide automatic generation of conditioning variables for constructing a data perspective for a given target variable. The present invention determines and analyzes the best target variable predictors for a given target variable, employing them to facilitate the conveying of information about the target variable to a user. It automatically discretizes continuous and discrete variables utilized as target variable predictors to establish their granularity. In other instances of the present invention, a complexity and/or utility parameter can be specified to facilitate generation of the data perspective via analyzing a best target variable predictor versus the complexity of the conditioning variable(s) and/or utility. The present invention can also adjust the conditioning variables (i.e., target variable predictors) of the data perspective to provide an optimum view and/or accept control inputs from a user to guide/control the generation of the data perspective.

    摘要翻译: 本发明利用机器学习技术来提供用于为给定目标变量构建数据透视图的自动生成调节变量。 本发明确定和分析给定目标变量的最佳目标变量预测变量,使用它们来促进向用户传达关于目标变量的信息。 它自动离散化用作目标变量预测变量的连续和离散变量以确定其粒度。 在本发明的其他实例中,可以规定复杂度和/或效用参数,以通过分析最佳目标变量预测器与调节变量和/或效用的复杂性来促进数据透视图的产生。 本发明还可以调整数据透视图的调节变量(即,目标变量预测器),以提供最佳视图和/或接受来自用户的控制输入以指导/控制数据视角的产生。

    Anomaly detection in data perspectives

    公开(公告)号:US20060106560A1

    公开(公告)日:2006-05-18

    申请号:US11299539

    申请日:2005-12-12

    IPC分类号: G06F19/00

    摘要: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.

    ANOMALY DETECTION IN DATA PERSPECTIVES
    4.
    发明申请
    ANOMALY DETECTION IN DATA PERSPECTIVES 失效
    数据视野中的异常检测

    公开(公告)号:US20050288883A1

    公开(公告)日:2005-12-29

    申请号:US10874956

    申请日:2004-06-23

    摘要: The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.

    摘要翻译: 本发明利用曲线拟合数据技术从数据角度提供“数据管”中的数据异常的自动检测,从而允许例如检测诸如屏幕上的数据异常,向下钻取和钻取数据异常的数据异常 例如,枢轴表和/或OLAP多维数据集。 它确定数据是否基本上偏离由曲线拟合处理(例如应用于数据管的分段线性函数)所建立的预测值。 本发明也可以采用阈值,以便在确定数据值被认为是异常之前确定所需的偏差程度。 阈值可以由系统和/或用户经由用户界面动态地和/或静态地提供。 另外,本发明从顶级数据的角度向用户提供了检测到的异常的类型和位置的指示。

    Association-based epitome design
    5.
    发明申请
    Association-based epitome design 审中-公开
    基于协会的缩影设计

    公开(公告)号:US20060160070A1

    公开(公告)日:2006-07-20

    申请号:US11324467

    申请日:2005-12-30

    IPC分类号: C12Q1/70 G06F19/00

    摘要: Systems that facilitate immunogen design are described herein. An optimization component is provided to determine an immunogen according to at least one criterion. The immunogen comprises a set of overlapping sequences comprising sequences that are known to be and/or are likely to be immunogenic. At least one of the sequences that are likely to be immunogenic can be determined by analyzing associations between a host and a pathogen at a population level. Methods of determining an epitome are described herein. A plurality of sequences are received. At least one of the sequences is predicted to be an epitope based on a relationship between a diverse trait of a population and a mutation of a pathogen. A collection of the plurality of sequences is optimized according to one or more criteria to determine the epitome. Epitomes and immunogens determined by the systems and methods described herein are also contemplated.

    摘要翻译: 本文描述了促进免疫原设计的系统。 提供优化组件以根据至少一个标准确定免疫原。 免疫原包含一组重叠序列,其包含已知是和/或可能是免疫原性的序列。 可能通过在群体水平上分析宿主和病原体之间的关联来确定可能是免疫原性的序列中的至少一个。 本文描述了确定缩影的方法。 接收多个序列。 基于群体的不同性状和病原体的突变之间的关系,至少有一个序列被预测为表位。 根据一个或多个标准来优化多个序列的集合以确定缩写。 还考虑了通过本文所述的系统和方法确定的病原体和免疫原。

    Verifying human interaction to a computer entity by way of a trusted component on a computing device or the like
    7.
    发明申请
    Verifying human interaction to a computer entity by way of a trusted component on a computing device or the like 审中-公开
    通过计算设备等上的受信任的组件验证与计算机实体的人际交互

    公开(公告)号:US20050278253A1

    公开(公告)日:2005-12-15

    申请号:US10868116

    申请日:2004-06-15

    CPC分类号: G06F21/31

    摘要: A method describes user interaction in combination with sending a send item from an application of a computing device to a recipient. The computing device has an attestation unit thereon for attesting to trustworthiness. The application facilitates a user in constructing the send item, and pre-determined indicia are monitored that can be employed to detect that the user is in fact expending effort to construct the send item. The attestation unit authenticates the application to impart trust thereto, and upon the user commanding the application to send, a send attestation is constructed to accompany the send item. The send attestation is based on the monitored indicia and the authentication of the application and thereby describes the user interaction. The constructed send attestation is packaged with the constructed send item and the package is sent to the recipient.

    摘要翻译: 一种方法描述了将发送项目从计算设备的应用发送到接收者的用户交互。 计算设备在其上具有用于证明可信赖性的证明单元。 应用程序便于用户构建发送项目,并且监视可以用于检测用户事实上花费构建发送项目的努力的预定标记。 认证单元认证应用程序以赋予其信任,并且在用户命令应用发送时,构造发送认证以伴随发送项目。 发送证明是基于监控的标记和应用的认证,从而描述用户交互。 构建的发送证明与构建的发送项目一起打包,并将包发送给收件人。

    Trees of classifiers for detecting email spam
    8.
    发明申请
    Trees of classifiers for detecting email spam 有权
    用于检测电子邮件垃圾邮件的分类树

    公开(公告)号:US20070038705A1

    公开(公告)日:2007-02-15

    申请号:US11193691

    申请日:2005-07-29

    IPC分类号: G06F15/16

    CPC分类号: H04L51/12

    摘要: Decision trees populated with classifier models are leveraged to provide enhanced spam detection utilizing separate email classifiers for each feature of an email. This provides a higher probability of spam detection through tailoring of each classifier model to facilitate in more accurately determining spam on a feature-by-feature basis. Classifiers can be constructed based on linear models such as, for example, logistic-regression models and/or support vector machines (SVM) and the like. The classifiers can also be constructed based on decision trees. “Compound features” based on internal and/or external nodes of a decision tree can be utilized to provide linear classifier models as well. Smoothing of the spam detection results can be achieved by utilizing classifier models from other nodes within the decision tree if training data is sparse. This forms a base model for branches of a decision tree that may not have received substantial training data.

    摘要翻译: 利用分类器模型填充的决策树利用电子邮件的每个功能使用单独的电子邮件分类器来提供增强的垃圾邮件检测。 这通过定制每个分类器模型提供了更高的垃圾邮件检测的概率,以便于在逐个特征的基础上更准确地确定垃圾邮件。 分类器可以基于诸如逻辑回归模型和/或支持向量机(SVM)等线性模型来构建。 分类器也可以基于决策树构建。 基于决策树的内部和/或外部节点的“复合特征”也可以用于提供线性分类器模型。 垃圾邮件检测结果的平滑可以通过使用来自决策树内的其他节点的分类器模型来实现,如果训练数据是稀疏的。 这形成了可能没有接收到大量训练数据的决策树的分支的基本模型。