摘要:
Determining the near-optimal block size for incremental-type expectation maximization (EM) algorithms is disclosed. Block size is determined based on the novel insight that the speed increase resulting from using an incremental-type EM algorithm as opposed to the standard EM algorithm is roughly the same for a given range of block sizes. Furthermore, this block size can be determined by an initial version of the EM algorithm that does not reach convergence. For a current block size, the speed increase is determined, and if the speed increase is the greatest determined so far, the current block size is set as the target block size. This process is repeated for new block sizes, until no new block sizes can be determined.
摘要:
The present invention leverages machine learning techniques to provide automatic generation of conditioning variables for constructing a data perspective for a given target variable. The present invention determines and analyzes the best target variable predictors for a given target variable, employing them to facilitate the conveying of information about the target variable to a user. It automatically discretizes continuous and discrete variables utilized as target variable predictors to establish their granularity. In other instances of the present invention, a complexity and/or utility parameter can be specified to facilitate generation of the data perspective via analyzing a best target variable predictor versus the complexity of the conditioning variable(s) and/or utility. The present invention can also adjust the conditioning variables (i.e., target variable predictors) of the data perspective to provide an optimum view and/or accept control inputs from a user to guide/control the generation of the data perspective.
摘要:
The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.
摘要:
The present invention leverages curve fitting data techniques to provide automatic detection of data anomalies in a “data tube” from a data perspective, allowing, for example, detection of data anomalies such as on-screen, drill down, and drill across data anomalies in, for example, pivot tables and/or OLAP cubes. It determines if data substantially deviates from a predicted value established by a curve fitting process such as, for example, a piece-wise linear function applied to the data tube. A threshold value can also be employed by the present invention to facilitate in determining a degree of deviation necessary before a data value is considered anomalous. The threshold value can be supplied dynamically and/or statically by a system and/or a user via a user interface. Additionally, the present invention provides an indication to a user of the type and location of a detected anomaly from a top level data perspective.
摘要:
Systems that facilitate immunogen design are described herein. An optimization component is provided to determine an immunogen according to at least one criterion. The immunogen comprises a set of overlapping sequences comprising sequences that are known to be and/or are likely to be immunogenic. At least one of the sequences that are likely to be immunogenic can be determined by analyzing associations between a host and a pathogen at a population level. Methods of determining an epitome are described herein. A plurality of sequences are received. At least one of the sequences is predicted to be an epitope based on a relationship between a diverse trait of a population and a mutation of a pathogen. A collection of the plurality of sequences is optimized according to one or more criteria to determine the epitome. Epitomes and immunogens determined by the systems and methods described herein are also contemplated.
摘要:
Systems and methods for determining the value of bids placed by content providers for placement positions on a page, e.g., a web page, rendered according to a given context, for instance, the search results listing for a particular query initiated on a search engine web site, are provided. Additionally, systems and methods are provided for determining placement of content items, e.g., advertisements and/or images, on a rendered page relative to other content items on the page based upon bid value.
摘要:
A method describes user interaction in combination with sending a send item from an application of a computing device to a recipient. The computing device has an attestation unit thereon for attesting to trustworthiness. The application facilitates a user in constructing the send item, and pre-determined indicia are monitored that can be employed to detect that the user is in fact expending effort to construct the send item. The attestation unit authenticates the application to impart trust thereto, and upon the user commanding the application to send, a send attestation is constructed to accompany the send item. The send attestation is based on the monitored indicia and the authentication of the application and thereby describes the user interaction. The constructed send attestation is packaged with the constructed send item and the package is sent to the recipient.
摘要:
Decision trees populated with classifier models are leveraged to provide enhanced spam detection utilizing separate email classifiers for each feature of an email. This provides a higher probability of spam detection through tailoring of each classifier model to facilitate in more accurately determining spam on a feature-by-feature basis. Classifiers can be constructed based on linear models such as, for example, logistic-regression models and/or support vector machines (SVM) and the like. The classifiers can also be constructed based on decision trees. “Compound features” based on internal and/or external nodes of a decision tree can be utilized to provide linear classifier models as well. Smoothing of the spam detection results can be achieved by utilizing classifier models from other nodes within the decision tree if training data is sparse. This forms a base model for branches of a decision tree that may not have received substantial training data.
摘要:
Systems and methods for determining the value of bids placed by content providers for placement positions on a page, e.g., a web page, rendered according to a given context, for instance, the search results listing for a particular query initiated on a search engine web site, are provided. Additionally, systems and methods are provided for determining placement of content items, e.g., advertisements and/or images, on a rendered page relative to other content items on the page based upon bid value.
摘要:
The subject invention provides systems and methods that facilitate AIDS vaccine cocktail assembly via machine learning algorithms such as a cost function, a greedy algorithm, an expectation-maximization (EM) algorithm, etc. Such assembly can be utilized to generate vaccine cocktails for species of pathogens that evolve quickly under immune pressure of the host. For example, the systems and methods of the subject invention can be utilized to facilitate design of T cell vaccines for pathogens such HIV. In addition, the systems and methods of the subject invention can be utilized in connection with other applications, such as, for example, sequence alignment, motif discovery, classification, and recombination hot spot detection. The novel techniques described herein can provide for improvements over traditional approaches to designing vaccines by constructing vaccine cocktails with higher epitope coverage, for example, in comparison with cocktails of consensi, tree nodes and random strains from data.