摘要:
A technique, specifically a method and apparatus that implements the method, which through a probabilistic classifier (370) and, for a given recipient, detects electronic mail (e-mail) messages, in an incoming message stream, which that recipient is likely to consider "junk". Specifically, the invention discriminates message content for that recipient, through a probabilistic classifier (e.g., a support vector machine) trained on prior content classifications. Through a resulting quantitative probability measure, i.e., an output confidence level, produced by the classifier for each message and subsequently compared against a predefined threshold, that message is classified as either, e.g., spam or legitimate mail, and, e.g., then stored in a corresponding folder (223, 227) for subsequent retrieval by and display to the recipient. Based on the probability measure, the message can alternatively be classified into one of a number of different folders, depicted in a pre-defined visually distinctive manner or simply discarded in its entirety.
摘要:
The subject invention provides for an advanced and robust system and method that facilitates detecting spam. The system and method include components as well as other operations which enhance or promote finding characteristics that are difficult or the spammer to avoid and finding characteristics in non-spam that are difficult for spammers to duplicate. Exemplary characteristics include examining origination features in pairs, analyzing character and/or number sequences, strings, and sub-strings, detecting various entropy levels of one or more character sequences, strings and/or sub-strings as well as analyzing message and/or feature sizes.
摘要:
A visualization input system is provided. The system includes a visualization component that receives input gestures from a user (or users) and translates the gestures into one or more data manipulation commands. A distribution component receives the data manipulation commands and propagates data modifications across one or more databases in view of the commands. This includes a rights component that enables the data modifications to be implemented across the one or more databases.
摘要:
The invention provides systems and methods that can be used for targeted advertising. The system determines where to present impressions, such as advertisements, to maximize an expected utility subject to one or more constraints, which can include quotas and minimum utilities for groups of one or more impression. The traditional measure of utility in web-based advertising is click-though rates, but the present invention provides a broader definition of utility, including measures of sales, profits, or brand awareness, for example. This broader definition permits advertisements to be allocated more in accordance with the actual interests of advertisers.
摘要:
The claimed subject matter provides systems and/or methods that determines a number of non-spurious arcs associated with a learned graphical model. The system can include devices and mechanisms that utilize learning algorithms and datasets to generate learned graphical models and graphical models associated with null permutations of the datasets, ascertaining the average number of arcs associated with the graphical models associated with null permutations of the datasets, enumerating the total number of arcs affiliated with the learned graphical model, and presenting a ratio of the average number of arcs to the total number of arcs, the ratio indicative of the number of non-spurious arcs associated the learned graphical model.
摘要:
Provided are systems and/or methods that facilitate sensing, detecting, or treatment of a condition or need of a living body using a genetically engineered symbiotic agent.
摘要:
Architecture for detecting and removing obfuscating clutter from the subject and/or body of a message, e.g., e-mail, prior to filtering of the message, to identify junk messages commonly referred to as spam. The technique utilizes the powerful features built into an HTML rendering engine to strip the HTML instructions for all non-substantive aspects of the message. Pre-processing includes pre-rendering of the message into a final format, which final format is that which is displayed by the rendering engine to the user. The final format message is then converted to a text-only format to remove graphics, color, non-text decoration, and spacing that cannot be rendered as ASCII-style or Unicode-style characters. The result is essentially to reduce each message to its common denominator essentials so that the junk mail filter can view each message on an equal basis.
摘要:
Targeted delivery of items with inventory management using a cluster-based approach or a rule-based approach is disclosed. An example of items is advertisements. Each item is allocated to one or more clusters. The allocation is made based on a predetermined criterion accounting for at least a quota for each item and possibly a constraint for each cluster. The former can refer to the number of times an item must be shown. The latter can refer to the number of times a given group of web pages is likely to be visited by users, and hence is the number of times items can be shown in a given cluster. The invention is not limited to any particular definition of what constitutes a cluster or item.
摘要:
Epitope prediction models are described herein. By way of example, a system for predicting epitope information relating to a epitope can include a classification model (e.g., logistic regression model). The trained classification model can illustratively operatively execute one ore logistic functions on received protein data, and incorporate one or more of hidden binary variables and shift variables that when processed represent the identification (e.g., prediction) of one or more desired epitopes. The classification model can be configured to predict the epitope information by processing data including various features of an epitope, MHC, MHC supertype, and Boolean combinations thereof.
摘要:
A streaming media caching mechanism and cache manager efficiently establish and maintain the contents of a streaming media cache for use in serving streaming media requests from cache rather than from an original data source when appropriate. The cost of caching is incurred only when the benefits of caching are likely to be experienced. The caching mechanism and cache manager evaluate the request count for each requested URL to determine whether the URL represents a cache candidate, and further analyze the URL request rate to determine whether the content associated with the URL will be cached. In an embodiment, the streaming media cache is maintained with a predetermined amount of reserve capacity rather than being filled to capacity whenever possible.