摘要:
The subject invention provides for a feedback loop system and method that facilitate classifying items in connection with spam prevention in server and/or client-based architectures. The invention makes uses of a machine-learning approach as applied to spam filters, and in particular, randomly samples incoming email messages so that examples of both legitimate and junk/spam mail are obtained to generate sets of training data. Users which are identified as spam-fighters are asked to vote on whether a selection of their incoming email messages is individually either legitimate mail or junk mail. A database stores the properties for each mail and voting transaction such as user information, message properties and content summary, and polling results for each message to generate training data for machine learning systems. The machine learning systems facilitate creating improved spam filter(s) that are trained to recognize both legitimate mail and spam mail and to distinguish between them.
摘要:
The present invention involves a system and method that facilitate extracting data from messages for spam filtering. The extracted data can be in the form of features, which can be employed in connection with machine learning systems to build improved filters. Data associated with origination information as well as other information embedded in the body of the message that allows a recipient of the message to contact and/or respond to the sender of the message can be extracted as features. The features, or a subset thereof, can be normalized and/or deobfuscated prior to being employed as features of the machine learning systems. The (deobfuscated) features can be employed to populate a plurality of feature lists that facilitate spam detection and prevention. Exemplary features include an email address, an IP address, a URL, an embedded image pointing to a URL, and/or portions thereof.
摘要:
The present invention involves a system and method that facilitate extracting data from messages for spam filtering. The extracted data can be in the form of features, which can be employed in connection with machine learning systems to build improved filters. Data associated with origination information as well as other information embedded in the body of the message that allows a recipient of the message to contact and/or respond to the sender of the message call be extracted as features. The features, or a subset thereof, can be normalized and/or deobfuscated prior to being employed as features of the machine learning systems. The (deobfuscated) features can be employed to populate a plurality of feature lists that facilitate spam detection and prevention. Exemplary features include an email address, an IP address, a URL, an embedded image pointing to a URL, and/or portions thereof.
摘要:
The present invention involves a system and method that facilitate extracting data from messages for spam filtering. The extracted data can be in the form of features, which can be employed in connection with machine learning systems to build improved filters. Data associated with the subject line, timestamps, and the message body can be extracted and employed to generate one or more features. In particular, subject lines and message bodies can be examined for consecutive, repeating characters, blobs, the association or distance between such characters, blobs and non-blob portions of the message. The values or counts obtained can be broken down into one or more ranges corresponding to a degree of spaminess. Presence and type of attachments to messages, percentage of non-white-space and non-numeric characters of a message, and determining message delivery times can be used to identify spam. A time-based delta can be computed to facilitate determining the delivery time.
摘要:
Distributed sender reputations are described. In an implementation, a method includes evaluating multiple characteristics of message delivery to establish a reputation for a sender of the message by a mail transfer agent and sharing data which describes the evaluation with another mail transfer agent.
摘要:
Techniques are presented for assigning reputations to email senders. In one implementation, real-time statistics and heuristics are constructed, stored, analyzed, and used to formulate a sender reputation level for use in evaluating and controlling a given sender's connection to an message transfer agent or email recipient. A sender with an unfavorable reputation may be denied a connection before resources are spent receiving and processing email messages from the sender. A sender with a favorable reputation may be rewarded by having safeguards removed from the connection, which also saves system resources. The statistics and heuristics may include real-time analysis of traffic patterns and delivery characteristics used by an email sender, analysis of content, and historical or time-sliced views of all of the above.
摘要:
Network domain reputation-based spam filtering is described. In an embodiment, emails are received from a network domain and a reputation of the network domain is established. Additional emails are filtered as they are received to determine a status of each email as spam email or not spam email. An email can be determined to be a spam email based on any one or more of the reputation of the network domain, an authentication status of an email, and other information that can be derived from an email.
摘要:
Message header spam filtering is described. In an embodiment, a message is received that includes header entries arranged in an ordered sequence which indicates a path by which the message was communicated. The header entries are parsed to categorize each header entry as a header type where the header types are listed in the ordered sequence. A quantity of each different header type is determined, and a determination is made as to whether the message is likely a spam message based at least in part on the quantity corresponding to a particular header type. In another embodiment, a numeric representation of the ordered sequence is created where the numeric representation includes unique integers assigned to each different header type. A determination is made as to whether the message is likely a spam message based at least in part on the numeric representation of the ordered sequence of header types.
摘要:
Embedded surface wave antenna elements incorporating different dielectric materials or other features are provided. The different dielectric materials can arranged adjacent a feed, to absorb energy that can cause undesirable reflections in the antenna element. In addition or alternatively, different dielectric materials can be arranged to alter the velocity of energy through the antenna element, and to control or attenuate the formation of nulls in the far field at angles of interest. The control or attenuation of nulls in the far field at angles of interest can further be controlled through contouring an antenna element ground plane in a lens region of the antenna element. A buried feed arrangement is also described.