摘要:
A system that can analyze a multi-dimensional input thereafter establishing a search query based upon extracted features from the input. In a particular example, an image can be used as an input to a search mechanism. Pattern recognition and image analysis can be applied to the image thereafter establishing a search query that corresponds to features extracted from the image input. The system can also facilitate indexing multi-dimensional searchable items thereby making them available to be retrieved as results to a search query. More particularly, the system can employ text analysis, pattern and/or speech recognition mechanisms to extract features from searchable items. These extracted features can be employed to index the searchable items.
摘要:
A system and method that facilitates and effectuates optimizing a classifier for greater performance in a specific region of classification that is of interest, such as a low false positive rate or a low false negative rate. A two-stage classification model can be trained and employed, where the first stage classification is optimized over the entire classification region and the second stage classifier is optimized for the specific region of interest. During training the entire set of training data is employed by a first stage classifier. Only data that is classified by the first stage classifier or by cross validation to fall within a region of interest is used to train the second stage classifier. During classification, data that is classified within the region of interest by the first classification is given the first stage classifier's classification value, otherwise the classification value for the instance of data from the second stage classifier is used.
摘要:
The present invention involves a system and method that facilitate extracting data from messages for spam filtering. The extracted data can be in the form of features, which can be employed in connection with machine learning systems to build improved filters. Data associated with origination information as well as other information embedded in the body of the message that allows a recipient of the message to contact and/or respond to the sender of the message can be extracted as features. The features, or a subset thereof, can be normalized and/or deobfuscated prior to being employed as features of the machine learning systems. The (deobfuscated) features can be employed to populate a plurality of feature lists that facilitate spam detection and prevention. Exemplary features include an email address, an IP address, a URL, an embedded image pointing to a URL, and/or portions thereof.
摘要:
Various embodiments can utilize information that is displayed for a user to automatically generate a list of keywords and use that list as a means to display supplemental information that is relevant to the keywords. In at least some embodiments, the displayed information is analyzed using an extraction algorithm to identify words or, more generally, character strings of interest. If these words or character strings of interest are determined to constitute relevant search terms or “keywords”, then a special user interface portion can be used to display this supplemental information along with the information that is already displayed for the user. This supplemental information can include the search terms themselves, ads that pertain to the search terms, and/or search results that have been ascertained from a web search engine.
摘要:
Phishing detection, prevention, and notification is described. In an embodiment, a messaging application facilitates communication via a messaging user interface, and receives a communication, such as an email message, from a domain. A phishing detection module detects a phishing attack in the communication by determining that the domain is similar to a known phishing domain, or by detecting suspicious network properties of the domain. In another embodiment, a Web browsing application receives content, such as data for a Web page, from a network-based resource, such as a Web site or domain. The Web browsing application initiates a display of the content, and a phishing detection module detects a phishing attack in the content by determining that a domain of the network-based resource is similar to a known phishing domain, or that an address of the network-based resource from which the content is received has suspicious network properties.
摘要:
Providing for generation of a task oriented data structure that can correlate natural language descriptions of computer related tasks to application level commands and functions is described herein. By way of example, a system can include an activity translation component that can receive a natural language description of an application level task. Furthermore, the system can include a language modeling component that can generate the data structure based on an association between the description of the task and at least one application level command utilized in executing the computer related task. Once generated, the data structure can be utilized to automate computer related tasks by input of a human centric description of those tasks. According to further embodiments, machine learning can be employed to train classifiers and heuristic models to optimize task/description relationships and/or tailor such relationships to the needs of particular users.
摘要:
Out-of-vocabulary (OOV) word determination corresponding to a key sequence entered by the user on a (typically numeric) keypad, and a user interface for the user to select one of the words, are disclosed. A word-determining logic determines letter sequences corresponding to the entered key sequence, and presents the sequences within the user interface in which the user can select one of the letter sequences as the intended word, or select the first letter of the intended word. When letters are selected, the word-determining logic determines new letter sequences, consistent with the key sequence and the selected letters, and presents the new letter sequences. The user again selects one of the letter sequences as the intended word, or selects the second letter of the intended word. This process is repeated until the user has selected the intended word.
摘要:
Content management architecture for a portable wireless device. Caching and fetching techniques are provided to improve content handling for portable devices such as cellular telephones and portable computers. A search component automatically performs searches as a background process, and potentially desired content is received and cached by a content storing component to be available in the future when and if needed, mitigating latency associated with slow download speeds, refresh rates, and other system and/or network impediments. Content from background search results can be trickled into the device as part of the background process so as not to burden system resources for other processes. As part of memory management, aged and/or low priority or low interest content can be selectively removed or archived to increase available cache or memory space, as well as to maintain relevant content within the device. A presentation component facilitates presentation of the pre-stored content.
摘要:
Architecture for targeted advertising using offline user behavior information. Information relating to offline behavior can be collected from cell phones, geolocation systems, credit card information, restaurants, grocery stores, etc., and this information is aggregated and employed in connection with selecting and displaying targeted advertising to a user when online. Machine learning and reasoning can be employed to make inferences and dynamically tune advertisement processing. Offline user information can also be employed to enhance context-based searching when the user goes online. The ranking of search results and content for display can be modified as a function of offline behavior. A system is provided that facilitates online advertising based on at least offline activity using a profile component for aggregating offline behavior information of a user and generating a related user profile. An advertising component employs the user profile in connection with delivery of an advertisement to the user when online.
摘要:
An architecture is provided for data mining of electronic messages to extract information relating to relevancy and popularity of websites and/or web pages for ranking of web pages or other documents. A monitor component monitors information of a message for a reference to a web page or other document, and a ranking component computes rank of the web page based in part on the reference.