摘要:
A distributed collection of web-crawlers to gather information over a large portion of the cyberspace. These crawlers share the overall crawling through a cyberspace partition scheme. They also collaborate with each other through load balancing to maximally utilize the computing resources of each of the crawlers. The invention takes advantage of the hierarchical nature of the cyberspace namespace and uses the syntactic components of the URL structure as the main vehicle for dividing and assigning crawling workload to individual crawler. The partition scheme is completely distributed in which each crawler makes the partitioning decision based on its own crawling status and a globally replicated partition tree data structure.
摘要:
A system for automatically generating user interest profiles and delivering information to users learns a user's interests by monitoring the user's outbound communication streams, i.e., the information that the user produces either by typing (e.g., while a user is composing an e-mail message or editing a word processor document) or by speaking (e.g., while a user is engaged in a phone conversation or listening to a lecture). The system uses the monitored text to build (and possibly update) a user interest profile. The profile is constructed from current text generated by the user, so that the retrieved information reflects present user interests. In addition, the profile may also retain past user interests, so that the profile reflects a combination of past and present user interests. The system then automatically queries diverse databases for information relevant to the interest profile. The databases may include internet web pages, files stored on the user's local network, and other local or remote data repositories. The queries may use a combination of internet search engines, the specific selection of which may depend upon the nature and/or content of the queries. The information retrieved in response to the queries is then presented to the user. The retrieved information may contain, for example, answers to questions that the user might ask and/or data related to the user's current and continuing interests. Because a user's current speech or typed text is highly correlated with the user's current interests, the retrieved information will be relevant to the user's actual interests. The communication stream monitoring, interest profile building, data base querying, and presentation of retrieved information are all performed automatically, in real time, and in the background of current user activities.
摘要:
A system generates user interest profiles by monitoring and analyzing a user's access to a variety of hierarchical levels within a set of structured documents, e.g., documents available at a web site. Each information document has parts associated with it and the documents are classified into categories using a known taxonomy. The user interest profiles are automatically generated based on the type of content viewed by the user. The type of content is determined by the text within the parts of the documents viewed and the classifications of the documents viewed. In addition, the profiles also are generated based on other factors including the frequency and currency of visits to documents having a given classification, and/or the hierarchical depth of the levels or parts of the documents viewed. User profiles include an interest category code and an interest score to indicate a level of interest in a particular category. The profiles are updated automatically to accurately reflect the current interests of an individual, as well as past interests. A time-dependent decay factor is applied to the past interests. The system presents to the user documents or references to documents that match the current profile.
摘要:
A method and apparatus for efficiently matching a large collection of user profiles against a large volume of data in a webcasting system. The invention generally includes in one embodiment four steps to parallelize the profiles. First, an initial profile set is partitioned into several subsets also referred to as sub-partitions using various heuristic methods. Second, each sub-partition is mapped onto one or more independent processing units. Each processing unit is not required to have equal processing performance. However, for best performance results, subset data should be mapped in one embodiment where the subset with a highest cost is mapped to a fastest processor, and the next highest cost subset mapped to the next fastest processor. Where appropriate, the invention evaluates the relative subset processing speed of each processor and adjusts future subset mapping based upon these evaluations. For each information item I that needs to be matched with a profile predicate, a third and a fourth step are executed. The third step broadcasts I to all processing units, and a fourth step performs a sequential profile match on I.
摘要:
An “active” calendar automatically analyzes a user's calendar entries and sends machine-readable messages to destinations appropriate to content of the calendar entry. A group of event categories is established, each category specifying one class of anticipated calendar entry. An action rule database pre-associates each event category with one or more message formats each having a content and a destination. The action rule database also contains data identifying sources containing the content and destination for each message format. These sources include records of the action rule database itself, subparts of calendar entries of the pre-associated event category, one or more other databases, or a combination of the foregoing. After the calendar receives a user-submitted computer calendar entry describing a planned event, it identifies one of the event categories of the established group that classifies the planned event. For each message format pre-associated with the identified event category, the calendar determines the content and destination for the message as specified by the action rule database, and transmits the message to the destination.
摘要:
A According to the invention, a music search system includes a music player, music analyzer, a search engine and a sophisticated user interface that enables users to visually build complex query profiles from the structural information of one or more musical pieces. The complex query profiles are useful for performing searches for musical pieces matching the structural information in the query profile. The system allows the user to supply an existing piece of music, or some components thereof, as query arguments, and lets the music search engine find music that is similar to the given sample by certain similarity measurement.
摘要:
A method and system for generating audio summaries of musical pieces receives computer readable data representing the musical piece and generates therefrom an audio summary including the main melody of the musical piece. A component builder generates a plurality of composite and primitive components representing the structural elements of the musical piece and creates a hierarchical representation of the components. The most primitive components, representing notes within the composition, are examined to determine repetitive patterns within the composite components. A melody detector examines the hierarchical representation of the components and uses algorithms to detect which of the repetitive patterns is the main melody of the composition. Once the main melody is detected, the segment of the musical data containing the main melody is provided in one or more formats. Musical knowledge rules representing specific genres of musical styles may be used to assist the component builder and melody detector in determining which primitive component patterns are the most likely candidates for the main melody.
摘要:
An efficient method and apparatus for regulating access to information objects stored in a database in which there are a large number of users and access groups. The invention uses a representation of a hierarchical access group structure in terms of intervals over a set of integers and a decomposition scheme that reduces any group structure to ones that have interval representation. This representation allows the problem for checking access rights to be reduced to an interval containment problem. An interval tree, a popular data structure in computational geometry, may be implemented to efficiently execute the access-right checking method.
摘要:
A tactile notification device that can be embodied in, e.g., a wristwatch, communicates via wireless link with plural personal computing devices, including cellular telephones, pagers, and palm top computers, of the person wearing the notification device. When one of the personal computing devices alerts, e.g., when the telephone receives an incoming call, the pager receives a page, or the palm top computer receives an email, the personal computing device sends a signal to the notification device, which generates a discrete tactile signal against the person's skin. The notification device can generate different tactile signals, and each tactile signal can be correlated as desired by the user to one of the personal computing devices. In one embodiment, opposed pinch bars are provided on the skin-facing tactile surface of a wristwatch to gently pinch the skin and thereby establish a first tactile signal that can be correlated to, for example, an alert for an incoming phone call. Also, a rotating bar can be provided on the tactile surface of the wristwatch, and the tactile signal that corresponds to, e.g., an incoming page can be established by rotating the bar against the skin.
摘要:
A search engine that forms a compact representation of a plurality of user queries to efficiently find desired information in an information network. The search engine comprises a profile processor having logic to receive the queries from the users and a search module. The search module is coupled to the profile processor and has logic to receive the information content, to combine the user queries into a master query, and to match the master query with the information content to determine matching content. The search engine also includes logic to analyze the matching content to determine if any of the queries has been satisfied.