摘要:
A system and method associates a label and description with a search query such that the query, label, and description can be stored in a shared query repository so that queries can be retrieved by multiple users for reuse. The shared query repository can be searched, so that an appropriate query can be located, retrieved, and then submitted for execution over a document database by a search engine. Retrieved queries can be combined with other retrieved queries or modified with new search terms, and the new combined search query can be used for a new search on the database. The database search system and method efficiently permits reuse of search queries and facilitates sharing of search strategies.
摘要:
A system and method of distributed metadata searching is disclosed. The present invention permits an extension of the searching and retrieval functions of existing Internet web search engines by utilizing computational resources embodied in user computer systems and search browsers. By distributing the searching and scanning functions to the user level, the present invention reduces the computational and communications burden on Internet web search engines and crawlers, resulting in lower computational resource utilization by Internet search engine providers. Given the exponential growth rate currently being experienced in the Internet community, the present invention provides one of the few methods by which complete searches of this vast distributed database may be performed. The present invention permits embodiments incorporating a Search Manger (1001) further comprising a Service Results Manager (1013), User Profile Database (1012), Service Manager(1013), and Service Database (1014); a Light Weight Application SCANNER (1002); and a Search Engine (1008). These components may be augmented in some preferred embodiments via the use of a Search Browser (1003), Internet Communications (1004); Web Site(s) (1005), Web Crawler(s) (1006), and a Repository Database (1007).
摘要:
A computer program product is provided as a session search system and associated method that provide a novel type of query referred to as “session query”. In the context of a session query, a user issues a search query using, for example, a web-based form. This query is processed immediately by the search engine, yielding search result elements that are returned within the new context of a “dynamic search result set”. As long as the user is reviewing the “dynamic search result set” of the session query, the search result is updated automatically in almost real-time, when new information arrives. When the user is no longer interested in continuing the search, the session query is terminated. The session search system generally includes two modules: A client module that presents the “dynamic search result set” to the user, and a server module that manages the current set of active session queries. The client module implements an executable code in the user's web browser.
摘要:
The present invention provides for an integrated matching service and calendaring system. Calendar events are utilized as a bridge between an electronic calendaring system and a matching service. A calendar event represents an activity, e.g., job opening, tennis match, bicycle race, etc., the requirements to match the activity, the entity attributes, and any match results. An entity defines criteria and information for a matching activity which is stored as a calendar event in the electronic calendar system. Portions of the criteria and information are stored as attachments to the calendar event. The calendar events representing a matching activity and associated attachments are provided to a matching server which locates suitable matches for the activity based upon the criteria and information of the activity. If a suitable match is located, the matching server notifies the entities involved by listing the corresponding entities as attendees associated with the calendar event.
摘要:
The present invention provides a system and technique for initiating, conducting, and managing real-time surveys, in the context of a real-time discourse, such as Internet chat, to provide dynamic, real-time survey results. A surveyor initiates a survey by filling out an electronic form which is processed and submitted to a sorting component of the invention. The invention imposes an additional layer of functionality upon a Live Information Selection and Analysis tool which gathers, summarizes, and indexes chat messages in a real-time discourse. The sorting component matches the collected real-time chat messages from the LISA tool with correlating submitted survey queries to provide raw real-time survey results which are converted into a viewable format for submission to the surveyor. The present invention makes it possible to initiate, conduct, and manage multiple surveys simultaneously to provide accurate, dynamic, real-time survey results within the context of a real-time discourse.
摘要:
A method and apparatus which enables a user to streamline the number of results presented to the user during a search session most typically performed over the Internet. The present invention allows the user to select specific results from a search result set which are to be excluded and are not to reappear in a subsequent result set in the search session. The present invention is capable of automatically excluding results from a search result set unless the user specifically flags the specific search results they want to keep and have reappear in a subsequent result set in the search session. This allows a user to save time during a search session by not having to view repeated results, and allows the user to focus on more relevant and related results.
摘要:
A system for identifying different language versions of the same structured format document (e.g., HTML web page) detects the language of the two documents and translates one or both into a preferred language if necessary, parses the two candidate documents and builds two hierarchical data structure based on the document. The data structures are used to compare the hierarchical structure of the two documents and also to access text portions in congruent positions in the two documents. A fuzzy measure of similarity of a set of text portions occupying congruent positions in the two documents is then obtained, to induce a measure of the similarity of the two documents which is compared to a fuzzy threshold.
摘要:
An automatic method for rating data files for objectionable content in a distributed computer system includes preprocessing the file to create semantic units, comparing the semantic units with a rating repository containing entries and associated ratings, assigning content rating vectors to the semantic units, and creating a modified data file incorporating rating information derived from the content rating vectors. For text files, the semantic units are words or phrases, and the rating repository also contains words or phrases with corresponding content rating vectors. For audio files, the file is first converted to a text file using voice recognition software. For image files, image processing software is used to recognize individual objects and compare them to basic images and ratings stored in the rating repository. In one embodiment, a composite content rating vector is derived for the file from the individual content rating vectors, and the composite content rating vector is incorporated into the modified file. In an alternate embodiment, semantic units with content rating vectors exceeding preset user limit values of objectionable content are blocked out by display blocks or, for audio, audio blanking signals, for example, beeps. The user can then view or hear the remaining portions of the file. The invention can be used with any type of data file that can be divided into semantic units, and can be implemented in a server, client, search engine, or proxy server.
摘要:
A network repository service supplements the functions of a web server to enable an increase in the efficiency of web crawling. The repository service: (a) automatically maintains a file modification list that contains the names of files on the server that have been modified (i.e., added, deleted, or otherwise modified), together with the date and time of the file modification; and (b) provides a requesting crawler with the file modification list (or a portion of the list corresponding to a time period specified by the crawler). The repository service may also (c) limit or restrict access privileges of crawlers that do not request the file modification list prior to crawling, thereby protecting the server from overcrawling. The repository service enables a crawler to request the file modification list, and avoid unnecessarily recrawling files that have not been modified since its last visit, thereby preventing considerable waste of time, network bandwidth, server processing resources, and crawler processing resources. Using the file modification list, the crawler can remove all prior references to deleted files, and efficiently recrawl only those files that have been added or changed since the crawler last visited the web server.
摘要:
A master repository service maintains a directory of web servers and the most recent times that their web contents were modified, and provides this information to web crawlers to increase their efficiency. The master repository service receives web content update reports from a plurality of web servers, updates the directory to keep it current, and provides crawlers with web site modification information. The web site modification information preferably comprises identifiers for new web sites, “dead” web sites, and modified web sites. Each crawler is preferably provided only with web site modification information received since it last received information from the master repository service. The information allows web crawlers to know immediately about new web sites, and allows them to spend time visiting only those web sites that are new or that have changed their content.