Labeling and describing search queries for reuse
    1.
    发明授权
    Labeling and describing search queries for reuse 有权
    标记和描述搜索查询以供重用

    公开(公告)号:US06484162B1

    公开(公告)日:2002-11-19

    申请号:US09342991

    申请日:1999-06-29

    IPC分类号: G06F1730

    摘要: A system and method associates a label and description with a search query such that the query, label, and description can be stored in a shared query repository so that queries can be retrieved by multiple users for reuse. The shared query repository can be searched, so that an appropriate query can be located, retrieved, and then submitted for execution over a document database by a search engine. Retrieved queries can be combined with other retrieved queries or modified with new search terms, and the new combined search query can be used for a new search on the database. The database search system and method efficiently permits reuse of search queries and facilitates sharing of search strategies.

    摘要翻译: 系统和方法将标签和描述与搜索查询相关联,使得查询,标签和描述可以存储在共享查询库中,以便多个用户可以检索查询以供重用。 可以搜索共享查询存储库,以便可以通过搜索引擎通过文档数据库找到,检索并提交适当的查询以执行。 检索到的查询可以与其他检索到的查询相结合,也可以用新的搜索字词进行修改,新的组合搜索查询可以用于数据库上的新搜索。 数据库搜索系统和方法有效地允许重用搜索查询并促进搜索策略的共享。

    Distributed metadata searching system and method
    2.
    发明授权
    Distributed metadata searching system and method 有权
    分布式元数据搜索系统和方法

    公开(公告)号:US06434548B1

    公开(公告)日:2002-08-13

    申请号:US09456737

    申请日:1999-12-07

    IPC分类号: G06F1730

    摘要: A system and method of distributed metadata searching is disclosed. The present invention permits an extension of the searching and retrieval functions of existing Internet web search engines by utilizing computational resources embodied in user computer systems and search browsers. By distributing the searching and scanning functions to the user level, the present invention reduces the computational and communications burden on Internet web search engines and crawlers, resulting in lower computational resource utilization by Internet search engine providers. Given the exponential growth rate currently being experienced in the Internet community, the present invention provides one of the few methods by which complete searches of this vast distributed database may be performed. The present invention permits embodiments incorporating a Search Manger (1001) further comprising a Service Results Manager (1013), User Profile Database (1012), Service Manager(1013), and Service Database (1014); a Light Weight Application SCANNER (1002); and a Search Engine (1008). These components may be augmented in some preferred embodiments via the use of a Search Browser (1003), Internet Communications (1004); Web Site(s) (1005), Web Crawler(s) (1006), and a Repository Database (1007).

    摘要翻译: 公开了一种分布式元数据搜索的系统和方法。 本发明允许通过利用体现在用户计算机系统和搜索浏览器中的计算资源来扩展现有互联网搜索引擎的搜索和检索功能。 通过将搜索和扫描功能分配到用户级别,本发明减少了因特网搜索引擎和爬虫的计算和通信负担,导致因特网搜索引擎提供商的计算资源利用率降低。 鉴于互联网社群当前正在经历的指数级增长率,本发明提供了可以执行这种庞大的分布式数据库的完整搜索的少数几种方法之一。 本发明允许包括进一步包括服务结果管理器(1013),用户简档数据库(1012),服务管理器(1013)和服务数据库(1014)的搜索管理器(1001)的实施例; 轻量级应用SCANNER(1002); 和搜索引擎(1008)。 这些组件可以在一些优选实施例中通过使用搜索浏览器(1003),因特网通信(1004); 网站(1005),Web爬虫(1006)和存储库数据库(1007)。

    System and method for providing a session query within the context of a dynamic search result set
    3.
    发明授权
    System and method for providing a session query within the context of a dynamic search result set 有权
    用于在动态搜索结果集的上下文中提供会话查询的系统和方法

    公开(公告)号:US06633867B1

    公开(公告)日:2003-10-14

    申请号:US09543383

    申请日:2000-04-05

    IPC分类号: G06F700

    摘要: A computer program product is provided as a session search system and associated method that provide a novel type of query referred to as “session query”. In the context of a session query, a user issues a search query using, for example, a web-based form. This query is processed immediately by the search engine, yielding search result elements that are returned within the new context of a “dynamic search result set”. As long as the user is reviewing the “dynamic search result set” of the session query, the search result is updated automatically in almost real-time, when new information arrives. When the user is no longer interested in continuing the search, the session query is terminated. The session search system generally includes two modules: A client module that presents the “dynamic search result set” to the user, and a server module that manages the current set of active session queries. The client module implements an executable code in the user's web browser.

    摘要翻译: 提供计算机程序产品作为会话搜索系统和相关联的方法,其提供被称为“会话查询”的新型查询。 在会话查询的上下文中,用户使用例如基于web的表单来发布搜索查询。 该查询由搜索引擎立即处理,产生在“动态搜索结果集”的新上下文中返回的搜索结果元素。 只要用户正在查看会话查询的“动态搜索结果集”,当新信息到达时,搜索结果将几乎实时地自动更新。 当用户不再需要继续搜索时,会话查询被终止。 会话搜索系统通常包括两个模块:向用户呈现“动态搜索结果集”的客户端模块以及管理当前活动会话查询集合的服务器模块。 客户端模块在用户的Web浏览器中实现可执行代码。

    System and method for matching entities utilizing an electronic calendaring system
    4.
    发明授权
    System and method for matching entities utilizing an electronic calendaring system 失效
    使用电子日历系统匹配实体的系统和方法

    公开(公告)号:US06978246B1

    公开(公告)日:2005-12-20

    申请号:US09556303

    申请日:2000-04-24

    IPC分类号: G06Q10/00 G06F17/60

    CPC分类号: G06Q10/109 G06Q10/1095

    摘要: The present invention provides for an integrated matching service and calendaring system. Calendar events are utilized as a bridge between an electronic calendaring system and a matching service. A calendar event represents an activity, e.g., job opening, tennis match, bicycle race, etc., the requirements to match the activity, the entity attributes, and any match results. An entity defines criteria and information for a matching activity which is stored as a calendar event in the electronic calendar system. Portions of the criteria and information are stored as attachments to the calendar event. The calendar events representing a matching activity and associated attachments are provided to a matching server which locates suitable matches for the activity based upon the criteria and information of the activity. If a suitable match is located, the matching server notifies the entities involved by listing the corresponding entities as attendees associated with the calendar event.

    摘要翻译: 本发明提供一种综合匹配服务和压延系统。 日历事件被用作电子日历系统和匹配服务之间的桥梁。 日历活动表示活动,例如工作打开,网球比赛,自行车比赛等,匹配活动的要求,实体属性和任何匹配结果。 实体定义在电子日历系统中作为日历事件存储的匹配活动的标准和信息。 标准和信息的部分作为附件存储在日历事件中。 表示匹配活动和相关附件的日历事件被提供给匹配服务器,该匹配服务器基于活动的标准和信息来定位活动的合适的匹配。 如果找到合适的匹配,则匹配服务器通过将相关实体列为与日历事件相关联的与会者来通知所涉及的实体。

    System and method for automatically conducting and managing surveys based on real-time information analysis
    5.
    发明授权
    System and method for automatically conducting and managing surveys based on real-time information analysis 有权
    基于实时信息分析自动进行调查的系统和方法

    公开(公告)号:US06912521B2

    公开(公告)日:2005-06-28

    申请号:US09878484

    申请日:2001-06-11

    摘要: The present invention provides a system and technique for initiating, conducting, and managing real-time surveys, in the context of a real-time discourse, such as Internet chat, to provide dynamic, real-time survey results. A surveyor initiates a survey by filling out an electronic form which is processed and submitted to a sorting component of the invention. The invention imposes an additional layer of functionality upon a Live Information Selection and Analysis tool which gathers, summarizes, and indexes chat messages in a real-time discourse. The sorting component matches the collected real-time chat messages from the LISA tool with correlating submitted survey queries to provide raw real-time survey results which are converted into a viewable format for submission to the surveyor. The present invention makes it possible to initiate, conduct, and manage multiple surveys simultaneously to provide accurate, dynamic, real-time survey results within the context of a real-time discourse.

    摘要翻译: 本发明提供了一种用于在诸如因特网聊天之类的实时话语的上下文中发起,执行和管理实时调查的系统和技术,以提供动态的实时调查结果。 验船师通过填写电子表格进行调查,该电子表格被处理并提交给本发明的分类部件。 本发明在实时信息选择和分析工具上附加附加的功能层,其在实时话语中收集,总结和索引聊天消息。 排序组件将来自LISA工具的收集的实时聊天消息与相关的提交的调查查询进行匹配,以提供原始的实时调查结果,将其转换为可查看的格式以提交给测量师。 本发明使得可以同时启动,执行和管理多个调查,以在实时话语的背景下提供准确,动态的实时调查结果。

    Method for reducing search results by manually or automatically excluding previously presented search results
    6.
    发明授权
    Method for reducing search results by manually or automatically excluding previously presented search results 有权
    通过手动或自动排除先前呈现的搜索结果来减少搜索结果的方法

    公开(公告)号:US06487553B1

    公开(公告)日:2002-11-26

    申请号:US09477844

    申请日:2000-01-05

    IPC分类号: G06F1730

    摘要: A method and apparatus which enables a user to streamline the number of results presented to the user during a search session most typically performed over the Internet. The present invention allows the user to select specific results from a search result set which are to be excluded and are not to reappear in a subsequent result set in the search session. The present invention is capable of automatically excluding results from a search result set unless the user specifically flags the specific search results they want to keep and have reappear in a subsequent result set in the search session. This allows a user to save time during a search session by not having to view repeated results, and allows the user to focus on more relevant and related results.

    摘要翻译: 一种使用户能够在最典型地通过因特网执行的搜索会话期间简化呈现给用户的结果数量的方法和装置。 本发明允许用户从搜索结果集中选择要被排除的特定结果,并且不会在搜索会话中的后续结果集中重新出现。 本发明能够自动排除搜索结果集中的结果,除非用户专门标记他们想要保留的特定搜索结果,并且在搜索会话中的后续结果集中重新出现。 这允许用户在搜索会话期间节省时间,而不必查看重复的结果,并允许用户专注于更相关和相关的结果。

    Software and method for recognizing similarity of documents written in different languages based on a quantitative measure of similarity
    7.
    发明授权
    Software and method for recognizing similarity of documents written in different languages based on a quantitative measure of similarity 有权
    基于相似度的量化方法识别用不同语言编写的文档的相似性的软件和方法

    公开(公告)号:US06519557B1

    公开(公告)日:2003-02-11

    申请号:US09588250

    申请日:2000-06-06

    IPC分类号: G06F1720

    CPC分类号: G06F17/289 G06F17/2211

    摘要: A system for identifying different language versions of the same structured format document (e.g., HTML web page) detects the language of the two documents and translates one or both into a preferred language if necessary, parses the two candidate documents and builds two hierarchical data structure based on the document. The data structures are used to compare the hierarchical structure of the two documents and also to access text portions in congruent positions in the two documents. A fuzzy measure of similarity of a set of text portions occupying congruent positions in the two documents is then obtained, to induce a measure of the similarity of the two documents which is compared to a fuzzy threshold.

    摘要翻译: 用于识别相同结构化格式文档(例如,HTML网页)的不同语言版本的系统检测两个文档的语言,并且如果需要则将一个或两个翻译成优选语言,解析两个候选文档并构建两个分层数据结构 基于文件。 数据结构用于比较两个文档的层次结构,并且还可以访问两个文档中的等同位置的文本部分。 然后,获得在两个文档中占据一致位置的一组文本部分的相似性的模糊度量,以引起与模糊阈值相比较的两个文档的相似性的度量。

    Automatic rating and filtering of data files for objectionable content
    8.
    发明授权
    Automatic rating and filtering of data files for objectionable content 失效
    对令人反感的内容进行数据文件的自动评级和过滤

    公开(公告)号:US06493744B1

    公开(公告)日:2002-12-10

    申请号:US09374644

    申请日:1999-08-16

    IPC分类号: G06F1516

    摘要: An automatic method for rating data files for objectionable content in a distributed computer system includes preprocessing the file to create semantic units, comparing the semantic units with a rating repository containing entries and associated ratings, assigning content rating vectors to the semantic units, and creating a modified data file incorporating rating information derived from the content rating vectors. For text files, the semantic units are words or phrases, and the rating repository also contains words or phrases with corresponding content rating vectors. For audio files, the file is first converted to a text file using voice recognition software. For image files, image processing software is used to recognize individual objects and compare them to basic images and ratings stored in the rating repository. In one embodiment, a composite content rating vector is derived for the file from the individual content rating vectors, and the composite content rating vector is incorporated into the modified file. In an alternate embodiment, semantic units with content rating vectors exceeding preset user limit values of objectionable content are blocked out by display blocks or, for audio, audio blanking signals, for example, beeps. The user can then view or hear the remaining portions of the file. The invention can be used with any type of data file that can be divided into semantic units, and can be implemented in a server, client, search engine, or proxy server.

    摘要翻译: 一种用于在分布式计算机系统中评估不良内容的数据文件的自动方法包括对文件进行预处理以创建语义单元,将语义单元与包含条目和相关等级的评级仓库进行比较,将内容分级向量分配给语义单元,以及创建 包含从内容分级向量导出的评级信息的修改数据文件。 对于文本文件,语义单位是单词或短语,评级存储库还包含具有相应内容分级向量的单词或短语。 对于音频文件,文件首先使用语音识别软件转换为文本文件。 对于图像文件,图像处理软件用于识别单个对象,并将它们与存储在评级库中的基本图像和评级进行比较。 在一个实施例中,针对来自各个内容评级向量的文件导出复合内容分级向量,并且将复合内容评级向量并入修改的文件中。 在替代实施例中,具有超过预期的令人反感的内容的用户极限值的内容分级矢量的语义单元被显示块或音频,音频消隐信号例如嘟嘟声阻挡。 然后,用户可以查看或听到文件的其余部分。 本发明可以与可分为语义单元的任何类型的数据文件一起使用,并且可以在服务器,客户机,搜索引擎或代理服务器中实现。

    Network repository service for efficient web crawling
    9.
    发明授权
    Network repository service for efficient web crawling 有权
    用于高效网页爬网的网络存储库服务

    公开(公告)号:US06418453B1

    公开(公告)日:2002-07-09

    申请号:US09433118

    申请日:1999-11-03

    IPC分类号: G06F1730

    摘要: A network repository service supplements the functions of a web server to enable an increase in the efficiency of web crawling. The repository service: (a) automatically maintains a file modification list that contains the names of files on the server that have been modified (i.e., added, deleted, or otherwise modified), together with the date and time of the file modification; and (b) provides a requesting crawler with the file modification list (or a portion of the list corresponding to a time period specified by the crawler). The repository service may also (c) limit or restrict access privileges of crawlers that do not request the file modification list prior to crawling, thereby protecting the server from overcrawling. The repository service enables a crawler to request the file modification list, and avoid unnecessarily recrawling files that have not been modified since its last visit, thereby preventing considerable waste of time, network bandwidth, server processing resources, and crawler processing resources. Using the file modification list, the crawler can remove all prior references to deleted files, and efficiently recrawl only those files that have been added or changed since the crawler last visited the web server.

    摘要翻译: 网络存储库服务补充了Web服务器的功能,以提高Web爬网的效率。 存储库服务:(a)自动维护文件修改列表,其中包含已修改(即添加,删除或以其他方式修改)的服务器上的文件名称以及文件修改的日期和时间; 和(b)向请求履历提供文件修改列表(或列表对应于爬行器指定的时间段的一部分)。 存储库服务还可以(c)在爬行之前限制或限制不请求文件修改列表的爬网程序的访问权限,从而保护服务器免受过度抓取。 存储库服务使爬网程序能够请求文件修改列表,并避免不必要地重新抓取自上次访问以来未被修改的文件,从而防止大量浪费时间,网络带宽,服务器处理资源和爬网程序处理资源。 使用文件修改列表,爬网程序可以删除所有先前对已删除文件的引用,并且只能有效地重新抓取从爬网程序上次访问Web服务器以来添加或更改的文件。

    Network repository service directory for efficient web crawling
    10.
    发明授权
    Network repository service directory for efficient web crawling 失效
    网络存储库服务目录,用于高效的网络爬网

    公开(公告)号:US06418452B1

    公开(公告)日:2002-07-09

    申请号:US09433116

    申请日:1999-11-03

    IPC分类号: G06F1730

    摘要: A master repository service maintains a directory of web servers and the most recent times that their web contents were modified, and provides this information to web crawlers to increase their efficiency. The master repository service receives web content update reports from a plurality of web servers, updates the directory to keep it current, and provides crawlers with web site modification information. The web site modification information preferably comprises identifiers for new web sites, “dead” web sites, and modified web sites. Each crawler is preferably provided only with web site modification information received since it last received information from the master repository service. The information allows web crawlers to know immediately about new web sites, and allows them to spend time visiting only those web sites that are new or that have changed their content.

    摘要翻译: 主存储库服务维护一个Web服务器的目录,并且最近一段时间对其Web内容进行了修改,并将这些信息提供给Web抓取工具以提高其效率。 主存储库服务从多个web服务器接收web内容更新报告,更新目录以保持目前的状态,并且向履带提供网站修改信息。 网站修改信息优选地包括用于新网站,“死”网站和经修改的网站的标识符。 优选地,每个爬行器仅提供从主存储库服务最后接收到的信息之后接收的网站修改信息。 该信息允许网络抓取工具立即了解新的网站,并允许他们花时间访问那些新的或已更改其内容的网站。