Method and system for providing a search index for an electronic messaging system based on message threads
    1.
    发明申请
    Method and system for providing a search index for an electronic messaging system based on message threads 有权
    用于基于消息线程为电子消息系统提供搜索索引的方法和系统

    公开(公告)号:US20060248151A1

    公开(公告)日:2006-11-02

    申请号:US11118969

    申请日:2005-04-29

    IPC分类号: G06F15/16

    CPC分类号: G06F17/30613

    摘要: When a message having at least one attachment is obtained for indexing, it is indexed as N+1 separate documents, where N is the number of attached documents. If the message is part of a message thread, then information regarding the last message in the thread is retrieved, and search index attachment meta data for the last message is extracted. A unique identifier is computed for the newly obtained attachments, and used to search for matches in the attachments for the last message in the thread. If there is a match, then the newly obtained attachment is not indexed, but the unique identifier of the previously indexed matching attachment is added to a body index document for the new message. A unique identifier associated with the new message body is also added to a list of parent identifiers associated with the attachment. If a search is subsequently issued that matches the contents of the attachment, all documents whose parent identifiers are listed in the attachment document meta data will be returned as matches. If an attachment is obtained for a message is not part of a previous message thread, or if a newly obtained attachment is not a match with any previously obtained attachment within the message thread to which it belongs, then the attachment is indexed into the search index, and its unique identifier is included in the index document for the newly obtained message body.

    摘要翻译: 当获得具有至少一个附件的消息用于索引时,它被索引为N + 1个单独的文档,其中N是附加文档的数量。 如果消息是消息线程的一部分,则检索关于线程中的最后消息的信息,并提取最后消息的搜索索引附加元数据。 为新获得的附件计算唯一标识符,并用于搜索线程中最后一条消息的附件中的匹配项。 如果有匹配,则新获得的附件不被索引,但是先前索引的匹配附件的唯一标识符被添加到新消息的身体索引文档中。 与新消息体相关联的唯一标识符也被添加到与附件相关联的父标识符的列表中。 如果随后发出与附件内容相匹配的搜索,则其附件文档元数据中列出其父标识符的所有文档将作为匹配返回。 如果获取消息的附件不是先前消息线程的一部分,或者如果新获得的附件与其所属的消息线程中的任何先前获得的附件不匹配,则附件被索引到搜索索引中 ,并且其唯一标识符被包括在新获得的消息体的索引文档中。

    Sharing of full text index entries across application boundaries
    2.
    发明申请
    Sharing of full text index entries across application boundaries 失效
    跨应用程序边界共享全文索引条目

    公开(公告)号:US20060248039A1

    公开(公告)日:2006-11-02

    申请号:US11118933

    申请日:2005-04-29

    IPC分类号: G06F17/30

    摘要: A method and system for sharing full text index entries across application boundaries in which documents are obtained by a shared, platform level indexing service, and a determination is made as to whether the received documents are duplicates with regard to previously indexed documents. If a document is determined to be a duplicate, the index representation of the previously indexed copy of the document is modified to indicate that the document is also associated with another application or context. If a document is not a duplicate of a previously indexed document, the document is indexed to support future searches and/or other processing. The index representation of a document includes application category identifiers associating one or more applications or contexts with the document. When a document is indexed, one or more category identifiers are generated and stored in association with that document. The category identifiers for an indexed document may, for example, represent an application that received, stored, or otherwise processed that document. The application category identifiers enable category specific searching by applications sharing a common search index. A software category filter may be provided to process search results from the shared search index, so that only documents associated with certain categories are returned. Accordingly, one or more search categories may be determined for a given search query, based on an application generating the search query, or some other context information, and then used to filter the search results provided from the shared search index.

    摘要翻译: 一种用于在共享的平台级索引服务获得文档的跨应用边界共享全文索引条目的方法和系统,并且确定所接收的文档是否是关于先前索引的文档的重复的。 如果文档被确定为重复,则修改文档的先前索引副本的索引表示,以指示该文档也与另一个应用程序或上下文相关联。 如果文档不是以前索引的文档的副本,则将文档编入索引以支持将来的搜索和/或其他处理。 文档的索引表示包括将一个或多个应用或上下文与文档相关联的应用类别标识符。 当文档被索引时,生成一个或多个类别标识符并与该文档相关联地存储。 索引文档的类别标识符可以例如表示接收,存储或以其他方式处理该文档的应用。 应用程序类别标识符可以通过共享一个常用搜索索引的应用程序进行类别特定的搜索。 可以提供软件类别过滤器来处理来自共享搜索索引的搜索结果,使得仅返回与某些类别相关联的文档。 因此,可以基于生成搜索查询的应用或某些其他上下文信息来为给定搜索查询确定一个或多个搜索类别,然后用于过滤从共享搜索索引提供的搜索结果。

    Method and system for providing a shared search index in a peer to peer network
    3.
    发明申请
    Method and system for providing a shared search index in a peer to peer network 失效
    在对等网络中提供共享搜索索引的方法和系统

    公开(公告)号:US20060248067A1

    公开(公告)日:2006-11-02

    申请号:US11118968

    申请日:2005-04-29

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30091 G06F17/30209

    摘要: A method and system for sharing search index entries across multiple computer systems organized in a peer to peer network, in which unique content is indexed only once, even though the content may be physically duplicated in multiple computer systems in the peer to peer network. When files are obtained by a shared indexing service, and a determination is made as to whether the received files are duplicates with regard to previously indexed files. If a file is determined to be a duplicate, the index representation of the previously indexed copy of the file is modified to indicate that the file is also associated with another computer system in the peer to peer network. If a file is not a duplicate of a previously indexed file, the file is indexed to support future searches. The index representation of a file includes category identifiers associating one or more computer systems with the file. When a file is indexed, one or more category identifiers are generated and stored in association with that file. The category identifiers for an indexed file may represent host computer systems on which copies of the file are stored. The category identifiers enable location specific searching by computer systems in a peer to peer network sharing a common search index. A software category filter may be provided to process search results from the shared search index, so that only files associated with certain categories are returned.

    摘要翻译: 一种用于在组织在对等网络中的多个计算机系统上共享搜索索引条目的方法和系统,其中唯一内容仅被索引一次,即使该内容可以在对等网络中的多个计算机系统中物理复制。 当通过共享索引服务获得文件时,并且确定所接收的文件是否是关于先前索引的文件的重复的。 如果确定文件是重复的,则修改文件的先前索引副本的索引表示,以指示文件也与对等网络中的另一计算机系统相关联。 如果文件不是以前索引的文件的副本,则将该文件编入索引以支持将来的搜索。 文件的索引表示包括将一个或多个计算机系统与文件相关联的类别标识符。 当文件被索引时,生成一个或多个类别标识符并与该文件相关联地存储。 索引文件的类别标识符可以表示存储文件副本的主机系统。 类别标识符使得在对等网络中的计算机系统进行位置特定搜索共享公共搜索索引。 可以提供软件类别过滤器来处理来自共享搜索索引的搜索结果,使得仅返回与某些类别相关联的文件。

    Method and system for full text indexing optimization through identification of idle and active content
    4.
    发明申请
    Method and system for full text indexing optimization through identification of idle and active content 失效
    通过识别空闲和活动内容来进行全文索引优化的方法和系统

    公开(公告)号:US20070073686A1

    公开(公告)日:2007-03-29

    申请号:US11237087

    申请日:2005-09-28

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30699

    摘要: A system for full text indexing optimization that operates based on identification of idle and active content in a content source, and by prioritizing indexing of idle content over active content. Active and idle content items are automatically identified, and idle content items are given a higher priority for indexing, while active content items are given a lower priority. Active content items are generally those that are considered relatively more likely to be located by the user without using the full text indexing function, while idle content items are those content items that are relatively more likely to be located through use of the full text indexing function. The specific content item attributes that are used to determine whether a given content item is active or idle may depend on the type content source for which the full text index is being provided. Additionally, the determination of which content items are active and which are idle may be based on predetermined, static criteria, and/or dynamically determined use patterns determined by monitoring operations performed on content items by a user.

    摘要翻译: 一种用于全文索引优化的系统,其基于内容源中的空闲和活动内容的识别以及通过优先于闲置内容对活动内容的索引进行操作。 自动识别活动和空闲内容项目,并为空闲内容项目提供更高的索引优先级,同时给予较低优先级的活动内容项目。 活动内容项通常是被认为在不使用全文索引功能的情况下被相对更可能定位的内容项,而空闲内容项是通过使用全文索引功能相对更可能定位的那些内容项 。 用于确定给定内容项是活动还是空闲的特定内容项属性可能取决于提供全文索引的类型内容源。 另外,确定哪些内容项是活动的,哪些是空闲的,可以基于通过由用户对内容项执行的监视操作确定的预定的静态标准和/或动态确定的使用模式。