Outbound information analysis for generating user interest profiles and improving user productivity
    11.
    发明授权
    Outbound information analysis for generating user interest profiles and improving user productivity 失效
    出站信息分析,用于生成用户兴趣配置文件并提高用户生产力

    公开(公告)号:US06654735B1

    公开(公告)日:2003-11-25

    申请号:US09227225

    申请日:1999-01-08

    IPC分类号: G06F1730

    摘要: A system for automatically generating user interest profiles and delivering information to users learns a user's interests by monitoring the user's outbound communication streams, i.e., the information that the user produces either by typing (e.g., while a user is composing an e-mail message or editing a word processor document) or by speaking (e.g., while a user is engaged in a phone conversation or listening to a lecture). The system uses the monitored text to build (and possibly update) a user interest profile. The profile is constructed from current text generated by the user, so that the retrieved information reflects present user interests. In addition, the profile may also retain past user interests, so that the profile reflects a combination of past and present user interests. The system then automatically queries diverse databases for information relevant to the interest profile. The databases may include internet web pages, files stored on the user's local network, and other local or remote data repositories. The queries may use a combination of internet search engines, the specific selection of which may depend upon the nature and/or content of the queries. The information retrieved in response to the queries is then presented to the user. The retrieved information may contain, for example, answers to questions that the user might ask and/or data related to the user's current and continuing interests. Because a user's current speech or typed text is highly correlated with the user's current interests, the retrieved information will be relevant to the user's actual interests. The communication stream monitoring, interest profile building, data base querying, and presentation of retrieved information are all performed automatically, in real time, and in the background of current user activities.

    摘要翻译: 用于自动生成用户兴趣简档并向用户传递信息的系统通过监视用户的出站通信流来学习用户的兴趣,即用户通过键入产生的信息(例如,当用户正在撰写电子邮件消息或 编辑文字处理器文档)或通过说话(例如,当用户进行电话交谈或听讲座时)。 系统使用受监控的文本构建(并可能更新)用户兴趣简档。 该配置文件由用户生成的当前文本构建,以便所检索的信息反映了用户的兴趣。 此外,简档也可以保留过去的用户兴趣,使得简档反映了过去和现在的用户兴趣的组合。 然后,系统会自动查询不同的数据库以获取与兴趣资料相关的信息。 数据库可以包括互联网网页,存储在用户的本地网络上的文件以及其他本地或远程数据存储库。 查询可以使用互联网搜索引擎的组合,其特定选择可以取决于查询的性质和/或内容。 然后将响应于查询检索的信息呈现给用户。 检索到的信息可以包含例如用户可能询问的问题的答案和/或与用户当前和持续兴趣相关的数据。 由于用户当前的语音或类型的文本与用户当前的兴趣高度相关,所检索的信息将与用户的实际兴趣相关。 通信流监控,兴趣信息构建,数据库查询和检索信息的呈现都是在当前用户活动的背景下实时自动执行的。

    Automatic user interest profile generation from structured document access information
    12.
    发明授权
    Automatic user interest profile generation from structured document access information 有权
    从结构化文档访问信息生成自动用户兴趣简档

    公开(公告)号:US06385619B1

    公开(公告)日:2002-05-07

    申请号:US09227117

    申请日:1999-01-08

    IPC分类号: G06F1730

    摘要: A system generates user interest profiles by monitoring and analyzing a user's access to a variety of hierarchical levels within a set of structured documents, e.g., documents available at a web site. Each information document has parts associated with it and the documents are classified into categories using a known taxonomy. The user interest profiles are automatically generated based on the type of content viewed by the user. The type of content is determined by the text within the parts of the documents viewed and the classifications of the documents viewed. In addition, the profiles also are generated based on other factors including the frequency and currency of visits to documents having a given classification, and/or the hierarchical depth of the levels or parts of the documents viewed. User profiles include an interest category code and an interest score to indicate a level of interest in a particular category. The profiles are updated automatically to accurately reflect the current interests of an individual, as well as past interests. A time-dependent decay factor is applied to the past interests. The system presents to the user documents or references to documents that match the current profile.

    摘要翻译: 系统通过监视和分析用户对一组结构化文档(例如,在网站上可获得的文档)中的各种层级的访问来生成用户兴趣简档。 每个信息文档都具有与之相关的部分,并且使用已知分类法将文档分类为类别。 基于用户观看的内容类型,自动生成用户兴趣简档。 内容的类型由所查阅文档的部分内容和查看的文档的分类决定。 另外,还可以基于包括对具有给定分类的文档的访问的频率和货币的其他因素和/或所查看的文档的级别或部分的分级深度来生成简档。 用户资料包括兴趣类别代码和利益分数,以指示特定类别的兴趣水平。 配置文件将自动更新,以准确反映个人的当前利益以及过去的兴趣。 时间依赖衰变因子适用于过去的兴趣。 系统向用户提供与当前配置文件匹配的文档或文档。

    Method and apparatus for parallel profile matching in a large scale webcasting system
    13.
    发明授权
    Method and apparatus for parallel profile matching in a large scale webcasting system 失效
    用于大规模网络广播系统中并行配置匹配的方法和装置

    公开(公告)号:US06169989A

    公开(公告)日:2001-01-02

    申请号:US09082747

    申请日:1998-05-21

    IPC分类号: G06F1700

    摘要: A method and apparatus for efficiently matching a large collection of user profiles against a large volume of data in a webcasting system. The invention generally includes in one embodiment four steps to parallelize the profiles. First, an initial profile set is partitioned into several subsets also referred to as sub-partitions using various heuristic methods. Second, each sub-partition is mapped onto one or more independent processing units. Each processing unit is not required to have equal processing performance. However, for best performance results, subset data should be mapped in one embodiment where the subset with a highest cost is mapped to a fastest processor, and the next highest cost subset mapped to the next fastest processor. Where appropriate, the invention evaluates the relative subset processing speed of each processor and adjusts future subset mapping based upon these evaluations. For each information item I that needs to be matched with a profile predicate, a third and a fourth step are executed. The third step broadcasts I to all processing units, and a fourth step performs a sequential profile match on I.

    摘要翻译: 一种用于在网络广播系统中有效地匹配大量用户简档与大量数据的方法和装置。 本发明通常在一个实施例中包括四个步骤来并行化轮廓。 首先,使用各种启发式方法将初始配置文件集划分为几个也称为子分区的子集。 第二,每个子分区映射到一个或多个独立的处理单元。 每个处理单元不需要具有相同的处理性能。 然而,为获得最佳性能结果,应在一个实施例中映射子集数据,其中具有最高成本的子集被映射到最快的处理器,并且将下一个最高成本子集映射到下一个最快的处理器。 在适当的情况下,本发明评估每个处理器的相对子集处理速度,并根据这些评估调整未来的子集映射。 对于需要与配置文件谓词匹配的每个信息项I,执行第三和第四步骤。 第三步将I广播到所有处理单元,第四步对I执行顺序配置文件匹配。