METHOD AND SYSTEM FOR CATEGORIZING TOPIC DATA WITH CHANGING SUBTOPICS
    21.
    发明申请
    METHOD AND SYSTEM FOR CATEGORIZING TOPIC DATA WITH CHANGING SUBTOPICS 审中-公开
    用于分类主题数据与更改子句的方法和系统

    公开(公告)号:US20090150436A1

    公开(公告)日:2009-06-11

    申请号:US11953198

    申请日:2007-12-10

    IPC分类号: G06F17/30

    CPC分类号: G06F16/355

    摘要: The embodiments of the invention provide a method for the automatic identification of changing subtopics within topics. The method begins by receiving customer satisfaction data having unstructured data objects. Next, the data objects are automatically categorized into pre-defined topics, wherein the pre-defined topics do not change throughout the customer satisfaction analysis. The pre-defined topics can be automatically defined based on a history of customer satisfaction data. Following this, a clustering analysis is automatically performed to identify subtopics of the data objects within the pre-defined topics. The subtopics are more specific than the pre-defined topics, and the subtopics can change. Further, the clustering analysis can include extracting features from the data objects and grouping the features into the subtopics. Each of the subtopics includes features having a predetermined degree of similarity.

    摘要翻译: 本发明的实施例提供了一种用于在主题内自动识别不断变化的子主题的方法。 该方法开始于接收具有非结构化数据对象的客户满意度数据。 接下来,数据对象被自动分类为预定义的主题,其中预定义的主题在整个客户满意度分析中不改变。 可以根据客户满意度数据的历史自动定义预定义的主题。 此后,将自动执行聚类分析,以确定预定义主题内的数据对象的子主题。 子主题比预定义的主题更具体,子主题可以更改。 此外,聚类分析可以包括从数据对象中提取特征并将特征分组到子主题中。 每个子主题包括具有预定相似度的特征。

    Methods, apparatus and computer programs for characterizing web resources
    22.
    发明授权
    Methods, apparatus and computer programs for characterizing web resources 失效
    用于表征网络资源的方法,设备和计算机程序

    公开(公告)号:US07516397B2

    公开(公告)日:2009-04-07

    申请号:US10901275

    申请日:2004-07-28

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30864 G06F17/30896

    摘要: Methods, apparatus and computer programs are provided for characterizing Web-based information resources based on their interactions. A Web-based information resource is a single Web document or a collection of related Web documents. Unlike simple text documents, Web documents contain hyperlinks and other HTML tags. Different types of interactions, including inbound hyperlinks, outbound hyperlinks and internal links associated with a Web-based information resource, are used to characterize the Web-based information resource. A DOM tree representing the tag structure of a Web-based information resource is used to identify text items likely to be useful as context for a hyperlink anchor text, and the anchor text is combined with the context to generate a representation. The representation of Web-based information resources based on interactions can be used for clustering and classification, and in Web mining applications such as query disambiguation and automatic taxonomy generation.

    摘要翻译: 提供方法,装置和计算机程序,用于基于它们的相互作用来表征基于Web的信息资源。 基于Web的信息资源是单个Web文档或相关Web文档的集合。 与简单的文本文档不同,Web文档包含超链接和其他HTML标签。 使用不同类型的交互,包括入站超链接,出站超链接和与基于Web的信息资源相关联的内部链接,用于表征基于Web的信息资源。 代表基于Web的信息资源的标签结构的DOM树用于识别可能作为超链接锚文本的上下文有用的文本项,并且锚文本与上下文组合以生成表示。 基于互动的基于Web的信息资源的表示可以用于聚类和分类,以及Web挖掘应用程序,如查询消歧和自动分类法生成。

    Method, system and computer program product for profiling entities
    23.
    发明授权
    Method, system and computer program product for profiling entities 有权
    分析实体的方法,系统和计算机程序产品

    公开(公告)号:US07219105B2

    公开(公告)日:2007-05-15

    申请号:US10664261

    申请日:2003-09-17

    摘要: The present invention provides a method, system and computer program product for profiling an entity based on information obtained form at least one information source. Various contexts associated with the entity are identified. This can be achieved by using a clustering algorithm, an ontology, a thesaurus, association rules or manually by an expert. After the classified into various sets and ranked using a ranking algorithm. Thereafter, certain top ranked concepts are presented to a user as the profile of the entity.

    摘要翻译: 本发明提供了一种用于根据从至少一个信息源获得的信息来对实体进行分析的方法,系统和计算机程序产品。 识别与实体相关联的各种上下文。 这可以通过使用聚类算法,本体论,辞典,关联规则或专家手动来实现。 分类成各种集合后,使用排序算法排序。 此后,将一些顶级概念作为实体的简档呈现给用户。

    Web page preview without browsing to web page
    25.
    发明申请
    Web page preview without browsing to web page 审中-公开
    网页预览,无需浏览网页

    公开(公告)号:US20070073833A1

    公开(公告)日:2007-03-29

    申请号:US11237366

    申请日:2005-09-28

    IPC分类号: G06F15/16

    CPC分类号: H04L67/02 G06F16/954

    摘要: Web pages are previewed without actually having to browse to those web pages. A method is performed in relation to a first web page being browsed by a user and that has a hyperlink to a second web page. The second web page is acquired, and a site-specific preview, a user-specific preview, and a time-specific preview of the second web page are constructed. The site-specific preview is specific to a web site encompassing the second web page. The user-specific preview is specific to the user browsing the first web page. The time-specific preview is nominally specific to a time at which the user previews the second web page. These three previews are combined into an overall preview. In response to the user performing an action in relation to the hyperlink on the first web page, the overall preview of the second web page is displayed without browsing to that page.

    摘要翻译: 预览网页,而不必浏览到这些网页。 执行关于由用户浏览的第一网页并且具有到第二网页的超链接的方法。 获取第二网页,并且构建第二网页的特定于站点的预览,用户特定的预览和时间特定的预览。 特定于网站的预览特定于包含第二个网页的网站。 用户特定的预览特定于浏览第一个网页的用户。 时间特定的预览名义上是指用户预览第二个网页的时间。 这三个预览组合成一个整体预览。 响应于用户执行与第一网页上的超链接有关的动作,第二网页的整体预览被显示,而不浏览该页面。

    System and method for extraction of factoids from textual repositories
    26.
    发明授权
    System and method for extraction of factoids from textual repositories 失效
    从文本库中提取事实的系统和方法

    公开(公告)号:US08706730B2

    公开(公告)日:2014-04-22

    申请号:US11321177

    申请日:2005-12-29

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864 G06F17/30705

    摘要: A method (400) is disclosed of extracting factoids from text repositories, with the factoids being associated with a given factoid category. The method (400) starts by training a classifier (230) to recognize factoids relevant to that given factoid category. Documents or document summaries relevant to the given factoid category is next collected (410) from the text repositories. Sentences having a predetermined association to the given factoid category is extracted (420) from the documents or said document summaries. Those sentences are classified (440), in a noisy environment, using the classifier (230) to extract snippets containing phrases relevant to the given factoid category. It is the extracted snippets that are the factoid associated with the given factoid category.

    摘要翻译: 公开了一种从文本存储库中提取事实框架的方法(400),其中事实框架与给定的类别类别相关联。 方法(400)通过训练分类器(230)开始,以识别与该给定的类别类别相关的因子。 接下来从文本存储库收集与文件类型相关的文档或文档摘要(410)。 具有与给定类别类别的预定关联的句子从文档或所述文档摘要中提取(420)。 这些句子在嘈杂的环境中被分类(440),使用分类器(230)提取包含与给定类别类别相关的短语的片段。 提取的片段是与给定类实体类别相关联的实例。

    Method for protecting audio content
    28.
    发明授权
    Method for protecting audio content 失效
    保护音频内容的方法

    公开(公告)号:US07974411B2

    公开(公告)日:2011-07-05

    申请号:US12023103

    申请日:2008-01-31

    IPC分类号: H04N7/167

    摘要: Techniques for protecting information in an audio file are provided. The techniques include obtaining an audio file, detecting information bearing one or more segments in a speech signal, wherein the information comprises information sought for protection, encrypting the information sought for protection by scrambling the one or more segments using a scrambling filter, and selectively decrypting an amount of the encrypted information, wherein the amount of the encrypted information to be decrypted depends on user access privilege, and wherein selectively decrypting the amount of the encrypted information protects said amount of the encrypted information. Techniques are also provided for protecting information in an audio file.

    摘要翻译: 提供了用于保护音频文件中的信息的技术。 所述技术包括:获取音频文件,检测携带语音信号中的一个或多个段的信息,其中所述信息包括寻求保护的信息,通过使用加扰滤波器对所述一个或多个段进行加扰来加密寻求保护的信息,以及选择性地解密 加密信息的量,其中要被解密的加密信息的量取决于用户访问权限,并且其中选择性地解密加密信息的量保护所述加密信息量。 还提供了用于保护音频文件中的信息的技术。

    METHOD FOR PROTECTING AUDIO CONTENT
    30.
    发明申请
    METHOD FOR PROTECTING AUDIO CONTENT 失效
    保护音频内容的方法

    公开(公告)号:US20090199015A1

    公开(公告)日:2009-08-06

    申请号:US12023103

    申请日:2008-01-31

    IPC分类号: H04K1/04 G06F12/14

    摘要: Techniques for protecting information in an audio file are provided. The techniques include obtaining an audio file, detecting information bearing one or more segments in a speech signal, wherein the information comprises information sought for protection, encrypting the information sought for protection by scrambling the one or more segments using a scrambling filter, and selectively decrypting an amount of the encrypted information, wherein the amount of the encrypted information to be decrypted depends on user access privilege, and wherein selectively decrypting the amount of the encrypted information protects said amount of the encrypted information. Techniques are also provided for protecting information in an audio file.

    摘要翻译: 提供了用于保护音频文件中的信息的技术。 所述技术包括:获取音频文件,检测携带语音信号中的一个或多个段的信息,其中所述信息包括寻求保护的信息,通过使用加扰滤波器对所述一个或多个段进行加扰来加密寻求保护的信息,以及选择性地解密 加密信息的量,其中要被解密的加密信息的量取决于用户访问权限,并且其中选择性地解密加密信息的量保护所述加密信息量。 还提供了用于保护音频文件中的信息的技术。