System and method for automatic anthology creation using document aspects
    1.
    发明授权
    System and method for automatic anthology creation using document aspects 有权
    使用文档方面的自动选集创建的系统和方法

    公开(公告)号:US07840564B2

    公开(公告)日:2010-11-23

    申请号:US11355546

    申请日:2006-02-15

    IPC分类号: G06F7/00

    摘要: A generic and expandable document aspect system and method for searching, browsing, presenting, and interacting with data assembled from document contents and related external data is provided. New varieties of document aspects are added to existing installations and can be accessed by users without requiring upgrades to server or clients, for example by using plug-in technology.

    摘要翻译: 提供了一种用于搜索,浏览,呈现和与文档内容和相关外部数据组合的数据进行交互的通用和可扩展文档方面系统和方法。 文档方面的新品种被添加到现有的安装中,并且可以由用户访问,而不需要升级到服务器或客户机,例如通过使用插件技术。

    Method and apparatus for real time text analysis and text navigation
    2.
    发明授权
    Method and apparatus for real time text analysis and text navigation 有权
    用于实时文本分析和文本导航的方法和装置

    公开(公告)号:US08280878B2

    公开(公告)日:2012-10-02

    申请号:US12726296

    申请日:2010-03-17

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30696

    摘要: An end user, by way of a submission interface, instructs an engine to select particular collections of documents to process. The engine processes all the text from within all the documents from within the selected collections. The result of the processing of such text is a distilled data set. Such distillate data set is accessed through APIs by a browser. Different views of the accessed distillate data set may be presented to the end user via the browser allowing them to more effectively assess the utility of the presented data and thereby responsively tune the presented data set with regard to their particular research task. One or more of such views may be used to create a new document from sentences, paragraphs, chapters or documents from the distillate data set that correspond to the one or more views for presentation to the end user.

    摘要翻译: 终端用户通过提交界面指示引擎选择要处理的特定文档集合。 引擎从所选集合内的所有文档中处理所有文本。 这种文本的处理结果是一个蒸馏数据集。 通过浏览器通过API访问此馏分数据集。 所访问的馏出物数据集的不同视图可以经由浏览器呈现给最终用户,允许它们更有效地评估所呈现的数据的效用,从而响应于其针对其特定研究任务调整所呈现的数据集。 这些视图中的一个或多个可以用于从对应于一个或多个视图的馏出物数据集的句子,段落,章节或文档创建新文档,以呈现给最终用户。

    Method and apparatus for improved information transactions
    3.
    发明授权
    Method and apparatus for improved information transactions 有权
    改进信息交易的方法和装置

    公开(公告)号:US07536561B2

    公开(公告)日:2009-05-19

    申请号:US10306806

    申请日:2002-11-27

    IPC分类号: G06F11/30

    摘要: A method and system for making information available of a computer system and compensating information owners or creators for access to said information.In various aspects, the invention provides a mechanism for giving users meaningful access to information via a computer system and network while protecting the interests of publishers and creators in information. The invention provides a solution for information including, but not limited to: text, graphics, photos, executable files, data tables, audio, video, and three dimensional data. In a further aspect, the invention comprises a new method for allowing a user to review a document while connected to a network but prevents the user from downloading, printing, or copying the document unless a fee is paid. In a further aspect, the invention comprises a new method for allowing a user to review documents at a first cost basis (which may be free), but only provides other access to documents, such as copying, printing, or downloading on a second cost basis. In a further aspect, the invention comprises a new method for allowing a user to purchase a selectable portion of a document at a price based on the amount of material selected where that amount of material can include a portion of a document, an entire document, or an anthology of components of multiple documents.

    摘要翻译: 一种用于使信息可用于计算机系统并补偿信息所有者或创建者以访问所述信息的方法和系统。 在各个方面,本发明提供了一种机制,用于通过计算机系统和网络给用户有意义地访问信息,同时保护发布者和创建者在信息中的兴趣。 本发明为包括但不限于文本,图形,照片,可执行文件,数据表,音频,视频和三维数据的信息提供了解决方案。 在另一方面,本发明包括一种允许用户在连接到网络时检查文档的新方法,但是防止用户下载,打印或复制文档,除非支付费用。 在另一方面,本发明包括一种新的方法,用于允许用户以第一个成本(可以是免费的)查看文档,但仅提供对文档的其他访问,例如以第二个成本进行复制,打印或下载 基础。 在另一方面,本发明包括一种新的方法,用于允许用户基于所选择的材料的数量以价格购买文档的可选部分,其中所述材料的量可以包括文档的一部分,整个文档, 或多个文档的组件选集。

    Method and apparatus for improved information transactions
    4.
    发明授权
    Method and apparatus for improved information transactions 有权
    改进信息交易的方法和装置

    公开(公告)号:US08892906B2

    公开(公告)日:2014-11-18

    申请号:US13198653

    申请日:2011-08-04

    IPC分类号: G06F11/30 G06K9/00 G06Q30/06

    摘要: Methods and systems for analyzing an image, such as a newspaper or magazine pager or the like including text by mapping the image to determine regions of text and analyzing portions of the image in accordance with characteristics of selected regions of the text to develop a desired ordering of at least the selected regions in accordance with a textual relationship between the selected regions. The desired order may be related to the order in which the selected regions, and or words therein, are to be presented in a different format appropriate for a specific use, such by a human reader, for transferring the text over a network, for use in a database or by a search function, word processor or printer. Normalizing, columnizing, regionalizing, frameset building and article tracing functions may be used to develop the desired order in related regions in an article within the image.

    摘要翻译: 用于分析诸如报纸或杂志传呼机等的图像的方法和系统,包括通过映射图像以确定文本的区域并根据文本的所选择的区域的特征来分析图像的部分来显示期望的排序 至少选择的区域根据所选区域之间的文本关系。 期望的顺序可以与所选择的区域或其中的单词以适合于特定用途的不同格式(例如由人类读取器)呈现以用于通过网络传送文本以供使用的顺序相关 在数据库中或通过搜索功能,文字处理器或打印机。 规范化,分栏化,区域化,框架架构和文章追踪功能可用于在图像中的文章中的相关区域中开发所需的顺序。

    Method and apparatus for improved information transactions
    5.
    发明授权
    Method and apparatus for improved information transactions 有权
    改进信息交易的方法和装置

    公开(公告)号:US08015418B2

    公开(公告)日:2011-09-06

    申请号:US12367346

    申请日:2009-02-06

    IPC分类号: G06F11/30

    摘要: Methods and systems for analyzing an image, such as a newspaper or magazine pager or the like including text by mapping the image to determine regions of text and analyzing portions of the image in accordance with characteristics of selected regions of the text to develop a desired ordering of at least the selected regions in accordance with a textual relationship between the selected regions. The desired order may be related to the order in which the selected regions, and or words therein, are to be presented in a different format appropriate for a specific use, such by a human reader, for transferring the text over a network, for use in a database or by a search function, word processor or printer. Normalizing, columnizing, regionalizing, frameset building and article tracing functions may be used to develop the desired order in related regions in an article within the image.

    摘要翻译: 用于分析诸如报纸或杂志传呼机等的图像的方法和系统,包括通过映射图像以确定文本的区域并根据文本的所选择的区域的特征来分析图像的部分来显示期望的排序 至少选择的区域根据所选区域之间的文本关系。 期望的顺序可以与所选择的区域或其中的单词以适合于特定用途的不同格式(例如由人类读取器)呈现以用于通过网络传送文本以供使用的顺序相关 在数据库中或通过搜索功能,文字处理器或打印机。 规范化,分栏化,区域化,框架架构和文章追踪功能可用于在图像中的文章中的相关区域中开发所需的顺序。

    Method and apparatus for document clustering and document sketching
    6.
    发明授权
    Method and apparatus for document clustering and document sketching 有权
    用于文档聚类和文档素描的方法和装置

    公开(公告)号:US07433869B2

    公开(公告)日:2008-10-07

    申请号:US11427781

    申请日:2006-06-29

    IPC分类号: G06F17/30

    摘要: A first embodiment of the invention provides a system that automatically classifies documents in a collection into clusters based on the similarities between documents, that automatically classifies new documents into the right clusters, and that may change the number or parameters of clusters under various circumstances. A second embodiment of the invention provides a technique for comparing two documents, in which a fingerprint or sketch of each document is computed. In particular, this embodiment of the invention uses a specific algorithm to compute the document's fingerprint, One embodiment uses a sentence in the document as a logical delimiter or window from which significant words are extracted and, thereafter, a hash is computed of all pair-wise permutations. Words are extracted based on their weight in the document, which can be computed using measures such as term frequency and the inverse document frequency.

    摘要翻译: 本发明的第一实施例提供了一种系统,其基于文档之间的相似性自动地将集合中的文档分类成簇,该文档将新文档自动分类到正确的集群中,并且可以在各种情况下改变集群的数量或参数。 本发明的第二实施例提供了一种用于比较两个文档的技术,其中计算每个文档的指纹或草图。 特别地,本发明的该实施例使用特定的算法来计算文档的指纹。一个实施例将文档中的句子用作提取有效字的逻辑定界符或窗口,此后,计算所有对 - 明智的排列。 根据文档中的权重提取单词,可以使用诸如术语频率和逆文档频率等度量来计算单词。

    System and method for automatic anthology creation using document aspects
    7.
    发明授权
    System and method for automatic anthology creation using document aspects 有权
    使用文档方面的自动选集创建的系统和方法

    公开(公告)号:US08799288B2

    公开(公告)日:2014-08-05

    申请号:US13286075

    申请日:2011-10-31

    IPC分类号: G06F7/00

    摘要: A generic and expandable document aspect system and method for searching, browsing, presenting, and interacting with data assembled from document contents and related external data is provided. New varieties of document aspects are added to existing installations and can be accessed by users without requiring upgrades to server or clients, for example by using plug-in technology.

    摘要翻译: 提供了一种用于搜索,浏览,呈现和与文档内容和相关外部数据组合的数据交互的通用和可扩展文档方面系统和方法。 文档方面的新品种被添加到现有的安装中,并且可以由用户访问,而不需要升级到服务器或客户机,例如通过使用插件技术。

    METHOD AND APPARATUS FOR IMPROVED INFORMATION TRANSACTIONS
    8.
    发明申请
    METHOD AND APPARATUS FOR IMPROVED INFORMATION TRANSACTIONS 审中-公开
    改进信息交易的方法和装置

    公开(公告)号:US20130047221A1

    公开(公告)日:2013-02-21

    申请号:US13656558

    申请日:2012-10-19

    申请人: EBRARY

    IPC分类号: G06F21/20

    摘要: A mechanism gives users meaningful access to information while protecting the interests of publishers and creators of information including text, graphics, photos, executable files, data tables, audio, video, and three dimensional data and allows a user to review a document while connected to a network but prevents the user from downloading, printing, or copying the document unless a fee is paid. The user is allowed to review documents at a first cost basis, but only provides other access to documents, such as copying, printing, or downloading on a second cost basis. The user is also allowed to purchase a selectable portion of a document at a price based on the amount of material selected where that amount of material can include a portion of a document, an entire document, or an anthology of components of multiple documents.

    摘要翻译: 一种机制使用户有意义地访问信息,同时保护出版商和创作者的兴趣,包括文本,图形,照片,可执行文件,数据表,音频,视频和三维数据等信息,并允许用户在连接到 网络,但防止用户下载,打印或复制文档,除非支付费用。 允许用户以第一个成本的方式查看文档,但仅提供其他访问文档,例如复印,打印或以第二个成本下载。 用户还可以基于所选择的材料的数量来购买文档的可选部分,该材料的数量可以包括文档的一部分,整个文档或多个文档的组件的选集。

    Method and apparatus for improved information transactions
    9.
    发明授权
    Method and apparatus for improved information transactions 有权
    改进信息交易的方法和装置

    公开(公告)号:US08311946B1

    公开(公告)日:2012-11-13

    申请号:US09498944

    申请日:2000-02-04

    IPC分类号: G06F21/00 G06Q20/00

    摘要: A mechanism gives users meaningful access to information while protecting the interests of publishers and creators of information including text, graphics, photos, executable files, data tables, audio, video, and three dimensional data and allows a user to review a document while connected to a network but prevents the user from downloading, printing, or copying the document unless a fee is paid. The user is allowed to review documents at a first cost basis, but only provides other access to documents, such as copying, printing, or downloading on a second cost basis. The user is also allowed to purchase a selectable portion of a document at a price based on the amount of material selected where that amount of material can include a portion of a document, an entire document, or an anthology of components of multiple documents.

    摘要翻译: 一种机制使用户有意义地访问信息,同时保护出版商和创作者的兴趣,包括文本,图形,照片,可执行文件,数据表,音频,视频和三维数据等信息,并允许用户在连接到 网络,但防止用户下载,打印或复制文档,除非支付费用。 允许用户以第一个成本的方式查看文档,但仅提供其他访问文档,例如复印,打印或以第二个成本下载。 用户还可以基于所选择的材料的数量来购买文档的可选部分,该材料的数量可以包括文档的一部分,整个文档或多个文档的组件的选集。

    Method and apparatus for document clustering and document sketching
    10.
    发明授权
    Method and apparatus for document clustering and document sketching 有权
    用于文档聚类和文档素描的方法和装置

    公开(公告)号:US08255397B2

    公开(公告)日:2012-08-28

    申请号:US12198841

    申请日:2008-08-26

    IPC分类号: G06F17/30

    摘要: A first embodiment of the invention provides a system that automatically classifies documents in a collection into clusters based on the similarities between documents, that automatically classifies new documents into the right clusters, and that may change the number or parameters of clusters under various circumstances. A second embodiment of the invention provides a technique for comparing two documents, in which a fingerprint or sketch of each document is computed. In particular, this embodiment of the invention uses a specific algorithm to compute the document's fingerprint. One embodiment uses a sentence in the document as a logical delimiter or window from which significant words are extracted and, thereafter, a hash is computed of all pair-wise permutations. Words are extracted based on their weight in the document, which can be computed using measures such as term frequency and the inverse document frequency.

    摘要翻译: 本发明的第一实施例提供了一种系统,其基于文档之间的相似性自动地将集合中的文档分类成簇,该文档将新文档自动分类到正确的集群中,并且可以在各种情况下改变集群的数量或参数。 本发明的第二实施例提供了一种用于比较两个文档的技术,其中计算每个文档的指纹或草图。 特别地,本发明的该实施例使用特定算法来计算文档的指纹。 一个实施例将文档中的句子用作提取有效字的逻辑定界符或窗口,此后,计算所有成对排列的散列。 根据文档中的权重提取单词,可以使用诸如术语频率和逆文档频率等度量来计算单词。