Patent search ap:("XEROX CORPORATION") AND inv:"Pedersen Page Jan O."

1.

发明授权
Finite-state transduction of related word forms for text indexing and retrieval 失效
Title translation: 用有限自动机的文本索引和检索相关的单词形式形成

公开(公告)号：EP0583083B1

公开(公告)日：2001-11-28

申请号：EP93305626.9

申请日：1993-07-19

Applicant: XEROX CORPORATION

Inventor： Cutting, Douglass R. , Halvorsen, Per-Kristian G. , Kaplan, Ronald M. , Karttunen, Lauri , Kay, Martin , Pedersen, Jan O.

IPC: G06F17/28

CPC classification number: G06F17/30616 , G06F17/30663 , G06F17/30666 , G06F17/30672 , G06F17/30684 , G06F17/30985 , G06F17/30988 , Y10S707/99931

2.

发明公开
Text genre identification 失效
Title translation: 文本流派标识

公开(公告)号：EP0889417A3

公开(公告)日：1999-11-24

申请号：EP98305231.7

申请日：1998-07-01

Applicant: XEROX CORPORATION

Inventor： Nunberg, Geoffrey D. , Pedersen, Jan O. , Schuetze, Hinrich , Kessler, Brett L. , Grefenstette, Gregory

IPC: G06F17/28

CPC classification number: G06F17/277 , G06F17/271 , G06F17/2775 , G06F17/2785 , G06F17/30705 , G06F17/30707

Abstract: A processor implemented method of identifying the genre of a machine readable, untagged text. The processor implemented method begins by generating a cue vector from the text, which represents occurrences in the text of a first set of nonstructural, surface cues, which are easily computable. Afterward, the processor determines whether the text is an instance of a first text genre using the cue vector and a weighting vector associated with the first text genre.

Abstract translation: 处理器实现的识别机器可读，无标记文本的流派的方法。处理器实现的方法首先从文本中生成一个提示矢量，该提示矢量表示第一组非易失性表面提示的文本中的出现，这些提示很容易计算。之后，处理器使用提示矢量和与第一文本流派相关联的加权矢量来确定文本是否是第一文本流派的实例。

3.

发明公开
Method of processing a corpus of electronically stored documents 失效
Title translation: Verfahren zur Verarbeitung mehrerer elektronisch gespeicherte Dokumente。

公开(公告)号：EP0631245A2

公开(公告)日：1994-12-28

申请号：EP94304471.9

申请日：1994-06-20

Applicant: XEROX CORPORATION

Inventor： Pedersen, Jan O. , Karger, David R. , Cutting, Douglass R.

IPC: G06F15/403 , G06F15/401

CPC classification number: G06F17/3071 , G06F17/30011 , Y10S707/99932

Abstract: Arbitrarily large document collections are processed by expanding a focus set having at least one initial metadocument (82) into a plurality of subsequent metadocuments (83,84,85,86). The number of subsequent metadocuments is approximately equal to a predetermined maximum number. The subsequent metadocuments are then clustered into a predetermined number of new metadocuments, which are summarized and presented to a user. The focus set is redefined to include only user-selected new metadocuments.

Abstract translation: 通过将具有至少一个初始元文件（82）的焦点集扩展到多个后续元文件（83,84,85,86）中来处理任意大的文档集合。后续元文件的数量近似等于预定的最大数量。随后的元文件然后被聚集成预定数量的新的元文件，其被汇总并呈现给用户。焦点集被重新定义为仅包括用户选择的新的元文件。

4.

发明授权
Automatic method of extracting summarization using feature probabilities 失效
Title translation: 通过使用特征概率提取总结自动方法

公开(公告)号：EP0751469B1

公开(公告)日：2002-08-21

申请号：EP96304777.4

申请日：1996-06-28

Applicant: XEROX CORPORATION

Inventor： Kupiec, Julian M. , Pedersen, Jan O. , Chen, Francine R. , Brotsky, Daniel C. , Putz, Steven B.

IPC: G06F17/30

CPC classification number: G06F17/30719

5.

发明授权
Automatic method of generating feature probabilities for automatic extracting summarization 失效
Title translation: 生成用于自动提取摘要功能概率的自动方法

公开(公告)号：EP0751470B1

公开(公告)日：2001-12-19

申请号：EP96304778.2

申请日：1996-06-28

Applicant: XEROX CORPORATION

Inventor： Kupiec, Julian M. , Pedersen, Jan O. , Chen, Francine R. , Brotsky, Daniel C. , Putz, Steven B.

IPC: G06F17/30

CPC classification number: G06F17/30719

6.

发明公开
Automatic method of generating feature probabilities for automatic extracting summarization 失效
Title translation: 生成用于自动提取摘要功能概率的自动方法

公开(公告)号：EP0751470A1

公开(公告)日：1997-01-02

申请号：EP96304778.2

申请日：1996-06-28

Applicant: XEROX CORPORATION

Inventor： Kupiec, Julian M. , Pedersen, Jan O. , Chen, Francine R. , Brotsky, Daniel C. , Putz, Steven B.

IPC: G06F17/30

CPC classification number: G06F17/30719

Abstract: A method of automatically generating feature probabilities that allow later automatic generation of document extracts. The computer system generates the probabilities by analyzing each document a document at a time. First, the computer system designates one of the documents as a selected document. Next, the computer system analyzes each sentence of the selected document to determine the value of the paragraph feature and the value of the uppercase feature. The computer system repeats this effort for each document of the document corpus. Afterward, the number of occurrences of each value of each feature is calculated and is used to calculate feature value probabilities for all of the features.

Abstract translation: 自动生成特征的概率的方法确实允许后自动生成文件提取物。计算机系统基因利率同时分析每个文档的文档的概率。首先，计算机系统指定文档作为一个选择的文档中的一个。接着，计算机系统所选择的文档的每个句子分析，以确定矿段落特征的值和上壳体特征的值。计算机系统重复这种努力的文档语料库的每个文档。此后，每个特征的每个值的出现的次数被计算并用于计算特征值的概率的所有的特征。

7.

发明公开
An iterative technique for phrase query formation and an information retrieval system employing same 失效
Title translation: 迭代Verfahren zum Suchen von Satzteilen und Informationsauffindungssystem，welches diesesbenützt。

公开(公告)号：EP0530993A2

公开(公告)日：1993-03-10

申请号：EP92307372.0

申请日：1992-08-12

Applicant: XEROX CORPORATION

Inventor： Pedersen, Jan O. , Tukey, John W. , Halvorsen, Per-Kristian , Bier, Eric A. , Cutting, Douglass R. , Bobrow, Daniel G.

IPC: G06F15/403

CPC classification number: G06F17/30646 , G06F17/30011 , Y10S707/99934

Abstract: An information retrieval system and method are provided in which an operator inputs (110) one or more query words which are used to determine a search key (120) for searching (130) through a corpus of documents, and which returns ( 140) any matches between the search key and the corpus of documents as a phrase containing the word data matching the search key (the query word(s)), a non-stop (content) word next adjacent to the matching word data, and all intervening stop-words between the matching word data and the next adjacent non-stop word. The operator, after reviewing one or more of the returned phrases can then use one or more of the next adjacent non-stop-words as new query words to reformulate the search key ( 150, 160, 170) and perform a subsequent search through the document corpus. This process can be conducted iteratively, until the appropriate documents of interest are located. The additional non-stop-words from each phrase are preferably aligned with each other (e.g., by columnation) to ease viewing of the " new" content words.

Abstract translation: 提供了一种信息检索系统和方法，其中操作者输入（110）用于确定搜索关键字（120）的一个或多个查询词，用于通过文档语料库搜索（130），并返回（140）任何将搜索关键字和文档语料库之间的匹配作为包含与搜索关键字（查询词）匹配的词数据的短语，与匹配字数据相邻的不间断（内容）字，以及所有中间停止在匹配的字数据与下一个相邻的不停止字之间。操作者在查看一个或多个所返回的短语之后可以使用下一个相邻的非停止词中的一个或多个作为新的查询词来重新形成搜索关键字（150,160,170），并且通过文件语料库该过程可以迭代进行，直到找到相关文档。来自每个短语的附加非停止词优选彼此对齐（例如，通过列），以便于观看“新”内容词。

8.

发明授权
Method of processing a corpus of electronically stored documents 失效
Title translation: 一种用于处理多个电子存储文档的方法

公开(公告)号：EP0631245B1

公开(公告)日：2000-03-01

申请号：EP94304471.9

申请日：1994-06-20

Applicant: XEROX CORPORATION

Inventor： Pedersen, Jan O. , Karger, David R. , Cutting, Douglass R.

IPC: G06F17/30

CPC classification number: G06F17/3071 , G06F17/30011 , Y10S707/99932

9.

发明公开
Text genre identification 失效
Title translation: Textgenreerkennung

公开(公告)号：EP0889417A2

公开(公告)日：1999-01-07

申请号：EP98305231.7

申请日：1998-07-01

Applicant: XEROX CORPORATION

Inventor： Nunberg, Geoffrey D. , Pedersen, Jan O. , Schuetze, Hinrich , Kessler, Brett L. , Grefenstette, Gregory

IPC: G06F17/28

CPC classification number: G06F17/277 , G06F17/271 , G06F17/2775 , G06F17/2785 , G06F17/30705 , G06F17/30707

Abstract: A processor implemented method of identifying the genre of a machine readable, untagged text. The processor implemented method begins by generating a cue vector from the text, which represents occurrences in the text of a first set of nonstructural, surface cues, which are easily computable. Afterward, the processor determines whether the text is an instance of a first text genre using the cue vector and a weighting vector associated with the first text genre.

Abstract translation: 一种处理器实现的方法，用于识别机器可读，未标记的文本的类型。处理器实现的方法开始于从文本生成提示向量，其代表第一组非结构化表面线索的文本中的出现，其易于计算。之后，处理器确定文本是否是使用提示向量的第一文本类型的实例以及与第一文本类型相关联的加权向量。

10.

发明授权
Electronic document processing systems 失效
Title translation: 电子公文处理系统

公开(公告)号：EP0459792B1

公开(公告)日：1997-06-04

申请号：EP91304879.9

申请日：1991-05-30

Applicant: XEROX CORPORATION

Inventor： Zdybel, Frank, Jr. , Henderson, D. Austin, Jr. , Sang, Henry W., Jr. , Hecht, David L. , Pedersen, Jan O. , Bloomberg, Dan S. , Smith, Z. Erol, III

IPC: G06F17/30 , G06F17/60

CPC classification number: H04N1/32133 , G06F17/30011 , G06Q10/10 , H04N2201/3204 , H04N2201/3205 , H04N2201/3214 , H04N2201/3226 , H04N2201/3232 , H04N2201/3233 , H04N2201/3242 , H04N2201/3269 , H04N2201/3271

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification