Systems and methods for hybrid text summarization
    1.
    发明申请
    Systems and methods for hybrid text summarization 失效
    混合文本摘要的系统和方法

    公开(公告)号:US20050086592A1

    公开(公告)日:2005-04-21

    申请号:US10684508

    申请日:2003-10-15

    IPC分类号: G06F17/30 G06F15/00 G06F17/27

    摘要: Techniques are provided for segmenting text into categorized discourse constituents and attaching discourse constituents into a structural representation of discourse. Techniques for determining hybrid structural and non-structural summaries of a text are also provided. A text is segmented based on a theory of discourse analysis into at least a main discourse constituent containing spatio-temporal information about a single event in a possible world view. The discourse constituents are then inserted into a structural representation of discourse. Non-structural techniques are used to determine relevance scores and important discourse constituents are determined. Relevance scores are percolated through the structural representation of discourse to determine supporting preceding discourse constituents that preserve grammaticality. A hybrid text summary is then determined based on the structural representation of the discourse and relevance scores.

    摘要翻译: 提供了将文本分割成分类话语组成部分并将话语组成部分附加到话语的结构表示中的技术。 还提供了用于确定文本的混合结构和非结构摘要的技术。 基于话语分析理论将文本分割成至少包含可能的世界观中关于单个事件的时空信息的主要话语成分。 话语成分然后插入话语的结构表示。 非结构性技术用于确定相关性分数,确定重要的话语成分。 相关性分数通过话语的结构表示来抵消,以确定支持保留语法的前面的话语组成部分。 然后基于话语和相关性分数的结构表示来确定混合文本摘要。

    Systems and methods for determining relevant information based on document structure
    2.
    发明申请
    Systems and methods for determining relevant information based on document structure 有权
    基于文件结构确定相关信息的系统和方法

    公开(公告)号:US20070143098A1

    公开(公告)日:2007-06-21

    申请号:US11301853

    申请日:2005-12-12

    IPC分类号: G06F17/20

    CPC分类号: G06F17/279

    摘要: Techniques are provided for determining relevant information from a document based on document structure. A document is selected and structural elements within the document having a dominance relationship are determined. A first location within the document is selected. The structural element surrounding the first location is determined and the surrounding and non-surrounding structural elements are characterized. Additional documents are associated with the first location in the surrounding structural element based on the surrounding structural element characterization and the non-surrounding structural element characterization. Techniques for dynamically determining annotations for images based on document structure are also provided.

    摘要翻译: 提供了基于文档结构从文档确定相关信息的技术。 选择文档并且确定具有优势关系的文档内的结构元素。 选择文档中的第一个位置。 确定围绕第一位置的结构元件,并且对周围和非周围结构元件进行表征。 基于周围结构元素表征和非周围结构元素表征,附加文档与周围结构元件中的第一位置相关联。 还提供了用于动态地确定基于文档结构的图像注释的技术。

    Systems and methods for determining and using interaction models
    3.
    发明申请
    Systems and methods for determining and using interaction models 有权
    用于确定和使用交互模型的系统和方法

    公开(公告)号:US20050182618A1

    公开(公告)日:2005-08-18

    申请号:US10807009

    申请日:2004-03-23

    IPC分类号: G10L15/18 G06F17/27 G10L15/02

    CPC分类号: G06F17/279

    摘要: Techniques are provided for determining and using interaction models. Discourse functions, prosodic features and turn information are determined from the speech information in a training corpus. Statistics, decision trees, rules and/or various other methods are used to determine a predictive interaction model based on the discourse functions, the prosodic features and the turn information. Predictive interaction models are optionally determined for individual users, genres, languages and/or other characteristics of the speech information. The predictive interaction model is useable to predict turns in a dialogue based on the discourse functions and prosodic features identified in the speech information. Speech information is presented and/or received based on the predictive interaction model.

    摘要翻译: 提供了确定和使用交互模型的技术。 话语功能,韵律特征和转向信息由训练语料库中的语音信息确定。 统计学,决策树,规则和/或各种其他方法被用于基于话语功能,韵律特征和转向信息来确定预测交互模型。 可选地针对个人用户,流派,语言和/或语音信息的其他特征来确定预测交互模型。 预测交互模型可用于基于语音信息中识别的话语功能和韵律特征来预测对话中的转折。 基于预测交互模型呈现和/或接收语音信息。

    Systems and methods for determining predictive models of discourse functions
    4.
    发明申请
    Systems and methods for determining predictive models of discourse functions 有权
    用于确定话语功能预测模型的系统和方法

    公开(公告)号:US20050182625A1

    公开(公告)日:2005-08-18

    申请号:US10781443

    申请日:2004-02-18

    CPC分类号: G06F17/279

    摘要: Techniques are provided for determining predictive models of discourse functions based on prosodic features of natural language speech. Inter and intra sentential discourse functions in a training corpus of natural language speech utterances are determined. The discourse functions are clustered. The exemplary prosodic features associated with each type of discourse function are determined. Machine learning, observation and the like are used to determine a subset of prosodic features associated with each type of discourse function useful in predicting the likelihood of each type of discourse function.

    摘要翻译: 提供了基于自然语言语言韵律特征来确定话语功能的预测模型的技术。 确定自然语言言语言训练语料库中的内部和内部话语功能。 话语功能是聚类的。 确定与各种话语功能相关联的示范性韵律特征。 机器学习,观察等用于确定与预测每种类型的话语功能的可能性有用的每种类型的话语功能相关联的韵律特征的子集。

    Systems and methods for synthesizing speech using discourse function level prosodic features
    5.
    发明申请
    Systems and methods for synthesizing speech using discourse function level prosodic features 审中-公开
    使用话语功能级韵律特征合成语音的系统和方法

    公开(公告)号:US20050187772A1

    公开(公告)日:2005-08-25

    申请号:US10785199

    申请日:2004-02-25

    IPC分类号: G10L13/00 G10L13/08

    CPC分类号: G10L13/10

    摘要: Techniques are provided for synthesizing speech using discourse function level prosodic features. An output text is determined. The discourse functions within the text are determined based on a theory of discourse analysis such as the Unified Linguistic Discourse Model. The salient prosodic features associated with the discourse functions are identified using a predictive model of discourse functions or some other model of salient prosodic features. The discourse functions are transformed into synthesized speech. Discourse function level prosodic feature adjustments are determined and applied to the synthesized speech is output.

    摘要翻译: 提供了使用话语功能级韵律特征来合成语音的技术。 确定输出文本。 文本中的话语功能是基于话语分析理论(如统一语言话语模型)来确定的。 与话语功能相关的突出韵律特征使用话语功能的预测模型或其他一些显着的韵律特征模型进行识别。 话语功能被转化为合成语音。 话语功能水平韵律特征调整被确定并应用于合成语音输出。

    Identification of semantic relationships within reported speech
    6.
    发明授权
    Identification of semantic relationships within reported speech 有权
    在报告的语音中识别语义关系

    公开(公告)号:US08868562B2

    公开(公告)日:2014-10-21

    申请号:US12201675

    申请日:2008-08-29

    IPC分类号: G06F7/00 G06F17/30

    摘要: Methods and computer-readable media for associating words or groups of words distilled from content, such as reported speech or an attitude report, of a document to form semantic relationships collectively used to generate a semantic representation of the content are provided. Semantic representations may include elements identified or parsed from a text portion of the content, the elements of which may be associated with other elements that share a semantic relationship, such as an agent, location, or topic relationship. Relationships may also be developed by associating one element that is in relation to, or is about, another element, thereby allowing for rapid and effective comparison of associations found in a semantic representation with associations derived from queries. The semantic relationships may be determined based on semantic information, such as potential meanings and grammatical functions of each element within the text portion of the content.

    摘要翻译: 提供了用于将从内容中提取出来的单词或组的组合的方法和计算机可读介质,例如文档的报告语音或态度报告,以形成共同用于生成内容的语义表示的语义关系。 语义表示可以包括从内容的文本部分识别或解析的元素,其元素可以与共享语义关系的其他元素(例如代理,位置或主题关系)相关联。 还可以通过将一个与另一个元素相关或相关的元素相关联来开发关系,从而允许快速有效地将在语义表示中找到的关联与从查询导出的关联进行比较。 可以基于诸如内容的文本部分中的每个元素的潜在含义和语法功能的语义信息来确定语义关系。

    Automatic generation of digital composite product reviews
    7.
    发明授权
    Automatic generation of digital composite product reviews 有权
    自动生成数字复合材料产品评论

    公开(公告)号:US08671098B2

    公开(公告)日:2014-03-11

    申请号:US13232031

    申请日:2011-09-14

    IPC分类号: G06F17/30

    CPC分类号: G06Q30/0278

    摘要: Consumers receive module-computed composite reviews that are lively, informative, coherent, and representative of a larger underlying collection of reviews. Representative phrases from reviews are extracted and aggregated into coherent sentences to create the composite review. Clear automatable criteria are provided to define coherence and other qualities, such as representativeness, liveliness, and informativity. Sentence coherence criteria involve syntax, shared vocabulary, phrase connectors, and phrase sentiment polarity, for instance. Phrase representativeness criteria involve review ratings and derived phrase ratings, for instance. Phrase liveliness criteria involve sentiment expression frequency, superlatives, comparatives, degree modifiers, affect activation scores, and affect imagery scores, for instance. Phrase informativity criteria involve product-specific words, review length, and recency, for instance. Prohibited language is filtered out. Composite reviews are automatically distributed, e.g., in response to a web search on the reviewed product. Reviews can be generated with a repeatability and rapidity not attainable by human performance alone.

    摘要翻译: 消费者接受模块计算的综合评论,这些评论是活泼,翔实,连贯和代表较大的基础评论集合。 来自评论的代表性短语被提取并聚合成连贯句子以创建复合评估。 提供了清晰的自动化标准来定义一致性和其他质量,如代表性,活力和信息。 句子一致性标准包括语法,共享词汇,短语连接符和短语情感极性。 例如,短语代表性标准涉及审查评级和派生短语评分。 短语活动标准包括情绪表达频率,最高分,比较,程度修饰,影响激活分数,并影响图像分数。 短语信息标准涉及到产品特定的单词,审查长度和新近程度。 被禁止的语言被过滤掉。 复合评估是自动分发的,例如,响应于对所审查产品的网络搜索。 可以产生可重复性和快速性的评价,而不能仅靠人性化实现。

    Systems and methods for generating analytic summaries
    8.
    发明授权
    Systems and methods for generating analytic summaries 有权
    用于生成分析摘要的系统和方法

    公开(公告)号:US07092872B2

    公开(公告)日:2006-08-15

    申请号:US09883345

    申请日:2001-06-19

    IPC分类号: G06F17/27

    摘要: A technique for compressing texts such that referential integrity, sentence coherency, punctuation and readability are preserved and which provides for compression of sentence constituents based on the type of content, the informativity of the sentence constituent and the grammatical readability of the resultant sentence or phrase. Information content portions are parsed to generate parts of speech tags. The informativity of the constituents in a phrase or sentence is determined and the parts of speech having lower information content and having a low effect on grammatical readability of the phrase or sentence are selectively compressed. Parts of speech having successively higher informativity and low effect on grammatical readability are selected for compression until the desired level of compression is reached. Compressed portions are indicated in the summary with a selectable placeholder which expands to display the compressed text.

    摘要翻译: 一种用于压缩文本的技术,使得参照完整性,句子一致性,标点符号和可读性被保留,并且其基于内容的类型,语句成分的信息性和所得到的句子或短语的语法可读性来提供语法成分的压缩。 解析信息内容部分以产生语音标签的部分。 确定短语或句子中的成分的信息性,并且选择性地压缩具有较低信息含量且对短语或句子的语法可读性影响较小的部分。 选择具有连续更高的信息性和对语法可读性的低影响的语音部分,用于压缩直到达到期望的压缩水平。 压缩部分在摘要中用可选占位符指示,扩展以显示压缩文本。

    System and method for teaching writing using microanalysis of text
    9.
    发明授权
    System and method for teaching writing using microanalysis of text 有权
    使用文本微分析教学写作的系统和方法

    公开(公告)号:US07013259B1

    公开(公告)日:2006-03-14

    申请号:US09609325

    申请日:2000-06-30

    IPC分类号: G06F17/27

    摘要: A technique for teaching expository writing using a system that provides an objective reader centric microanalysis of the information a writer has conveyed to a virtual reader. The technique uses a theory of discourse analysis such as the Linguistic Discourse Model. Using the technique, a text is segmented into discrete units of meaning of the selected theoretic model. Student analysis and understanding are facilitated by the assignment of types to the discrete units of meaning and by linking the discrete units of meaning into a discourse tree under the constraints imposed by the selected theory. A virtual, or objective, reader centric summary of the information actually conveyed by the text is then compared to the writer designated important concepts and the results conveyed as feedback to the writer.

    摘要翻译: 一种使用提供作者向虚拟阅读器传达的信息的客观读者中心微分析的系统来教导说明书写作的技术。 该技术使用话语分析理论,如语言话语模型。 使用该技术,文本被分割成所选理论模型的意义的离散单位。 通过将类型分配给离散的意义单位以及通过所选理论施加的约束将意义的离散单位链接到话语树中来促进学生分析和理解。 然后将文本实际传达的信息的虚拟或客观的读者中心概要与作者指定的重要概念进行比较,并将结果作为对作者的反馈传达。

    AUTOMATIC GENERATION OF DIGITAL COMPOSITE PRODUCT REVIEWS
    10.
    发明申请
    AUTOMATIC GENERATION OF DIGITAL COMPOSITE PRODUCT REVIEWS 有权
    数字复合产品评论自动生成

    公开(公告)号:US20130066873A1

    公开(公告)日:2013-03-14

    申请号:US13232031

    申请日:2011-09-14

    IPC分类号: G06F17/30

    CPC分类号: G06Q30/0278

    摘要: Consumers receive module-computed composite reviews that are lively, informative, coherent, and representative of a larger underlying collection of reviews. Representative phrases from reviews are extracted and aggregated into coherent sentences to create the composite review. Clear automatable criteria are provided to define coherence and other qualities, such as representativeness, liveliness, and informativity. Sentence coherence criteria involve syntax, shared vocabulary, phrase connectors, and phrase sentiment polarity, for instance. Phrase representativeness criteria involve review ratings and derived phrase ratings, for instance. Phrase liveliness criteria involve sentiment expression frequency, superlatives, comparatives, degree modifiers, affect activation scores, and affect imagery scores, for instance. Phrase informativity criteria involve product-specific words, review length, and recency, for instance. Prohibited language is filtered out. Composite reviews are automatically distributed, e.g., in response to a web search on the reviewed product. Reviews can be generated with a repeatability and rapidity not attainable by human performance alone.

    摘要翻译: 消费者接受模块计算的综合评论,这些评论是活泼,翔实,连贯和代表较大的基础评论集合。 来自评论的代表性短语被提取并聚合成连贯句子以创建复合评估。 提供了清晰的自动化标准来定义一致性和其他质量,如代表性,活力和信息。 句子一致性标准包括语法,共享词汇,短语连接符和短语情感极性。 例如,短语代表性标准涉及审查评级和派生短语评分。 短语活动标准包括情绪表达频率,最高分,比较,程度修饰,影响激活分数,并影响图像分数。 短语信息标准涉及到产品特定的单词,审查长度和新近程度。 被禁止的语言被过滤掉。 复合评估是自动分发的,例如,响应于对所审查产品的网络搜索。 可以产生可重复性和快速性的评价,而不能仅靠人性化实现。