Method and apparatus for detecting and extracting information from dynamically generated web pages
    12.
    发明授权
    Method and apparatus for detecting and extracting information from dynamically generated web pages 有权
    用于从动态生成的网页中检测和提取信息的方法和装置

    公开(公告)号:US07877396B1

    公开(公告)日:2011-01-25

    申请号:US11546967

    申请日:2006-10-12

    IPC分类号: G06F17/30

    CPC分类号: G06F17/3089

    摘要: A method and apparatus for automatically detecting and extracting information from dynamically generated web pages are disclosed. For example, the present method stores user provided information that is entered into a form interface of a web page for a first query. Responsive to the first query, a first response web page is received and stored. The present method then automatically generates a second query to acquire a second response web page that is responsive to the second query. Finally, the present method compares the first response web page and the second response web page. In one embodiment, the present invention extracts information that is dissimilar between the first response web page and the second response web page. This extracted information is deemed to be the pertinent information requested by the user.

    摘要翻译: 公开了一种用于从动态生成的网页自动检测和提取信息的方法和装置。 例如,本方法将用户提供的信息存储到用于第一查询的网页的表单界面中。 响应于第一个查询,收到并存储第一个响应网页。 本方法然后自动生成第二查询以获取响应于第二查询的第二响应网页。 最后,本方法比较第一响应网页和第二响应网页。 在一个实施例中,本发明提取在第一响应网页和第二响应网页之间不相似的信息。 该提取的信息被认为是用户请求的相关信息。

    System and method of providing a spoken dialog interface to a website
    13.
    发明授权
    System and method of providing a spoken dialog interface to a website 有权
    向网站提供口语对话界面的系统和方法

    公开(公告)号:US08249879B2

    公开(公告)日:2012-08-21

    申请号:US13290501

    申请日:2011-11-07

    IPC分类号: G10L15/18 G06F17/27

    摘要: Disclosed is a system and method for training a spoken dialog service component from website data. Spoken dialog service components typically include an automatic speech recognition module, a language understanding module, a dialog management module, a language generation module and a text-to-speech module. The method includes converting data from a structured database associated with a website to a structured text data set and a structured task knowledge base, extracting linguistic items from the structured database, and training a spoken dialog service component using at least one of the structured text data, the structured task knowledge base, or the linguistic items. The system includes modules configured to implement the method.

    摘要翻译: 公开了一种用于从网站数据训练口语对话服务组件的系统和方法。 口语对话服务组件通常包括自动语音识别模块,语言理解模块,对话管理模块,语言生成模块和文本到语音模块。 该方法包括将来自与网站相关联的结构化数据库的数据转换为结构化文本数据集和结构化任务知识库,从结构化数据库中提取语言项目,以及使用至少一个结构化文本数据来训练口语对话服务组件 ,结构化任务知识库或语言项目。 该系统包括配置为实现该方法的模块。

    Method and Apparatus for Building Sales Tools by Mining Data from Websites
    15.
    发明申请
    Method and Apparatus for Building Sales Tools by Mining Data from Websites 失效
    通过网站挖掘数据构建销售工具的方法和装置

    公开(公告)号:US20110258531A1

    公开(公告)日:2011-10-20

    申请号:US13088935

    申请日:2011-04-18

    IPC分类号: G06F17/00

    摘要: A website mining tool is disclosed that extracts information from, for example, a company's website and presents the extracted information in a graphical user interface (GUI). In one embodiment, web pages from a website are stored in, for example, computer memory and a structure of the web pages is identified. A plurality of blocks of information is then extracted as a function of this structure and a category is assigned to each block of information. The elements in the blocks of information are then displayed, for example to a salesperson, as a function of these categories. In another embodiment, Document Object Modeling parsing is used to identify the structure of the web pages. In yet another embodiment, a support vector machine is used to categorize each block of information.

    摘要翻译: 公开了一种网站挖掘工具,其从例如公司的网站中提取信息,并将所提取的信息呈现在图形用户界面(GUI)中。 在一个实施例中,来自网站的网页被存储在例如计算机存储器中,并且识别网页的结构。 然后根据该结构提取多个信息块,并将类别分配给每个信息块。 然后,作为这些类别的函数,将信息块中的元素显示为例如销售人员。 在另一个实施例中,文档对象建模解析用于识别网页的结构。 在另一个实施例中,支持向量机用于对每个信息块进行分类。

    System and method for generating customized text-to-speech voices
    16.
    发明授权
    System and method for generating customized text-to-speech voices 有权
    用于生成定制的文本到语音语音的系统和方法

    公开(公告)号:US08666746B2

    公开(公告)日:2014-03-04

    申请号:US10845364

    申请日:2004-05-13

    IPC分类号: G10L13/00 G10L13/08

    摘要: A system and method are disclosed for generating customized text-to-speech voices for a particular application. The method comprises generating a custom text-to-speech voice by selecting a voice for generating a custom text-to-speech voice associated with a domain, collecting text data associated with the domain from a pre-existing text data source and using the collected text data, generating an in-domain inventory of synthesis speech units by selecting speech units appropriate to the domain via a search of a pre-existing inventory of synthesis speech units, or by recording the minimal inventory for a selected level of synthesis quality. The text-to-speech custom voice for the domain is generated utilizing the in-domain inventory of synthesis speech units. Active learning techniques may also be employed to identify problem phrases wherein only a few minutes of recorded data is necessary to deliver a high quality TTS custom voice.

    摘要翻译: 公开了用于为特定应用产生定制的文本到语音语音的系统和方法。 该方法包括通过选择用于生成与域相关联的自定义文本到语音语音的语音来生成自定义文本到语音语音,从预先存在的文本数据源收集与域相关联的文本数据,并使用收集的 文本数据,通过搜索合成语音单元的预先存在的库存来选择适合于该域的语音单元,或者通过记录所选合成质量水平的最小库存来生成合成语音单元的域内库存。 使用合成语音单元的域内库存来生成域的文本到语音定制语音。 还可以使用主动学习技术来识别问题短语,其中只需要几分钟的记录数据来传送高质量的TTS定制语音。

    System and method of automating a spoken dialogue service
    17.
    发明授权
    System and method of automating a spoken dialogue service 有权
    自动化口语对话服务的系统和方法

    公开(公告)号:US08566102B1

    公开(公告)日:2013-10-22

    申请号:US10288764

    申请日:2002-11-06

    IPC分类号: G10L21/00

    CPC分类号: G10L15/22 G10L2015/228

    摘要: A system and method of generating and operating a spoken dialog service for a web-site are disclosed. The system parses web-site data and organizes the web-site data in a task knowledge data bank. The system receives text associated with a user query; processes the received text in a spoken language understanding (SLU) module, the SLU module using the web-site data from the task knowledge data bank; generates a ranked list of relevant responses to the user query; generates a hierarchical tree using the web-site data and the ranked list of relevant responses to the user query, generates a response to the user query using the hierarchical tree; and presents the response to the user.

    摘要翻译: 公开了一种用于生成和操作用于网站的口语对话服务的系统和方法。 系统解析网站数据,并在任务知识数据库中组织网站数据。 系统接收与用户查询相关联的文本; 处理使用来自任务知识数据库的网站数据的口语理解(SLU)模块中的接收到的文本,SLU模块; 生成用户查询的相关响应的排名列表; 使用网站数据和与用户查询的相关响应的排名列表来生成分层树,使用分层树生成对用户查询的响应; 并向用户呈现响应。

    SYSTEM AND METHOD OF PROVIDING A SPOKEN DIALOG INTERFACE TO A WEBSITE
    18.
    发明申请
    SYSTEM AND METHOD OF PROVIDING A SPOKEN DIALOG INTERFACE TO A WEBSITE 有权
    向网站提供SPOKEN对话界面的系统和方法

    公开(公告)号:US20120316866A1

    公开(公告)日:2012-12-13

    申请号:US13587554

    申请日:2012-08-16

    IPC分类号: G06F17/27

    摘要: Disclosed is a system and method for training a spoken dialog service component from website data. Spoken dialog service components typically include an automatic speech recognition module, a language understanding module, a dialog management module, a language generation module and a text-to-speech module. The method includes converting data from a structured database associated with a website to a structured text data set and a structured task knowledge base, extracting linguistic items from the structured database, and training a spoken dialog service component using at least one of the structured text data, the structured task knowledge base, or the linguistic items. The system includes modules configured to implement the method.

    摘要翻译: 公开了一种用于从网站数据训练口语对话服务组件的系统和方法。 口语对话服务组件通常包括自动语音识别模块,语言理解模块,对话管理模块,语言生成模块和文本到语音模块。 该方法包括将来自与网站相关联的结构化数据库的数据转换为结构化文本数据集和结构化任务知识库,从结构化数据库中提取语言项目,以及使用至少一个结构化文本数据来训练口语对话服务组件 ,结构化任务知识库或语言项目。 该系统包括配置为实现该方法的模块。

    System and method of automatically generating building dialog services by exploiting the content and structure of websites
    19.
    发明授权
    System and method of automatically generating building dialog services by exploiting the content and structure of websites 有权
    通过利用网站的内容和结构自动生成建筑对话服务的系统和方法

    公开(公告)号:US08090583B1

    公开(公告)日:2012-01-03

    申请号:US11929060

    申请日:2007-10-30

    IPC分类号: G10L11/00

    CPC分类号: G10L15/22 G10L15/193

    摘要: A method and system are disclosed for providing a dialog interface for a website. The method comprises at each node in a website, computing a summary, a document description and an alias. A dialog manager within a spoken dialog service utilizes the summary, document description and alias for each website node to generate prompts to a user, wherein nodes in the website are matched with user requests. In this manner, a spoken dialog interface to the website content and navigation may be generated automatically.

    摘要翻译: 公开了一种用于提供网站的对话界面的方法和系统。 该方法包括在网站中的每个节点处,计算摘要,文档描述和别名。 口语对话服务中的对话管理器利用每个网站节点的摘要,文档描述和别名来向用户生成提示,其中网站中的节点与用户请求匹配。 以这种方式,可以自动生成对网站内容和导航的口头对话界面。

    Method and apparatus for automatically building conversational systems
    20.
    发明授权
    Method and apparatus for automatically building conversational systems 有权
    自动构建对话系统的方法和装置

    公开(公告)号:US07660400B2

    公开(公告)日:2010-02-09

    申请号:US10742466

    申请日:2003-12-19

    IPC分类号: H04M1/64

    摘要: A system and method provides a natural language interface to world-wide web content. Either in advance or dynamically, webpage content is parsed using a parsing algorithm. A person using a telephone interface can provide speech information, which is converted to text and used to automatically fill in input fields on a webpage form. The form is then submitted to a database search and a response is generated. Information contained on the responsive webpage is extracted and converted to speech via a text-to-speech engine and communicated to the person.

    摘要翻译: 系统和方法为世界各地的Web内容提供了一种自然语言界面。 提前或动态地,使用解析算法解析网页内容。 使用电话接口的人可以提供语音信息,其被转换成文本并用于自动填写网页表单上的输入字段。 然后将表单提交到数据库搜索,并生成响应。 包含在响应网页上的信息被提取并经由文本到语音引擎转换成语音,并传达给该人。