Methods, apparatus and computer programs for characterizing web resources
    1.
    发明授权
    Methods, apparatus and computer programs for characterizing web resources 失效
    用于表征网络资源的方法,设备和计算机程序

    公开(公告)号:US07516397B2

    公开(公告)日:2009-04-07

    申请号:US10901275

    申请日:2004-07-28

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30864 G06F17/30896

    摘要: Methods, apparatus and computer programs are provided for characterizing Web-based information resources based on their interactions. A Web-based information resource is a single Web document or a collection of related Web documents. Unlike simple text documents, Web documents contain hyperlinks and other HTML tags. Different types of interactions, including inbound hyperlinks, outbound hyperlinks and internal links associated with a Web-based information resource, are used to characterize the Web-based information resource. A DOM tree representing the tag structure of a Web-based information resource is used to identify text items likely to be useful as context for a hyperlink anchor text, and the anchor text is combined with the context to generate a representation. The representation of Web-based information resources based on interactions can be used for clustering and classification, and in Web mining applications such as query disambiguation and automatic taxonomy generation.

    摘要翻译: 提供方法,装置和计算机程序,用于基于它们的相互作用来表征基于Web的信息资源。 基于Web的信息资源是单个Web文档或相关Web文档的集合。 与简单的文本文档不同,Web文档包含超链接和其他HTML标签。 使用不同类型的交互,包括入站超链接,出站超链接和与基于Web的信息资源相关联的内部链接,用于表征基于Web的信息资源。 代表基于Web的信息资源的标签结构的DOM树用于识别可能作为超链接锚文本的上下文有用的文本项,并且锚文本与上下文组合以生成表示。 基于互动的基于Web的信息资源的表示可以用于聚类和分类,以及Web挖掘应用程序,如查询消歧和自动分类法生成。

    Methods, apparatus and computer programs for characterizing web resources

    公开(公告)号:US20060026496A1

    公开(公告)日:2006-02-02

    申请号:US10901275

    申请日:2004-07-28

    IPC分类号: G06F17/21

    CPC分类号: G06F17/30864 G06F17/30896

    摘要: Methods, apparatus and computer programs are provided for characterizing Web-based information resources based on their interactions. A Web-based information resource is a single Web document or a collection of related Web documents. Unlike simple text documents, Web documents contain hyperlinks and other HTML tags. Different types of interactions, including inbound hyperlinks, outbound hyperlinks and internal links associated with a Web-based information resource, are used to characterize the Web-based information resource. A DOM tree representing the tag structure of a Web-based information resource is used to identify text items likely to be useful as context for a hyperlink anchor text, and the anchor text is combined with the context to generate a representation. The representation of Web-based information resources based on interactions can be used for clustering and classification, and in Web mining applications such as query disambiguation and automatic taxonomy generation.

    METHOD AND SYSTEM FOR CATEGORIZING TOPIC DATA WITH CHANGING SUBTOPICS
    3.
    发明申请
    METHOD AND SYSTEM FOR CATEGORIZING TOPIC DATA WITH CHANGING SUBTOPICS 审中-公开
    用于分类主题数据与更改子句的方法和系统

    公开(公告)号:US20090150436A1

    公开(公告)日:2009-06-11

    申请号:US11953198

    申请日:2007-12-10

    IPC分类号: G06F17/30

    CPC分类号: G06F16/355

    摘要: The embodiments of the invention provide a method for the automatic identification of changing subtopics within topics. The method begins by receiving customer satisfaction data having unstructured data objects. Next, the data objects are automatically categorized into pre-defined topics, wherein the pre-defined topics do not change throughout the customer satisfaction analysis. The pre-defined topics can be automatically defined based on a history of customer satisfaction data. Following this, a clustering analysis is automatically performed to identify subtopics of the data objects within the pre-defined topics. The subtopics are more specific than the pre-defined topics, and the subtopics can change. Further, the clustering analysis can include extracting features from the data objects and grouping the features into the subtopics. Each of the subtopics includes features having a predetermined degree of similarity.

    摘要翻译: 本发明的实施例提供了一种用于在主题内自动识别不断变化的子主题的方法。 该方法开始于接收具有非结构化数据对象的客户满意度数据。 接下来,数据对象被自动分类为预定义的主题,其中预定义的主题在整个客户满意度分析中不改变。 可以根据客户满意度数据的历史自动定义预定义的主题。 此后,将自动执行聚类分析,以确定预定义主题内的数据对象的子主题。 子主题比预定义的主题更具体,子主题可以更改。 此外,聚类分析可以包括从数据对象中提取特征并将特征分组到子主题中。 每个子主题包括具有预定相似度的特征。

    Web page preview without browsing to web page
    4.
    发明申请
    Web page preview without browsing to web page 审中-公开
    网页预览,无需浏览网页

    公开(公告)号:US20070073833A1

    公开(公告)日:2007-03-29

    申请号:US11237366

    申请日:2005-09-28

    IPC分类号: G06F15/16

    CPC分类号: H04L67/02 G06F16/954

    摘要: Web pages are previewed without actually having to browse to those web pages. A method is performed in relation to a first web page being browsed by a user and that has a hyperlink to a second web page. The second web page is acquired, and a site-specific preview, a user-specific preview, and a time-specific preview of the second web page are constructed. The site-specific preview is specific to a web site encompassing the second web page. The user-specific preview is specific to the user browsing the first web page. The time-specific preview is nominally specific to a time at which the user previews the second web page. These three previews are combined into an overall preview. In response to the user performing an action in relation to the hyperlink on the first web page, the overall preview of the second web page is displayed without browsing to that page.

    摘要翻译: 预览网页,而不必浏览到这些网页。 执行关于由用户浏览的第一网页并且具有到第二网页的超链接的方法。 获取第二网页,并且构建第二网页的特定于站点的预览,用户特定的预览和时间特定的预览。 特定于网站的预览特定于包含第二个网页的网站。 用户特定的预览特定于浏览第一个网页的用户。 时间特定的预览名义上是指用户预览第二个网页的时间。 这三个预览组合成一个整体预览。 响应于用户执行与第一网页上的超链接有关的动作,第二网页的整体预览被显示,而不浏览该页面。

    METHOD FOR SEGMENTING COMMUNICATION TRANSCRIPTS USING UNSUPERVSED AND SEMI-SUPERVISED TECHNIQUES
    5.
    发明申请
    METHOD FOR SEGMENTING COMMUNICATION TRANSCRIPTS USING UNSUPERVSED AND SEMI-SUPERVISED TECHNIQUES 审中-公开
    使用不间断和半监督技术分隔通信转录的方法

    公开(公告)号:US20090112588A1

    公开(公告)日:2009-04-30

    申请号:US11931806

    申请日:2007-10-31

    IPC分类号: G10L15/06

    CPC分类号: G10L15/04 G06F16/355

    摘要: A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a specified number of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters within sequences of the collection.

    摘要翻译: 提供了一种用于从事务通信的通信转录语料库形成一个或多个顺序句子的离散段聚类的方法,其包括将语料库的通信记录分成由呼叫者说出的第一组句子和第二组句子 由答复者 通过使用无监督分数聚类方法,根据词汇相似度的度量,对第一和第二组句子进行分组,从而产生指定数目的句子群; 通过为每个句子集分配不同的句子类型并以分配给句子分组的句子集合的句子类型表示语料库的每个通信录音的每个句子来生成句子序列的集合; 以及通过根据在集合的序列内分配给句子集群的句子类型之间的基于邻近度的度量连续地合并语句集群来生成指定数量的离散分段集群。

    CROWDSOURCING TRANSLATION SERVICES
    6.
    发明申请
    CROWDSOURCING TRANSLATION SERVICES 审中-公开
    CROWDSOURCING翻译服务

    公开(公告)号:US20140058718A1

    公开(公告)日:2014-02-27

    申请号:US13592736

    申请日:2012-08-23

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2836 G06F17/2854

    摘要: A method, system, and computer program product for translating a text file are disclosed. A text file in a source language is received and text snippets from the text file are extracted. The text snippets are distributed to a first set of remote workers for translation. The translated text snippets are validated by a second set of remote workers and the validated text snippets are used to generate a translated text file.

    摘要翻译: 公开了一种用于翻译文本文件的方法,系统和计算机程序产品。 收到源语言的文本文件,并从文本文件中提取文本片段。 文本片段分发给第一组远程工作人员进行翻译。 翻译的文本片段由第二组远程工作人员验证,验证的文本片段用于生成翻译的文本文件。

    Partial Access to Electronic Documents and Aggregation for Secure Document Distribution
    7.
    发明申请
    Partial Access to Electronic Documents and Aggregation for Secure Document Distribution 有权
    电子文档的部分访问和安全文档分发的聚合

    公开(公告)号:US20120216290A1

    公开(公告)日:2012-08-23

    申请号:US13364745

    申请日:2012-02-02

    IPC分类号: G06F21/24

    CPC分类号: G06F21/6227 G06F21/10

    摘要: Partial access to electronic documents and aggregation for secure document distribution is disclosed. The embodiments herein relate to providing access to electronic documents and, more particularly, to providing access to portions of electronic documents and aggregating such portions in secure document distribution environment. Existing document distribution mechanisms do not provide means to access partial documents based on the attributes such as roles of the agents within an organization, location of access, time of access, device ID and so on. The disclosed method allows agents to access partial contents of documents based on the attributes. Meta data tags are attached to the documents in order to control the access of the documents by the defined attributes. The agent who wishes to access the document enters his credential and based on the credentials he is provided access to the content that is assigned for him

    摘要翻译: 披露了部分访问电子文档和聚合安全文件分发。 本文的实施例涉及提供对电子文档的访问,更具体地,涉及提供对电子文档的部分的访问并在安全文档分发环境中聚合这些部分。 现有文档分发机制不提供基于诸如组织内的代理角色,访问位置,访问时间,设备ID等等的属性来访问部分文档的方法。 所公开的方法允许代理基于属性来访问文档的部分内容。 元数据标签附加到文档,以便根据定义的属性控制文档的访问。 希望访问该文档的代理人输入他的凭证,并根据他所提供的凭证访问为他分配的内容

    Feedback based technique towards total completion of task in crowdsourcing

    公开(公告)号:US09727881B2

    公开(公告)日:2017-08-08

    申请号:US13350965

    申请日:2012-01-16

    IPC分类号: G06Q10/00 G06Q30/02

    CPC分类号: G06Q30/02

    摘要: The present disclosure provides a method for incenting potential contributors for creating content in response to a posting. The method comprises: posting a task to a first crowdsource with the task having a first expiry period of δ1; waiting for δ1 period to expire; determining whether the task is complete; reposting the task if not complete including a second expiry period of δ2; waiting for the second period of δ2 to expire; reposting the task if not yet complete including an increased reward and a third expiry period of δ3; waiting for the third period of δ3 to expire; and, reposting the task if still not complete, wherein the reposting includes a second crowdsource.

    System and method to implement sharing of paper documents using virtual currency
    9.
    发明授权
    System and method to implement sharing of paper documents using virtual currency 有权
    使用虚拟货币实现纸质文件共享的系统和方法

    公开(公告)号:US08848242B2

    公开(公告)日:2014-09-30

    申请号:US13600613

    申请日:2012-08-31

    IPC分类号: G06K15/02

    CPC分类号: G06Q10/10 G06Q50/01 H04L67/10

    摘要: The application discloses systems and methods for physically sharing a hard copy of a document. The systems and methods include presenting to a user a graphical user interface having printing options for printing the document, where the graphical user interface has an input for receiving an indication by the user that the user is willing to share the hard copy of the document; presenting to the user options for defining characteristics of the hard copy of the document in response to receiving the indication; and publishing at least one of the defined characteristics within a profile page of the user.

    摘要翻译: 本申请公开了用于物理共享文档的硬拷贝的系统和方法。 系统和方法包括向用户呈现具有用于打印文档的打印选项的图形用户界面,其中图形用户界面具有用于接收用户愿意共享文档的硬拷贝的指示的输入; 向用户呈现响应于接收到指示来定义文档的硬拷贝的特性的选项; 以及在所述用户的简档页面中发布所定义的特征中的至少一个。