GENERATING TEXT SNIPPETS USING UNIVERSAL CONCEPT GRAPH
    1.
    发明申请
    GENERATING TEXT SNIPPETS USING UNIVERSAL CONCEPT GRAPH 审中-公开
    使用通用概念图生成文本片段

    公开(公告)号:WO2017143096A1

    公开(公告)日:2017-08-24

    申请号:PCT/US2017/018220

    申请日:2017-02-16

    Abstract: In an example embodiment, a method for selecting text snippets to display on a computer display is provided. A universal concept graph for phrases relevant to a search domain is created, the universal concept graph representing each phrase as a node and relationships between the phrases as edges between the nodes. A result in the search domain is represented as a subgraph of the universal concept graph by extracting a portion of the universal concept graph containing phrases contained in the result. Then, a score is produced for each node of the subgraph, the score based on a graph analysis algorithm applied to the subgraph. Then text snippets to display for the result are selected to be displayed based on the scores produced in the subgraph for phrases contained in the text snippets.

    Abstract translation: 在示例实施例中,提供了一种用于选择文本片段以在计算机显示器上显示的方法。 创建与搜索域相关的短语的通用概念图,通用概念图将每个短语表示为节点,并且将短语之间的关系表示为节点之间的边缘。 通过提取包含在结果中的短语的通用概念图的一部分,将搜索域中的结果表示为通用概念图的子图。 然后,为子图的每个节点生成一个分数,该分数基于应用于子图的图分析算法。 然后根据文章片段中包含的短语在子图中生成的分数选择要显示结果的文本片段。

    SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT
    2.
    发明申请
    SEMANTIC MULTISENSORY EMBEDDINGS FOR VIDEO SEARCH BY TEXT 审中-公开
    用于视频搜索的语义多用途嵌入式文本

    公开(公告)号:WO2017052791A1

    公开(公告)日:2017-03-30

    申请号:PCT/US2016/045353

    申请日:2016-08-03

    Abstract: A method of embedding video for text search includes extracting visual features from a video. The visual features may, for example, include appearance information, motion, audio, and/or like features. Term vectors are determined from textual descriptions associated with the video. The text may be included in a title for the video or included within the video (e.g., subtitles), for example. A feature projection is computed based on the extracted video features and a textual projection is computed based on the term vectors. A semantic embedding is computed based on the feature projection and the textual projection by jointly optimizing semantic predictability and semantic descriptiveness.

    Abstract translation: 嵌入用于文本搜索的视频的方法包括从视频中提取视觉特征。 视觉特征可以例如包括外观信息,运动,音频和/或类似特征。 术语向量由与视频关联的文本描述确定。 例如,文本可以被包括在视频的标题中或被包括在视频内(例如,字幕)中。 基于所提取的视频特征来计算特征投影,并且基于项向量来计算文本投影。 通过联合优化语义可预测性和语义描述性,基于特征投影和文本投影来计算语义嵌入。

    CREATING A TRAINING DATA SET BASED ON UNLABELED TEXTUAL DATA
    3.
    发明申请
    CREATING A TRAINING DATA SET BASED ON UNLABELED TEXTUAL DATA 审中-公开
    根据未经批准的文本数据创建培训数据集

    公开(公告)号:WO2017040663A1

    公开(公告)日:2017-03-09

    申请号:PCT/US2016/049700

    申请日:2016-08-31

    Applicant: SKYTREE, INC.

    CPC classification number: G06F17/30675 G06F17/30705 G06N99/005

    Abstract: A system and method are disclosed for obtaining a plurality of unlabeled text documents; obtaining an initial concept; obtaining keywords from a knowledge source based on the initial concept; scoring the plurality of unlabeled documents based at least in part on the initial keywords; determining a categorization of the documents based on the scores; performing a first feature selection and creating a first vector space representation of each document in a first category and a second category, the first and second categories based on the scores, the first vector space representation serving as one or more labels for an associated unlabeled textual document; and generating the training set including a subset of the obtained unlabeled textual documents, the subset of the obtained unlabeled documents including a documents belonging to the first category and documents belonging to the second category.

    Abstract translation: 公开了一种用于获得多个未标记的文本文档的系统和方法; 获得初始概念; 基于初始概念从知识源获取关键字; 至少部分地基于初始关键词对多个未标记的文档进行评分; 根据分数确定文件的分类; 执行第一特征选择并且在第一类别和第二类别中创建每个文档的第一向量空间表示,所述第一和第二类别基于所述分数,所述第一向量空间表示用作相关联的未标记文本的一个或多个标签 文件; 以及生成包括所获得的未标记文本文档的子集的训练集合,所获得的未标记文档的子集包括属于第一类别的文档和属于第二类别的文档。

    TEXT RESTRUCTURING
    4.
    发明申请
    TEXT RESTRUCTURING 审中-公开
    文本重构

    公开(公告)号:WO2016171709A1

    公开(公告)日:2016-10-27

    申请号:PCT/US2015/027445

    申请日:2015-04-24

    Abstract: In example implementations, a plurality of re-structured version of texts is generated for each one of a plurality of different documents by applying a plurality of text summarization methods to each one of the plurality of different documents. An effectiveness score is calculated for each one of the plurality of text summarization methods to determine the text summarization method with the highest effectiveness score for an application. The plurality of re-structured versions of text for each one of the plurality of different documents that is generated by the text summarization method that has the highest effectiveness score is stored to be used in the application.

    Abstract translation: 在示例实现中,通过对多个不同文档中的每一个应用多个文本汇总方法,为多个不同文档中的每个文档生成多个文本的重组结构版本。 针对多个文本汇总方法中的每一个计算有效性分数,以确定应用程序的最高有效性得分的文本汇总方法。 由具有最高有效性得分的文本摘要方法产生的多个不同文档中的每一个的文本的多个重组结构版本被存储以供应用。

    PROTECTED INDEXING AND QUERYING OF LARGE SETS OF TEXTUAL DATA
    5.
    发明申请
    PROTECTED INDEXING AND QUERYING OF LARGE SETS OF TEXTUAL DATA 审中-公开
    保护大量文本数据集的索引和查询

    公开(公告)号:WO2016053714A1

    公开(公告)日:2016-04-07

    申请号:PCT/US2015/051681

    申请日:2015-09-23

    Abstract: A protected querying technique involves creating shingles from a query and then fingerprinting the shingles. The documents to be queried are also shingled and then fingerprinted. The overlap between adjacent shingles for the query and the documents to be queried is different, there being less, or no overlap for the query shingles. The query fingerprint is compared to the fingerprints of the documents to be queried to determine whether there are any matches.

    Abstract translation: 受保护的查询技术涉及从查询中创建带状键,然后指示瓦片。 要查询的文件也被遮盖,然后进行指纹识别。 用于查询的相邻散光板和要查询的文档之间的重叠是不同的,对于查询木瓦来说,较少或没有重叠。 将查询指纹与要查询的文档的指纹进行比较,以确定是否存在任何匹配。

    TECHNIQUES FOR SIMILARITY ANALYSIS AND DATA ENRICHMENT USING KNOWLEDGE SOURCES
    6.
    发明申请
    TECHNIQUES FOR SIMILARITY ANALYSIS AND DATA ENRICHMENT USING KNOWLEDGE SOURCES 审中-公开
    使用知识来源进行类似分析和数据丰富的技术

    公开(公告)号:WO2016049437A1

    公开(公告)日:2016-03-31

    申请号:PCT/US2015/052190

    申请日:2015-09-25

    Abstract: The present disclosure relates to performing similarity metric analysis and data enrichment using knowledge sources. A data enrichment service can compare an input data set to reference data sets stored in a knowledge source to identify similarly related data. A similarity metric can be calculated corresponding to the semantic similarity of two or more datasets. The similarity metric can be used to identify datasets based on their metadata attributes and data values enabling easier indexing and high performance retrieval of data values. A input data set can labeled with a category based on the data set having the best match with the input data set. The similarity of an input data set with a data set provided by a knowledge source can be used to query a knowledge source to obtain additional information about the data set. The additional information can be used to provide recommendations to the user.

    Abstract translation: 本公开涉及使用知识源来执行相似性度量分析和数据丰富。 数据丰富服务可以将输入数据集与存储在知识源中的参考数据集进行比较,以识别类似的相关数据。 可以对应于两个或更多个数据集的语义相似度来计算相似性度量。 相似性度量可以用于基于其元数据属性和数据值来识别数据集,从而实现数据值的索引和高性能检索。 输入数据集可以基于与输入数据集具有最佳匹配的数据集来标记类别。 输入数据集与由知识源提供的数据集的相似性可用于查询知识源以获得关于数据集的附加信息。 附加信息可用于向用户提供建议。

    SPELLING CORRECTION OF EMAIL QUERIES
    7.
    发明申请
    SPELLING CORRECTION OF EMAIL QUERIES 审中-公开
    电子邮件查询的修改

    公开(公告)号:WO2016032866A1

    公开(公告)日:2016-03-03

    申请号:PCT/US2015/046194

    申请日:2015-08-21

    Abstract: Techniques and constructs to facilitate spelling correction of email queries can leverage features of email data to obtain candidate corrections particular to the email data being queried. The constructs may enable accurate spelling correction of email queries across languages and domains based on, for example, one or more of a language model such as a bigram language model and/or a normalized token IDF based language model, a translation model such as an edit distance translation model and/or a fuzzy match translation model, content-based features, and/or contextual features. Content-based features can include features associated with the subject line of emails, content including identified phrases, contacts, and/or the number of candidate emails returned. Contextual features can include a time window of subject match and/or contact match, a frequency of emails received from a contact, and/or device characteristics.

    Abstract translation: 促进电子邮件查询拼写校正的技术和结构可以利用电子邮件数据的功能来获取特定于要查询的电子邮件数据的候选修正。 这些构造可以基于例如诸如双语言模型和/或基于标准化令牌IDF的语言模型之类的语言模型中的一个或多个,跨语言和域的电子邮件查询的精确拼写校正,翻译模型,例如 编辑距离翻译模型和/或模糊匹配翻译模型,基于内容的特征和/或上下文特征。 基于内容的功能可以包括与电子邮件的主题行相关联的功能,包括所识别的短语,联系人和/或返回的候选电子邮件的数量的内容。 上下文特征可以包括主题匹配和/或联系匹配的时间窗口,从联系人接收的电子邮件的频率和/或设备特征。

    SECURE INFORMATION RETRIEVAL BASED ON HASH TRANSFORMS
    8.
    发明申请
    SECURE INFORMATION RETRIEVAL BASED ON HASH TRANSFORMS 审中-公开
    基于哈希变换的安全信息检索

    公开(公告)号:WO2016032503A1

    公开(公告)日:2016-03-03

    申请号:PCT/US2014/053333

    申请日:2014-08-29

    Abstract: Secure information retrieval is disclosed. One example is a system including an information retriever comprising a collection of nodes that receive a hash count from a first dataset, the first dataset including a first data term, and provide the hash count to a second dataset, the second dataset including a plurality of second data terms. A hash transformer transforms the data terms based on the hash count. A modifier modifies, for a given node, the transformed data terms. An evaluator evaluates, for each node, a similarity value between the first data term and each given second data term based on shared data elements between the modified first data term and a given modified second data term associated with the given second data term. The information retriever provides to the first dataset, at least one term identifier associated with a second data term.

    Abstract translation: 披露安全信息检索。 一个示例是包括信息检索器的系统,该信息检索器包括从第一数据集接收散列计数的节点集合,第一数据集包括第一数据项,并且将散列计数提供给第二数据集,第二数据集包括多个 第二个数据项。 散列变换器根据散列数变换数据项。 对于给定的节点,修饰符修改变换后的数据项。 评估者基于修改的第一数据项和与给定的第二数据项相关联的给定修改的第二数据项之间的共享数据元素,为每个节点评估第一数据项和每个给定的第二数据项之间的相似度值。 信息检索器向第一数据集提供与第二数据项相关联的至少一个术语标识符。

    METHOD AND APPARATUS FOR RECOMMENDING NETWORK SERVICE
    9.
    发明申请
    METHOD AND APPARATUS FOR RECOMMENDING NETWORK SERVICE 审中-公开
    推荐网络服务的方法和装置

    公开(公告)号:WO2015176656A1

    公开(公告)日:2015-11-26

    申请号:PCT/CN2015/079358

    申请日:2015-05-20

    Abstract: A method and an apparatus for recommending music are provided. The method includes : acquiring a historical browsing record of each user account on a network service (101); establishing a browsing sequence of each user account according to the historical browsing record corresponding to each user account (102); mapping the browsing sequence of each user account to a mapping value (103 ); aggregating all user accounts according to the mapping value corresponding to each user account, to obtain at least one user account group (104); and recommending the network service to each user account based on a user account group to which the user account belongs (105). The method improves an accuracy rate of whether a recommended network service satisfies an interest of a user in the network service.

    Abstract translation: 提供了一种用于推荐音乐的方法和装置。 该方法包括:获取网络服务(101)上的每个用户帐户的历史浏览记录; 根据对应于每个用户帐户(102)的历史浏览记录,建立每个用户帐户的浏览顺序; 将每个用户帐户的浏览顺序映射到映射值(103); 根据与每个用户帐户对应的映射值聚合所有用户帐户,以获得至少一个用户帐户组(104); 并且基于用户帐户所属的用户帐户组向每个用户帐户推荐网络服务(105)。 该方法提高了推荐网络服务是否满足用户在网络服务中的兴趣的准确率。

    POLYGON-BASED INDEXING OF PLACES
    10.
    发明申请
    POLYGON-BASED INDEXING OF PLACES 审中-公开
    基于POLYGON的指标排列

    公开(公告)号:WO2015142369A1

    公开(公告)日:2015-09-24

    申请号:PCT/US2014/035386

    申请日:2014-04-25

    Applicant: FACEBOOK, INC.

    CPC classification number: G06F17/30622 G06F17/30241 G06F17/30675

    Abstract: In one embodiment, a method includes receiving an identification of a location. The method further includes accessing an inverted index that comprises a plurality of records, where each record corresponds to a map tile and identifies one or more places corresponding to the map tile. At least one of the places identified in the inverted index is identified in multiple records corresponding to multiple map tiles, where the map tiles collectively define an area that circumscribes the place. The method also includes identifying based on the inverted index one or more places associated with the location.

    Abstract translation: 在一个实施例中,一种方法包括接收位置的标识。 该方法还包括访问包括多个记录的反向索引,其中每个记录对应于地图图块,并且识别与该地图图块相对应的一个或多个地点。 在反向索引中识别的位置中的至少一个在与多个地图瓦片相对应的多个记录中被识别,其中地图瓦片共同定义了限定该地点的区域。 该方法还包括基于与该位置相关联的一个或多个位置的反向索引来识别。

Patent Agency Ranking