Patent search ap:"Sachindra Joshi" Page 3

21.

发明授权
Clustering a collection using an inverted index of features 有权

公开(公告)号：US10083230B2

公开(公告)日：2018-09-25

申请号：US12966698

申请日：2010-12-13

Applicant: Danish Contractor , Thomas Hampp-Bahnmueller , Sachindra Joshi , Raghuram Krishnapuram , Kenney Ng

Inventor： Danish Contractor , Thomas Hampp-Bahnmueller , Sachindra Joshi , Raghuram Krishnapuram , Kenney Ng

IPC: G06F7/00 , G06F17/30

CPC classification number: G06F16/355 , G06F16/285 , G06F16/319

Abstract: Provided are techniques for creating an inverted index for features of a set of data elements, wherein each of the data elements is represented by a vector of features, wherein the inverted index, when queried with a feature, outputs one or more data elements containing the feature. The features of the set of data elements are ranked. For each feature in the ranked list, the inverted index is queried for data elements having the feature and not having any previously selected feature and a cluster of the data elements is created based on results returned in response to the query.

22.

发明授权
Systems and methods for standardization and de-duplication of addresses using taxonomy 有权

公开(公告)号：US09697301B2

公开(公告)日：2017-07-04

申请号：US12859607

申请日：2010-08-19

Applicant: Tanveer Afzal Faruquie , Sachindra Joshi , Hima Prasad Karanam , Mukesh Kumar Mohania , Sriram K. Padmanabhan , L. Venkata Subramaniam

Inventor： Tanveer Afzal Faruquie , Sachindra Joshi , Hima Prasad Karanam , Mukesh Kumar Mohania , Sriram K. Padmanabhan , L. Venkata Subramaniam

IPC: G06F17/30

CPC classification number: G06F17/30961

Abstract: Systems and associated methods for address standardization and applications related thereto are described. Embodiments exploit a common context in a taxonomy and a given address to detect and correct deviations in the address. Embodiments establish a possible path from a root of the taxonomy to a leaf in the taxonomy that can possibly generate a given address. Given a new address, embodiments use complete addresses, and/or segments or elements thereof, to compute the representations of the elements and find a closest matching leaf in the taxonomy. Embodiments then traverse the path to a root node to detect the agreement and disagreement between the path and the address entry. Taxonomical structured is thus used to detect, segregate and standardize the expected fields.

23.

发明授权
Cross-domain clusterability evaluation for cross-guided data clustering based on alignment between data domains 失效
Title translation: 基于数据域之间的对齐的交叉引导数据聚类的跨域可聚集性评估

公开(公告)号：US08661039B2

公开(公告)日：2014-02-25

申请号：US13437287

申请日：2012-04-02

Applicant: Jeffrey M. Achtermann , Indrajit Bhattacharya , Kevin W. English, Jr. , Shantanu R. Godbole , Sachindra Joshi , Ashwin Srinivasan , Ashish Verma

Inventor： Jeffrey M. Achtermann , Indrajit Bhattacharya , Kevin W. English, Jr. , Shantanu R. Godbole , Sachindra Joshi , Ashwin Srinivasan , Ashish Verma

IPC: G06F17/30

CPC classification number: G06F17/30598 , G06F17/3071 , G06F17/30864

Abstract: A process for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.

Abstract translation: 用于评估目标域和源域的跨域可聚集性的过程。通过使用确定目标可聚集性和源 - 目标对匹配性的相对贡献的折衷参数，跨域可聚集性计算为目标可聚集性和源 - 目标对匹配性的线性组合。目标可集群性量化目标域的可集群性。源 - 目标对匹配性被计算为目标端匹配度和源端匹配度的平均值，其量化目标域的目标质心与源中心的匹配程度以及源域的源中心有多好与目标质心分别对齐。

24.

发明授权
Cross-domain clusterability evaluation for cross-guided data clustering based on alignment between data domains 失效
Title translation: 基于数据域之间的对齐的交叉引导数据聚类的跨域可聚集性评估

公开(公告)号：US08655884B2

公开(公告)日：2014-02-18

申请号：US13434105

申请日：2012-03-29

Applicant: Jeffrey M. Achtermann , Indrajit Bhattacharya , Kevin W. English, Jr. , Shantanu R. Godbole , Sachindra Joshi , Ashwin Srinivasan , Ashish Verma

Inventor： Jeffrey M. Achtermann , Indrajit Bhattacharya , Kevin W. English, Jr. , Shantanu R. Godbole , Sachindra Joshi , Ashwin Srinivasan , Ashish Verma

IPC: G06F17/30

CPC classification number: G06F17/30598 , G06F17/3071 , G06F17/30864

Abstract: A computer system for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.

Abstract translation: 用于评估目标域和源域的跨域可聚集性的计算机系统。通过使用确定目标可聚集性和源 - 目标对匹配性的相对贡献的折衷参数，跨域可聚集性计算为目标可聚集性和源 - 目标对匹配性的线性组合。目标可集群性量化目标域的可集群性。源 - 目标对匹配性被计算为目标端匹配度和源端匹配度的平均值，其量化目标域的目标质心与源中心的匹配程度以及源域的源中心有多好与目标质心分别对齐。

25.

发明申请
ENHANCING POSTED CONTENT IN DISCUSSION FORUMS 审中-公开

公开(公告)号：US20140006524A1

公开(公告)日：2014-01-02

申请号：US13600856

申请日：2012-08-31

Applicant: Amit K. Singh , Rose Catherine Kanjirathinkal , Sachindra Joshi , Ankur Gandhe , Karthik Vesweswariah

Inventor： Amit K. Singh , Rose Catherine Kanjirathinkal , Sachindra Joshi , Ankur Gandhe , Karthik Vesweswariah

IPC: G06F15/16

CPC classification number: G09B7/02

Abstract: Methods and arrangements for enhancing content in discussion forums. Access to an online discussion is provided. A posting by an author participating in the discussion is accepted, and a recommendation is automatically produced for the author for amending the posting to increase the likelihood of response to the posting by other individuals participating in the discussion.

26.

发明申请
SYSTEMS AND METHODS FOR STANDARDIZATION AND DE-DUPLICATION OF ADDRESSES USING TAXONOMY 有权
Title translation: 使用税收的地址标准化和失效的系统和方法

公开(公告)号：US20120047179A1

公开(公告)日：2012-02-23

申请号：US12859607

申请日：2010-08-19

Applicant: Tanveer Afzal Faruquie , Sachindra Joshi , Hima Prasad Karanam , Mukesh Kumar Mohania , Sriram K. Padmanabhan , L. Venkata Subramaniam

Inventor： Tanveer Afzal Faruquie , Sachindra Joshi , Hima Prasad Karanam , Mukesh Kumar Mohania , Sriram K. Padmanabhan , L. Venkata Subramaniam

IPC: G06F7/00 , G06F17/30

CPC classification number: G06F17/30961

Abstract: Systems and associated methods for address standardization and applications related thereto are described. Embodiments exploit a common context in a taxonomy and a given address to detect and correct deviations in the address. Embodiments establish a possible path from a root of the taxonomy to a leaf in the taxonomy that can possibly generate a given address. Given a new address, embodiments use complete addresses, and/or segments or elements thereof, to compute the representations of the elements and find a closest matching leaf in the taxonomy. Embodiments then traverse the path to a root node to detect the agreement and disagreement between the path and the address entry. Taxonomical structured is thus used to detect, segregate and standardize the expected fields.

Abstract translation: 描述用于地址标准化的系统和相关方法及其相关的应用。实施例利用分类法和给定地址中的公共上下文来检测和纠正地址中的偏差。实体建立了从分类的根到可能产生给定地址的分类中的叶的可行路径。给定新的地址，实施例使用完整的地址和/或其部分或元素来计算元素的表示并在分类中找到最接近的匹配叶。然后，实施例遍历到根节点的路径以检测路径和地址条目之间的协议和不一致。因此，分类结构用于检测，分离和规范预期的领域。

27.

发明申请
DYNAMICALLY DETECTING NEAR-DUPLICATE DOCUMENTS 有权
Title translation: 动态检测近似文件

公开(公告)号：US20110029491A1

公开(公告)日：2011-02-03

申请号：US12511175

申请日：2009-07-29

Applicant: Sachindra Joshi , Kenney Ng , Sandeep Singh

Inventor： Sachindra Joshi , Kenney Ng , Sandeep Singh

IPC: G06F17/30

CPC classification number: G06F17/30675

Abstract: Techniques for detecting one or more documents that are duplicate or near-duplicate of a first document are provided. The techniques include obtaining a first document, obtaining one or more additional documents, retrieving a set of one or more document signatures for each document, and detecting one or more documents that are duplicate or near-duplicate of the first document by detecting each of the one or more additional documents that have at least a minimum number of signatures in common with the first document, wherein detecting each of the one or more additional documents that have at least a minimum number of signatures in common with the first document comprises dynamically using at least one of a user-configurable similarity definition and a user-configurable similarity threshold value.

Abstract translation: 提供了用于检测与第一文档重复或近似重复的一个或多个文档的技术。这些技术包括获得第一文档，获得一个或多个附加文档，检索每个文档的一个或多个文档签名的集合，以及通过检测第一文档中的每一个来检测与第一文档重复或近似重复的一个或多个文档一个或多个附加文档具有与第一文档相同的至少最小数量的签名，其中检测至少具有与第一文档共同的最小签名数量的一个或多个附加文档中的每一个，包括动态地使用用户可配置的相似性定义和用户可配置的相似性阈值中的至少一个。

28.

发明申请
Methods, apparatus and computer programs for characterizing web resources 失效

公开(公告)号：US20060026496A1

公开(公告)日：2006-02-02

申请号：US10901275

申请日：2004-07-28

Applicant: Sachindra Joshi , Raghuram Krishnapuram , Shourya Roy

Inventor： Sachindra Joshi , Raghuram Krishnapuram , Shourya Roy

IPC: G06F17/21

CPC classification number: G06F17/30864 , G06F17/30896

Abstract: Methods, apparatus and computer programs are provided for characterizing Web-based information resources based on their interactions. A Web-based information resource is a single Web document or a collection of related Web documents. Unlike simple text documents, Web documents contain hyperlinks and other HTML tags. Different types of interactions, including inbound hyperlinks, outbound hyperlinks and internal links associated with a Web-based information resource, are used to characterize the Web-based information resource. A DOM tree representing the tag structure of a Web-based information resource is used to identify text items likely to be useful as context for a hyperlink anchor text, and the anchor text is combined with the context to generate a representation. The representation of Web-based information resources based on interactions can be used for clustering and classification, and in Web mining applications such as query disambiguation and automatic taxonomy generation.

29.

发明申请
Determining structural similarity in semi-structured documents 有权
Title translation: 确定半结构文件的结构相似性

公开(公告)号：US20050038785A1

公开(公告)日：2005-02-17

申请号：US10629133

申请日：2003-07-29

Applicant: Neeraj Agrawal , Sachindra Joshi , Raghuram Krishnapuram , Sumit Negi

Inventor： Neeraj Agrawal , Sachindra Joshi , Raghuram Krishnapuram , Sumit Negi

IPC: G06F17/22 , G06F17/30

CPC classification number: G06F17/30911 , G06F17/2211 , G06F17/2247 , Y10S707/99932 , Y10S707/99933 , Y10S707/99936 , Y10S707/99942

Abstract: Documents are represented based on their structure, which arises from the relationship between various elements in the document. After representing documents based on their structure in vector form, a method of measuring similarity between vectors is used to obtain the measure of structural similarity between two given documents.

Abstract translation: 文件基于它们的结构来表示，这些结构源于文档中各种元素之间的关系。在以向量形式的结构表示文档之后，使用测量向量之间的相似性的方法来获得两个给定文档之间的结构相似度的度量。

30.

发明授权
Dynamically detecting near-duplicate documents 有权
Title translation: 动态检测近重复文件

公开(公告)号：US09245007B2

公开(公告)日：2016-01-26

申请号：US12511175

申请日：2009-07-29

Applicant: Sachindra Joshi , Kenney Ng , Sandeep Singh

Inventor： Sachindra Joshi , Kenney Ng , Sandeep Singh

IPC: G06F17/30

CPC classification number: G06F17/30675

Abstract: Techniques for detecting one or more documents that are duplicate or near-duplicate of a first document are provided. The techniques include obtaining a first document, obtaining one or more additional documents, retrieving a set of one or more document signatures for each document, and detecting one or more documents that are duplicate or near-duplicate of the first document by detecting each of the one or more additional documents that have at least a minimum number of signatures in common with the first document, wherein detecting each of the one or more additional documents that have at least a minimum number of signatures in common with the first document comprises dynamically using at least one of a user-configurable similarity definition and a user-configurable similarity threshold value.

Abstract translation: 提供了用于检测与第一文档重复或近似重复的一个或多个文档的技术。这些技术包括获得第一文档，获得一个或多个附加文档，检索每个文档的一个或多个文档签名的集合，以及通过检测第一文档中的每一个来检测与第一文档重复或近似重复的一个或多个文档一个或多个附加文档具有与第一文档相同的至少最小数量的签名，其中检测至少具有与第一文档共同的最小签名数量的一个或多个附加文档中的每一个，包括动态地使用用户可配置的相似性定义和用户可配置的相似性阈值中的至少一个。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification