Invention Application
- Patent Title: CLUSTERING HIGH DIMENSIONAL DATA USING GAUSSIAN MIXTURE COPULA MODEL WITH LASSO BASED REGULARIZATION
-
Application No.: US15093302Application Date: 2016-04-07
-
Publication No.: US20170293856A1Publication Date: 2017-10-12
- Inventor: Sakyajit Bhattacharya , Vaibhav Rajan , Asim Anand
- Applicant: Xerox Corporation
- Assignee: Xerox Corporation
- Current Assignee: Xerox Corporation
- Main IPC: G06N99/00
- IPC: G06N99/00 ; G06F17/30

Abstract:
LASSO constraints can lead to a Gaussian mixture copula model that is more robust, better conditioned, and more reflective of the actual clusters in the training data. These qualities of the GMCM have been shown with data obtained from: digital images of fine needle aspirates of breast tissue for detecting cancer; email for detecting spam; two dimensional terrain data for detecting hills and valleys; and video sequences of hand movements to detect gestures. Using training data, a GMCM estimate can be produced and iteratively refined to maximize a penalized log likelihood estimate until sequential iterations are within a threshold value of one another. The GMCM estimate can then be used to classify further samples. The LASSO constraints help keep the analysis tractibe such that useful results can be found and used while the result is still useful.
Information query