摘要:
Techniques are presented for analyzing audio-video segments, usually from multiple sources. A combined similarity measure is determined from text similarities and video similarities. The text and video similarities measure similarity between audio-video scenes for text and video, respectively. The combined similarity measure is then used to determine similar scenes in the audio-video segments. When the audio-video segments are from multiple audio-video sources, the similar scenes are common scenes in the audio-video segments. Similarities may be converted to or measured by distance. Distance matrices may be determined by using the similarity matrices. The text and video distance matrices are normalized before the combined similarity matrix is determined. Clustering is performed using distance values determined from the combined similarity matrix. Resulting clusters are examined and a cluster is considered to represent a common scene between two or more different audio-video segments when scenes in the cluster are similar.
摘要:
Systems and methods for risk factor identification include identifying a first set of risk factors from personal data. A second set of risk factors is identified from at least one of a user input and a knowledge source. The first set is combined with the second set, using a processor, by selecting a number of risk factors from the first set that augment the second set of risk factors to determine a combined list of risk factors that predict a condition of interest.
摘要:
Past realization profiles can be used to predict future realization profiles using a similarity rubric that emphasizes relationships between the past realization profiles. That similarity rubric might involve techniques including manifold characterization of past realization profiles; predictive modeling; and/or matrix factorization. Realization profiles might be related to business projects and track features such as ongoing resource expenditure, revenues realized, or percentage project completion. Realization profiles might relate to other applications such as effectiveness of medical treatment.
摘要:
The present invention relates to a method, computer program product and system for the compression of a probability table and the reconstruction of one or more probability elements using the compressed data and method. After determining a probability table that is to be compressed, the probability table is compressed using a first probability table compression method, wherein the probability table compression method creates a first compressed probability table. The first compressed probability table contains a plurality of probability elements. Further, the probability table is compressed using a second probability table compression method, wherein the probability table compression method creates a second compressed probability table. The second compressed probability table containing a plurality of probability elements. A first probability element reconstructed using the first compressed probability table is thereafter merged with a second probability element reconstructed using the second compressed probability table in order to produce a merged probability element.
摘要:
Improved techniques are disclosed for adapting signature verification systems to natural signature variations. For example, a technique for adapting a signature verification system to variations in a signature of a user includes the following steps/operations. One or more signature samples are obtained from the user. The one or more obtained signature samples are submitted by the user as part of a regular authentication procedure associated with the signature verification system. A reference set of signature samples for the user is updated through selection of one or more signature samples from the obtained signature samples, such that the updated reference set is usable by the signature verification system for verifying subsequent signature samples attributed to the user. The selection of the one or more signature samples used to update the reference set is conditioned on a false rejection rate of the user when at least one obtained signature sample of the user is authenticated and on an identification check when no obtained signature sample is authenticated.
摘要:
A method (and structure) for end-to-end workforce management, includes identifying sources of data that together reflect data of substantially the entirety of a workforce of an organization, identifying service components related to the workforce, and combining the data sources and service components into an integrated framework to support an end-to-end workforce management cycle.
摘要:
Document type comparison and classification using layout classification is accomplished by first segmenting a document page into blocks of text and white space. A grid of rows and columns, forming bins, is created on the page to intersect the blocks. Layout information is identified using a unique fixed length interval vector, to represent each row on the segmented document. By computing the Manhattan distance between interval vectors of all rows of two document pages and performing a warping function to determine the row to row correspondence, two documents may be compared by their layout. Furthermore, interval vectors may be grouped into N clusters with a cluster center, defined as the median of the interval vectors of the cluster, replacing each interval vector in its cluster. Using Hidden Markov Models, documents can be compared to document type models comprising rows represented by cluster centers and identified as belonging to one or more document types. In addition, documents stored in a database may be retrieved, deleted, or otherwise managed by type, using their corresponding vector sets without requiring expensive OCR of the document. Furthermore, based on the classification, it is a simple matter to locate which blocks of data contain certain information. Where only that information is desired, it is not necessary to perform OCR on the entire document. Rather OCR may be limited to those blocks where the particular information is expected based on the document type.
摘要:
Methods and systems for event pattern mining are shown that include representing longitudinal event data in a measurable geometric space as a temporal event matrix representation (TEMR) using spatial temporal shapes, wherein event data is organized into hierarchical categories of event type and performing temporal event pattern mining with a processor by locating visual event patterns among the spatial temporal shapes of said TEMR using a constraint sparse coding framework.
摘要:
In a computerized social network, expert and user chat sessions are stored and rated probabilistically. Later user requests for information are met with an expert ranking, based on a balance of similarities between expert profile and questions; similarity between expert profile and prior chat sessions, and dynamically updated chat session ratings. New sessions can be rated automatically with reference to keywords distilled from past sessions responsive to user ratings—and based on session length.
摘要:
A system, method and program product for matching members of a population, e.g., patients, based on member similarities. Patients are mapped to a bipartite graph with patient nodes connected by weighted edges to clustered factor nodes, are clustered categorically. As a new patient query is received, a similarity measure for each other patient is generated for each cluster by comparing cluster edges. The cluster similarity measures are aggregated for each patient to provide a global closeness measure to every other patient. Based on the global closeness measure, a list of the closest patients is displayed and measurement feedback may be provided.