Abstract:
Computer implemented systems and methods are disclosed for identifying and categorizing electronic documents through machine learning. In accordance with some embodiments, a seed set of categorized electronic documents may be used to train a document categorizer based on a machine learning algorithm. The trained document categorizer may categorize electronic documents in a large corpus of electronic documents. Performance metrics associated with performance of the trained document categorizer may be tracked, and additional seed sets of categorized electronic documents may be used to improve the performance of document categorizer by retraining the document categorizer on subsequent seed sets. Additional seed sets may and categorizations may be iterated through until a desired document categorization performance is reached.
Abstract:
Embodiments of the present disclosure relate to a data analysis system that may automatically generate memory-efficient clustered data structures, automatically analyze those clustered data structures, and provide results of the automated analysis in an optimized way to an analyst. The automated analysis of the clustered data structures (also referred to herein as data clusters) may include an automated application of various criteria or rules so as to generate a compact, human-readable analysis of the data clusters. The human-readable analyses (also referred to herein as “summaries” or “conclusions”) of the data clusters may be organized into an interactive user interface so as to enable an analyst to quickly navigate among information associated with various data clusters and efficiently evaluate those data clusters in the context of, for example, a fraud investigation. Embodiments of the present disclosure also relate to automated scoring of the clustered data structures.
Abstract:
Systems and methods are disclosed for active column filtering. In accordance with one implementation, a method is provided for active column filtering. The method includes providing a table having data values arranged in rows and columns, providing a first filter location indicator whose location is visually associated with a first column, and providing a first interface based on a selection of the first filter location indicator, wherein the first interface's location is visually associated with the first column. The method also includes acquiring a first filter input entered into the first interface, filtering the table based on the acquired first filter input, providing the filtered table for displaying, and providing an applied filter indicator, whose location is visually associated with the first column, the applied filter indicator including at least the first filter input.
Abstract:
This disclosure relates to a system and method for data analysis. According to a first aspect, there is described a method, the method being performed using one or more processors, comprising: receiving one or more user inputs indicative of one or more relationships between data in a plurality of datasets; determining, based on the one or more user inputs, at least one object view for visualizing the data in the plurality of datasets; generating, based on the one or more user inputs, metadata comprising: an object graph indicative of the one or more relationships between two or more of the plurality of datasets; and information identifying the at least one object view; and in response to a query relating to the plurality of datasets, using the metadata to determine how response data responding to the query should be provided.
Abstract:
Embodiments of the present disclosure relate to a data analysis system that may automatically generate memory-efficient clustered data structures, automatically analyze those clustered data structures, and provide results of the automated analysis in an optimized way to an analyst. The automated analysis of the clustered data structures (also referred to herein as data clusters) may include an automated application of various criteria or rules so as to generate a compact, human-readable analysis of the data clusters. The human-readable analyses (also referred to herein as “summaries” or “conclusions”) of the data clusters may be organized into an interactive user interface so as to enable an analyst to quickly navigate among information associated with various data clusters and efficiently evaluate those data clusters in the context of, for example, a fraud investigation. Embodiments of the present disclosure also relate to automated scoring of the clustered data structures.
Abstract:
Embodiments of the present disclosure relate to a data analysis system that may automatically generate memory-efficient clustered data structures, automatically analyze those clustered data structures, and provide results of the automated analysis in an optimized way to an analyst. The automated analysis of the clustered data structures (also referred to herein as data clusters) may include an automated application of various criteria or rules so as to generate a compact, human-readable analysis of the data clusters. The human-readable analyses (also referred to herein as “summaries” or “conclusions”) of the data clusters may be organized into an interactive user interface so as to enable an analyst to quickly navigate among information associated with various data clusters and efficiently evaluate those data clusters in the context of, for example, a fraud investigation. Embodiments of the present disclosure also relate to automated scoring of the clustered data structures.
Abstract:
This disclosure relates to a system and method for data analysis. According to a first aspect, there is described a method, the method being performed using one or more processors, comprising: receiving one or more user inputs indicative of one or more relationships between data in a plurality of datasets; determining, based on the one or more user inputs, at least one object view for visualizing the data in the plurality of datasets; generating, based on the one or more user inputs, metadata comprising: an object graph indicative of the one or more relationships between two or more of the plurality of datasets; and information identifying the at least one object view; and in response to a query relating to the plurality of datasets, using the metadata to determine how response data responding to the query should be provided.
Abstract:
Systems and methods are provided for providing an object platform for datasets A definition of an object may be obtained. The object may be associated with information stored in one or more datasets. The information may be determined based at least in part on the definition of the object. The object may be stored in a cache such that the information associated with the object is also stored in the cache. One or more interfaces through which requests to perform one or more operations on the object are able to be submitted may be provided.
Abstract:
Systems and methods are provided for providing an object platform for datasets A definition of an object may be obtained. The object may be associated with information stored in one or more datasets. The information may be determined based at least in part on the definition of the object. The object may be stored in a cache such that the information associated with the object is also stored in the cache. One or more interfaces through which requests to perform one or more operations on the object are able to be submitted may be provided.