Abstract:
In various embodiments, systems, methods, and techniques are disclosed for generating a collection of clusters of related data from a seed. Seeds may be generated based on seed generation strategies or rules. Clusters may be generated by, for example, retrieving a seed, adding the seed to a first cluster, retrieving a clustering strategy or rules, and adding related data and/or data entities to the cluster based on the clustering strategy. Various cluster scores may be generated based on attributes of data in a given cluster. Further, cluster metascores may be generated based on various cluster scores associated with a cluster. Clusters may be ranked based on cluster metascores. Various embodiments may enable an analyst to discover various insights related to data clusters, and may be applicable to various tasks including, for example, tax fraud detection, beaconing malware detection, malware user-agent detection, and/or activity trend detection, among various others.
Abstract:
Aspects of the present disclosure relate to systems and methods for receiving, managing, and displaying annotations on documents in real-time. A user (e.g., an author of a document) uploads a document into a real-time annotation system, which may then generate a composite presentation based on the uploaded document. The composite presentation includes all the content of the document presented in a specially configured graphical user interface to receive and manage annotations from a plurality of user devices.
Abstract:
A computer-implemented system and method for data revision control in a large-scale data analytic systems. In one embodiment, for example, a computer-implemented method comprises the operations of storing a first version of a dataset that is derived by executing a first version of driver program associated with the dataset; and storing a first build catalog entry comprising an identifier of the first version of the dataset and comprising an identifier of the first version of the driver program.
Abstract:
A context-sensitive viewing system is disclosed in which various data visualizations, also referred to a contextual views, of a common set of data may be viewed by a user on an electronic device. Data in the system may comprise data objects and associated properties and/or metadata, and may be stored in one or more electronic data stores. As a user of the system views and manipulates a first contextual view of a set of data objects, one or more other contextual views of the same set of data objects may be updated accordingly. Updates to the secondary contextual views may, in various embodiments, happen real-time. Further, the secondary contextual views may be visible to the user simultaneously with the primary contextual view. A user may switch from one view to another, and may manipulate data in any view, resulting in updates in the other views.
Abstract:
Systems and methods are provided for analyzing entity performance. In one implementation, a method is provided that includes receiving a request with one or more filter selections and accessing a data structure comprising a plurality of categories of information showing interactions associated with multiple entities. The method also comprises identifying a set of categories of the plurality of categories within the data structure based on the one or more filter selections. The method further comprises processing the information of the identified categories to analyze a performance of one or more entities of the multiple entities in accordance with the one or more filter selections and providing the processed information to display the performance of the one or more entities on a user interface.
Abstract:
Embodiments of the present disclosure relate to a data analysis system that may automatically generate memory-efficient clustered data structures, automatically analyze those clustered data structures, and provide results of the automated analysis in an optimized way to an analyst. The automated analysis of the clustered data structures (also referred to herein as data clusters) may include an automated application of various criteria or rules so as to generate a compact, human-readable analysis of the data clusters. The human-readable analyses (also referred to herein as “summaries” or “conclusions”) of the data clusters may be organized into an interactive user interface so as to enable an analyst to quickly navigate among information associated with various data clusters and efficiently evaluate those data clusters in the context of, for example, a fraud investigation. Embodiments of the present disclosure also relate to automated scoring of the clustered data structures.
Abstract:
Techniques are disclosed for generating a collection of clusters of related data from a seed. Doing so may generally include retrieving a seed and adding the seed to a first cluster and include retrieving a cluster strategy referencing one or more data bindings. Each data binding specifies a search protocol for retrieving data. For each of the one or more data bindings, data parameters input to the search protocol are identified, the search protocol is performed using the identified data parameters, and data returned by the search protocol is evaluated for inclusion in the first cluster.
Abstract:
A context-sensitive viewing system is disclosed in which various data visualizations, also referred to a contextual views, of a common set of data may be viewed by a user on an electronic device. Data in the system may comprise data objects and associated properties and/or metadata, and may be stored in one or more electronic data stores. As a user of the system views and manipulates a first contextual view of a set of data objects, one or more other contextual views of the same set of data objects may be updated accordingly. Updates to the secondary contextual views may, in various embodiments, happen real-time. Further, the secondary contextual views may be visible to the user simultaneously with the primary contextual view. A user may switch from one view to another, and may manipulate data in any view, resulting in updates in the other views.
Abstract:
A resource dependency system displays two dynamically interactive interfaces in a resource dependency user interface, a hierarchical resource repository and a dependency graph user interface. User interactions on each interface can dynamically update either interface. For example, a selection of a particular resource in the dependency graph user interface causes the system to update the dependency graph user interface to indicate the selection and also updates the hierarchical resource repository to navigate to the appropriate folder corresponding to the stored location of the selected resource. In another example, a selection of a particular resource in the hierarchical resource repository causes the system to update the hierarchical resource repository to indicate the selection and also updates the dependency graph user interface to display an updated graph, indicate the selection and, in some embodiments, focus on the selected resource by zooming into a portion of the graph.