Abstract:
A false alarm reduction system is provided that includes a processor cropping each input image at randomly chosen positions to form cropped images of a same size at different scales in different contexts. The system further includes a CONDA-GMM, having a first and a second conditional deep autoencoder for respectively (i) taking each cropped image without a respective center block as input for measuring a discrepancy between a reconstructed and a target center block, and (ii) taking an entirety of cropped images with the target center block. The CONDA-GMM constructs density estimates based on reconstruction error features and low-dimensional embedding representations derived from image encodings. The processor determines an anomaly existence based on a prediction of a likelihood of the anomaly existing in a framework of a CGMM, given the context being a representation of the cropped image with the center block removed and having a discrepancy above a threshold.
Abstract:
Systems and methods for solving queries on image data are provided. The system includes a processor device coupled to a memory device. The system includes a detector manager with a detector application programming interface (API) to allow external detectors to be inserted into the system by exposing capabilities of the external detectors and providing a predetermined way to execute the external detectors. An ontology manager exposes knowledge bases regarding ontologies to a reasoning engine. A query parser transforms a natural query into query directed acyclic graph (DAG). The system includes a reasoning engine that uses the query DAG, the ontology manager and the detector API to plan an execution list of detectors. The reasoning engine uses the query DAG, a scene representation DAG produced by the external detectors and the ontology manager to answer the natural query.
Abstract:
A video device for predicting driving situations while a person drives a car is presented. The video device includes multi-modal sensors and knowledge data for extracting feature maps, a deep neural network trained with training data to recognize real-time traffic scenes (TSs) from a viewpoint of the car, and a user interface (UI) for displaying the real-time TSs. The real-time TSs are compared to predetermined TSs to predict the driving situations. The video device can be a video camera. The video camera can be mounted to a windshield of the car. Alternatively, the video camera can be incorporated into the dashboard or console area of the car. The video camera can calculate speed, velocity, type, and/or position information related to other cars within the real-time TS. The video camera can also include warning indicators, such as light emitting diodes (LEDs) that emit different colors for the different driving situations.
Abstract:
Systems and methods are disclosed for computer vision and object detection by extracting tracks of moving objects on a set of video sequences; selecting a subset of tracks for training; rendering a composite of each selected track into a single image; labeling tracks using the rendered images; training a track classifier by supervised machine learning using the labeled tracks; applying the trained track classifier to the remainder of the tracks; and selecting tracks classified with a low confidence by the classifier.
Abstract:
A method, a computer program product, and a system are provided for video based action recognition. The system includes a processor. One or more frames from one or more video sequences are received. A feature vector for each patch of the one w more frames is generated using a deep convolutional neural network. An attention factor for the feature vectors is generated based on a within-frame attention and a between-frame attention. A target action is identified using a multi-layer deep long short-term memory process applied to the attention factor, said target action representing at least one of the one or more video sequences. An operation of a processor-based machine is controlled to change a state of the processor-based machine, responsive to the at least one of the one or more video sequences including the identified target action
Abstract:
A video camera is provided for video-based anomaly detection that includes at least one imaging sensor configured to capture video sequences in a workplace environment having a plurality of machines therein. The video camera further includes a processor. The processor is configured to generate one or more predictions of an impending anomaly affecting at least one item selected from the group consisting of (i) at least one of the plurality of machines and (ii) at least one operator of the at least one of the plurality of machines, using a Deep High-Order Convolutional Neural Network (DHOCNN)-based model applied to the video sequences. The DHOCNN-based model has a one-class SVM as a loss layer of the model. The processor is further configured to generate a signal for initiating an action to the at least one of the plurality of machines to mitigate expected harm to the at least one item.
Abstract:
Systems and methods are disclosed to assist a driver with a dangerous condition by creating a graph representation where traffic participants and static elements are the vertices and the edges are relations between pairs of vertices; adding attributes to the vertices and edges of the graph based on information obtained on the driving vehicle, the traffic participants and additional information; creating a codebook of dangerous driving situations, each represented as graphs; performing subgraph matching between the graphs in the codebook and the graph representing a current driving situation to select a set of matching graphs from the codebook; determining a distance metric between each selected codebook graphs and the matching subgraph of the current driving situation; from codebook graphs with a low distance, determining potential dangers; and generating an alert if one or more of the codebook dangers are imminent.
Abstract:
Systems and methods are disclosed for classifying histological tissues or specimens with two phases. In a first phase, the method includes providing off-line training using a processor during which one or more classifiers are trained based on examples, including: finding a split of features into sets of increasing computational cost, assigning a computational cost to each set; training for each set of features a classifier using training examples; training for each classifier, a utility function that scores a usefulness of extracting the next feature set for a given tissue unit using the training examples. In a second phase, the method includes applying the classifiers to an unknown tissue sample with extracting the first set of features for all tissue units; deciding for which tissue unit to extract the next set of features by finding the tissue unit for which a score: S=U−h*C is maximized, where U is a utility function, C is a cost of acquiring the feature and h is a weighting parameter; iterating until a stopping criterion is met or no more feature can be computed; and issuing a tissue-level decision based on a current state.
Abstract:
Methods and systems for processing a scanned tissue section include locating cells within a scanned tissue. Cells in the scanned tissue are classified using a classifier model. A tumor-cell ratio (TCR) map is generated based on classified normal cells and tumor cells. A TCR isoline is generated for a target TCR value using the TCR map, marking areas of the tissue section where a TCR is at or above the target TCR value. Dissection is performed on the tissue sample to isolate an area identified by the isoline.
Abstract:
Methods and systems for training a machine learning model include generating pairs of training pixel patches from a dataset of training images, each pair including a first patch representing a part of a respective training image, and a second patch, centered at the same location as the first, representing a larger part of the training image, being resized to a same size of as the first patch. A detection model is trained using the first pixel patches, to detect and locate cells in the images. A classification model is trained using the first pixel patches, to classify cells according to whether the detected cells are cancerous, based on cell location information generated by the detection model. A segmentation model is trained using the second pixel patches, to locate and classify cancerous arrangements of cells in the images.