Communication efficient sparse-reduce in distributed machine learning

    公开(公告)号:US11356334B2

    公开(公告)日:2022-06-07

    申请号:US15980243

    申请日:2018-05-15

    Abstract: A method is provided for sparse communication in a parallel machine learning environment. The method includes determining a fixed communication cost for a sparse graph to be computed. The sparse graph is (i) determined from a communication graph that includes all the machines in a target cluster of the environment, and (ii) represents a communication network for the target cluster having (a) an overall spectral gap greater than or equal to a minimum threshold, and (b) certain information dispersal properties such that an intermediate output from a given node disperses to all other nodes of the sparse graph in lowest number of time steps given other possible node connections. The method further includes computing the sparse graph, based on the communication graph and the fixed communication cost. The method also includes initiating a propagation of the intermediate output in the parallel machine learning environment using a topology of the sparse graph.

    SELF-SUPERVISED MULTIMODAL REPRESENTATION LEARNING WITH CASCADE POSITIVE EXAMPLE MINING

    公开(公告)号:US20230086023A1

    公开(公告)日:2023-03-23

    申请号:US17940599

    申请日:2022-09-08

    Abstract: A method for model training and deployment includes training, by a processor, a model to learn video representations with a self-supervised contrastive loss by performing progressive training in phases with an incremental number of positive instances from one or more video sequences, resetting the learning rate schedule in each of the phases, and inheriting model weights from a checkpoint from a previous training phase. The method further includes updating the trained model with the self-supervised contrastive loss given multiple positive instances obtained from Cascade K-Nearest Neighbor mining of the one or more video sequences by extracting features in different modalities to compute similarities between the one or more video sequences and selecting a top-k similar instances with features in different modalities. The method also includes fine-tuning the trained model for a downstream task. The method additionally includes deploying the trained model for a target application inference for the downstream task.

Patent Agency Ranking