3D Object Detection Using Random Forests

    公开(公告)号:US20220292717A1

    公开(公告)日:2022-09-15

    申请号:US17636250

    申请日:2019-09-13

    Applicant: Google LLC

    Abstract: Example embodiments allow for fast, efficient detection and pose estimation of objects based on point clouds, depth images/maps, or other depth information about a scene that may contain the objects. Embodiments include translating and rotating the depth image to bring individual points of the depth image to a standard orientation and location so as to improve performance when an object is near the periphery of the field of view. Some disclosed embodiments include applying a random forest to perform pose estimation. By using the decision trees or other fast methods, it can be advantageous to perform pose estimation a plurality of times prior to identifying whether a particular object is actually present in a scene. Prospective pose estimates can be combined with models of the objects in order to evaluate whether the object is present in the scene.

    Image Transformation Using Interpretable Transformation Parameters

    公开(公告)号:US20210358095A1

    公开(公告)日:2021-11-18

    申请号:US17052049

    申请日:2020-02-05

    Applicant: Google LLC

    Abstract: A computer-implemented method to perform image-to-image translation. The method can include obtaining one or more machine-learned generator models. The one or more machine-learned generator models can be configured to receive an input image and a user-specified conditioning vector that parameterizes one or more desired values for one or more defined characteristics of an output image. The one or more machine-learned generator models can be configured to perform, based at least in part on the user-specified conditioning vector, one or more transformations on the input image to generate the output image with the one or more desired values for the one or more defined characteristics. The method can include receiving the input image and the user-specified conditioning vector. The method can include generating, using the machine-learned generator model, an output image having the one or more desired values for the one or more characteristics.

    3D object detection using random forests

    公开(公告)号:US12236639B2

    公开(公告)日:2025-02-25

    申请号:US17636250

    申请日:2019-09-13

    Applicant: Google LLC

    Abstract: Example embodiments allow for fast, efficient detection and pose estimation of objects based on point clouds, depth images/maps, or other depth information about a scene that may contain the objects. Embodiments include translating and rotating the depth image to bring individual points of the depth image to a standard orientation and location so as to improve performance when an object is near the periphery of the field of view. Some disclosed embodiments include applying a random forest to perform pose estimation. By using the decision trees or other fast methods, it can be advantageous to perform pose estimation a plurality of times prior to identifying whether a particular object is actually present in a scene. Prospective pose estimates can be combined with models of the objects in order to evaluate whether the object is present in the scene.

    Domain Generalization via Batch Normalization Statistics

    公开(公告)号:US20230122207A1

    公开(公告)日:2023-04-20

    申请号:US17909545

    申请日:2021-03-05

    Applicant: Google LLC

    Abstract: Generally, the present disclosure is directed to systems and methods that leverage batch normalization statistics as a way to generalize across domains In particular, example implementations of the present disclosure can generate different representations for different domains by collecting independent batch normalization statistics, which can then be used to map between domains in a shared latent space. At test or inference time, samples from an unknown test or target domain can be projected into the same shared latent space. The domain of the target sample can therefore be expressed as a linear combination of the known ones, with the combination between weighted based on respective distances between batch normalization statistics in the latent space. This same mapping strategy can be applied at both training and test time to learn both a latent representation and a powerful but light-weight ensemble model that operates within such latent space.

    Image transformation using interpretable transformation parameters

    公开(公告)号:US11599980B2

    公开(公告)日:2023-03-07

    申请号:US17052049

    申请日:2020-02-05

    Applicant: Google LLC

    Abstract: A computer-implemented method to perform image-to-image translation. The method can include obtaining one or more machine-learned generator models. The one or more machine-learned generator models can be configured to receive an input image and a user-specified conditioning vector that parameterizes one or more desired values for one or more defined characteristics of an output image. The one or more machine-learned generator models can be configured to perform, based at least in part on the user-specified conditioning vector, one or more transformations on the input image to generate the output image with the one or more desired values for the one or more defined characteristics. The method can include receiving the input image and the user-specified conditioning vector. The method can include generating, using the machine-learned generator model, an output image having the one or more desired values for the one or more characteristics.

    Image Transformation Using Interpretable Transformation Parameters

    公开(公告)号:US20240202878A1

    公开(公告)日:2024-06-20

    申请号:US18390566

    申请日:2023-12-20

    Applicant: Google LLC

    Abstract: 1. A computer-implemented method to perform image-to-image translation. The method can include obtaining one or more machine-learned generator models. The one or more machine-learned generator models can be configured to receive an input image and a user-specified conditioning vector that parameterizes one or more desired values for one or more defined characteristics of an output image. The one or more machine-learned generator models can be configured to perform, based at least in part on the user-specified conditioning vector, one or more transformations on the input image to generate the output image with the one or more desired values for the one or more defined characteristics. The method can include receiving the input image and the user-specified conditioning vector. The method can include generating, using the machine-learned generator model, an output image having the one or more desired values for the one or more characteristics.

    Processing Diagrams as Search Input
    8.
    发明公开

    公开(公告)号:US20240152546A1

    公开(公告)日:2024-05-09

    申请号:US18502688

    申请日:2023-11-06

    Applicant: Google LLC

    CPC classification number: G06F16/532 G06F16/953

    Abstract: Methods and systems for returning search results based on diagrams as search inputs are disclosed herein. One method can include receiving a search request from a user, the search request including an image that depicts a diagram with at least one associated question, and processing the search request using a diagram parsing model to obtain a formal language representation of the diagram. The method can also include providing the formal language representation of the diagram to a search engine as a search query, and receiving, as a search result to the search query, at least one solution to the at least one associated question of the diagram.

    TEXT CONDITIONED VIDEO RESAMPLER FOR VIDEO UNDERSTANDING

    公开(公告)号:US20250166379A1

    公开(公告)日:2025-05-22

    申请号:US18949777

    申请日:2024-11-15

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus for video understanding. In one aspect, a conditioned resampler model receives video features of multiple video frames of a video processed by a visual encoder and token embeddings for a specified task. The conditioned resampler model generates conditioned resampler embeddings according to the specified task in response to the video features and token embeddings provided as input. The conditioned resampler embeddings are provided to a large language model as input. The large language model generates, in response to the input conditioned resampler embeddings, a text response to the specified task.

    Image transformation using interpretable transformation parameters

    公开(公告)号:US11908115B2

    公开(公告)日:2024-02-20

    申请号:US18161415

    申请日:2023-01-30

    Applicant: Google LLC

    Abstract: A computer-implemented method to perform image-to-image translation. The method can include obtaining one or more machine-learned generator models. The one or more machine-learned generator models can be configured to receive an input image and a user-specified conditioning vector that parameterizes one or more desired values for one or more defined characteristics of an output image. The one or more machine-learned generator models can be configured to perform, based at least in part on the user-specified conditioning vector, one or more transformations on the input image to generate the output image with the one or more desired values for the one or more defined characteristics. The method can include receiving the input image and the user-specified conditioning vector. The method can include generating, using the machine-learned generator model, an output image having the one or more desired values for the one or more characteristics.

Patent Agency Ranking