Deep content tagging
    1.
    发明授权

    公开(公告)号:US11589120B2

    公开(公告)日:2023-02-21

    申请号:US16799232

    申请日:2020-02-24

    Abstract: A method and apparatus for deep content tagging. A media device receives one or more first frames of a content item, where the one or more first frames spans a duration of a scene in the content item. The media device detects one or more objects or features in each of the first frames using a neural network model and identifies one or more first genres associated with the first frames based at least in part on the detected objects or features in each of the first frames. The media device further controls playback of the content item based at least in part on the identified first genres.

    Head pose estimation
    3.
    发明授权

    公开(公告)号:US11120569B2

    公开(公告)日:2021-09-14

    申请号:US16450832

    申请日:2019-06-24

    Abstract: A method and apparatus for estimating a user's head pose relative to a sensing device. The sensing device detects a face of the user in an image. The sensing device further identifies a plurality of points in the image corresponding to respective features of the detected face. The plurality of points includes at least a first point corresponding to a location of a first facial feature. The sensing device determines a position of the face relative to the sensing device based at least in part on a distance between the first point in the image and one or more of the remaining points. For example, the sensing device may determine a pitch, yaw, distance, or location of the user's face relative to the sensing device.

    DEEP CONTENT TAGGING
    4.
    发明申请

    公开(公告)号:US20200275158A1

    公开(公告)日:2020-08-27

    申请号:US16799232

    申请日:2020-02-24

    Abstract: A method and apparatus for deep content tagging. A media device receives one or more first frames of a content item, where the one or more first frames spans a duration of a scene in the content item. The media device detects one or more objects or features in each of the first frames using a neural network model and identifies one or more first genres associated with the first frames based at least in part on the detected objects or features in each of the first frames. The media device further controls playback of the content item based at least in part on the identified first genres.

    Audio source enhancement facilitated using video data

    公开(公告)号:US11082460B2

    公开(公告)日:2021-08-03

    申请号:US16455668

    申请日:2019-06-27

    Abstract: Systems and methods for audio signal enhancement facilitated using video data are provided. In one example, a method includes receiving a multi-channel audio signal including audio inputs detected by a plurality of audio input devices. The method further includes receiving an image captured by a video input device. The method further includes determining a first signal based at least in part on the image. The first signal is indicative of a likelihood associated with a target audio source. The method further includes determining a second signal based at least in part on the multi-channel audio signal and the first signal. The second signal is indicative of a likelihood associated with an audio component attributed to the target audio source. The method further includes processing the multi-channel audio signal based at least in part on the second signal to generate an output audio signal.

    Enrollment-free offline device personalization

    公开(公告)号:US11079911B2

    公开(公告)日:2021-08-03

    申请号:US16553998

    申请日:2019-08-28

    Abstract: A method and apparatus for device personalization. A device is configured to receive first sensor data from one or more sensors, detect biometric information in the first sensor data, encode the biometric information as a first vector using one or more neural network models stored on the device, and configure a user interface of the device based at least in part on the first vector. For example, the profile information may include configurations, settings, preferences, or content to be displayed or rendered via the user interface. In some implementations, the first sensor data may comprise an image of a scene and the biometric information may comprise one or more facial features of a user in the scene.

Patent Agency Ranking