OBJECT RE-IDENTIFICATION USING POSE PART BASED MODELS

    公开(公告)号:US20220343639A1

    公开(公告)日:2022-10-27

    申请号:US17764093

    申请日:2019-12-06

    Abstract: An example apparatus for re-identifying objects includes an image receiver to receive a first image and a second image of an object with an identity. The apparatus also includes a fused model generator to fuse a global representation of the object with local representations of pose parts of the object to generate a fused representation of the object based on the first image. The apparatus further includes an object re-identifier to re-identify the object with the identity in the second image based on the fused representation.

    TECHNIQUES FOR DENSE VIDEO DESCRIPTIONS

    公开(公告)号:US20210142115A1

    公开(公告)日:2021-05-13

    申请号:US16616533

    申请日:2017-06-29

    Abstract: Techniques and apparatus for generating dense natural language descriptions for video content are described. In one embodiment, for example, an apparatus may include at least one memory and logic, at least a portion of the logic comprised in hardware coupled to the at least one memory, the logic to receive a source video comprising a plurality of frames, determine a plurality of regions for each of the plurality of frames, generate at least one region-sequence connecting the determined plurality of regions, apply a language model to the at least one region-sequence to generate description information comprising a description of at least a portion of content of the source video. Other embodiments are described and claimed.

    TECHNIQUES FOR DENSE VIDEO DESCRIPTIONS

    公开(公告)号:US20220180127A1

    公开(公告)日:2022-06-09

    申请号:US17569725

    申请日:2022-01-06

    Abstract: Techniques and apparatus for generating dense natural language descriptions for video content are described. In one embodiment, for example, an apparatus may include at least one memory and logic, at least a portion of the logic comprised in hardware coupled to the at least one memory, the logic to receive a source video comprising a plurality of frames, determine a plurality of regions for each of the plurality of frames, generate at least one region-sequence connecting the determined plurality of regions, apply a language model to the at least one region-sequence to generate description information comprising a description of at least a portion of content of the source video. Other embodiments are described and claimed.

Patent Agency Ranking