SYSTEMS AND METHODS FOR PROCESSING AUDIO AND VIDEO USING A VOICE PRINT

    公开(公告)号:US20210350823A1

    公开(公告)日:2021-11-11

    申请号:US17315780

    申请日:2021-05-10

    摘要: A wearable device for processing audio signals may include a microphone configured to capture sounds from an environment of a user and at least one processor. The processor may be programmed to receive first audio signals captured by the microphone during a first time period during which the user is in a location, and obtain an audio segment from the first audio signals. The audio segment may include a portion of the first audio signals in which an individual is speaking. The processor may also be programmed to generate a voice print of the individual using at least the audio segment, and receive second audio signals representative of additional sounds captured by the microphone. The additional sounds may include sounds made by the individual. The second audio signals may be at least one of audio signals captured by the microphone within a predetermined time period after the first time period, or audio signals captured by the microphone while the user is in the location. The at least one processor may also be programmed to process the second audio signals using the generated voice print.

    SYSTEMS AND METHODS FOR MATCHING AUDIO AND IMAGE INFORMATION

    公开(公告)号:US20210209362A1

    公开(公告)日:2021-07-08

    申请号:US17141985

    申请日:2021-01-05

    IPC分类号: G06K9/00 G06F1/16 G06K9/20

    摘要: System and methods for processing audio signals are disclosed. In one implementation, a system may comprise a wearable camera configured to capture images from an environment of a user; a microphone configured to capture sounds from the environment of the user; and a processor. The processor may be configured to receive at least one image of the plurality of images, the at least one image comprising a plurality of image portions associated with corresponding image portion timestamps; receive at least one audio signal representative of the sounds captured by the at least one microphone, identify an audio timestamp associated with a portion of the audio signal; identify an image portion from among the plurality of image portions, the image portion having an image portion timestamp associated with the audio timestamp; and analyze the image portion to identify a voice originating from an object represented in the image.

    RETRIEVING AND DISPLAYING KEY WORDS FROM PRIOR CONVERSATIONS

    公开(公告)号:US20200050862A1

    公开(公告)日:2020-02-13

    申请号:US16659163

    申请日:2019-10-21

    摘要: A wearable apparatus is provided. The wearable apparatus may include: a wearable image sensor configured to capture a plurality of images from an environment of a user; and at least one processor programmed to: receive, from the wearable image sensor, a facial image of an individual with whom a user interacted in a first interaction during a time window; receive sound data captured in a vicinity of the image sensor during a part of the time window; process the sound data to identify a key word; store an association between the key word and the facial image; receive another facial image of the individual during a second interaction; determine that the individual is the individual in the second interaction; access the memory to locate the key word from the first interaction; and cause a display of the key word on a display visible to the user.

    Wearable Apparatus for Name Tagging
    47.
    发明申请

    公开(公告)号:US20200050861A1

    公开(公告)日:2020-02-13

    申请号:US16658438

    申请日:2019-10-21

    摘要: A wearable apparatus and methods may analyze images. In one implementation, the wearable apparatus may comprise a wearable image sensor and at least one processor. The at least one processor may be programmed to: receive, from the wearable image sensor, a facial image of an individual with whom a user of the wearable apparatus is interacting; receive sound data captured during the interacting; process at least a portion of the sound data to determine a spoken name of the individual; convert the spoken name to text; store, in memory, text associated with the spoken name in a manner associating the text with the facial image; after a subsequent encounter with the individual, receive, from the wearable image sensor, a subsequent facial image of the individual; perform a look-up of an identity of the individual based on the subsequent facial image; receive, from the memory the text of the spoken name of the individual; and cause a display in text of the name of the individual on a device paired with the wearable apparatus.