Audio analysis learning with video data

    公开(公告)号:US10573313B2

    公开(公告)日:2020-02-25

    申请号:US16272054

    申请日:2019-02-11

    申请人: Affectiva, Inc.

    摘要: Audio analysis learning is performed using video data. Video data is obtained, on a first computing device, wherein the video data includes images of one or more people. Audio data is obtained, on a second computing device, which corresponds to the video data. A face within the video data is identified. A first voice, from the audio data, is associated with the face within the video data. The face within the video data is analyzed for cognitive content. Audio features corresponding to the cognitive content of the video data are extracted. The audio data is segmented to correspond to an analyzed cognitive state. An audio classifier is learned, on a third computing device, based on the analyzing of the face within the video data. Further audio data is analyzed using the audio classifier.

    NEURAL NETWORK TRAINING WITH BIAS MITIGATION

    公开(公告)号:US20220101146A1

    公开(公告)日:2022-03-31

    申请号:US17482501

    申请日:2021-09-23

    申请人: Affectiva, Inc.

    IPC分类号: G06N3/08 G06N3/04

    摘要: Techniques for machine learning based on neural network training with bias mitigation are disclosed. Facial images for a neural network configuration and a neural network training dataset are obtained. The training dataset is associated with the neural network configuration. The facial images are partitioned into multiple subgroups, wherein the subgroups represent demographics with potential for biased training. A multifactor key performance indicator (KPI) is calculated per image. The calculating is based on analyzing performance of two or more image classifier models. The neural network configuration and the training dataset are promoted to a production neural network, wherein the promoting is based on the KPI. The KPI identifies bias in the training dataset. Promotion of the neural network configuration and the neural network training dataset is based on identified bias. Identified bias precludes promotion to the production neural network, while identified non-bias allows promotion to the production neural network.

    NEURAL NETWORK SYNTHESIS ARCHITECTURE USING ENCODER-DECODER MODELS

    公开(公告)号:US20220067519A1

    公开(公告)日:2022-03-03

    申请号:US17458639

    申请日:2021-08-27

    申请人: Affectiva, Inc.

    摘要: Disclosed techniques include neural network architecture using encoder-decoder models. A facial image is obtained for processing on a neural network. The facial image includes unpaired facial image attributes. The facial image is processed through a first encoder-decoder pair and a second encoder-decoder pair. The first encoder-decoder pair decomposes a first image attribute subspace. The second encoder-decoder pair decomposes a second image attribute subspace. The first encoder-decoder pair outputs a transformation mask based on the first image attribute subspace. The second encoder-decoder pair outputs a second image transformation mask based on the second image attribute subspace. The first image transformation mask and the second image transformation mask are concatenated to enable downstream processing. The concatenated transformation masks are processed on a third encoder-decoder pair and a resulting image is output. The resulting image eliminates a paired training data requirement.

    MULTIMODAL MACHINE LEARNING FOR VEHICLE MANIPULATION

    公开(公告)号:US20200242383A1

    公开(公告)日:2020-07-30

    申请号:US16852627

    申请日:2020-04-20

    申请人: Affectiva, Inc.

    IPC分类号: G06K9/00 B60W40/08 G06N3/08

    摘要: Techniques for machine-trained analysis for multimodal machine learning vehicle manipulation are described. A computing device captures a plurality of information channels, wherein the plurality of information channels includes contemporaneous audio information and video information from an individual. A multilayered convolutional computing system learns trained weights using the audio information and the video information from the plurality of information channels. The trained weights cover both the audio information and the video information and are trained simultaneously. The learning facilitates cognitive state analysis of the audio information and the video information. A computing device within a vehicle captures further information and analyzes the further information using trained weights. The further information that is analyzed enables vehicle manipulation. The further information can include only video data or only audio data. The further information can include a cognitive state metric.

    IMAGE ANALYSIS USING A SEMICONDUCTOR PROCESSOR FOR FACIAL EVALUATION IN VEHICLES

    公开(公告)号:US20200074154A1

    公开(公告)日:2020-03-05

    申请号:US16678180

    申请日:2019-11-08

    申请人: Affectiva, Inc.

    摘要: Analysis for convolutional processing is performed using logic encoded in a semiconductor processor. The semiconductor chip evaluates pixels within an image of a person in a vehicle, where the analysis identifies a facial portion of the person. The facial portion of the person can include facial landmarks or regions. The semiconductor chip identifies one or more facial expressions based on the facial portion. The facial expressions can include a smile, frown, smirk, or grimace. The semiconductor chip classifies the one or more facial expressions for cognitive response content. The semiconductor chip evaluates the cognitive response content to produce cognitive state information for the person. The semiconductor chip enables manipulation of the vehicle based on communication of the cognitive state information to a component of the vehicle.

    MULTIDEVICE MULTIMODAL EMOTION SERVICES MONITORING

    公开(公告)号:US20200026347A1

    公开(公告)日:2020-01-23

    申请号:US16587579

    申请日:2019-09-30

    申请人: Affectiva, Inc.

    摘要: Techniques for multidevice, multimodal emotion services monitoring are disclosed. An expression to be detected is determined. The expression relates to a cognitive state of an individual. Input on the cognitive state of the individual is obtained using a device local to the individual. Monitoring for the expression is performed. The monitoring uses a background process on a device remote from the individual. An occurrence of the expression is identified. The identification is performed by the background process. Notification that the expression was identified is provided. The notification is provided from the background process to a device distinct from the device running the background process. The expression is defined as a multimodal expression. The multimodal expression includes image data and audio data from the individual. The notification enables emotion services to be provided. The emotion services augment messaging, social media, and automated help applications.

    VEHICLE MANIPULATION USING COGNITIVE STATE ENGINEERING

    公开(公告)号:US20190283762A1

    公开(公告)日:2019-09-19

    申请号:US16429022

    申请日:2019-06-02

    申请人: Affectiva, Inc.

    摘要: Vehicle manipulation uses cognitive state engineering. Images of a vehicle occupant are obtained using imaging devices within a vehicle. The one or more images include facial data of the vehicle occupant. A computing device is used to analyze the images to determine a cognitive state. Audio information from the occupant is obtained and the analyzing is augmented based on the audio information. The cognitive state is mapped to a loading curve, where the loading curve represents a continuous spectrum of cognitive state loading variation. The vehicle is manipulated, based on the mapping to the loading curve, where the manipulating uses cognitive state alteration engineering. The manipulating includes changing vehicle occupant sensory stimulation. Additional images of additional occupants of the vehicle are obtained and analyzed to determine additional cognitive states. Additional cognitive states are used to adjust the mapping. A cognitive load is estimated based on eye gaze tracking.

    Audio analysis learning using video data

    公开(公告)号:US10204625B2

    公开(公告)日:2019-02-12

    申请号:US15861855

    申请日:2018-01-04

    申请人: Affectiva, Inc.

    摘要: Audio analysis learning is performed using video data. Video data is obtained, on a first computing device, wherein the video data includes images of one or more people. Audio data is obtained, on a second computing device, which corresponds to the video data. A face is identified within the video data. A first voice, from the audio data, is associated with the face within the video data. The face within the video data is analyzed for cognitive content. Audio features are extracted corresponding to the cognitive content of the video data. The audio data is segmented to correspond to an analyzed cognitive state. An audio classifier is learned, on a third computing device, based on the analyzing of the face within the video data. Further audio data is analyzed using the audio classifier.