DETERMINING WHEN A SUBJECT IS SPEAKING BY ANALYZING A RESPIRATORY SIGNAL OBTAINED FROM A VIDEO

    公开(公告)号:US20170294193A1

    公开(公告)日:2017-10-12

    申请号:US15092287

    申请日:2016-04-06

    申请人: Xerox Corporation

    摘要: What is disclosed is a system and method for determining when a subject is speaking from a respiratory signal obtained from a video of that subject. A video of a subject is received and a respiratory signal is extracted from a time-series signal is obtained from processing pixels in image frames of the video. The respiratory signal comprises an inspiratory signal and an expiratory signal. Cycle-level feature are extracted from the respiratory signal and used to identify expiratory signals during which speech is likely to have occurred. The identified expiratory signal are divided into time intervals. Frame-level features are determined for each time interval and an amount of distortion in the expiratory signal for this time interval is quantified. The amount of distortion is compared to a threshold. In response to the comparison, a determination is made that speech occurred during this interval. The process repeats for all time intervals.