MULTIPLE SIMULTANEOUS FRAMING ALTERNATIVES USING SPEAKER TRACKING

    公开(公告)号:US20190356883A1

    公开(公告)日:2019-11-21

    申请号:US15981299

    申请日:2018-05-16

    Abstract: In one embodiment, a video conference endpoint may detect a one or more participants within a field of view of a camera of the video conference endpoint. The video conference endpoint may determine one or more alternative framings of an output of the camera of the video conference endpoint based on the detected one or more participants. The video conference endpoint may send the output of the camera of the video conference endpoint to one or more far-end video conference endpoints participating in a video conference with the video conference endpoint. The video conference endpoint may send data descriptive of the one or more alternative framings of the output of the camera to the far-end video conference endpoints. The far-end video conference endpoints may utilize the data to display one of the one or more alternative framings.

    Group and conversational framing for speaker tracking in a video conference system

    公开(公告)号:US10257465B2

    公开(公告)日:2019-04-09

    申请号:US15908984

    申请日:2018-03-01

    Abstract: In one embodiment, a method is provided to intelligently frame groups of participants in a meeting. This gives a more pleasing experience with fewer switches, better contextual understanding, and more natural framing, as would be seen in a video production made by a human director. Furthermore, in accordance with another embodiment, conversational framing techniques are provided. During speaker tracking, when two local participants are addressing each other, a method is provided to show a close-up framing showing both participants. By evaluating the direction participants are looking and a speaker history, it is determined if there is a local discussion going on, and an appropriate framing is selected to give far-end participants the most contextually rich experience.

    Automatic switching between dynamic and preset camera views in a video conference endpoint

    公开(公告)号:US09883143B2

    公开(公告)日:2018-01-30

    申请号:US15383231

    申请日:2016-12-19

    CPC classification number: H04N7/152 H04N5/23219 H04N7/142 H04N7/147

    Abstract: A video conference endpoint includes a camera to capture video and a microphone array to sense audio. One or more preset views are defined. Images in the captured video are processed with a face detection algorithm to detect faces. Active talkers are detected from the sensed audio. The camera is controlled to capture video from the preset views, and from dynamic views created without user input and which include a dynamic overview and a dynamic close-up view. The camera is controlled to dynamically adjust each of the dynamic views to track changing positions of detected faces over time, and dynamically switch the camera between the preset views, the dynamic overview, and the dynamic close-up view over time based on positions of the detected faces and the detected active talkers relative to the preset views and the dynamic views.

    Automatic switching between dynamic and preset camera views in a video conference endpoint
    38.
    发明授权
    Automatic switching between dynamic and preset camera views in a video conference endpoint 有权
    在视频会议终端中自动切换动态和预设摄像机视图

    公开(公告)号:US09584763B2

    公开(公告)日:2017-02-28

    申请号:US14534557

    申请日:2014-11-06

    CPC classification number: H04N7/152 H04N5/23219 H04N7/142 H04N7/147

    Abstract: A video conference endpoint includes one or more cameras to capture video of different views and a microphone array to sense audio. One or more preset views are defined. The endpoint detects faces in the captured video and active audio sources from the sensed audio. The endpoint detects any active talker detected faces that coincide positionally with detected active audio sources, and also detects whether any active talker is in one of the preset views. Based on whether an active talker is detected in any of the preset views, the endpoint switches between capturing video of one of the preset views, and capturing video of a dynamic view.

    Abstract translation: 视频会议端点包括用于捕获不同视图的视频的一个或多个摄像机和用于感测音频的麦克风阵列。 定义一个或多个预设视图。 端点从感测到的音频中检测拍摄视频中的人脸和活动音频源。 端点检测与检测到的活动音频源在位置上重合的任何有效的说话者检测到的面部,并且还检测是否有任何活跃的讲话者处于预设视图之一。 基于在任何预设视图中是否检测到主动讲话者,端点在捕获一个预设视图的视频和捕获动态视图的视频之间切换。

    Method and Apparatus for Enhanced Caller ID
    39.
    发明申请
    Method and Apparatus for Enhanced Caller ID 审中-公开
    增强来电显示的方法和装置

    公开(公告)号:US20160037129A1

    公开(公告)日:2016-02-04

    申请号:US14449589

    申请日:2014-08-01

    CPC classification number: G06K9/00221 G06K9/00362 H04N7/147 H04N7/15

    Abstract: In one embodiment, a method is provided for handling a call from a conferencing endpoint configured to handle a conference between multiple participants. A request to call a participant is received from the conferencing endpoint. Information is inferred about a presence of one or more participants in the call, based on a detection of the one or more participants by presence detection equipment associated with the conferencing endpoint;. Additional call context information is determined based on the inferred information. The additional call context information is provided to the participant in addition to the call, wherein the additional call context information is accessible to the participant.

    Abstract translation: 在一个实施例中,提供了一种用于处理来自被配置为处理多个参与者之间的会议的会议端点的呼叫的方法。 从会议端点接收到呼叫参与者的请求。 基于通过与会议端点相关联的存在检测设备对一个或多个参与者的检测,推断关于呼叫中的一个或多个参与者的存在的信息。 基于所推断的信息来确定附加呼叫上下文信息。 附加呼叫上下文信息除了呼叫之外还提供给参与者,其中附加呼叫上下文信息可由参与者访问。

    DYNAMIC VIDEO LAYOUT DESIGN DURING ONLINE MEETINGS

    公开(公告)号:US20240314263A1

    公开(公告)日:2024-09-19

    申请号:US18674346

    申请日:2024-05-24

    Abstract: Presented herein are techniques for cropping video streams to create an optimized layout in which participants of a meeting are a similar size. A user device receives a plurality of video streams, each video stream including at least one face of a participant participating in a video communication session. Faces in one or more of the plurality of video streams are cropped so that faces in the plurality of video streams are approximately equal in size, to produce a plurality of processed video streams. The plurality of processed video streams are sorted according to video stream widths to produce sorted video streams and the plurality of sorted video streams are distributed for display across a smallest number of rows possible on a display of the user device.

Patent Agency Ranking