-
公开(公告)号:US20170132528A1
公开(公告)日:2017-05-11
申请号:US15195894
申请日:2016-06-28
发明人: Ozlem Aslan , Rich Caruana , Matthew R. Richardson , Abdelrahman Mohamed , Matthai Philipose , Krzysztof Geras , Gregor Urban , Shengjie Wang
CPC分类号: G06N20/00
摘要: Multiple machine learning models can be jointly trained in parallel. An example process for jointly training multiple machine learning models includes providing a set of machine learning models that are to learn a respective task, the set of machine learning models including a first machine learning model and a second machine learning model. The process can initiate training of the first machine learning model to learn a task using training data. During the training of the first machine learning model, information can be passed between the first machine learning model and the second machine learning model. Such passing of information (or “transfer of knowledge”) between the machine learning models can be accomplished via the formulation, and optimization, of an objective function that comprises model parameters that are based on the multiple machine learning models in the set.
-
公开(公告)号:US10264346B2
公开(公告)日:2019-04-16
申请号:US15782185
申请日:2017-10-12
摘要: Wearable audio accessories for computing devices are described. In one embodiment the wearable audio accessory provides a speech based interface between the user and a nearby computing device for the performance of user-initiated or computing device initiated microtasks. Information is provided to the user via a loudspeaker and the user can provide input via a microphone. An audio sensing channel within the accessory continuously monitors the audio signal as detected by the microphone and in various embodiments will trigger more complex audio processing based on this monitoring. A wireless communication link is provided between the accessory and the nearby computing device. To mitigate any delay caused by the switching between audio processing techniques, the audio accessory may include a rolling buffer which continuously stores the audio signal and outputs a delayed audio signal to the audio processing engines.
-
公开(公告)号:US10223604B2
公开(公告)日:2019-03-05
申请号:US15373301
申请日:2016-12-08
IPC分类号: G06K9/00 , G06F9/48 , G06F9/50 , H04N7/18 , H04N21/234
摘要: Various technologies described herein pertain to performing video analytics. The approaches set forth herein support live video analytics at scale with approximate and delay-tolerant processing. Video streams can be captured by multiple cameras and continuously streamed to a video analytics computing system; the video streams can be received at the video analytics computing system. Multiple video analytics queries can be executed on the video streams. The multiple video analytics queries can be concurrently executed by the video analytics computing system on the video streams as the video streams are continuously streamed to the video analytics computing system. The multiple video analytics queries can be executed utilizing resources of the video analytics computing system allocated between the multiple video analytics queries. Execution of the multiple video analytics queries can return respective results for the multiple video analytics queries. The results for the multiple video analytics queries can be outputted.
-
公开(公告)号:US09984314B2
公开(公告)日:2018-05-29
申请号:US15148900
申请日:2016-05-06
CPC分类号: G06K9/6285 , G06K9/00718 , G06K9/6227 , G06K9/6256 , G06K9/628 , G06N3/08
摘要: A classification system classifies different aspects of content of an input image stream, such as faces, landmarks, events, and so forth. The classification system includes a general classifier and at least one specialized classifier template. The general classifier is trained to classify a large number of different aspects of content, and a specialized classifier can be trained based on a specialized classifier template during operation of the classification system to classify a particular subset of the multiple different aspects of content. The classification system determines when to use the general classifier and when to use a specialized classifier based on class skew, which refers to the temporal locality of a subset of aspects of content in the image stream.
-
公开(公告)号:US20170323184A1
公开(公告)日:2017-11-09
申请号:US15148900
申请日:2016-05-06
CPC分类号: G06K9/6285 , G06K9/00718 , G06K9/6227 , G06K9/6256 , G06K9/628 , G06N3/08
摘要: A classification system classifies different aspects of content of an input image stream, such as faces, landmarks, events, and so forth. The classification system includes a general classifier and at least one specialized classifier template. The general classifier is trained to classify a large number of different aspects of content, and a specialized classifier can be trained based on a specialized classifier template during operation of the classification system to classify a particular subset of the multiple different aspects of content. The classification system determines when to use the general classifier and when to use a specialized classifier based on class skew, which refers to the temporal locality of a subset of aspects of content in the image stream.
-
公开(公告)号:US11354902B2
公开(公告)日:2022-06-07
申请号:US16875080
申请日:2020-05-15
摘要: A method can include classifying, using a compressed and specialized convolutional neural network (CNN), an object of a video frame into classes, clustering the object based on a distance of a feature vector of the object to a feature vector of a centroid object of the cluster, storing top-k classes, a centroid identification, and a cluster identification, in response to receiving a query for objects of class X from a specific video stream, retrieving image data for each centroid of each cluster that includes the class X as one of the top-k classes, classifying, using a ground truth CNN (GT-CNN), the retrieved image data for each centroid, and for each centroid determined to be classified as a member of the class X providing image data for each object in each cluster associated with the centroid.
-
公开(公告)号:US20190205649A1
公开(公告)日:2019-07-04
申请号:US15971563
申请日:2018-05-04
CPC分类号: G06K9/00718 , G06K9/00711 , G06K9/6218 , G06K9/6227 , G06K9/6256 , G06K9/6269
摘要: A method can include classifying, using a compressed and specialized convolutional neural network (CNN), an object of a video frame into classes, clustering the object based on a distance of a feature vector of the object to a feature vector of a centroid object of the cluster, storing top-k classes, a centroid identification, and a cluster identification, in response to receiving a query for objects of class X from a specific video stream, retrieving image data for each centroid of each cluster that includes the class X as one of the top-k classes, classifying, using a ground truth CNN (GT-CNN), the retrieved image data for each centroid, and for each centroid determined to be classified as a member of the class X providing image data for each object in each cluster associated with the centroid.
-
公开(公告)号:US20170235828A1
公开(公告)日:2017-08-17
申请号:US15043219
申请日:2016-02-12
IPC分类号: G06F17/30 , H04N21/84 , H04N21/2665 , G06K9/00 , H04N21/234
CPC分类号: G06F16/783 , G06F16/7837 , G06K9/00718 , H04N21/23418 , H04N21/2665 , H04N21/84
摘要: A digest generation system obtains video streams and includes an admission control module that selects, for each video stream, a subset of the frames of the video stream to analyze. A frame-to-text classifier generates a digest for each selected frame and the generated digests are stored in a digest store in a manner so that each digest is associated with the video stream from which the digest was generated. The digest for a frame is text that describes the frame, such as objects identified in the frame. A viewer desiring to view a video stream having particular characteristics inputs a text search query to a search system. The search system, based on the digests, generates search results that are an indication of video streams that satisfy the search criteria. The search results are presented to the user, allowing the user to select and view one of the video streams.
-
公开(公告)号:US20170060850A1
公开(公告)日:2017-03-02
申请号:US14834197
申请日:2015-08-24
发明人: William Lewis , Arul Menezes , Matthai Philipose , Vishal Chowdhary , John Franciscus Marie Helmes , Stephen Hodges , Stuart Alastair Taylor
CPC分类号: G06F17/289 , G06F17/2836 , G10L13/00 , G10L15/24 , G10L15/30 , G10L15/32
摘要: The personal translator implementations described herein provide a speech translation device that pairs with a computing device to translate in-person conversations. The speech translation device can be wearable. In one implementation the personal translator comprises a speech translation device with at least one microphone that captures input signals representing nearby speech of a first user/wearer of the device and at least one other nearby person in a conversation in two languages; a wireless communication unit that sends the captured input signals representing speech to a nearby computing device, and receives for each language in the conversation, language translations from the computing device; and at least one loudspeaker that outputs the language translations to the first user/wearer and at least one other nearby person. The language translations in text form can be displayed on a display at the same time the language translations are output to the loudspeaker(s).
摘要翻译: 本文描述的个人翻译器实现提供了与计算设备配对以翻译个人对话的语音翻译设备。 语音翻译装置可以穿戴。 在一个实施方式中,个人翻译器包括具有至少一个麦克风的语音翻译装置,该麦克风在两种语言的对话中捕获表示设备的第一用户/穿戴者的附近语音的输入信号和至少一个其他附近的人; 无线通信单元,其将表示语音的所捕获的输入信号发送到附近的计算设备,并且接收对话中的每种语言,来自所述计算设备的语言翻译; 以及至少一个扬声器,其将语言翻译输出到第一用户/佩戴者以及至少一个其他附近的人。 文本形式的语言翻译可以在显示器上显示,同时将语言翻译输出到扬声器。
-
公开(公告)号:US11170819B2
公开(公告)日:2021-11-09
申请号:US16411611
申请日:2019-05-14
发明人: Donald Frank Brinkman, Jr. , Suvamsh Shivaprasad , Max Artemov , Lenin Ravindranath Sivalingam , Matthai Philipose , Peter Bodik
IPC分类号: G11B27/10 , H04N21/472 , H04N21/845 , H04N21/8549
摘要: Described herein is a mechanism for creating a dynamic video highlight from a plurality of video segments. A metadata collection agent collects metadata comprising attributes about a video, segments within the video where one or more events occur, attributes about the creator of the video and so forth. The metadata is collected and used to create highlight video definitions comprising a set of metadata attribute-value pairs. The highlight video definitions can be created in an interactive fashion by presenting a user interface allowing selection of a combination of attribute-value pairs to include/exclude segments from the definition and/or manual selection of custom segments. The highlight video definitions can be stored and/or shared among users. The video highlight definitions are utilized to instantiate one or more video players to play the video segments in an identified order without assembling the video segments into a separate video.
-
-
-
-
-
-
-
-
-