-
公开(公告)号:US20160267179A1
公开(公告)日:2016-09-15
申请号:US15030815
申请日:2013-10-21
发明人: Tao Mei , Shipeng Li , Wu Liu
IPC分类号: G06F17/30
摘要: A facility for using a mobile device to search video content takes advantage of computing capacity on the mobile device to capture input through a camera and/or a microphone, extract an audio-video signature of the input in real time, and to perform progressive search. By extracting a joint audio-video signature from the input in real time as the input is received and sending the signature to the cloud to search similar video content through the layered audio-video indexing, the facility can provide progressive results of candidate videos for progressive signature captures.
摘要翻译: 使用移动设备搜索视频内容的设施利用移动设备上的计算能力来捕获通过照相机和/或麦克风的输入,实时提取输入的音频 - 视频签名并执行逐行搜索 。 通过接收输入实时从输入中提取联合音视频签名,并通过分层音视频索引将签名发送到云搜索相似的视频内容,该设施可以提供渐进的候选视频的渐进结果 签名捕获。
-
公开(公告)号:US09754188B2
公开(公告)日:2017-09-05
申请号:US14522194
申请日:2014-10-23
发明人: Tao Mei , Jianlong Fu , Kuiyuan Yang , Yong Rui
CPC分类号: G06K9/6256 , G06F17/30247 , G06F17/30256 , G06F17/3028 , G06K9/627
摘要: Techniques and constructs to facilitate automatic tagging can provide improvements in image storage and searching. The constructs may enable training a deep network using tagged source images and target images. The constructs may also train a top layer of the deep network using a personal photo ontology. The constructs also may select one or more concepts from the ontology for tagging personal digital images.
-
公开(公告)号:US20170109584A1
公开(公告)日:2017-04-20
申请号:US14887629
申请日:2015-10-20
IPC分类号: G06K9/00 , G11B27/30 , G11B27/031
CPC分类号: G06K9/00718 , G06K9/00751 , G11B27/031 , G11B27/3081 , H04N21/45457 , H04N21/4666 , H04N21/8549
摘要: Video highlight detection using pairwise deep ranking neural network training is described. In some examples, highlights in a video are discovered, then used for generating summarization of videos, such as first-person videos. A pairwise deep ranking model is employed to learn the relationship between previously identified highlight and non-highlight video segments. This relationship is encapsulated in a neural network. An example two stream process generates highlight scores for each segment of a user's video. The obtained highlight scores are used to summarize highlights of the user's video.
-
公开(公告)号:US11538244B2
公开(公告)日:2022-12-27
申请号:US16642660
申请日:2018-06-22
摘要: Implementations of the subject matter described herein provide a solution for extracting spatial-temporal feature representation. In this solution, an input comprising a plurality of images is received at a first layer of a learning network. First features that characterize spatial presentation of the images are extracted from the input in a spatial dimension using a first unit of the first layer. Based on a type of a connection between the first unit and a second unit of the first layer, second features at least characterizing temporal changes across the images are extracted from the first features and/or the input in a temporal dimension using the second unit. A spatial-temporal feature representation of the images is generated partially based on the second features. Through this solution, it is possible to reduce learning network sizes, improve training and use efficiency of learning networks, and obtain accurate spatial-temporal feature representations.
-
公开(公告)号:US10459964B2
公开(公告)日:2019-10-29
申请号:US15323247
申请日:2014-07-04
发明人: Tao Mei , Yan-Feng Sun , Yong Rui , Chun-Che Wu
IPC分类号: G06F16/50 , G06F16/9038 , G06F16/9535 , G06F16/9032 , G06F16/532 , G06F16/58
摘要: Techniques and constructs to facilitate suggestion of image-based search queries can provide personalized trending image search queries. The constructs may enable identification of trending image searches and further personalize those trending image search queries for an identified user based on information about on the user's search history and the search histories of other users. The constructs also may select a representative image for display to the user, such that selection of the representative image will execute the search query. The representative image may be selected from a plurality of candidate images based on its burstiness.
-
公开(公告)号:US10452712B2
公开(公告)日:2019-10-22
申请号:US15030815
申请日:2013-10-21
发明人: Tao Mei , Shipeng Li , Wu Liu
IPC分类号: G06F16/00 , G06F16/732 , G06F16/71 , G06F16/738 , G06F16/783 , G06K9/00 , G06K9/46
摘要: A facility for using a mobile device to search video content takes advantage of computing capacity on the mobile device to capture input through a camera and/or a microphone, extract an audio-video signature of the input in real time, and to perform progressive search. By extracting a joint audio-video signature from the input in real time as the input is received and sending the signature to the cloud to search similar video content through the layered audio-video indexing, the facility can provide progressive results of candidate videos for progressive signature captures.
-
公开(公告)号:US11670071B2
公开(公告)日:2023-06-06
申请号:US16631923
申请日:2018-05-29
发明人: Jianlong Fu , Tao Mei
IPC分类号: G06V30/194 , G06T7/11 , G06T1/20 , G06T1/60 , G06V10/94 , G06V10/44 , G06V10/764 , G06V10/82
CPC分类号: G06V10/454 , G06T1/20 , G06T1/60 , G06T7/11 , G06V10/764 , G06V10/82 , G06V10/95
摘要: In accordance with implementations of the subject matter described herein, a solution for fine-grained image recognition is proposed. This solution includes extracting a global feature of an image using a first sub-network of a first learning network; determining a first attention region of the image based on the global feature using a second sub-network of the first learning network, the first attention region including a discriminative portion of an object in the image; extracting a first local feature of the first attention region using a first sub-network of a second learning network; and determining a category of the object in the image based at least in part on the first local feature. Through this solution, it is possible to localize an image region at a finer scale accurately such that a local feature at a fine scale can be obtained for object recognition.
-
公开(公告)号:US09807473B2
公开(公告)日:2017-10-31
申请号:US14946988
申请日:2015-11-20
IPC分类号: H04N5/445 , H04N21/8405 , G06F17/27 , G06K9/00 , G06N3/08
CPC分类号: H04N21/8405 , G06F17/274 , G06F17/2785 , G06K9/00718 , G06K9/6273 , G06N3/08 , H04N21/26603
摘要: Video description generation using neural network training based on relevance and coherence is described. In some examples, long short-term memory with visual-semantic embedding (LSTM-E) can maximize the probability of generating the next word given previous words and visual content and can create a visual-semantic embedding space for enforcing the relationship between the semantics of an entire sentence and visual content. LSTM-E can include a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep recurrent neural network for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics.
-
公开(公告)号:US20170150235A1
公开(公告)日:2017-05-25
申请号:US14946988
申请日:2015-11-20
IPC分类号: H04N21/8405 , G06K9/00 , G06N3/08 , G06F17/27
CPC分类号: H04N21/8405 , G06F17/274 , G06F17/2785 , G06K9/00718 , G06K9/6273 , G06N3/08 , H04N21/26603
摘要: Video description generation using neural network training based on relevance and coherence is described. In some examples, long short-term memory with visual-semantic embedding (LSTM-E) can maximize the probability of generating the next word given previous words and visual content and can create a visual-semantic embedding space for enforcing the relationship between the semantics of an entire sentence and visual content. LSTM-E can include a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep recurrent neural network for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics.
-
公开(公告)号:US20170139954A1
公开(公告)日:2017-05-18
申请号:US15323247
申请日:2014-07-04
发明人: Tao Mei , Yan-Feng Sun , Yong Rui , Chun-Che Wu
IPC分类号: G06F17/30
CPC分类号: G06F16/50 , G06F16/532 , G06F16/58 , G06F16/90324 , G06F16/9038 , G06F16/9535
摘要: Techniques and constructs to facilitate suggestion of image-based search queries can provide personalized trending image search queries. The constructs may enable identification of trending image searches and further personalize those trending image search queries for an identified user based on information about on the user's search history and the search histories of other users. The constructs also may select a representative image for display to the user, such that selection of the representative image will execute the search query. The representative image may be selected from a plurality of candidate images based on its burstiness.
-
-
-
-
-
-
-
-
-