-
公开(公告)号:US20240249138A1
公开(公告)日:2024-07-25
申请号:US18395282
申请日:2023-12-22
Applicant: Google LLC
Inventor: Sergey Ioffe , Corinna Cortes
CPC classification number: G06N3/08 , G06F18/10 , G06F18/2415 , G06N3/04 , G06N3/084 , G06V10/70 , G06V10/82 , G06T2207/20081
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images or features of images using an image classification system that includes a batch normalization layer. One of the systems includes a convolutional neural network configured to receive an input comprising an image or image features of the image and to generate a network output that includes respective scores for each object category in a set of object categories, the score for each object category representing a likelihood that that the image contains an image of an object belonging to the category, and the convolutional neural network comprising: a plurality of neural network layers, the plurality of neural network layers comprising a first convolutional neural network layer and a second neural network layer; and a batch normalization layer between the first convolutional neural network layer and the second neural network layer.
-
公开(公告)号:US20230093469A1
公开(公告)日:2023-03-23
申请号:US18071806
申请日:2022-11-30
Applicant: Google LLC
Inventor: Sergey Ioffe
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for training a neural network, wherein the neural network is configured to receive an input data item and to process the input data item to generate a respective score for each label in a predetermined set of multiple labels. The method includes actions of obtaining a set of training data that includes a plurality of training items, wherein each training item is associated with a respective label from the predetermined set of multiple labels; and modifying the training data to generate regularizing training data, comprising: for each training item, determining whether to modify the label associated with the training item, and changing the label associated with the training item to a different label from the predetermined set of labels, and training the neural network on the regularizing data.
-
公开(公告)号:US10956749B2
公开(公告)日:2021-03-23
申请号:US16298327
申请日:2019-03-11
Applicant: Google LLC
Inventor: Matthias Grundmann , Alexandra Ivanna Hawkins , Sergey Ioffe
Abstract: Methods, systems, and media for summarizing a video with video thumbnails are provided. In some embodiments, the method comprises: receiving a plurality of video frames corresponding to the video and associated information associated with each of the plurality of video frames; extracting, for each of the plurality of video frames, a plurality of features; generating candidate clips that each includes at least a portion of the received video frames based on the extracted plurality of features and the associated information; calculating, for each candidate clip, a clip score based on the extracted plurality of features from the video frames associated with the candidate clip; calculating, between adjacent candidate clips, a transition score based at least in part on a comparison of video frame features between frames from the adjacent candidate clips; selecting a subset of the candidate clips based at least in part on the clip score and the transition score associated with each of the candidate clips; and automatically generating an animated video thumbnail corresponding to the video that includes a plurality of video frames selected from each of the subset of candidate clips.
-
公开(公告)号:US10514818B2
公开(公告)日:2019-12-24
申请号:US15092102
申请日:2016-04-06
Applicant: Google LLC
Inventor: Sergey Ioffe , Vivek Kwatra , Matthias Grundmann
IPC: G06F3/0481 , G06F16/58 , G06F16/438
Abstract: A computer-implemented method, computer program product, and computing system is provided for interacting with images having similar content. In an embodiment, a method may include identifying a plurality of photographs as including a common characteristic. The method may also include generating a flipbook media item including the plurality of photographs. The method may further include associating one or more interactive control features with the flipbook media item.
-
公开(公告)号:US10460211B2
公开(公告)日:2019-10-29
申请号:US15395530
申请日:2016-12-30
Applicant: Google LLC
Inventor: Vincent O. Vanhoucke , Christian Szegedy , Sergey Ioffe
Abstract: A neural network system that includes: multiple subnetworks that includes: a first subnetwork including multiple first modules, each first module including: a pass-through convolutional layer configured to process the subnetwork input for the first subnetwork to generate a pass-through output; an average pooling stack of neural network layers that collectively processes the subnetwork input for the first subnetwork to generate an average pooling output; a first stack of convolutional neural network layers configured to collectively process the subnetwork input for the first subnetwork to generate a first stack output; a second stack of convolutional neural network layers that are configured to collectively process the subnetwork input for the first subnetwork to generate a second stack output; and a concatenation layer configured to concatenate the pass-through output, the average pooling output, the first stack output, and the second stack output to generate a first module output for the first module.
-
公开(公告)号:US09940552B1
公开(公告)日:2018-04-10
申请号:US15069697
申请日:2016-03-14
Applicant: Google LLC
Inventor: Sergey Ioffe , Alexander Toshkov Toshev
IPC: G06K9/62
CPC classification number: G06K9/6276 , G06K9/6215 , G06K9/6267 , G06K9/628
Abstract: A linear function describing a framework for identifying an object of class k in an image sample x may be described by: wk*x+bk, where bk is the bias term. The higher the value obtained for a particular classifier, the better the match or strength of identity. A method is disclosed for classifier and/or content padding to convert dot-products to distances, applying a hashing and/or nearest neighbor technique on the resulting padded vectors, and preprocessing that may improve the hash entropy. A vector for an image, an audio, and/or a video may be received. One or more classifier vectors may be obtained. A padded image, video, and/or audio vector and classifier vector may be generated. A dot product may be approximated and a hashing and/or nearest neighbor technique may be performed on the approximated dot product to identify at least one class (or object) present in the image, video, and/or audio.
-
公开(公告)号:US20250013864A1
公开(公告)日:2025-01-09
申请号:US18740393
申请日:2024-06-11
Applicant: Google LLC
Inventor: Sergey Ioffe , Corinna Cortes
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing inputs using a neural network system that includes a batch normalization layer. One of the methods includes receiving a respective first layer output for each training example in the batch; computing a plurality of normalization statistics for the batch from the first layer outputs; normalizing each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch; generating a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and providing the batch normalization layer output as an input to the second neural network layer.
-
公开(公告)号:US11062181B2
公开(公告)日:2021-07-13
申请号:US16550731
申请日:2019-08-26
Applicant: Google LLC
Inventor: Vincent O. Vanhoucke , Christian Szegedy , Sergey Ioffe
Abstract: A neural network system that includes: multiple subnetworks that includes: a first subnetwork including multiple first modules, each first module including: a pass-through convolutional layer configured to process the subnetwork input for the first subnetwork to generate a pass-through output; an average pooling stack of neural network layers that collectively processes the subnetwork input for the first subnetwork to generate an average pooling output; a first stack of convolutional neural network layers configured to collectively process the subnetwork input for the first subnetwork to generate a first stack output; a second stack of convolutional neural network layers that are configured to collectively process the subnetwork input for the first subnetwork to generate a second stack output; and a concatenation layer configured to concatenate the pass-through output, the average pooling output, the first stack output, and the second stack output to generate a first module output for the first module.
-
公开(公告)号:US20200234127A1
公开(公告)日:2020-07-23
申请号:US16837959
申请日:2020-04-01
Applicant: Google LLC
Inventor: Sergey Ioffe , Corinna Cortes
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images or features of images using an image classification system that includes a batch normalization layer. One of the systems includes a convolutional neural network configured to receive an input comprising an image or image features of the image and to generate a network output that includes respective scores for each object category in a set of object categories, the score for each object category representing a likelihood that that the image contains an image of an object belonging to the category, and the convolutional neural network comprising: a plurality of neural network layers, the plurality of neural network layers comprising a first convolutional neural network layer and a second neural network layer; and a batch normalization layer between the first convolutional neural network layer and the second neural network layer.
-
公开(公告)号:US20200057924A1
公开(公告)日:2020-02-20
申请号:US16226483
申请日:2018-12-19
Applicant: Google LLC
Inventor: Sergey Ioffe , Corinna Cortes
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images or features of images using an image classification system that includes a batch normalization layer. One of the systems includes a convolutional neural network configured to receive an input comprising an image or image features of the image and to generate a network output that includes respective scores for each object category in a set of object categories, the score for each object category representing a likelihood that that the image contains an image of an object belonging to the category, and the convolutional neural network comprising: a plurality of neural network layers, the plurality of neural network layers comprising a first convolutional neural network layer and a second neural network layer; and a batch normalization layer between the first convolutional neural network layer and the second neural network layer.
-
-
-
-
-
-
-
-
-