-
公开(公告)号:US20230186082A1
公开(公告)日:2023-06-15
申请号:US17978026
申请日:2022-10-31
Applicant: Google LLC
Inventor: Michele Covell , David Marwood , Shumeet Baluja , Nicholas Johnston
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing a network input using a neural network to generate a network output for the network input. One of the methods includes maintaining, for each of the plurality of neural network layers, a respective look-up table that maps each possible combination of a quantized input index and a quantized weight index to a multiplication result; and generating a network output from a network input, comprising, for each of the neural network layers: receiving data specifying a quantized input to the neural network layer, the quantized input comprising a plurality of quantized input values; and generating a layer output for the neural network layer from the quantized input to the neural network layer using the respective look-up table for the neural network layer.
-
公开(公告)号:US20220237882A1
公开(公告)日:2022-07-28
申请号:US17614929
申请日:2019-05-28
Applicant: Google LLC
Inventor: Shumeet Baluja , Rahul Sukthankar
IPC: G06V10/20 , G06V10/774 , G06V10/82 , G06V10/94
Abstract: The present disclosure is directed to encoding images. In particular, one or more computing devices can receive data representing one or more machine learning (ML) models configured, at least in part, to encode images comprising objects of a particular type. The computing device(s) can receive data representing an image comprising one or more objects of the particular type. The computing device(s) can generate, based at least in part on the data representing the image and the data representing the ML model(s), data representing an encoded version of the image that alters at least a portion of the image comprising the object(s) such that when the encoded version of the image is decoded, the object(s) are unrecognizable as being of the particular type by one or more object-recognition ML models based at least in part upon which the ML model(s) configured to encode the images were trained.
-
公开(公告)号:US11092455B2
公开(公告)日:2021-08-17
申请号:US15950645
申请日:2018-04-11
Applicant: Google LLC
Inventor: Henry Allan Rowley , Shumeet Baluja
IPC: G01C21/26 , G01C21/36 , G01C21/34 , G08G1/0968 , G06Q30/02
Abstract: A computer-implemented method of providing personalized route information involves gathering a plurality of past location indicators over time for a wireless client device, determining a future driving objective using the plurality of previously-gathered location indicators, obtaining real-time traffic data for an area proximate to the determined driving objective, and generating a suggested route for the driving objective using the near real-time traffic data.
-
公开(公告)号:US11019366B2
公开(公告)日:2021-05-25
申请号:US16413992
申请日:2019-05-16
Applicant: GOOGLE LLC
Inventor: David Marwood , Michele Covell , Shumeet Baluja , Nicholas Milo Johnston , Pascal Massimino
IPC: G06T9/00 , H04N19/90 , H04N19/186 , H04N19/176
Abstract: An encoder system can include a pixel grid generator to receive an image having a first dimension, generate a grid having a second dimension, add a plurality of points to positions on the grid, and map a plurality of pixels of the image to the plurality of points. The encoder system can include a color module to assign a color to each of the plurality of points using a color table, a triangulation module to generate a plurality of vertices based on the plurality of points and triangulate the grid using the vertices, and a compression module to compress the vertices as a set of compressed vertex positions and a set of vertex colors.
-
公开(公告)号:US10681388B2
公开(公告)日:2020-06-09
申请号:US15883639
申请日:2018-01-30
Applicant: GOOGLE LLC
Inventor: Michele Covell , David Marwood , Shumeet Baluja , Rahul Sukthankar
IPC: H04N19/44 , H04N19/463 , H04N19/91 , H04N19/176 , H04N19/14 , H04N19/13
Abstract: Encoding and decoding occupancy information is disclosed. A method includes determining row sums for the region, determining column sums for the region, encoding, in a compressed bitstream, at least one of the row sums and the column sums, and encoding, in the compressed bitstream and based on a coding order, at least one of the rows and the columns of the region. The coding order is based on the encoded at least one of the row sums and the column sums. The row sums include, for each row of the region, a respective count of a number of locations in the row having a specified value. The column sums include, for each column of the region, a respective count of a number of locations in the column having the specified value. A location having the specified value is indicative of the occupancy information at the location.
-
公开(公告)号:US09953636B2
公开(公告)日:2018-04-24
申请号:US14879755
申请日:2015-10-09
Applicant: Google LLC
Inventor: Michael H. Cohen , Shumeet Baluja , Pedro J. Moreno Mengibar
IPC: G10L15/00 , G10L17/00 , G10L15/065 , G10L15/187 , G10L15/26 , G10L15/06
CPC classification number: G10L15/065 , G10L15/06 , G10L15/063 , G10L15/187 , G10L15/26 , G10L2015/0635
Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
-
公开(公告)号:US20240370706A1
公开(公告)日:2024-11-07
申请号:US18690176
申请日:2021-10-01
Applicant: Google LLC
Inventor: David Marwood , Shumeet Baluja
IPC: G06N3/0464
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing an input through each of a plurality of layers of a neural network to generate an output, wherein the plurality of layers comprise a convolutional layer. One of the methods includes: receiving a layer input for the convolutional layer; processing the layer input to generate a layer output for the convolutional layer, comprising determining a convolution between the layer input and a filter associated with the convolutional layer; generating a spatial weight mask for the convolutional layer by using a contextual convolution block in accordance with a set of one or more spatially sensitive mask functions defined in the contextual convolution block; and determining a weighted layer output for the convolutional layer, comprising determining a product between the spatial weight mask and the layer output of the convolutional layer.
-
公开(公告)号:US20230307003A1
公开(公告)日:2023-09-28
申请号:US18204720
申请日:2023-06-01
Applicant: Google LLC
Inventor: Michele Covell , Shumeet Baluja
Abstract: In accordance with some embodiments of the disclosed subject matter, mechanisms for seamless audio melding between audio items in a playlist are provided. In some embodiments, a method for transitioning between audio items in playlists is provided, comprising: identifying a sequence of audio items in a playlist of audio items, wherein the sequence of audio items includes a first audio item and a second audio item that is to be played subsequent to the first audio item; and modifying an end portion of the first audio item and a beginning portion of the second audio item, where the end portion of the first audio item and the beginning portion of the second audio item are to be played concurrently to transition between the first audio item and the second audio item, wherein the end portion of the first audio item and the beginning portion of the second audio item have an overlap duration, and wherein modifying the end portion of the first audio item and the beginning portion of the second audio item comprises: generating a first spectrogram corresponding to the end portion of the first audio item and a second spectrogram corresponding to the beginning portion of the second audio item; identifying, for each frequency band in a series of frequency bands, a window over which the first spectrogram within the end portion of the first audio item and the second spectrogram within the beginning portion of the second audio item have a particular cross-correlation; modifying, for each frequency band in the series of frequency bands, the end portion of the first spectrogram and the beginning portion of the second spectrogram such that amplitudes of frequencies within the frequency band decrease within the first spectrogram over the end portion of the first spectrogram and that amplitudes of frequencies within the frequency band increase within the second spectrogram over the beginning portion of the second spectrogram; and generating a modified version of the first audio item the includes the modified end portion of the first audio item based on the modified end portion of the first spectrogram and generating a modified version of the second audio item that includes the modified beginning portion of the second audio item based on the modified beginning portion of the second spectrogram.
-
公开(公告)号:US11488016B2
公开(公告)日:2022-11-01
申请号:US16751175
申请日:2020-01-23
Applicant: Google LLC
Inventor: Michele Covell , David Marwood , Shumeet Baluja , Nicholas Johnston
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing a network input using a neural network to generate a network output for the network input. One of the methods includes maintaining, for each of the plurality of neural network layers, a respective look-up table that maps each possible combination of a quantized input index and a quantized weight index to a multiplication result; and generating a network output from a network input, comprising, for each of the neural network layers: receiving data specifying a quantized input to the neural network layer, the quantized input comprising a plurality of quantized input values; and generating a layer output for the neural network layer from the quantized input to the neural network layer using the respective look-up table for the neural network layer.
-
公开(公告)号:US11195553B2
公开(公告)日:2021-12-07
申请号:US17009001
申请日:2020-09-01
Applicant: Google LLC
Inventor: Michele Covell , Shumeet Baluja
Abstract: In accordance with some embodiments of the disclosed subject matter, mechanisms for seamless audio melding between audio items in a playlist are provided. In some embodiments, a method for transitioning between audio items in playlists is provided, comprising: identifying a sequence of audio items in a playlist of audio items, wherein the sequence of audio items includes a first audio item and a second audio item that is to be played subsequent to the first audio item; and modifying an end portion of the first audio item and a beginning portion of the second audio item, where the end portion of the first audio item and the beginning portion of the second audio item are to be played concurrently to transition between the first audio item and the second audio item, wherein the end portion of the first audio item and the beginning portion of the second audio item have an overlap duration, and wherein modifying the end portion of the first audio item and the beginning portion of the second audio item comprises: generating a first spectrogram corresponding to the end portion of the first audio item and a second spectrogram corresponding to the beginning portion of the second audio item; identifying, for each frequency band in a series of frequency bands, a window over which the first spectrogram within the end portion of the first audio item and the second spectrogram within the beginning portion of the second audio item have a particular cross-correlation; modifying, for each frequency band in the series of frequency bands, the end portion of the first spectrogram and the beginning portion of the second spectrogram such that amplitudes of frequencies within the frequency band decrease within the first spectrogram over the end portion of the first spectrogram and that amplitudes of frequencies within the frequency band increase within the second spectrogram over the beginning portion of the second spectrogram; and generating a modified version of the first audio item the includes the modified end portion of the first audio item based on the modified end portion of the first spectrogram and generating a modified version of the second audio item that includes the modified beginning portion of the second audio item based on the modified beginning portion of the second spectrogram.
-
-
-
-
-
-
-
-
-