-
1.
公开(公告)号:US20240362269A1
公开(公告)日:2024-10-31
申请号:US18308970
申请日:2023-04-28
申请人: ADOBE INC.
IPC分类号: G06F16/632 , G06F16/638 , G06F16/68
CPC分类号: G06F16/632 , G06F16/638 , G06F16/686
摘要: Systems and methods for cross-modal retrieval are provided. According to one aspect, a method for cross-modal retrieval includes obtaining a query describing a sound using a query modality other than a sound modality; encoding the query to obtain a query embedding using a query encoder network for the query modality and a query projection network, wherein the query projection network includes a self-attention layer, and wherein the query embedding is in a joint embedding space for the query modality and the sound modality; and providing a response including an audio sample based on the query embedding, wherein the audio sample includes the sound.
-
公开(公告)号:US20240340505A1
公开(公告)日:2024-10-10
申请号:US18747207
申请日:2024-06-18
IPC分类号: H04N21/8358 , G06F16/11 , G06F16/245 , G06F16/632 , G06F16/683 , G06F16/783 , G06V10/40 , G06V10/74 , G06V10/94 , G06V10/96
CPC分类号: H04N21/8358 , G06F16/683 , G06F16/783 , G06V10/40 , G06V10/761 , G06V10/955 , G06V10/96 , G06F16/122 , G06F16/125 , G06F16/245 , G06F16/634
摘要: Methods, apparatus, systems and articles of manufacture are disclosed for scalable architectures for reference signature matching and updating. An example method for scalable architectures for reference signature matching and updating includes accessing site signatures to be compared to reference signatures from a first group of media sources. Determining if a first reference node is an owner of a first one of the site signatures. Comparing a neighborhood of site signatures including the first site signature to reference signatures in a first subset of reference signatures when the first reference node is the owner of the first site signature, the first subset of references signatures stored in a first memory partition associated with the first reference node. Not comparing site signature to reference signatures when the first reference node is not the owner of the first one of the site signatures.
-
公开(公告)号:US12067051B1
公开(公告)日:2024-08-20
申请号:US17207458
申请日:2021-03-19
IPC分类号: G06F16/683 , G06F16/632 , G06F16/638 , G06F16/68
CPC分类号: G06F16/685 , G06F16/634 , G06F16/639 , G06F16/686
摘要: Systems, devices and methods for analyzing music and other content are provided. In some embodiments, music libraries may be searched by using one or more songs, portions of songs or other segments of music as the search key. Other types of audio and video files may also be searched using similar devices and methods. In other embodiments, a musician or vocalist who sounds similar to another musician or vocalist may be identified. Similarity scores may be generated for music and/or other content that indicate the likelihood that they will be perceived as similar or dissimilar.
-
4.
公开(公告)号:US20240193205A1
公开(公告)日:2024-06-13
申请号:US18386876
申请日:2023-11-03
发明人: Eiichi MAEDA
IPC分类号: G06F16/632 , G10L15/22
CPC分类号: G06F16/632 , G10L15/22 , G10L2015/223
摘要: An information processing device includes a controller. The controller is configured to execute acquiring an utterance of a user, searching for a first music piece on a plurality of sound sources in response to the utterance of the user including a request to replay the first music piece, and replaying the first music piece that has been searched for and is provided in a first sound source.
-
公开(公告)号:US20240137720A1
公开(公告)日:2024-04-25
申请号:US18048387
申请日:2022-10-19
CPC分类号: H04S7/30 , G06F16/632 , H04R3/005 , H04R5/027 , H04R5/033 , H04R29/005 , H04S1/007 , H04R2499/11 , H04S2400/11 , H04S2400/15
摘要: Systems and techniques are provided for performing spatial audio recording. For instance, a process can include detecting an occlusion for at least one audio frame of one or more audio frames associated with a spatial audio recording. During the spatial audio recording, the process can further include selecting, based on detection of the occlusion, at least one of an occluded spatial filter for the one or more audio frames or a non-occluded spatial filter for the one or more audio frames.
-
公开(公告)号:US11960534B2
公开(公告)日:2024-04-16
申请号:US16341763
申请日:2019-04-08
申请人: Google LLC
发明人: Bo Wang , Smita Rai , Max Ohlendorf , Venkat Kotla , Chad Yoshikawa , Abhinav Taneja , Amit Agarwal , Chris Ramsdale , Chris Turkstra
IPC分类号: G06F16/63 , G06F16/632 , G06F16/638 , G06F21/62
CPC分类号: G06F16/634 , G06F16/638 , G06F21/6218
摘要: Coordinating processing of audio queries is provided. A system receives a query. The system provides the query to a first digital assistant component and a second digital assistant component for processing. The system receives a first response to the query from the first digital assistant component, and a second response to the query from the second digital assistant component. The first digital assistant component can be authorized to access a database the second digital assistant component is prohibited from accessing. The system determines, based on a ranking decision function, to select the second response to the query from the second digital assistant component. The system provides, responsive to the selection, the second response from the second digital assistant to a computing device.
-
公开(公告)号:US20240119088A1
公开(公告)日:2024-04-11
申请号:US17938455
申请日:2022-10-06
申请人: Google LLC
发明人: Matthew Sharifi , Victor Carbune
IPC分类号: G06F16/632 , G06F16/638 , G10L17/02 , G10L17/06 , G10L17/22
CPC分类号: G06F16/632 , G06F16/639 , G10L17/02 , G10L17/06 , G10L17/22
摘要: A method for handling contradictory queries on a shared device includes receiving a first query issued by a first user, the first query specifying a first long-standing operation for a digital assistant to perform, and while the digital assistant is performing the first long-standing operation, receiving a second query, the second query specifying a second long-standing operation for the digital assistant to perform. The method also includes determining that the second query was issued by another user different than the first user and determining, using a query resolver, that performing the second long-standing operation would conflict with the first long-standing operation. The method further includes identifying one or more compromise operations for the digital assistant to perform, and instructing the digital assistant to perform a selected compromise operation among the identified one or more compromise operations.
-
公开(公告)号:US11947593B2
公开(公告)日:2024-04-02
申请号:US16147331
申请日:2018-09-28
发明人: Arindam Jati , Naveen Kumar , Ruxin Chen
IPC分类号: G06F16/65 , G06F16/632 , G06N3/08
CPC分类号: G06F16/65 , G06F16/634 , G06N3/08
摘要: A system, method, and computer program product for hierarchical categorization of sound comprising one or more neural networks implemented on one or more processors. The one or more neural networks are configured to categorize a sound into a two or more tiered hierarchical coarse categorization and a finest level categorization in the hierarchy. The categorization sound may be used to search a database for similar or contextually related sounds.
-
公开(公告)号:US11947592B2
公开(公告)日:2024-04-02
申请号:US17339918
申请日:2021-06-04
IPC分类号: G06F16/00 , G06F7/00 , G06F9/54 , G06F16/61 , G06F16/632 , G06F16/638 , G06F40/56 , G16Y10/75
CPC分类号: G06F16/61 , G06F9/542 , G06F16/632 , G06F16/638 , G06F40/56 , G16Y10/75
摘要: Systems and methods are disclosed for generating messages in a cloud platform. One method comprises storing a collection of audio files and destination information identifying location information corresponding to plurality of different geographic locations where a plurality of edge devices are located and grammar information including language-specific rules, receiving messages from one or more of a plurality of requesting devices, the messages including a first type of message and a second type of message, generating an action list, determining an available time slot at a first geographic location of a first edge device when the first edge device is available to render an announcement, retrieving, using the grammar information associated with the second type of destination information included in the action list, an audio file from the collection of audio files stored; and transmitting the audio file and the action list to the first edge device to render an announcement.
-
公开(公告)号:US11880407B2
公开(公告)日:2024-01-23
申请号:US15856810
申请日:2017-12-28
IPC分类号: G06F16/61 , G06F16/40 , G06F16/683 , G10L21/0208 , G06F16/60 , G10L21/0232 , G06F16/632
CPC分类号: G06F16/61 , G06F16/40 , G06F16/60 , G06F16/634 , G06F16/683 , G10L21/0208 , G10L21/0232 , G06F2218/08 , G06F2218/12
摘要: A method for generating a database, having “receiving environmental noises” (e.g. disturbing noise) and “buffered environmental noises for a migrating time window” (like 30 or 60 seconds) alternatively “deriving a set of parameters relative to the environmental noises” and of “buffering the set of parameters for the migrating time window”, he buffered environmental noises or the buffered set of parameters being generally referred to as recording, furthermore “obtaining a signal”, which identifies a signal class (like disturbing noise) of a plurality of signal classes (disturbing noise and non-disturbing noise) in the environmental noises, and “storing the buffered recordings responsive to the signal” in a memory (e.g. internal or external memory). Obtaining and storing are repeated in order to set up the database which has a plurality of buffered recordings for the same signal class.
-
-
-
-
-
-
-
-
-