EPHEMERAL LEARNING AND/OR FEDERATED LEARNING OF AUDIO-BASED MACHINE LEARNING MODEL(S) FROM STREAM(S) OF AUDIO DATA GENERATED VIA RADIO STATION(S)

    公开(公告)号:US20240071406A1

    公开(公告)日:2024-02-29

    申请号:US18074739

    申请日:2022-12-05

    申请人: GOOGLE LLC

    IPC分类号: G10L25/51 G10L15/00 G10L15/18

    摘要: Implementations disclosed herein are directed to utilizing ephemeral learning techniques and/or federated learning techniques to update audio-based machine learning (ML) model(s) based on processing streams of audio data generated via radio station(s) across the world. This enables the audio-based ML model(s) to learn representations and/or understand languages across the world, including tail languages for which there is no/minimal audio data. In various implementations, one or more deduping techniques may be utilized to ensure the same stream of audio data is not overutilized in updating the audio-based ML model(s). In various implementations, a given client device may determine whether to employ an ephemeral learning technique or a federated learning technique based on, for instance, a connection status with a remote system. Generally, the streams of audio data are received at client devices, but the ephemeral learning techniques may be implemented at the client device and/or at the remote system.

    Ephemeral learning of machine learning model(s)

    公开(公告)号:US12126845B2

    公开(公告)日:2024-10-22

    申请号:US17533779

    申请日:2021-11-23

    申请人: GOOGLE LLC

    摘要: Implementations disclosed herein are directed to ephemeral learning of machine learning (“ML”) model(s) based on gradient(s) generated at a remote system (e.g., remote server(s)). Processor(s) of the remote system can receive stream(s) of audio data capturing spoken utterance(s) from a client device of a user. A fulfillment pipeline can process the stream(s) of audio data to cause certain fulfillment(s) of the spoken utterance(s) to be performed. Meanwhile, a training pipeline can process the stream(s) of audio data to generate gradient(s) using unsupervised learning techniques. Subsequent to the processing by the fulfillment pipeline and/or the training pipeline, the stream(s) of audio data are discarded by the remote system. Accordingly, the ML model(s) can be trained at the remote system without storing or logging of the stream(s) of audio data by non-transient memory thereof, thereby providing more efficient training mechanisms for training the ML model(s) and also increasing security of user data.

    EPHEMERAL LEARNING OF MACHINE LEARNING MODEL(S)

    公开(公告)号:US20230156248A1

    公开(公告)日:2023-05-18

    申请号:US17533779

    申请日:2021-11-23

    申请人: GOOGLE LLC

    摘要: Implementations disclosed herein are directed to ephemeral learning of machine learning (“ML”) model(s) based on gradient(s) generated at a remote system (e.g., remote server(s)). Processor(s) of the remote system can receive stream(s) of audio data capturing spoken utterance(s) from a client device of a user. A fulfillment pipeline can process the stream(s) of audio data to cause certain fulfillment(s) of the spoken utterance(s) to be performed. Meanwhile, a training pipeline can process the stream(s) of audio data to generate gradient(s) using unsupervised learning techniques. Subsequent to the processing by the fulfillment pipeline and/or the training pipeline, the stream(s) of audio data are discarded by the remote system. Accordingly, the ML model(s) can be trained at the remote system without storing or logging of the stream(s) of audio data by non-transient memory thereof, thereby providing more efficient training mechanisms for training the ML model(s) and also increasing security of user data.