ADAPTING EXISTING SOURCE CODE SNIPPETS TO NEW CONTEXTS

    公开(公告)号:US20230004366A1

    公开(公告)日:2023-01-05

    申请号:US17901128

    申请日:2022-09-01

    Abstract: Implementations are described herein for adapting existing source code snippets to new contexts. In various implementations, a command may be detected to incorporate an existing source code snippet into destination source code. An embedding may be generated based on the existing source code snippet, e.g., by processing the existing source code snippet using an encoder. The destination source code may be processed to identify one or more decoder constraints. Subject to the one or more decoder constraints, the embedding may be processed using a decoder to generate a new version of the existing source code snippet that is adapted to the destination source code.

    Conditioning autoregressive language model to improve code migration

    公开(公告)号:US11481210B2

    公开(公告)日:2022-10-25

    申请号:US17136968

    申请日:2020-12-29

    Abstract: Implementations are described herein for using machine learning to perform various tasks related to migrating source code based on relatively few (“few shots”) demonstrations. In various implementations, an autoregressive language model may be conditioned based on demonstration tuple(s). In some implementations, a demonstration tuple may include a pre-migration version of a first source code snippet and a post-migration version of the first source code snippet. In other implementations, demonstration tuples may include other data, such as intermediate forms (e.g., natural language descriptions or pseudocode), input-output pairs demonstrating intended behavior, etc. The autoregressive language model may be trained on corpora of source code and natural language documentation on the subject of computer programming. A pre-migration version of a source code file may be processed based on the conditioned autoregressive language model, and a post-migration version may be generated based on output generated based on the conditioned autoregressive model.

    CONDITIONING AUTOREGRESSIVE LANGUAGE MODEL TO IMPROVE CODE MIGRATION

    公开(公告)号:US20220206785A1

    公开(公告)日:2022-06-30

    申请号:US17136968

    申请日:2020-12-29

    Abstract: Implementations are described herein for using machine learning to perform various tasks related to migrating source code based on relatively few (“few shots”) demonstrations. In various implementations, an autoregressive language model may be conditioned based on demonstration tuple(s). In some implementations, a demonstration tuple may include a pre-migration version of a first source code snippet and a post-migration version of the first source code snippet. In other implementations, demonstration tuples may include other data, such as intermediate forms (e.g., natural language descriptions or pseudocode), input-output pairs demonstrating intended behavior, etc. The autoregressive language model may be trained on corpora of source code and natural language documentation on the subject of computer programming. A pre-migration version of a source code file may be processed based on the conditioned autoregressive language model, and a post-migration version may be generated based on output generated based on the conditioned autoregressive model.

    ADAPTING EXISTING SOURCE CODE SNIPPETS TO NEW CONTEXTS

    公开(公告)号:US20220236971A1

    公开(公告)日:2022-07-28

    申请号:US17159524

    申请日:2021-01-27

    Abstract: Implementations are described herein for adapting existing source code snippets to new contexts. In various implementations, a command may be detected to incorporate an existing source code snippet into destination source code. An embedding may be generated based on the existing source code snippet, e.g., by processing the existing source code snippet using an encoder. The destination source code may be processed to identify one or more decoder constraints. Subject to the one or more decoder constraints, the embedding may be processed using a decoder to generate a new version of the existing source code snippet that is adapted to the destination source code.

    TRAINING AND APPLICATION OF BOTTLENECK MODELS AND EMBEDDINGS

    公开(公告)号:US20250028995A1

    公开(公告)日:2025-01-23

    申请号:US18224889

    申请日:2023-07-21

    Abstract: Disclosed implementations relate to adding “bottleneck” models to machine learning pipelines that already apply domain models to translate and/or transfer representations of high-level semantic concepts between domains. In various implementations, an initial representation in a first domain of a transition from an initial state of an environment to a goal state of the environment may be processed based on a pre-trained first domain encoder to generate a first embedding that semantically represents the transition. The first embedding may be processed based on one or more bottleneck models to generate a second embedding with fewer dimensions than the first embedding. In various implementations, the second embedding may be processed in various ways to train one or more of the bottleneck model(s) based on various different auxiliary loss functions.

    CONDITIONING AUTOREGRESSIVE LANGUAGE MODEL TO IMPROVE CODE MIGRATION

    公开(公告)号:US20230018088A1

    公开(公告)日:2023-01-19

    申请号:US17945376

    申请日:2022-09-15

    Abstract: Implementations are described herein for using machine learning to perform various tasks related to migrating source code based on relatively few (“few shots”) demonstrations. In various implementations, an autoregressive language model may be conditioned based on demonstration tuple(s). In some implementations, a demonstration tuple may include a pre-migration version of a first source code snippet and a post-migration version of the first source code snippet. In other implementations, demonstration tuples may include other data, such as intermediate forms (e.g., natural language descriptions or pseudocode), input-output pairs demonstrating intended behavior, etc. The autoregressive language model may be trained on corpora of source code and natural language documentation on the subject of computer programming. A pre-migration version of a source code file may be processed based on the conditioned autoregressive language model, and a post-migration version may be generated based on output generated based on the conditioned autoregressive model.

Patent Agency Ranking