Syntactically coherent code segmentation

    公开(公告)号:US12265805B2

    公开(公告)日:2025-04-01

    申请号:US18102039

    申请日:2023-01-26

    Applicant: GOOGLE LLC

    Abstract: Techniques are described herein for segmenting source code into syntactically coherent sequences of tokens that satisfy constraints inherent in sequence-to-sequence networks. In various implementations, source code may be processed to generate one or more graphs representing the source code. One or more of the graphs may then be traversed to identify one or more sequences of tokens within the source code that satisfy an input constraint of a sequence-to-sequence network. The source code may be segmented into the identified one or more sequences of tokens. The one or more sequences of tokens may then be processed using the sequence-to-sequence network.

    REFACTORING AND/OR REARCHITECTING SOURCE CODE USING MACHINE LEARNING

    公开(公告)号:US20230251856A1

    公开(公告)日:2023-08-10

    申请号:US17668974

    申请日:2022-02-10

    Applicant: Google LLC

    CPC classification number: G06F8/72 G06N3/10

    Abstract: Implementations are described herein for leveraging machine learning to automate source code refactoring and/or rearchitecting. In various implementations, one or more ground truth boundaries may be removed from one or more boundaried source code files to produce one or more boundary-less source code files. One or more of the boundary-less source code files may be processed using a machine learning model to predict one or more candidate boundaries for reintroduction into the one or more boundary-less source code files. The one or more ground truth boundaries may be compared with the one or more predicted candidate boundaries. The machine learning model may be trained based on the comparing.

    SYNTACTICALLY COHERENT CODE SEGMENTATION
    3.
    发明公开

    公开(公告)号:US20240256235A1

    公开(公告)日:2024-08-01

    申请号:US18102039

    申请日:2023-01-26

    Applicant: GOOGLE LLC

    CPC classification number: G06F8/433 G06F8/425 G06F8/427

    Abstract: Techniques are described herein for segmenting source code into syntactically coherent sequences of tokens that satisfy constraints inherent in sequence-to-sequence networks. In various implementations, source code may be processed to generate one or more graphs representing the source code. One or more of the graphs may then be traversed to identify one or more sequences of tokens within the source code that satisfy an input constraint of a sequence-to-sequence network. The source code may be segmented into the identified one or more sequences of tokens. The one or more sequences of tokens may then be processed using the sequence-to-sequence network.

    Refactoring and/or rearchitecting source code using machine learning

    公开(公告)号:US11893384B2

    公开(公告)日:2024-02-06

    申请号:US17668974

    申请日:2022-02-10

    Applicant: Google LLC

    CPC classification number: G06F8/72 G06N3/10

    Abstract: Implementations are described herein for leveraging machine learning to automate source code refactoring and/or rearchitecting. In various implementations, one or more ground truth boundaries may be removed from one or more boundaried source code files to produce one or more boundary-less source code files. One or more of the boundary-less source code files may be processed using a machine learning model to predict one or more candidate boundaries for reintroduction into the one or more boundary-less source code files. The one or more ground truth boundaries may be compared with the one or more predicted candidate boundaries. The machine learning model may be trained based on the comparing.

Patent Agency Ranking