SYSTEMS AND METHODS FOR ITERATIVE CODE GENERATION WITH LARGE LANGUAGE MODELS AND REPRESENTATIVE SUB-MODULES

    公开(公告)号:US20250103300A1

    公开(公告)日:2025-03-27

    申请号:US18424372

    申请日:2024-01-26

    Abstract: The embodiments are directed to generating source code for a program from a problem description. One or more pre-trained code large language models (LLMs) generate sub-modules from a problem description in a natural language. The sub-modules are filtered based on testing criteria and encoded into sub-module encodings in an embedding space. The sub-module encodings are clustered into multiple clusters. A subset of sub-modules encoding that are close to the centroids of the clusters are selected. The sub-set of sub-modules is decoded into representative sub-modules. The problem description is augmented with the representative sub-modules and fed into one or more pre-trained code LLMs and new sub-modules are generated. The iterations continue until a program is generated from the representative sub-modules.

Patent Agency Ranking