-
公开(公告)号:US20230176829A1
公开(公告)日:2023-06-08
申请号:US17544502
申请日:2021-12-07
Applicant: Microsoft Technology Licensing, LLC
Inventor: Kiarash RAHMANI , Mohammad RAZA , Sumit GULWANI , Vu Minh LE , Daniel James MORRIS , Arjun RADHAKRISHNA , Gustavo ARAUJO SOARES , Ashish TIWARI
Abstract: Embodiments use a multi-modal approach to generate software programs that match a solution program description. The solution program description may include natural language, input-output examples, partial source code, desired operators, or other hints. Some embodiments use optimized prompts to a pre-trained language model to obtain initial candidate programs. Maximal program components are extracted and then recombined variously using component-based synthesis. Beam search reduces a solution program search space by discarding some candidates from a given synthesis iteration. Relevance metrics, string similarity metrics, operator frequency distributions, token rareness scores, and other optimizations may be employed. By virtue of optimizations and the multi-modal approach, a solution program may be obtained after fewer iterations than by use of a language model alone. The multi-modal approach is domain agnostic, as illustrated by examples using regular expression and cascading style sheet selector domain specific languages.
-
公开(公告)号:US20220317979A1
公开(公告)日:2022-10-06
申请号:US17220156
申请日:2021-04-01
Applicant: Microsoft Technology Licensing, LLC
Inventor: Gustavo ARAUJO SOARES , Piyush ARORA , Titus BARIK , Peter GROENEWEGEN , Sumit GULWANI , Ameya Sanjay KETKAR , Vu Minh LE , Wode NI , David Ellis PUGH , Arjun RADHAKRISHNA , Ivan RADICEK , Ashish TIWARI , Mark Alistair WILSON-THOMAS
IPC: G06F8/33 , G06F16/901
Abstract: Edit automation functionality generalizes edits performed by a user in a document, locates similar text, and recommends or applies transforms while staying within a current workflow. Source code edits such as refactoring are automated. The functionality uses or provides anchor target lists, temporal edit patterns, edit graphs, automatable edit sequence libraries, and other data structures and computational techniques for identifying locations appropriate for particular edits, for getting transforms, for selecting optimal transforms, for leveraging transforms in an editing session or later, and for displaying transform recommendations and results. The edit automation functionality enhances automation subtool generation, discoverability, and flexibility, for refactoring, snippet insertion, quick actions in an integrated development environment, and other automatable edit sequences.
-
公开(公告)号:US20200334054A1
公开(公告)日:2020-10-22
申请号:US16592470
申请日:2019-10-03
Applicant: Microsoft Technology Licensing, LLC
Inventor: Sumit GULWANI , Arjun RADHAKRISHNA , Abhishek UDUPA , Gustavo ARAUJO SOARES , Vu Minh LE , Anders MILTNER , Mark A. WILSON-THOMAS
Abstract: Automatically identifying context-specific repeated transformations (such as repeated edit tasks) that are based on observation of the developer drafting or modifying code. As the developer modifies the code, the code passes through a series of states, one after the other. The computing system observes the series of states of the code. It is based on this observation that the computing system identifies repeated transformations of the code for potentially offering to continue performing the repeated transformations for the user. This alleviates the developer from having to manually perform the remainder of the repeated transformations.
-
公开(公告)号:US20190034437A1
公开(公告)日:2019-01-31
申请号:US15663575
申请日:2017-07-28
Applicant: Microsoft Technology Licensing, LLC
Inventor: Sumit GULWANI , Prateek JAIN , Daniel Adam PERELMAN , Saswat PADHI , Oleksandr POLOZOV
CPC classification number: G06F16/355 , G06F17/2264 , G06F17/271
Abstract: A computing device includes a storage machine holding instructions executable by a logic machine to generate multi-string clusters, each containing alphanumeric strings of a dataset. Further multi-string clusters are generated via iterative performance of a combination operation in which a hierarchically-superior cluster is generated from a set of multi-string clusters. The combination operation includes, for candidate pairs of multi-string clusters, generating syntactic profiles describing an alphanumeric string from each multi-string cluster of the candidate pair. For each of the candidate pairs, a cost factor is determined for at least one of its syntactic profiles. Based on the cost factors determined for the syntactic profiles, one of the candidate pairs is selected. The multi-string clusters from the selected candidate pair are combined to generate the hierarchically-superior cluster including all of the alphanumeric strings from the selected candidate pair of multi-string clusters.
-
公开(公告)号:US20240184979A1
公开(公告)日:2024-06-06
申请号:US18075497
申请日:2022-12-06
Applicant: Microsoft Technology Licensing, LLC
Inventor: Mukul SINGH , José Pablo CAMBRONERO SÁNCHEZ , Sumit GULWANI , Vu Minh LE , Carina Suzana NEGREANU , Mohammad RAZA , Daniel Galen SIMMONS , Gust Ben Anneloes VERBRUGGEN
CPC classification number: G06F16/355 , G06F40/18
Abstract: Some embodiments automatically generate data processing rules based on positive examples of processed data, e.g., formatting rules based on formatted data, filtering rules based on filtered data, or validating rules based on valid data. Some embodiments also use negative examples, e.g., unformatted data. A machine learning rule generation architecture includes a predicate generator, a cell cluster creator, a rule enumerator, and in some versions a rule ranker. Formatting rules written by a user are replaced by simpler autogenerated rules. Spreadsheet formatting rule functionality is enhanced, and surfaced in a user interface.
-
公开(公告)号:US20230289523A1
公开(公告)日:2023-09-14
申请号:US17693285
申请日:2022-03-11
Applicant: Microsoft Technology Licensing, LLC
Inventor: Rohan Jayesh BAVISHI , José Pablo CAMBRONERO SÁNCHEZ , Anna FARIHA , Sumit GULWANI , Vu Minh LE , Ivan RADICEK , Daniel Galen SIMMONS , Ashish TIWARI
IPC: G06F40/211 , G06F40/284 , G06F8/30 , G06F16/332
CPC classification number: G06F40/211 , G06F40/284 , G06F8/31 , G06F16/3329
Abstract: Techniques are described herein that are capable of creating a language-agnostic computer program repair engine generator. A context-free grammar is annotated to identify token(s) that are likely to be included in or excluded from a computer program in a manner that violates the context-free grammar. A language-agnostic computer program repair engine generator is created that is configured to generate a parser. The repair engine generator is configured to create a repair engine that: converts the candidate string into repaired strings that neither violate the context-free grammar nor violate a criterion for a valid computer program; calculates differences between the candidate string and the respective repaired strings; and replaces the candidate string with a designated repaired string based at least in part on the difference between the designated repaired string and the candidate string being less than or equal to a difference threshold.
-
公开(公告)号:US20190311004A1
公开(公告)日:2019-10-10
申请号:US16448805
申请日:2019-06-21
Applicant: Microsoft Technology Licensing, LLC
Inventor: Sumit GULWANI , Prateek JAIN , Daniel Adam PERELMAN , Saswat PADHI , Oleksandr POLOZOV
Abstract: A computing device includes a storage machine holding instructions executable by a logic machine to generate multi-string clusters, each containing alphanumeric strings of a dataset. Further multi-string clusters are generated via iterative performance of a combination operation in which a hierarchically-superior cluster is generated from a set of multi-string clusters. The combination operation includes, for candidate pairs of multi-string clusters, generating syntactic profiles describing an alphanumeric string from each multi-string cluster of the candidate pair. For each of the candidate pairs, a cost factor is determined for at least one of its syntactic profiles. Based on the cost factors determined for the syntactic profiles, one of the candidate pairs is selected. The multi-string clusters from the selected candidate pair are combined to generate the hierarchically-superior cluster including all of the alphanumeric strings from the selected candidate pair of multi-string clusters.
-
公开(公告)号:US20180246915A1
公开(公告)日:2018-08-30
申请号:US15443531
申请日:2017-02-27
Applicant: Microsoft Technology Licensing, LLC
Inventor: Rishabh SINGH , Sumit GULWANI , Dana DRACHSLER COHEN
IPC: G06F17/30
CPC classification number: G06F16/221 , G06F16/25
Abstract: Techniques are disclosed which provide for transforming a hierarchical table to a relational table. A hierarchical table may be received, in which a headline row is identified. A candidate row may be determined in the hierarchical table. The process may include systematically classifying headlines as data headlines or descriptors. For each data headline a new column may be generated, while for each descriptor headline, the table may be split to produce a resultant table. The resultant table may be stored and the process may be repeated until there are no headlines left to be classified. The steps performed by the system to transform the table can then be displayed on a user device using a program in the Domain-specific language, which can then be further inspected or modified to perform the desired table transformation.
-
公开(公告)号:US20240256423A1
公开(公告)日:2024-08-01
申请号:US18159712
申请日:2023-01-26
Applicant: Microsoft Technology Licensing, LLC
Inventor: Jialu ZHANG , José Pablo CAMBRONERO SÁNCHEZ , Gustavo ARAUJO SOARES , Vu Minh LE , Sumit GULWANI , Gust Ben Anneloes VERBRUGGEN
CPC classification number: G06F11/3608 , G06F8/42 , G06F8/71
Abstract: Some embodiments generate prompts and submit them in queries to a language model trained on code to perform automated program repair. Some embodiments fix syntactic mistakes and semantic mistakes by combining multimodal prompts, iterative querying, test-case-based selection of few-shots, and program chunking. In some cases, edit distance is minimized between an initial flawed program and the automatically created improved version of that program. The initial flawed program is obtained from a programming student, or from a source code generator.
-
10.
公开(公告)号:US20230280989A1
公开(公告)日:2023-09-07
申请号:US17687577
申请日:2022-03-04
Applicant: Microsoft Technology Licensing, LLC
Inventor: José Pablo CAMBRONERO SÁNCHEZ , Sumit GULWANI , Vu Minh LE , Daniel PERELMAN , Arjun RADHAKRISHNA , Daniel Galen SIMMONS , Clint Michael SIMON , Ashish TIWARI
IPC: G06F8/41 , G06F40/211 , G06F40/30
CPC classification number: G06F8/436 , G06F8/427 , G06F40/211 , G06F40/30
Abstract: Techniques are described herein that are capable of synthesizing a computer program to include idiomatic function(s) and semantically-meaningful variable(s) using programming by example. For instance, an intent of a user to synthesize a computer program to include functionality configured to generate sample output(s) from respective input(s) is determined based at least in part on receipt of the sample input(s) and the respective sample output(s) from the user. Based at least in part on the determined intent, the computer program is synthesized to include the idiomatic function(s) by configuring the idiomatic function(s) to have the target functionality and to conform to a convention of the target domain-specific language associated with a textual representation of the computer program to be displayed to the user. Non-semantically-meaningful variable(s) included among the idiomatic function(s) are replaced with the respective semantically-meaningful variable(s). The textual representation of the computer program is caused to be displayed to the user.
-
-
-
-
-
-
-
-
-