-
公开(公告)号:US10809983B1
公开(公告)日:2020-10-20
申请号:US16198980
申请日:2018-11-23
Applicant: Amazon Technologies, Inc.
Inventor: Russell Reas , Neela Sawant , Srinivasan Sengamedu Hanumantha Rao
IPC: G06F8/41
Abstract: Techniques for suggesting a name from one or more code files are described. An exemplary method includes receiving a request to suggest one or more names for a name in a code file; determining one or more names based on existing names in one or more code files using one or more abstract syntax trees (ASTs) for the one or more code files; and outputting the determined one or more names as a name suggestion that comprises novel sequences of sub-tokens of existing names of the one or more code files.
-
公开(公告)号:US11514054B1
公开(公告)日:2022-11-29
申请号:US16145104
申请日:2018-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Andrew Borthwick , Robert Anthony Barton, Jr. , Stephen Michael Ash , Russell Reas
IPC: G06F16/2455 , G06N20/00 , G06F16/28 , G06F16/22 , G06F16/901
Abstract: Supervised partitioning is used to perform record matching. A request to identify matches between records is received. A graph representation that indicates similarities between the records is partitioned and an evaluation of the partitioning is performed according to a supervised machine learning technique to generate a confidence value in the partitioning. An indication of equivalent records according to the partitioning and the confidence value of the partitioning may be provided.
-
3.
公开(公告)号:US10901708B1
公开(公告)日:2021-01-26
申请号:US16198969
申请日:2018-11-23
Applicant: Amazon Technologies, Inc.
Inventor: Russell Reas , Neela Sawant , Srinivasan Sengamedu Hanumantha Rao , Yinglong Wang , Anton Emelyanov , Shishir Sethiya
Abstract: Techniques for unsupervised learning of embeddings on source code from non-local contexts are described. Code can be processed to generate an abstract syntax tree (AST) which represents syntactic paths between tokens in the code. Once the AST(s) have been generated, the paths in the AST(s) can be crawled to identify terminals (e.g., leaf nodes in the AST) and paths between terminals can be identified. The pairs of tokens identified at the ends of each path can then be used to generate a cooccurrence matrix. For example, if X number of unique terminals are identified, a matrix of size X by X can be generated to indicate a frequency at which pairs of terminals cooccur. This cooccurrence matrix can then be used as input to existing techniques for learning vector-space embeddings, such as word2vec, GloVe, Swivel, etc.
-
-