Text data representation learning using random document embedding

    公开(公告)号:US11823013B2

    公开(公告)日:2023-11-21

    申请号:US15689799

    申请日:2017-08-29

    CPC classification number: G06N20/00 G06F16/3331

    Abstract: Embodiments of the present invention provide a computer-implemented method for performing unsupervised feature representation learning for text data. The method generates reference text data having a set of random text sequences, in which each text sequence of set of random text sequences is of a random length and comprises a number of random words, and in which each random length is sampled from a minimum length to a maximum length. The random words of each text sequence in the set are drawn from a distribution. The method generates a feature matrix for raw text data based at least in part on a set of computed distances between the set of random text sequences and the raw text data. The method provides the feature matrix as an input to one or more machine learning models.

    Abstract Meaning Representation Parsing with Graph Translation

    公开(公告)号:US20220171923A1

    公开(公告)日:2022-06-02

    申请号:US17109008

    申请日:2020-12-01

    Abstract: A computer-implemented method for generating an abstract meaning representation (“AMR”) of a sentence, comprising receiving, by a computing device, an input sentence and parsing the input sentence into one or more syntactic and/or semantic graphs. An input graph including a node set and an edge set is formed from the one or more syntactic and/or semantic graphs. Node representations are generated by natural language processing. The input graph is provided to a first neural network to provide an output graph having learned node representations aligned with the node representations in the input graph. The method further includes predicting via a second neural network, node label and predicting, via a third neural network, edge labels in the output graph. The AMR is generated based on the predicted node labels and predicted edge labels. A system and a non-transitory computer readable storage medium are also disclosed.

    LEARNING-BASED AUTOMATION MACHINE LEARNING CODE ANNOTATION IN COMPUTATIONAL NOTEBOOKS

    公开(公告)号:US20220113964A1

    公开(公告)日:2022-04-14

    申请号:US17069402

    申请日:2020-10-13

    Abstract: One embodiment of the invention provides a method for automated code annotation in machine learning (ML) and data science. The method comprises receiving, as input, a section of executable code. The method further comprises classifying, via a ML model, the section of executable code with a stage classification label indicative of a stage within a workflow for automated ML that the executable code applies to. The method further comprises categorizing, based on the stage classification label, the section of executable code with a category of annotation that is most appropriate for the section of executable code. The method further comprises generating a suggested annotation for the section of executable code based on the category of annotation. The method further comprises providing, as output, the suggested annotation to a display of an electronic device for user review. The suggested annotation is user interactable via the electronic device.

    Semantic parsing using encoded structured representation

    公开(公告)号:US11157705B2

    公开(公告)日:2021-10-26

    申请号:US16518120

    申请日:2019-07-22

    Abstract: Aspects described herein include a method of semantic parsing, and related system and computer program product. The method comprises receiving an input comprising a plurality of words, generating a structured representation of the plurality of words, encoding the structured representation into a latent embedding space, and decoding the encoded structured representation from the latent embedding space into a logical representation of the plurality of words.

Patent Agency Ranking