Adaptive semi-supervised learning for cross-domain sentiment classification

    公开(公告)号:US10817668B2

    公开(公告)日:2020-10-27

    申请号:US16199422

    申请日:2018-11-26

    Applicant: SAP SE

    Inventor: Ruidan He

    Abstract: Methods, systems, and computer-readable storage media for receiving a source domain data set including a set of source document and source label pairs, each source label corresponding to a source domain and indicating a sentiment attributed to a respective source document, receiving a target domain data set including a set of target documents absent target labels, processing documents of the source and target domains using a feature encoder of a DAS platform, to map the documents of the source and target domains to a shared feature space through feature representations, the processing including minimizing a distance between the feature representations of the source domain, and feature representations of the target domain based on a set of loss functions, providing an ensemble prediction from the processing, and providing predicted labels based on the ensemble prediction, the predicted labels being used by the sentiment classifier to classify documents from the target domain.

    ADAPTIVE SEMI-SUPERVISED LEARNING FOR CROSS-DOMAIN SENTIMENT CLASSIFICATION

    公开(公告)号:US20200167418A1

    公开(公告)日:2020-05-28

    申请号:US16199422

    申请日:2018-11-26

    Applicant: SAP SE

    Inventor: Ruidan He

    Abstract: Methods, systems, and computer-readable storage media for receiving a source domain data set including a set of source document and source label pairs, each source label corresponding to a source domain and indicating a sentiment attributed to a respective source document, receiving a target domain data set including a set of target documents absent target labels, processing documents of the source and target domains using a feature encoder of a DAS platform, to map the documents of the source and target domains to a shared feature space through feature representations, the processing including minimizing a distance between the feature representations of the source domain, and feature representations of the target domain based on a set of loss functions, providing an ensemble prediction from the processing, and providing predicted labels based on the ensemble prediction, the predicted labels being used by the sentiment classifier to classify documents from the target domain.

    Unsupervised neural attention model for aspect extraction

    公开(公告)号:US10755174B2

    公开(公告)日:2020-08-25

    申请号:US15484577

    申请日:2017-04-11

    Applicant: SAP SE

    Abstract: Methods, systems, and computer-readable storage media for receiving a vocabulary, the vocabulary including text data that is provided as at least a portion of raw data, the raw data being provided in a computer-readable file, associating each word in the vocabulary with a feature vector, providing a sentence embedding for each sentence of the vocabulary based on a plurality of feature vectors to provide a plurality of sentence embeddings, providing a reconstructed sentence embedding for each sentence embedding based on a weighted parameter matrix to provide a plurality of reconstructed sentence embeddings, and training the unsupervised neural attention model based on the sentence embeddings and the reconstructed sentence embeddings to provide a trained neural attention model, the trained neural attention model being used to automatically determine aspects from the vocabulary.

    Unsupervised aspect extraction from raw data using word embeddings

    公开(公告)号:US10223354B2

    公开(公告)日:2019-03-05

    申请号:US15478363

    申请日:2017-04-04

    Applicant: SAP SE

    Abstract: Methods, systems, and computer-readable storage media for receiving a vocabulary that includes text data that is provided as at least a portion of raw data, the raw data being provided in a computer-readable file, providing word embeddings based on the vocabulary, the word embeddings including word vectors for words included in the vocabulary, clustering word embeddings to provide a plurality of clusters, each cluster representing an aspect inferred from the vocabulary, determining a respective association score between each word in the vocabulary and a respective aspect, and providing a word ranking for each aspect based on the respective association scores.

    Exploiting document knowledge for aspect-level sentiment classification

    公开(公告)号:US10726207B2

    公开(公告)日:2020-07-28

    申请号:US16200829

    申请日:2018-11-27

    Applicant: SAP SE

    Inventor: Ruidan He

    Abstract: Methods, systems, and computer-readable storage media for receiving a set of document-level training data including a plurality of documents, each document having a sentiment label associated therewith, receiving a set of aspect-level training data including a plurality of aspects, each aspect having a sentiment label associated therewith, training the aspect-level sentiment classifier including a long short-term memory (LSTM) network, and an output layer using one or more of pretraining, and multi-task learning based on the document-level training data and the aspect-level training data, pretraining including initializing parameters based on pretrained weights that are fine-tuned during training, and multi-task learning including simultaneous training of document-level classification and aspect-level classification, and providing the aspect-level sentiment classifier for classifying one or more aspects in one or more sentences of one or more input documents based on sentiment classes.

    UNSUPERVISED NEURAL ATTENTION MODEL FOR ASPECT EXTRACTION

    公开(公告)号:US20180293499A1

    公开(公告)日:2018-10-11

    申请号:US15484577

    申请日:2017-04-11

    Applicant: SAP SE

    Abstract: Methods, systems, and computer-readable storage media for receiving a vocabulary, the vocabulary including text data that is provided as at least a portion of raw data, the raw data being provided in a computer-readable file, associating each word in the vocabulary with a feature vector, providing a sentence embedding for each sentence of the vocabulary based on a plurality of feature vectors to provide a plurality of sentence embeddings, providing a reconstructed sentence embedding for each sentence embedding based on a weighted parameter matrix to provide a plurality of reconstructed sentence embeddings, and training the unsupervised neural attention model based on the sentence embeddings and the reconstructed sentence embeddings to provide a trained neural attention model, the trained neural attention model being used to automatically determine aspects from the vocabulary.

    EXPLOITING DOCUMENT KNOWLEDGE FOR ASPECT-LEVEL SENTIMENT CLASSIFICATION

    公开(公告)号:US20200167419A1

    公开(公告)日:2020-05-28

    申请号:US16200829

    申请日:2018-11-27

    Applicant: SAP SE

    Inventor: Ruidan He

    Abstract: Methods, systems, and computer-readable storage media for receiving a set of document-level training data including a plurality of documents, each document having a sentiment label associated therewith, receiving a set of aspect-level training data including a plurality of aspects, each aspect having a sentiment label associated therewith, training the aspect-level sentiment classifier including a long short-term memory (LSTM) network, and an output layer using one or more of pretraining, and multi-task learning based on the document-level training data and the aspect-level training data, pretraining including initializing parameters based on pretrained weights that are fine-tuned during training, and multi-task learning including simultaneous training of document-level classification and aspect-level classification, and providing the aspect-level sentiment classifier for classifying one or more aspects in one or more sentences of one or more input documents based on sentiment classes.

    UNSUPERVISED ASPECT EXTRACTION FROM RAW DATA USING WORD EMBEDDINGS

    公开(公告)号:US20180285344A1

    公开(公告)日:2018-10-04

    申请号:US15478363

    申请日:2017-04-04

    Applicant: SAP SE

    CPC classification number: G06F17/2785

    Abstract: Methods, systems, and computer-readable storage media for receiving a vocabulary that includes text data that is provided as at least a portion of raw data, the raw data being provided in a computer-readable file, providing word embeddings based on the vocabulary, the word embeddings including word vectors for words included in the vocabulary, clustering word embeddings to provide a plurality of clusters, each cluster representing an aspect inferred from the vocabulary, determining a respective association score between each word in the vocabulary and a respective aspect, and providing a word ranking for each aspect based on the respective association scores.

Patent Agency Ranking