EFFICIENT USE OF WORD EMBEDDINGS FOR TEXT CLASSIFICATION

    公开(公告)号:US20200167429A1

    公开(公告)日:2020-05-28

    申请号:US16199659

    申请日:2018-11-26

    申请人: SAP France

    IPC分类号: G06F17/30 G06F17/10

    摘要: Disclosed are systems, methods, and non-transitory computer-readable media for efficient use of word embeddings for text classification. A text classification system receives a message including a keyword and determines an embedding value for the keyword. The text classification system uses the embedding value as input into each mathematical function in a set mathematical functions, yielding a first set of coefficient values for the keyword. Each respective mathematical function corresponds to a respective intent and defines a continuous surface determined from a subset of coefficient values and embedding values for a set of known keywords. For each intent, the text classification system calculates a probability score based on the respective coefficient value from the set of coefficient values that corresponds to the respective intent, yielding a set of probability scores for the message, and the assigns an intent to the message based on the set of probability scores for the message.

    QUICK LANGUAGE DETECTION WITH LANGUAGE NEUTRAL FUNCTIONALITY

    公开(公告)号:US20200097539A1

    公开(公告)日:2020-03-26

    申请号:US16137737

    申请日:2018-09-21

    申请人: SAP FRANCE

    发明人: Gilles Katz

    IPC分类号: G06F17/27 G06N99/00

    摘要: Implementations are directed to receiving text data including a string of characters, processing the text data to determine a set of reference scores including two or more reference scores, each reference score being associated with a respective language, and being determined based on the text data and a dictionary document provided for the respective language, each dictionary document including a compression of a language document provided in the respective language, selectively determining a language of the text data based on the set of reference scores, and providing language data representative of the language as output.

    MULTIPLE ELEMENT JOB CLASSIFICATION
    3.
    发明申请

    公开(公告)号:US20190188647A1

    公开(公告)日:2019-06-20

    申请号:US15842443

    申请日:2017-12-14

    申请人: SAP France

    IPC分类号: G06Q10/10 G06Q10/06 G06F17/30

    摘要: Multiple element job classification data objects include values for multiple elements related to a job. The multiple element job classification data object may be generated automatically from a job listing or search query. A database of multiple element job classification data objects may be created using scraping. Scraping job listing data from multiple job listing sites allows for the creation of a centralized database that includes all job listings from the multiple sites. Converting the job listings from a typical title-and-description format into multiple element job classification data objects permits more accurate searching of the data. The database of multiple element job classification data objects may be searched for relevant job listings by a user who provides a text string. The text string is converted into a multiple element job classification data object and used to find job listings that correspond to the user's search.

    SHARED NETWORK LEARNING FOR MACHINE LEARNING ENABLED TEXT CLASSIFICATION

    公开(公告)号:US20230169362A1

    公开(公告)日:2023-06-01

    申请号:US17538120

    申请日:2021-11-30

    申请人: SAP France

    发明人: Thomas Beucher

    IPC分类号: G06N5/04 G06N20/00

    CPC分类号: G06N5/04 G06N20/00

    摘要: A method may include training a first machine learning model to perform a question generation task and a second machine learning model to perform a question answering task. The first machine learning model and the second machine learning model may be subj ected to a collaborative training in which a first plurality of weights applied by the first machine learning model generating one or more questions are adjusted to minimize an error in an output of the second machine learning model answering the one or more questions. The first machine learning model and the second machine learning model may be deployed to perform a natural language processing task that requires the first machine learning model to generate a question and/or the second machine learning model to answer a question. Related methods and articles of manufacture are also disclosed.

    Quick language detection with language neutral functionality

    公开(公告)号:US10796090B2

    公开(公告)日:2020-10-06

    申请号:US16137737

    申请日:2018-09-21

    申请人: SAP FRANCE

    发明人: Gilles Katz

    摘要: Implementations are directed to receiving text data including a string of characters, processing the text data to determine a set of reference scores including two or more reference scores, each reference score being associated with a respective language, and being determined based on the text data and a dictionary document provided for the respective language, each dictionary document including a compression of a language document provided in the respective language, selectively determining a language of the text data based on the set of reference scores, and providing language data representative of the language as output.