-
公开(公告)号:US20200167429A1
公开(公告)日:2020-05-28
申请号:US16199659
申请日:2018-11-26
申请人: SAP France
发明人: Gil Katz , Thomas Beucher
摘要: Disclosed are systems, methods, and non-transitory computer-readable media for efficient use of word embeddings for text classification. A text classification system receives a message including a keyword and determines an embedding value for the keyword. The text classification system uses the embedding value as input into each mathematical function in a set mathematical functions, yielding a first set of coefficient values for the keyword. Each respective mathematical function corresponds to a respective intent and defines a continuous surface determined from a subset of coefficient values and embedding values for a set of known keywords. For each intent, the text classification system calculates a probability score based on the respective coefficient value from the set of coefficient values that corresponds to the respective intent, yielding a set of probability scores for the message, and the assigns an intent to the message based on the set of probability scores for the message.
-
公开(公告)号:US20200097539A1
公开(公告)日:2020-03-26
申请号:US16137737
申请日:2018-09-21
申请人: SAP FRANCE
发明人: Gilles Katz
摘要: Implementations are directed to receiving text data including a string of characters, processing the text data to determine a set of reference scores including two or more reference scores, each reference score being associated with a respective language, and being determined based on the text data and a dictionary document provided for the respective language, each dictionary document including a compression of a language document provided in the respective language, selectively determining a language of the text data based on the set of reference scores, and providing language data representative of the language as output.
-
公开(公告)号:US20190188647A1
公开(公告)日:2019-06-20
申请号:US15842443
申请日:2017-12-14
申请人: SAP France
发明人: Jean-Baptiste Gariel , Marc Vu , Marwan Ajala
CPC分类号: G06Q10/1053 , G06F16/951 , G06F16/955 , G06Q10/063112
摘要: Multiple element job classification data objects include values for multiple elements related to a job. The multiple element job classification data object may be generated automatically from a job listing or search query. A database of multiple element job classification data objects may be created using scraping. Scraping job listing data from multiple job listing sites allows for the creation of a centralized database that includes all job listings from the multiple sites. Converting the job listings from a typical title-and-description format into multiple element job classification data objects permits more accurate searching of the data. The database of multiple element job classification data objects may be searched for relevant job listings by a user who provides a text string. The text string is converted into a multiple element job classification data object and used to find job listings that correspond to the user's search.
-
公开(公告)号:US20230169362A1
公开(公告)日:2023-06-01
申请号:US17538120
申请日:2021-11-30
申请人: SAP France
发明人: Thomas Beucher
摘要: A method may include training a first machine learning model to perform a question generation task and a second machine learning model to perform a question answering task. The first machine learning model and the second machine learning model may be subj ected to a collaborative training in which a first plurality of weights applied by the first machine learning model generating one or more questions are adjusted to minimize an error in an output of the second machine learning model answering the one or more questions. The first machine learning model and the second machine learning model may be deployed to perform a natural language processing task that requires the first machine learning model to generate a question and/or the second machine learning model to answer a question. Related methods and articles of manufacture are also disclosed.
-
公开(公告)号:US10796090B2
公开(公告)日:2020-10-06
申请号:US16137737
申请日:2018-09-21
申请人: SAP FRANCE
发明人: Gilles Katz
IPC分类号: G06F40/263 , G06N20/00 , G06F40/242
摘要: Implementations are directed to receiving text data including a string of characters, processing the text data to determine a set of reference scores including two or more reference scores, each reference score being associated with a respective language, and being determined based on the text data and a dictionary document provided for the respective language, each dictionary document including a compression of a language document provided in the respective language, selectively determining a language of the text data based on the set of reference scores, and providing language data representative of the language as output.
-
-
-
-