Detection of a topic
    72.
    发明授权

    公开(公告)号:US11755831B2

    公开(公告)日:2023-09-12

    申请号:US17061989

    申请日:2020-10-02

    申请人: Telia Company AB

    摘要: The present invention relates to a method for performing a detection of a topic of a message introduced in a real-time customer service messaging platform. In the method a message comprising at least one word from which the topic is definable is received; a topic from the received message is extracted; it is inquired from a database if the topic is determinable from a number of messages received chronically earlier than the received message; and an indication is generated to an operator of the real-time customer service messaging platform in accordance with a detection result obtained through an inquiry to the database. Some aspects of the present invention relate to a network node, to a computer program product and to a system.

    Mapping data set(s) to canonical phrases using natural language processing model(s)

    公开(公告)号:US11734511B1

    公开(公告)日:2023-08-22

    申请号:US16946840

    申请日:2020-07-08

    IPC分类号: G06F40/289 G06N3/044

    CPC分类号: G06F40/289 G06N3/044

    摘要: Techniques are disclosed that enable generating a unified data set by mapping a set of item description phrases, describing entries in a data set, to a set of canonical phrases. Various implementations include generating a similarity measure between each item description phrase and each canonical phrase by processing the corresponding item description phrase and the corresponding canonical phrase using a natural language processing model. Additional or alternative implementations include generating a bipartite graph based on the set of item description phrases, the set of canonical phrases, and the similarity measures. The mapping can be generated based on the bipartite graph.

    TOPIC LABELING BY SENTIMENT POLARITY IN TOPIC MODELING

    公开(公告)号:US20230259711A1

    公开(公告)日:2023-08-17

    申请号:US17669484

    申请日:2022-02-11

    摘要: Described are techniques for topic modeling including a computer-implemented method of generating a plurality of topic labels corresponding to a plurality of documents clustered into a plurality of topics, where the plurality of topic labels include a sentiment-oriented topic label and a sentiment-neutral topic label. The method further comprises calculating term frequency-inverse document frequency (TF-IDF) values for respective topic labels and corresponding pluralities of documents. The method further comprises receiving a selected sentiment polarity from a user device. The method further comprises identifying a subset of the plurality of topic labels that satisfy the selected sentiment polarity. The method further comprises transmitting at least one topic label of the subset of the plurality of topic labels to the user device, where the at least one topic label has a higher TF-IDF value than other topic labels in the subset of the plurality of topic labels.

    Method of abbreviated typing and compression of texts written in languages using alphabetic scripts

    公开(公告)号:US11720760B2

    公开(公告)日:2023-08-08

    申请号:US16652490

    申请日:2019-05-02

    申请人: AbbType Ltd

    摘要: The invention provides a computer implemented method of drafting of abbreviations for the statistically most frequent word forms and phrases for the purposes of computer typing and compression of texts written in languages using alphabetic scripts with full vowel representation. Therein, drafted abbreviations do not constitute meaningful word forms of a given language, for which they are drafted. Every abbreviated word form or phrase is attributed only one unique and exclusive abbreviation, which is based on the letters contained in this abbreviated word form or phrase and in accordance with the order, in which these letters appear in the abbreviated word form or phrase. For a given word form, one-letter, two-letter, three-letter and four-letter abbreviations of the word forms are chosen according to the statistical frequency of the word forms in a way that allows the mathematically most efficient process of abbreviation of the text.

    LANGUAGE REPRESENTATION MODEL SYSTEM, PRE-TRAINING METHOD AND APPARATUS, DEVICE, AND MEDIUM

    公开(公告)号:US20230244879A1

    公开(公告)日:2023-08-03

    申请号:US17923316

    申请日:2021-07-23

    摘要: Disclosed are a language representation model system, a language representation model pre-training method, a natural language processing method, an electronic device, and a storage medium. The language representation model system includes: a word granularity language representation sub-model based on segmentation in units of words, and a phrase granularity language representation sub-model based on segmentation in units of words. The word granularity language representation sub-model is configured to output, based on a sentence segmented in units of words, a first semantic vector corresponding to a semantic expressed by each segmented word in the sentence. The phrase granularity language representation sub-model is configured to output, based on the sentence segmented in units of phrases, a second semantic vector corresponding to a semantic expressed by each segmented phrase in the sentence.

    AUTOMATICALLY AUGMENTING AND LABELING CONVERSATIONAL DATA FOR TRAINING MACHINE LEARNING MODELS

    公开(公告)号:US20230244871A1

    公开(公告)日:2023-08-03

    申请号:US17589860

    申请日:2022-01-31

    IPC分类号: G06F40/289 G06K9/62

    摘要: A method implemented via execution of computing instructions configured to run at one or more processors and stored at one or more non-transitory computer-readable media. The method can include generating training data for an intent classification machine learning model by: (a) determining, via a text-to-text machine learning model, one or more respective paraphrases for each sample phrase of training phrases; (b) generating, via a label generating machine learning model, labeled data based on unlabeled live logs by: (i) determining live-log samples from the unlabeled live logs based at least in part on: a respective timestamp of each live log of the unlabeled live logs, or random sampling; and (ii) generating, via the label generating machine learning model, the labeled data based on the live-log samples and one or more labeling functions; and (c) adding the one or more respective paraphrases for the each sample phrase of the training phrases and the labeled data to the training data. In certain embodiments, a respective quantity of the one or more respective paraphrases can vary for the each sample phrase of the training phrases. In some embodiments, the method further can include transmitting the training data, as generated, to the intent classification machine learning model for training. Other embodiments are described.