Phrase based unstructured content parsing

    公开(公告)号:US12061627B2

    公开(公告)日:2024-08-13

    申请号:US17179614

    申请日:2021-02-19

    摘要: From an unstructured content using an ontology, a forward materialization graph is generated. The forward materialization graph is converted to a set of vector representations comprising multidimensional numbers representing elements of the forward materialization graph. A set of inference paths is computed for the set of vector representations. An inference path in the set of inference paths connecting a first vector representation with a second vector representation. Based on a set of features, the set of vector representations is formed into clusters, a feature in the set of features comprising a relevance probability, the relevance probability corresponding to a relevance of a portion of the unstructured content according to a relevance metric. A structured representation of the unstructured content is placed at an edge location of a content delivery network determined using the set of clusters.

    AUTOMATED REQUEST PROCESSING USING ENSEMBLE MACHINE LEARNING FRAMEWORK

    公开(公告)号:US20240143915A1

    公开(公告)日:2024-05-02

    申请号:US17974215

    申请日:2022-10-26

    IPC分类号: G06F40/20 G06F40/44 G06Q30/00

    摘要: Methods, apparatus, and processor-readable storage media for automated request processing using an ensemble machine learning framework are provided herein. An example computer-implemented method includes aggregating interaction data associated with a request; computing a weighted score for the request, wherein the weighted score comprises a first component that is based at least in part on a comparison of the aggregated interaction data to a set of keywords and a second component corresponding to a sentiment predicted by a first machine learning model for at least a portion of the aggregated interaction data; using a second machine learning model to determine whether the request is anomalous based at least in part on the weighted score; and in response to determining that the request is anomalous, initiating one or more automated actions for the request.

    Dynamic contraction and expansion of heuristic beam width based on predicted probabilities

    公开(公告)号:US11966708B2

    公开(公告)日:2024-04-23

    申请号:US17491671

    申请日:2021-10-01

    IPC分类号: G06F40/44 G06F40/58 G06N3/047

    CPC分类号: G06F40/44 G06F40/58 G06N3/047

    摘要: A method, computer program product, and computer system for translating, using a beam search, a source sentence in a source language into a target sentence in a target language by an iterative process. Each iteration of the iterative process includes: generating, using a sequence-to-sequence model, probability vectors of conditional probabilities of respective vocabulary words in the target language being translations of a source word in the source sentence; sorting the probabilities in the probability vectors; generating probability difference vectors containing numerical differences between adjacent elements in respective sorted probability; vectors determining, by a fully connected neural network (FCNN), a best beam width B using the probability difference vectors as input to the FCNN; selecting B vocabulary words and B target vectors corresponding to the B highest conditional probabilities; and after all words in the source sentence have been translated, outputting the B target vectors generated in the last iteration.

    SIMULTANEOUS TRANSLATION DEVICE AND COMPUTER PROGRAM

    公开(公告)号:US20240111967A1

    公开(公告)日:2024-04-04

    申请号:US18264595

    申请日:2021-12-27

    IPC分类号: G06F40/44 G06F40/289

    CPC分类号: G06F40/44 G06F40/289

    摘要: A simultaneous translation system includes: an encoder encoding an input word sequence to an intermediate language representation; a chunk-end detecting device detecting an end of a chunk in the word sequence; a word vector reading unit inputting a partial word sequence up to the chunk-end detected by the chunk-end detecting device to the encoder; a decoder and a translated word searching unit receiving the intermediate language representation from encoder as an input, for outputting a translation word sequence corresponding to the partial word sequence; and a translated word sequence storage unit storing the translation word sequences output by decoder and translated word searching unit.

    APPARATUS AND METHOD FOR AUTOMATICALLY GENERATING IMAGE CAPTION

    公开(公告)号:US20230177854A1

    公开(公告)日:2023-06-08

    申请号:US17925354

    申请日:2020-06-16

    摘要: apparatus and method for automatically generating an image caption is provided capable of giving an explanation by using Bayesian inference and an image area-word mapping module on the basis of a deep learning algorithm. An apparatus for automatically generating an image caption, according to one embodiment of the present invention, includes: an automatic caption generation module for creating a caption by applying a deep learning algorithm to an image received from a client; a caption basis generation module for creating a basis for the caption by mapping a partial area in the image received from the client with respect to important words in the caption received from the automatic caption generation module; and a visualization module for visualizing the caption received from the automatic caption generation module and the basis for the caption received from the caption basis generation module to return same to the client.

    Systems and methods for query autocompletion

    公开(公告)号:US11625436B2

    公开(公告)日:2023-04-11

    申请号:US17119941

    申请日:2020-12-11

    摘要: Embodiments described herein provide a query autocompletion (QAC) framework at subword level. Specifically, the QAC framework employs a subword encoder that encodes or converts the sequence of input alphabet letters into a sequence of output subwords. The generated subword candidate sequences from the subword encoder is then for the n-gram language model to perform beam search on. For example, as user queries for search engines are in general short, e.g., ranging from 10 to 30 characters. The n-gram language model at subword level may be used for modeling such short contexts and outperforms the traditional language model in both completion accuracy and runtime speed. Furthermore, key computations are performed prior to the runtime to prepare segmentation candidates in support of the subword encoder to generate subword candidate sequences, thus eliminating significant computational overhead.