Systems and methods for online adaptation for cross-domain streaming data

    公开(公告)号:US12235850B2

    公开(公告)日:2025-02-25

    申请号:US17588022

    申请日:2022-01-28

    Abstract: Embodiments described herein provide an online domain adaptation framework based on cross-domain bootstrapping for online domain adaptation, in which the target domain streaming data is deleted immediately after adapted. At each online query, the data diversity is increased across domains by bootstrapping the source domain to form diverse combinations with the current target query. To fully take advantage of the valuable discrepancies among the diverse combinations, a set of independent learners are trained to preserve the differences. The knowledge of the learners is then integrated by exchanging their predicted pseudo-labels on the current target query to co-supervise the learning on the target domain, but without sharing the weights to maintain the learners' divergence.

    Multi-context stateful rule execution

    公开(公告)号:US12235849B2

    公开(公告)日:2025-02-25

    申请号:US17993795

    申请日:2022-11-23

    Abstract: A rules engine (RE) may operate in conjunction with a database providing functionality, such as transactional support in data access environments on behalf of tenants. The database may have a data repository accessible by multiple tenants, and tenants may have a private context. RE rules may be defined in the data repository having an extension point corresponding to an extension defined in the private context. Execution of database functionality may affect data defined in the database. Execution of RE rules corresponding to database functionality affects corresponding data associated with the RE. Various techniques, e.g., fact handles, event listeners, etc. may be used to coordinate tracking and synchronizing changes between RE data and/or the database. A flag or other indicator may signify state preservation between multiple calls to the database and/or the RE, e.g., to support analysis involving transactions having overlapping intermediary results such as results from performing data lookups.

    AUTOMATED DATA EXTRACTION PIPELINE FOR LARGE LANGUAGE MODEL TRAINING

    公开(公告)号:US20250060944A1

    公开(公告)日:2025-02-20

    申请号:US18449498

    申请日:2023-08-14

    Abstract: An automated data extraction pipeline for large language model (LLM) training may include extracting a set of code segments from a set of natural language question-answer (Q&A) combinations that each include a provided input, a provided output, and a provided code segment formatted to transform the provided input into the provided output. The data extraction pipeline may then generate a predicted output from a question portion of a first natural language Q&A combination using a first LLM. A first extracted code segment from the extracted set of code segments may then be executed to generate a first actual output of the first extracted code segment. One or more data samples may then be generated for training a second LLM based on a comparison of the first actual output to the predicted output. The second LLM may then be trained using the one or more data samples.

    Semantic alignment of text and visual cards to present time series metrics

    公开(公告)号:US12229856B2

    公开(公告)日:2025-02-18

    申请号:US17956746

    申请日:2022-09-29

    Abstract: A computing device displays, in a graphical user interface, a canvas region that includes a first scene. The first scene includes a first visualization card having a first data visualization and a first text card, adjacent to the first visualization card. The device receives, via the first text card, (i) text input from a user and (ii) user selection of a first user interface element for linking the first text card to the first visualization card. In accordance with the receiving, the device determines whether the text input includes a first expression having a first time span that intersects with a temporal domain of the first data visualization. In accordance with a determination that the text input includes the first expression, and in response to a first user interaction with a first region of the first text card that includes the first expression, the device visually emphasizes a first portion of the first data visualization, corresponding to the first time span.

    Techniques for communication process flow and data platform integration

    公开(公告)号:US12229701B2

    公开(公告)日:2025-02-18

    申请号:US17856508

    申请日:2022-07-01

    Abstract: Methods, systems, apparatuses, devices, and computer program products are described. A communication process flow management service that manages a communication process flow may receive an indication of a segment of entities from a second service that manages a data model for multiple entities. Based on an action of the communication process flow, the communication process flow management service may request schema of the data model or additional attribute data associated with the segment from the second service. The communication process flow management service may receive the schema or the additional attribute data and use it to determine a set of communications to be transmitted to one or more entities of the segment. The communication process flow management service may transmit the set of communications in accordance with the communication process flow.

    Systems and methods for code understanding and generation

    公开(公告)号:US12217033B2

    公开(公告)日:2025-02-04

    申请号:US18475103

    申请日:2023-09-26

    Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.

    QUORUM-BASED SCALABLE DATABASE SYSTEM

    公开(公告)号:US20250036654A1

    公开(公告)日:2025-01-30

    申请号:US18779287

    申请日:2024-07-22

    Abstract: Techniques are disclosed relating to a database system. The database system includes multiple coordinator nodes storing replicas of a partition. Each partition describes the state of locks and transactions for keys covered by that partition of keys. Each partition is, in turn, replicated. The multiple coordinator nodes receive, from multiple worker nodes, requests to grant a lock for a key to permit a worker node to write a record for the key as part of executing a transaction. A given coordinator node of the multiple coordinator nodes sends an approval response for the lock to at most one of the worker nodes. A single worker node acquires the lock in response to receiving approval responses from a majority of the multiple coordinator nodes, and none of the multiple worker nodes acquire the lock in response to none of them receiving approval responses from a majority of the multiple coordinator nodes.

    Elastic data partitioning of a database

    公开(公告)号:US12204948B2

    公开(公告)日:2025-01-21

    申请号:US18464078

    申请日:2023-09-08

    Abstract: A database entry may be stored in a container in a database table corresponding with a partition key. The partition key may be determined by applying one or more partition rules to one or more data values associated with the database entry. The database entry may be an instance of one of a plurality of data object definitions associated with database entries in the database. Each of the data object definitions may identify a respective one or more data fields included within an instance of the data object definition.

    Systems and methods for text classification using label modular prompts

    公开(公告)号:US12204857B2

    公开(公告)日:2025-01-21

    申请号:US18059234

    申请日:2022-11-28

    Abstract: Embodiments described herein provide training a prompt generator for text classification. A first training dataset associated with a first plurality of class labels is received for a first training process. For a first instance of the first training dataset, a set of labels of interest is generated by sampling from a set of possible class labels including the first plurality of class labels. The prompt generator generates a first prompt based on the set of labels of interest. A pretrained language model generates a task output in response to an input of the first instance prepended with the first prompt. A loss objective is generated based on the task output and the set of labels of interest. Parameters of the prompt generator are updated based on the computed loss function via backpropagation while the pretrained language model is frozen.

Patent Agency Ranking