-
公开(公告)号:US20240281455A1
公开(公告)日:2024-08-22
申请号:US18444454
申请日:2024-02-16
Applicant: Oracle International Corporation
Inventor: Youssef Mohamed Saied , Mohamed Ridha Chahed , Anatoly Yakovlev , Sandeep R. Agrawal , Sanjay Jinturkar , Nipun Agarwal
CPC classification number: G06F16/285 , G06F16/2282
Abstract: Disclosed is an improved approach to implement anomaly detection, where an ensemble detection mechanism is provided. An improvement is provided for the KNN algorithm where scaling is applied to permit efficient detection of multiple categories of anomalies. Further extensions are used to optimize local anomaly detection.
-
公开(公告)号:US20250094777A1
公开(公告)日:2025-03-20
申请号:US18821539
申请日:2024-08-30
Applicant: Oracle International Corporation
Inventor: Anatoly Yakovlev , Sandeep R. Agrawal , Karoon Rashedi Nia , Ridha Chahed , Sanjay Jinturkar , Nipun Agarwal
IPC: G06N3/0455
Abstract: The present disclosure relates to LLM orchestration with vector store generation. An embeddings model may be selected to generate an embedding for a digital artifact. Metadata for the digital artifact may also be generated and stored in a vector store in association with the embedding. A user query may be received and categorized. One of a plurality of machine learning models may be selected based on the categorization of the user query. A prompt may be generated based at least in part on the user query, and the selected machine learning model may generate a response to the user query based at least in part on the prompt.
-
公开(公告)号:US20240086763A1
公开(公告)日:2024-03-14
申请号:US17944949
申请日:2022-09-14
Applicant: Oracle International Corporation
Inventor: Jeremy Plassmann , Anatoly Yakovlev , Sandeep R. Agrawal , Ali Moharrer , Sanjay Jinturkar , Nipun Agarwal
Abstract: Techniques for computing global feature explanations using adaptive sampling are provided. In one technique, first and second samples from an dataset are identified. A first set of feature importance values (FIVs) is generated based on the first sample and a machine-learned model. A second set of FIVs is generated based on the second sample and the model. If a result of a comparison between the first and second FIV sets does not satisfy criteria, then: (i) an aggregated set is generated based on the last two FIV sets; (ii) a new sample that is double the size of a previous sample is identified from the dataset; (iii) a current FIV set is generated based on the new sample and the model; (iv) determine whether a result of a comparison between the current and aggregated FIV sets satisfies criteria; repeating (i)-(iv) until the result of the last comparison satisfies the criteria.
-
公开(公告)号:US10521225B2
公开(公告)日:2019-12-31
申请号:US15638168
申请日:2017-06-29
Applicant: Oracle International Corporation
Inventor: Arun Raghavan , Sandeep R. Agrawal , Sam Idicula , Nipun Agarwal
Abstract: Techniques related to matrix multiplication at memory bandwidth are disclosed. Computing device(s) perform multiplication of a first matrix with a second matrix to generate a third matrix. A first register stores contiguous element values of the first matrix. Furthermore, a second register stores a first set of contiguous element values of the second matrix, and a third register stores a second set of contiguous element values of the second matrix. The first set and the second set correspond to a first row and a second row, respectively, of the second matrix. The first row and the second row are contiguous rows. A single instruction is executed to cause at least a partial computation of contiguous element values of the third matrix. The single instruction causes multiplication of element values stored in the first register with element values stored in the second and third registers and grouped accumulation of the products.
-
公开(公告)号:US10452744B2
公开(公告)日:2019-10-22
申请号:US15470377
申请日:2017-03-27
Applicant: Oracle International Corporation
Inventor: Sandeep R. Agrawal , Sam Idicula , Nipun Agarwal
Abstract: Techniques related to memory management for sparse matrix multiplication are disclosed. Computing device(s) may perform a method for multiplying a row of a first sparse matrix with a second sparse matrix to generate a product matrix row. A compressed representation of the second sparse matrix is stored in main memory. The compressed representation comprises a values array that stores non-zero value(s). Tile(s) corresponding to row(s) of second sparse matrix are loaded into scratchpad memory. The tile(s) comprise set(s) of non-zero value(s) of the values array. A particular partition of an uncompressed representation of the product matrix row is generated in the scratchpad memory. The particular partition corresponds to a partition of the second sparse matrix comprising non-zero value(s) included in the tile(s). When a particular tile is determined to comprise non-zero value(s) that are required to generate the particular partition, the particular tile is loaded into the scratchpad memory.
-
公开(公告)号:US20250094787A1
公开(公告)日:2025-03-20
申请号:US18808300
申请日:2024-08-19
Applicant: Oracle International Corporation
Inventor: Karoon Rashedi Nia , Anatoly Yakovlev , Sandeep R. Agrawal , Ridha Chahed , Sanjay Jinturkar , Nipun Agarwal
IPC: G06N3/0475 , G06F21/62 , G06N3/092
Abstract: Disclosed herein are various approaches for sharing knowledge within and between organizations while protecting sensitive data. A machine learning model may be trained using training prompts querying a vector store to prevent unauthorized user disclosure of data derived from the vector store. A prompt may be received and a response to the prompt may be generated using the machine learning model based at least in part on the vector store.
-
公开(公告)号:US20230153394A1
公开(公告)日:2023-05-18
申请号:US17528305
申请日:2021-11-17
Applicant: Oracle International Corporation
Inventor: Ritesh Ahuja , Anatoly Yakovlev , Venkatanathan Varadarajan , Sandeep R. Agrawal , Hesam Fathi Moghadam , Sanjay Jinturkar , Nipun Agarwal
CPC classification number: G06K9/6227 , G06K9/6257 , G06K9/6265 , G06K9/6298 , G06N20/00
Abstract: Herein are timeseries preprocessing, model selection, and hyperparameter tuning techniques for forecasting development based on temporal statistics of a timeseries and a single feed-forward pass through a machine learning (ML) pipeline. In an embodiment, a computer hosts and operates the ML pipeline that automatically measures temporal statistic(s) of a timeseries. ML algorithm selection, cross validation, and hyperparameters tuning is based on the temporal statistics of the timeseries. The result from the ML pipeline is a rigorously trained and production ready ML model that is validated to have increased accuracy for multiple prediction horizons. Based on the temporal statistics, efficiency is achieved by asymmetry of investment of computer resources in the tuning and training of the most promising ML algorithm(s). Compared to other approaches, this ML pipeline produces a more accurate ML model for a given amount of computer resources and consumes fewer computer resources to achieve a given accuracy.
-
公开(公告)号:US20210390466A1
公开(公告)日:2021-12-16
申请号:US17086204
申请日:2020-10-30
Applicant: Oracle International Corporation
Inventor: Venkatanathan Varadarajan , Sandeep R. Agrawal , Hesam Fathi Moghadam , Anatoly Yakovlev , Ali Moharrer , Jingxiao Cai , Sanjay Jinturkar , Nipun Agarwal , Sam Idicula , Nikan Chavoshi
Abstract: A proxy-based automatic non-iterative machine learning (PANI-ML) pipeline is described, which predicts machine learning model configuration performance and outputs an automatically-configured machine learning model for a target training dataset. Techniques described herein use one or more proxy models—which implement a variety of machine learning algorithms and are pre-configured with tuned hyperparameters—to estimate relative performance of machine learning model configuration parameters at various stages of the PANI-ML pipeline. The PANI-ML pipeline implements a radically new approach of rapidly narrowing the search space for machine learning model configuration parameters by performing algorithm selection followed by algorithm-specific adaptive data reduction (i.e., row- and/or feature-wise dataset sampling), and then hyperparameter tuning. Furthermore, because of the one-pass nature of the PANI-ML pipeline and because each stage of the pipeline has convergence criteria by design, the whole PANI-ML pipeline has a novel convergence property that stops the configuration search after one pass.
-
公开(公告)号:US20190004794A1
公开(公告)日:2019-01-03
申请号:US15638168
申请日:2017-06-29
Applicant: Oracle International Corporation
Inventor: Arun Raghavan , Sandeep R. Agrawal , Sam Idicula , Nipun Agarwal
Abstract: Techniques related to matrix multiplication at memory bandwidth are disclosed. Computing device(s) perform multiplication of a first matrix with a second matrix to generate a third matrix. A first register stores contiguous element values of the first matrix. Furthermore, a second register stores a first set of contiguous element values of the second matrix, and a third register stores a second set of contiguous element values of the second matrix. The first set and the second set correspond to a first row and a second row, respectively, of the second matrix. The first row and the second row are contiguous rows. A single instruction is executed to cause at least a partial computation of contiguous element values of the third matrix. The single instruction causes multiplication of element values stored in the first register with element values stored in the second and third registers and grouped accumulation of the products.
-
公开(公告)号:US20180275909A1
公开(公告)日:2018-09-27
申请号:US15470377
申请日:2017-03-27
Applicant: Oracle International Corporation
Inventor: Sandeep R. Agrawal , Sam Idicula , Nipun Agarwal
Abstract: Techniques related to memory management for sparse matrix multiplication are disclosed. Computing device(s) may perform a method for multiplying a row of a first sparse matrix with a second sparse matrix to generate a product matrix row. A compressed representation of the second sparse matrix is stored in main memory. The compressed representation comprises a values array that stores non-zero value(s). Tile(s) corresponding to row(s) of second sparse matrix are loaded into scratchpad memory. The tile(s) comprise set(s) of non-zero value(s) of the values array. A particular partition of an uncompressed representation of the product matrix row is generated in the scratchpad memory. The particular partition corresponds to a partition of the second sparse matrix comprising non-zero value(s) included in the tile(s). When a particular tile is determined to comprise non-zero value(s) that are required to generate the particular partition, the particular tile is loaded into the scratchpad memory.
-
-
-
-
-
-
-
-
-