-
公开(公告)号:US20250021759A1
公开(公告)日:2025-01-16
申请号:US18219763
申请日:2023-07-10
Applicant: Oracle International Corporation
Inventor: Samuele Meta , Aneesh Dahiya , Felix Schmidt , Marija Nikolic , Matteo Casserini , Milos Vasic
IPC: G06F40/284 , G06F11/34
Abstract: Herein is natural language processing (NLP) to detect an anomalous log entry using a language model that infers an encoding of the log entry from novel generation of numeric lexical tokens. In an embodiment, a computer extracts an original numeric lexical token from a variable sized log entry. Substitute numeric lexical token(s) that represent the original numeric lexical token are generated, such as with a numeric exponent or by trigonometry. The log entry does not contain the substitute numeric lexical token. A novel sequence of lexical tokens that represents the log entry and contains the substitute numeric lexical token is generated. The novel sequence of lexical tokens does not contain the original numeric lexical token. The computer hosts and operates a machine learning model that generates, based on the novel sequence of lexical tokens that represents the log entry, an inference that characterizes the log entry with unprecedented accuracy.
-
公开(公告)号:US20240370429A1
公开(公告)日:2024-11-07
申请号:US18143776
申请日:2023-05-05
Applicant: Oracle International Corporation
Inventor: Aneesh Dahiya , Matteo Casserini , Marija Nikolic , Milos Vasic , Samuele Meta , Nikola Milojkovic , Felix Schmidt
IPC: G06F16/2452 , G06N3/0455 , G06N3/08
Abstract: In an embodiment, a computer generates sentence fingerprints that represent respective pluralities of similar database statements. Based on the sentence fingerprints, an artificial neural network is trained. After training the artificial neural network on a large corpus of fingerprinted database statements, the artificial neural network is ready to be used for zero-shot transfer learning to a downstream task in training. Database statement fingerprinting also anonymizes literal values in raw SQL statements. The trained artificial neural network can be safely reused without risk of disclosing sensitive data in the artificial neural network's vocabulary. After training, the artificial neural network infers a fixed-size encoded database statement from a new database statement. Based on the fixed-size encoded database statement, the new database statement is detected as anomalous, which increases database security and preserves database throughput by not executing the anomalous database statement.
-