Invention Grant
- Patent Title: Machine learning-based DNS request string representation with hash replacement
-
Application No.: US17197375Application Date: 2021-03-10
-
Publication No.: US11784964B2Publication Date: 2023-10-10
- Inventor: Renata Khasanova , Felix Schmidt , Stuart Wray , Craig Schelp , Nipun Agarwal , Matteo Casserini
- Applicant: Oracle International Corporation
- Applicant Address: US CA Redwood Shores
- Assignee: Oracle International Corporation
- Current Assignee: Oracle International Corporation
- Current Assignee Address: US CA Redwood Shores
- Agency: Hickman Becker Bingham Ledesma LLP
- Main IPC: H04L61/4511
- IPC: H04L61/4511 ; G06N20/00 ; H04L41/16 ; G06F40/30

Abstract:
Techniques are described herein for using machine learning to learn vector representations of DNS requests such that the resulting embeddings represent the semantics of the DNS requests as a whole. Techniques described herein perform pre-processing of tokenized DNS request strings in which hashes, which are long and relatively random strings of characters, are detected in DNS request strings and each detected hash token is replaced with a placeholder token. A vectorizing ML model is trained using the pre-processed training dataset in which hash tokens have been replaced. Embeddings for the DNS tokens are derived from an intermediate layer of the vectorizing ML model. The encoding application creates final vector representations for each DNS request string by generating a weighted summation of the embeddings of all of the tokens in the DNS request string. Because of hash replacement, the resulting DNS request embeddings reflect semantics of the hashes as a group.
Public/Granted literature
- US20220294757A1 MACHINE LEARNING-BASED DNS REQUEST STRING REPRESENTATION WITH HASH REPLACEMENT Public/Granted day:2022-09-15
Information query