-
1.
公开(公告)号:US20250094398A1
公开(公告)日:2025-03-20
申请号:US18885531
申请日:2024-09-13
Applicant: Oracle International Corporation
Inventor: Aleksandra Czarlinska , Saurabh Naresh Netravalkar , Denis B. Mukhin , Harichandan Roy , Zhen Hua Liu , Sebastian de la Hoz Luna , Beda Christoph Hammerschmidt , George R. Krupka , Bo Xia , David Chih-Wei Jiang
Abstract: Techniques for a unified relational database framework for hybrid vector search are provided. In one technique, multiple documents are accessed and a vector table and a text table are generated. For each accessed document, data within the document is converted to plaintext, multiple chunks are generated based on the plaintext, an embedding model generates a vector for each of the chunks, the vectors are stored in the vector table along with a document identifier that identifies the accessed document, tokens are generated based on the plaintext, the tokens are stored in the text table along with the document identifier. Such processing may be performed in a database system in response to a single database statement to create a hybrid index. In response to receiving a hybrid query, a vector query and a text query are generated and executed and the respective results may be combined.