Electronic mail data modeling for efficient indexing
摘要:
Techniques are herein described for creating a scalable IMAP4 compliant email system using a NoSQL database and a distributed full text search engine. Data for each email message is stored in multiple tables to avoid storing redundant data unnecessarily. However, a full text search index is created based on a single table as if the index refers to a single table. In embodiments herein described, the single index is created on the fields of a message metadata table with virtual fields added to the table that are derived from the message content. During this process, data is pulled from a message table in “blob” format and broken down into corresponding fields and data items, so the data items may be converted and placed in the proper virtual fields for index creation. Each blob section that is converted is cached, so the same blob section does not need to be converted multiple times. After index creation, the index may be used to search for emails based on metadata and data within the body of the email.
公开/授权文献
信息查询
0/0