Abstract:
A system and method for curation of document versions with significantly reduced storage requirements. In some embodiments, all or substantially all versions of a document are at least initially retained. Based on various criteria, versions of the document are selectively deleted while preserving the versions that are likely to provide the highest value. Advantageously, the teachings of embodiments as described can be used in conjunction with various systems, including document versioning, deduplication, and retention systems.
Abstract:
Systems, methods, and computer program products for enabling assessment of the quality of a search index. In one embodiment, objects are processed to produce corresponding text that is stored and indexed. The objects are also processed to identify and store corresponding metadata values for indexing. Error conditions that are detected during the processing of objects to generate corresponding text are tracked and compared to determine the most severe of the error conditions. An indication of the most severe error condition is stored in a first consolidated error field. Errors that are encountered in the identification and storage of metadata values are counted and this count is stored in a second consolidated error field. Both of the consolidated error fields are indexed in the same manner as the text and metadata for the objects, so that the stored error information can be used in queries of the search index.
Abstract:
Search engines today are capable of incorporating numeric scoring modifiers from controlling applications into their relevance computations. Challenges arise in keeping these modifiers current, given that they may change over time. Embodiments provide a new way to compute numeric value decay for efficient relevance computation without having to rely on a controlling application. The controlling application can set a value for a modifier of an object managed by the controlling application and the controlling application can perform operations on the modifier. However, the controlling application does not need to keep track of the modifier and compute the modifier value independently. Rather, a search engine is configured to perform decay computation(s) and adjust the modifier value on a regular basis or on demand. The search engine ensures that modifier values for all the objects indexed by the search engine are always valid—within acceptable ranges and with acceptable adjustments.
Abstract:
Responsive to a request from a user device, a content server may perform an electronic discovery function. The request may include information on a quantity of data objects desired from a collection of data objects stored in a repository. Objects stored in the repository may be managed by the content server. The content server may determine a number of batches and process the collection of data objects into batches, each having a batch size. An efficient selection process may be determined and utilized in selecting data objects from each of the batches such that a total number of data objects selected from the collection is not less than the quantity of data objects desired. The content server may make a disk image of the selected data objects and communicate same to the user device over a network.
Abstract:
Systems, methods, and computer program products for enabling assessment of the quality of a search index. In one embodiment, objects are processed to produce corresponding text that is stored and indexed. The objects are also processed to identify and store corresponding metadata values for indexing. Error conditions that are detected during the processing of objects to generate corresponding text are tracked and compared to determine the most severe of the error conditions. An indication of the most severe error condition is stored in a first consolidated error field. Errors that are encountered in the identification and storage of metadata values are counted and this count is stored in a second consolidated error field. Both of the consolidated error fields are indexed in the same manner as the text and metadata for the objects, so that the stored error information can be used in queries of the search index.