Abstract:
Systems, methods, devices, and computer-readable media for managing duplicate media items. The system first analyzes a first file from a first source, wherein the first file is a duplicate of a second file. Next, the system deduplicates the first file and the second file to yield a deduplicated file. The system then selects metadata associated with at least one of the first file or the second file to be assigned as metadata for the deduplicated file, the metadata being selected based on a priority preference.
Abstract:
Systems and methods for accelerating relational database applications are disclosed whereby the retrieval of objects can be 100,000 times faster than state of the art methods. According to embodiments of the present invention, an application may directly obtain digital objects from an in-memory store rather than querying a possibly remote data source. In some embodiments, several in-memory nodes are deployed simultaneously, for example, in clusters. Changes in underlying data store(s) can be updated to in-memory cache with SQL triggers. Potential queries may be predicted with automatically generated code. Advanced read/write locking mechanisms further improve the performance of data access.
Abstract:
A storage system is characterized in that the storage system includes duplication-determination-unit determining means for determining a duplication determination unit, which is a unit to be used in determining duplications of data, on the basis of a duplication generation rate computed for each of a plurality of data division units obtained as a result of division of data stored in a storage device, and duplication eliminating means for carrying out processing to eliminate duplications of the data stored in the storage device on the basis of the duplication determination unit determined by the duplication-determination-unit determining means.
Abstract:
In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.
Abstract:
The present invention relates to a method for uploading in an on-line storage system a file from a user client device through a gateway connected to the client device through a local area network and to the on-line storage system through a wide area network. The on-line storage system comprises a storage server coupled to a target storage device in which the file is to be stored.A gateway receives from the user client device an uploaded file.If the file is not present in the target storage device, the gateway uploads the file to the target storage device, and if the file is present in the target storage device, the gateway creates a link to the file stored in the target storage device.
Abstract:
In one embodiment, a system includes logic adapted for: receiving data identifiers (IDs), each associated with a file, from multiple data providers, storing the data IDs to a database, identifying any duplicate data IDs in the database to determine if any of the files associated with the data IDs are non-confidential, querying the data providers which provided the file having the duplicate data ID to determine if the data provider wants to store the file to a storage network, such as a cloud storage network, receiving a response from the data provider indicating whether or not to store the file to the storage network, receiving the file from the data provider, storing the file to a storage network, and causing deletion of the file from a system of the data provider. In other embodiments, computer program products are presented for storing data to a storage network.
Abstract:
A data virtualization storage appliance performs data deduplication transformations on the data. The original or non-deduplicated file system is used as shell to hold the directory/file hierarchy and file metadata. The data of the file system is stored by a separate data storage in a transformed and deduplicated form. The deduplicated data store may be implemented as one or more hidden files. The shell file system preserves the hierarchy structure and potentially the file metadata of the original, non-deduplicated file system in its original format, allowing clients to access file metadata and hierarchy information easily. The data of a file may be removed from the shell file system and replaced with a data layout that specifies the arrangement of deduplicated data segments needed to reconstruct the file data. The data layout associated with a file may be stored in a separate data stream in the shell file system.
Abstract:
Provided is a computer system, including: a computer; and a storage system coupled to the computer via a network. The computer includes: an interface coupled to the network, a processor coupled to the interface and a memory coupled to the processor. The storage system includes a plurality of volumes in which files are stored. The processor is configured to: decide duplicating files from among the files stored in the plurality of volumes as files to be consolidated; identify a plurality of volumes in which the files to be consolidated are stored; select at least one volume from among the identified plurality of volumes as a consolidation volume based on loads imposed on the identified plurality of volumes; and delete the files to be consolidated stored in the volumes that are not selected. Accordingly, in data de-duplication, it is possible to avoid extra loads from centralizing in a high-load-bearing volume.
Abstract:
A method of managing data fragments on computer readable storage media includes identifying an identical data segment within both of first and second data files, establishing a single instance of the identical data segment as a shared data fragment, modifying file headers associated with the first and second data files so that each file header associates with the shared data fragment, and reclaiming storage space that contains a redundant instance of the identical data segment. A data file or data fragment may be divided or further divided into data fragments if the file or fragment is identified as having a data segment that is identical to a data segment in a different data file or fragment. The method should require that amount of identical data reclaimed is greater than the amount of new header information stored with each fragment.
Abstract:
A client device operates via at least one application to: generate feature detection data for a plurality of photographs by performing a computer vision function on the plurality of photographs; facilitate, based on the selection data generated via a graphical user interface, deletion of the at least one of the plurality of photographs corresponding to images of the particular person to be deleted from the memory; facilitate, based on selection data generated via the graphical user interface, deletion of the at least one of the plurality of photographs corresponding to images from a particular location to be deleted from the memory; and facilitate, based on selection data generated via the graphical user interface, deletion of the at least one of the plurality of photographs corresponding to images of a group of particular people to be deleted from the memory.