摘要:
An Internet Protocol driver executed by a network interface card, or a network address translation module, includes a mechanism that enables a server to bypass packets associated with certain destinations, sources, or a combination of the two based upon their IP address. When a packet arrives at the network interface card, the driver extracts a source IP address and a destination IP address from the packet. The driver searches a table to locate a rule matching one of the addresses. If a match is found, the packet is bypassed. If no match is found, the packet is sent on to an indexing and caching server for further processing. The bypass rules may be adaptively and dynamically generated when a message causes a remote server to respond with an error code. The dynamically generated bypass rules prevent the first server from sending subsequent requests to the remote server, thereby insulating the indexing and caching server from unnecessary network traffic.
摘要:
A method for garbage collection in a cache of information objects is provided. A non-volatile storage device is segmented into storage areas called pools. Each pool has a pool header and a plurality of arenas. Each arena stores one or more fragments of an object. Header information for each arena is stored in the pool header in which that arena is stored. Each fragment comprises a fragment header and fragment data. The garbage collection periodically selects a pool that is storing an amount of data greater than a minimum storage value or high water mark. Each arena in the pool is examined and selected for garbage collection according to selection criteria. Each fragment within a selected arena is examined based upon a second set of selection criteria that determine whether the fragment is retained or deleted. If the fragment is deleted, all other fragments in the storage device that relate to that fragment's object are also deleted. When a fragment arena is retained, it is moved into contiguous storage in another available arena. A computer program product, computer apparatus, and data signal embodied in a carrier wave, configured to carry out the steps of the method, are also disclosed.
摘要:
A high-performance cache is disclosed. The cache is designed for time- and space-efficiency for a diverse range of information objects. Information objects are stored in portions of a non-volatile storage device called arenas, which are contiguous regions from which space is allocated in parallel. Objects are substantially contiguously allocated within an arena and are mapped by name keys and content-based object keys to a tag table, an open directory, and a directory table. The tag table is indexed by the name keys, and stores references to sets in the directory table. The tag table is compact and therefore can be stored in fast main memory, facilitating rapid lookups. The directory table is organized so that at least a frequently-accessed portion of it also usually resides in fast main memory, which further speeds lookups. The tag and directory tables are organized to quickly determine non-presence of objects. Large objects may be chunked into fragments, which are chained using a forward functional-iteration mechanism, to prevent the need for mutating existing on-disk data structures. Garbage collection periodically moves objects within an arena or to other arenas so that inactive objects are deleted and free space becomes contiguous. Because the objects are substantially contiguously allocated, reading and writing an typical object requires only one or two disk head actuator movements; thus, the cache can efficiently and smoothly stream data off of the storage device, providing optimal delivery of multimedia objects. The disclosure also encompasses a computer apparatus, computer program product, and computer data signal embodied in a carrier wave that are similarly configured.
摘要:
Social media networking applications, web sites, and services creates implicit relationships between users based on their interest or participation in real-world and optionally virtual or online activities in addition to explicitly defined peer relationships. User profiles, activity entities, and expressions may be associated with metadata to assist in searching and navigation. Metadata is implicitly associated with user profiles, activity entities, expressions, or other data entities based on user behavior using metadata collector. A metadata collector is a poll, survey, list, questionnaire, census, test, game, or other type of presentation adapted to solicit user interaction. A metadata collector is associated with metadata elements. When users interact with a metadata collector, their user profiles and the data entities included in their interactions become associated with the metadata elements of the metadata collector. These metadata element associations may then be used for any purpose.
摘要:
A high-performance cache is disclosed. The cache is designed for time- and space-efficiency for a diverse range of information objects. Information objects are stored in portions of a non-volatile storage device called arenas, which are contiguous regions from which space is allocated in parallel. Objects are substantially contiguously allocated within an arena and are mapped by name keys and content-based object keys to a tag table, an open directory, and a directory table. The tag table is indexed by the name keys, and stores references to sets in the directory table. The tag table is compact and therefore can be stored in fast main memory, facilitating rapid lookups. The directory table is organized so that at least a frequently-accessed portion of it also usually resides in fast main memory, which further speeds lookups. The tag and directory tables are organized to quickly determine non-presence of objects. Large objects are chunked into fragments, which are chained using a forward functional-iteration mechanism, to prevent the need for mutating existing on-disk data structures. Garbage collection periodically moves objects within an arena or to other arenas. Additionally, for a plurality of counters, the following is computed: (1) the sum of values stored in the counters, and (2) the maximum value that can be represented by the coimters. Each of the counters are decremented when the sum is greater than half of the maximum value. Each of the counters is associated with an information object, which is deleted when a counter is decremented to zero.
摘要:
A method is provided for caching and delivering an alternate version from among a plurality of alternate versions of information objects. One or more alternate versions of an information object, for example, versions of the information object that are prepared in different languages or compatible with different systems, are stored in an object cache database. In the cache, a vector of alternates is associated with a key value that identifies the information object. The vector of alternates stores information that describes the alternate, the context and constraints of the object's use, and a reference to the location of the alternate's object content. When a subsequent client request for the information object is received, the cache extracts information from the client request, and attempts to select an acceptable and optimal alternate from the vector by matching the request information to the cached contextual information in the vector of alternates. This selection is performed in a time- and space-efficient manner. Accordingly, the cache can deliver different versions of an information object based on the metadata and criteria specified in a request to the cache. As a result, the information delivered by the cache is customized for the requesting client, without requiring access to the original object server.
摘要:
A method for caching information objects is provided. Information objects are stored in portions of a non-volatile storage device called arenas, which are contiguous regions from which space is allocated in parallel. Objects are contiguously allocated within an arena and are mapped to directory tables that provide an efficient search mechanism. Each object is identified by a name key and a content key. The name key is constructed by applying a hash function to the composition of the name or URL of the object along with implicit or explicit context about the request. The content key is constructed by applying a hash function to the entire contents of the object data. Buckets and blocks in the directory tables store tags and subkeys derived from the keys. Since duplicate objects that have different names will hash to the same content key, the cache can detect duplicate objects even though they have different names, and store only one copy of the object. As a result, cache storage usage is dramatically reduced, and tracking object aliases is not required. The disclosure also encompasses a computer apparatus, computer program product, and computer data signal embodied in a carrier wave that are configured similarly.
摘要:
A method for consistently storing cached objects in the presence of failures is provided. This method ensures atomic object consistency--in the event of failure and restart, an object will either be completely present or completely absent from the cache, never truncated or corrupted. Furthermore, this consistency comes without any time-consuming data structure reconstruction on restart. In this scheme, objects are indexed by a directory table that is stored in main memory and mapped to non-volatile storage, and changes to the directory table are buffered into an open directory that is stored in main memory. Cache objects are either stored in volatile aggregation buffers or in segments of non-volatile disk storage called arenas. Objects are first coalesced into memory-based aggregation buffers, and later committed to disk. Locking is used to control parallel storage to aggregation buffers. Directory entries pointing to objects are only permitted to be written to persistent disk storage after the target objects are themselves committed to disk, preventing dangling pointers. Periodically, when the contents of open directory entries point to objects that are stably stored on disk, the open directory entries are copied into the directory table and committed to non-volatile storage. The disclosure also encompasses a computer program product, computer apparatus, and computer data signal configured similarly.
摘要:
The foregoing needs and other needs are addressed by the present invention, which provides, in one aspect, a mechanism for locating a data object. According to an aspect of the present invention, key values for data objects are generated, each key value may contain a first subkey value and a second subkey value. A mapping associates the first subkey values with storage locations. A particular first subkey value is mapped to storage location that contains second subkeys of a set of key values that correspond to the first subkey value. The storage location also includes additional information that may be used to locate objects associated with the set of key values.
摘要:
A data-driven, hierarchical information navigation system and method enable search of sets of documents or other materials by certain common attributes that characterize the materials. The invention includes several aspects of a data-driven, hierarchical navigation system that employs this navigation mode. The navigation system of the present invention includes features of an interface, a knowledge base and a taxonomy definition process and a classification process for generating the knowledge base, a graph-based navigable data structure and method for generating the data structure, World Wide Web-based applications of the system, and methods of implementing the system. Users are able to search or browse a particular collection of documents by selecting desired values for the attributes. A data-driven, hierarchical information navigation system and method enable this navigation mode by associating terms with the materials, defining a set of hierarchical relationships among the terms, and providing a guided search mechanism based on the relationship between the terms.