摘要:
A database apparatus has an element appearance information storage portion in which element appearance information is stored using element name IDs as keys, an ancestral path appearance information storage portion in which element appearance information is stored using ancestral path name IDs of the elements as keys, an attribute appearance information storage portion in which attribute appearance information is stored using attribute name IDs as keys, and a text appearance information storage portion in which appearance information about text character strings of element entities and the values of attributes possessed by the elements is stored using the partial character strings as keys.
摘要:
A document retrieval system for searching a document coinciding with a retrieval request the user inputs and further ranking the document in accordance with the degree of coincidence between the document and the retrieval request. In the document retrieval system, a word frequency calculating section finds out the number of documents where a word appears, a frequency of occurrence of the word in a document and obtains a weighting parameter for the word, and a frequency score calculating section obtains a frequency score on the basis of the output of the word frequency calculating section. In addition, a word cooccurrence relation checking section checks word cooccurrence relations of the retrieval request and the document, and a cooccurrence score calculating section calculates a cooccurrence score from the degree of coincidence therebetween. A document score calculating section calculates a document score on the basis of the frequency score and the cooccurrence score. The documents are ranked in order of document score and displayed to the user.
摘要:
Provided is a database management server apparatus that can maintain the consistency of updates and prevent blocking other update requests in an update process.A server apparatus 3 of a database management system 1 has a function of nondestructively updating databases in response to an update request from a client apparatus 2 to manage generation-management databases. A main storage unit 4 stores entities of a plurality of databases for each version of the databases, and a version creating unit 5 creates a new version of the databases in response to an update request from a client apparatus. A request accepting unit 11 accepts an update request for a next version regardless of whether the new version is being created. An acceptance management unit 13 starts a period for accepting the update request for the next version in response to the update request and ends the period for accepting after a predetermined time. A version creating unit 5 creates the next version based on the update request accepted in the period for accepting.
摘要:
A system for providing keywords to facilitate a search in a text retrieval system. For each of texts constituting a text base, the system creates a word ID of each of words used in the text and a word occurrence count of a corresponding word. The word occurrence count indicates a number of occurrences of a word in each text. For each of words used in any of the texts constituting the text base, the system creates a total word occurrence count and a containing text count indicative of the number of texts containing the word. For each of words contained in the selected texts, a degree of importance is calculated by using the word occurrence count, the total word occurrence count and the containing text count. The words contained in the selected texts are sorted in order of the degree of importance. At least a part of the sorted words are displayed as related keywords.
摘要:
A searching apparatus includes an index generation portion for generating an index to provide data of the number of documents including the key word and the number of appearances of the key word. Matching degrees between the key word and documents are calculated from the number of documents including the key word and the number of appearances of the key word. A portion of documents are arranged in order of the matching degree in a buffer which are outputted as the searching result. Lower rank documents regarding the matching degree are searched by comparing the lowest matching degree of the neighbour higher ranked document arranged in the buffer. At first time searching, data of the latest edition of the documents stored in a memory is detected and stored and is used to provide second time searching operation to eliminate inconsistency in the searching result between the editions at first and second time searching operations. The index is generated every field of each document. The matching degree of combined field is calculated by logical operation between the two fields. Moreover, an index of combined field may be generated and one of field of the combined field may be omitted. The matching degree of the other field is also obtained by another logical operation.
摘要:
A new type of text search apparatus, capable of finding all occurrence positions of a search string that is an arbitrary character string, within a text which is written as a continous sequence of characters, utilizes for text position reference purposes in an index file, words which each occur (at least once within the text) as the maximum length word, referred to as an extension word, among a set of arbitrarily predefined dictionary words extending from a specific character position. Each such occurrence of a word as an extension word defines one of a set of text position elements, with that set covering all of the character positions of the text. The index file also includes a table which relates each of the extension words to the respective positions at which each of the partial character strings of the word occur within the word. Each occurrence of an arbitrary search string within the text can thereby be expressed as either a partial character string within a single text position element, or as a sequence of partial character strings within a set of sequentially occurring text position elements, so that all such occurrences can be found by utilizing the index file.
摘要:
A document storing and managing system for storing plural electronic documents in each of folders according to classifications and managing the stored electronic documents in a unit of the folder has a folder managing means for managing attributes of the electronic documents included in each of the folders, a document version managing means for managing information as to version of the electronic documents included in each of the folder, and a folder version managing means for managing a correspondence relation between a version of the folder and a version of each of the electronic documents included in the folder. The document storing and managing system of this invention may set and manage a version of a folder while keeping adjustability with a version of each document.
摘要:
A concept dictionary management device includes a fundamental concept dictionary data holding portion for holding fundamental concept network connection information which represents fundamental concept network connections among words stored in a concept dictionary, a first supplemental concept dictionary data holding portion for holding first supplemental concept network connection information to be used for adding words to and deleting words from the concept network connections represented by the fundamental concept network connection information, a second supplemental concept dictionary data holding portion for holding second supplemental concept network connection information to be used only for personal use, to add one or more words to and deleting one or more words from the concept network connections obtained as a result of an addition of a deletion of the words connection by using the first supplemental concept network connection information, a concept dictionary retrieval portion for retrieving a concept network connection including an input word from the fundamental concept dictionary data holding portion, from the first supplemental concept dictionary data holding portion and from the second supplemental concept dictionary data holding portion, and an operation control portion for receiving concept network connection information representing the concept network connection retrieved by the concept dictionary retrieval means and for extracting a word from the received network connection information and outputting data indicating the extracted word to the concept dictionary retrieval portion as data indicating an input word.
摘要:
In the present invention, a similar vector is searched from a several hundreds dimensional vector database at a high speed, by a single vector index, and in accordance with either measure of an inner product or a distance by designating a similarity search range and maximum obtained pieces number, vector index preparation is performed by decomposing each vector into a plurality of partial vectors and characterizing the vector by a norm division, belonging region and declination division to prepare an index, and similarity search is performed by obtaining a partial query vector and partial search range from a query vector and search range, performing similarity search in each partial space to accumulate a difference from the search range and to obtain an upper limit value, and obtaining a correct measure from a higher upper limit value to obtain a final similarity search result.
摘要:
A constructing method of a finite state machine with failure transitions FFM is disclosed. The machine FFM is constructed from a nondeterministic finite-state machine and a string of external inputs. States in the machine FFM is formed of a state set q included in the nondeterministic finite-state machine and a set p defined as a subset of the state set q, and the number of states is finite. Also, an external input c takes the machine FFM from a current state s to a next state g(s,c) and an output .mu.(s) is output from the next state g(s,c) in cases where a value g(s,c) of a success function g is defined, and an external input c takes the machine FFM from the current state s to a state g(f(f...f(s)...)) determined by repeatedly calculating a value f(s) of a failure function f until a value g(f(f...f(s)...)) defined is found out in cases where the value g(s,c) of the success function g is not defined. Because all of transitions from the current state s for all external inputs c are not defined by the success function g, a storage capacity for storing the machine FFM is considerably reduced.