摘要:
In the learning apparatus, a memory stores a dictionary in an updatable manner, and an inputting means inputs data when an instruction is input by a user. An outputting part processes the data inputted through the inputting part by using the dictionary stored in the memory, and outputs the result of the processing. An identifier receiver obtains an identifier of the user or a group to which the user belongs. An updating means updates the dictionary only when the identifier obtained by the identifier receiver is pre-registered in the memory.
摘要:
A translation device has a dictionary that stores a set of words and their corresponding meanings in plural languages; an input unit that inputs a document; a recognizing unit that recognizes text in the inputted document; an analyzing unit that devides the text recognized by the recognizing unit into words; a translating unit that translates each of the words obtained by the analyzing unit into a translated term by using the dictionary; and an output unit that outputs an output image containing the translated term for a key word.
摘要:
The present invention provides a document processing device including: a general feature vector memory that stores feature vectors of a shape for each of plural characters; an input unit that optically reads in a document; a extracting unit that extracts feature vectors from the shapes of characters in a document read in by the input unit; a general shape recognition unit that estimates a character for which the feature vectors of its shape extracted by the extracting unit, based on the feature vectors extracted by the extracting unit and the content stored in the general feature vector memory; and a specific feature vector memory that stores the feature vectors extracted by the extracting unit in association with an estimation result of the general shape recognition unit.
摘要:
The present invention provides a document processing device including: a specifying unit that specifies character strings which have a common property across documents, from among character strings included in plural documents which are represented by plural corresponding document data; and a rewriting unit that rewrites, among the character strings specified by the specifying unit, character strings expressed in formats different from a defined format to character strings expressed in the defined format.
摘要:
A image processing device has a reading unit, a graphics area extraction unit, a writing area extraction unit, a character string extraction unit and an association unit. The reading unit reads a document. The graphics area extraction unit extracts a graphics area from the document read by the reading unit. The writing area extraction unit extracts a writing area from the document read by the reading unit. The character string extraction unit extracts a character string presented in the graphics area. The association unit associates information of the writing area with the graphics area based on the character string extracted by the character string extraction unit.
摘要:
The present invention provides a document processing device including: a specifying unit that specifies character strings which have a common property across documents, from among character strings included in plural documents which are represented by plural corresponding document data; and a rewriting unit that rewrites, among the character strings specified by the specifying unit, character strings expressed in formats different from a defined format to character strings expressed in the defined format.
摘要:
The present invention provides a document processing device including: a general feature vector memory that stores feature vectors of a shape for each of plural characters; an input unit that optically reads in a document; a extracting unit that extracts feature vectors from the shapes of characters in a document read in by the input unit; a general shape recognition unit that estimates a character for which the feature vectors of its shape extracted by the extracting unit, based on the feature vectors extracted by the extracting unit and the content stored in the general feature vector memory; and a specific feature vector memory that stores the feature vectors extracted by the extracting unit in association with an estimation result of the general shape recognition unit.
摘要:
A image processing device has a reading unit, a graphics area extraction unit, a writing area extraction unit, a character string extraction unit and an association unit. The reading unit reads a document. The graphics area extraction unit extracts a graphics area from the document read by the reading unit. The writing area extraction unit extracts a writing area from the document read by the reading unit. The character string extraction unit extracts a character string presented in the graphics area. The association unit associates information of the writing area with the graphics area based on the character string extracted by the character string extraction unit.
摘要:
A translation device has a dictionary that stores a set of words and their corresponding meanings in plural languages; an input unit that inputs a document; a recognizing unit that recognizes text in the inputted document; an analyzing unit that devides the text recognized by the recognizing unit into words; a translating unit that translates each of the words obtained by the analyzing unit into a translated term by using the dictionary; and an output unit that outputs an output image containing the translated term for a key word.
摘要:
The invention provides a document processing device including: a memory that stores syntax data expressing syntax of character strings whose probability of being a title of a document is high or-character strings whose probability of being a title of a document is low; an input unit that inputs document data obtained by digitizing a document; an extraction unit that analyzes the input document data and extracts character string data expressing character strings; a syntax analyzing unit that analyzes the extracted character string data and specifies the syntax of each character string contained in the document corresponding to the document data; and a specifying unit that specifies, from among the extracted character string data, character string data expressing a title of the document corresponding to the document data, based on results of specification by the syntax analyzing unit and content stored in the memory.