Abstract:
The present invention is a text display system with speech output that uses a method of text segmentation in which segments of text are presented one after another for reading text sequentially. To indicate the location of text a user is currently reading, the current sentence is emphasized by presenting the surrounding text in faded colors. The current sentence is segmented into phrases where the points of segmentation are chosen by a series of grammatical rules and the desired number of words in each segment. When the text is presented sequentially, each segment is highlighted within the current sentence. With the use of a text-to-speech output system, each segment is spoken out with a pause before the next segment is presented. In a non-linear/selective reading scenario, a user can select a text segment, for which the span of the segment can be automatically generated or manually selected by the user.
Abstract:
Various embodiments provide a method that comprises receiving a set of segments from a text field, analyzing the set of segments to determine at least one of a target subtext or a target meaning associated with the set of segments, and identifying a set of candidate emoticons where each candidate emoticon in the set of candidate emoticons has an association between the candidate emoticon and at least one of the target subtext or the target meaning. The method may further comprise presenting the set of candidate emoticons for entry selection at a current position of an input cursor, receiving an entry selection for a set of selected emoticons from the set of candidate emoticons, and inserting the set of selected emoticons into the text field at the current position of the input cursor.
Abstract:
Die Erfindung betrifft ein Verfahren zur Detektion, zum Indizieren sowie zur Erstellung einer Index-Datenstruktur von physikalischen Messgrößen In einem Textkorpus umfassend eine Anzahl von vorab zur Verfügung stehenden Textdokumenten (T), wobei in den Textdokumenten (T) nach Zahlen-Zeichenfolgen (Z, Z1, Z2) gesucht wird, die Zahlen entsprechen oder enthalten und den jeweils detektierten Zeichenfolgen die erkannte Zahl sowie das Attribut "ZAHL" zugeordnet werden, wobei in den Textdokumenten durch Vergleich mit vorab vorgegebenen Zeichenfolgen nach Einheiten-Zeichenfolgen (U, U1, U2) gesucht wird, die einer, insbesondere physikalischen, Einheit entsprechen und den jeweils detektierten Zeichenfolgen ein Code für die jeweilige Einheit sowie das Attribut "EINHEIT" zugeordnet werden, wobei anschließend nach benachbart oder innerhalb eines vorgegebenen Bereichs auftretenden und auf vorgegebene Weise angeordneten Einheiten- und Zahlen-Zeichenfolgen (Z, Z1, Z2, U, U1, U2) gesucht wird, von denen zumindest einer das Attribut "ZAHL" und zumindest einer das Attribut "EINHEIT" zugeordnet ist, und der so aufgefundenen Messwert-Zeichenfolge (M) umfassend zumindest die benachbart angeordneten Zahlen- und Einheiten-Zeichenfolgen (Z, Z1, Z2, U, U1, U2) das Attribut "MESSWERT" sowie ein Messwert zugeordnet werden, wobei der Zahlenwert des Messwerts der erkannten Zahl der mit dem Attribut "ZAHL" versehenen Zeichenfolge entspricht, wobei die Einheit des Messwerts der erkannten Einheit der mit dem Attribut "EINHEIT" versehenen Zeichenfolge entspricht, und wobei der aufgefundene Messwert auf ein vorgegebenes Einheitensystem, insbesondere das Sl, umgewandelt wird und der umgewandelte Messwert der Messwert-Zeichenfolge, und gegebenenfalls auch die umgewandelte Einheit, als Attribut hinzugefügt wird.
Abstract:
The present invention relates to an apparatus and method for recognizing an idiomatic expression using phrase alignment of a parallel corpus, and more particularly, to an apparatus and method extracting an idiom candidate expression using phrase alignment information of a parallel corpus and measuring an idiomatic expression index for each candidate idiomatic expression in order to recognize an idiomatic expression, thereby correcting errors in the measurement of translation entropy and in the extraction of a representative target word, as well as enhancing the accuracy of recognizing an idiomatic expression.
Abstract:
Voice stream augmented note taking may be provided. An audio stream associated with at least one speaker may be recorded and converted into text chunks. A text entry may be received from a user, such as in an electronic document. The text entry may be compared to the text chunks to identify matches, and the matching text chunks may be displayed to the user for selection.
Abstract:
The present disclosure discloses a method and apparatus of selecting a word sequence for a text written in a language without word boundary in order to solve the problem of having excessively large computation load when selecting an optimal word sequence in existing technologies. The disclosed method includes: segmenting a segment of the text to obtain different word sequences; determining a common word boundary for the word sequences; and performing optimal word sequence selection for portions of the word sequences prior to the common word boundary. Because optimal word sequence selection is performed for portions of word sequences prior to a common word boundary, shorter independent units can be obtained, thus reducing computation load of word segmentation.
Abstract:
A method for real-time exploitation of documents in non-English languages includes processing an input document in into a processed input document, extracting ontology elements from the processed input document to obtain a document digest (DD), statistically scoring each DD to obtain a DD with category scores, refining the DD and the category scores to obtain a summary of each document in the form of a refined DD with refined category scores. The summary allows a user to estimate in real-time if the input document warrants added attention.
Abstract:
Training using tree transducers is described. Given sample input/output pairs as training (100, 110), and given a set of tree transducer rules (120), the information is combined to yield locally optimal weights for those rules (140). This combination is carried out by building a weighted derivation forest for each input/output pair and applying counting methods to those forests (130).
Abstract:
An improved method of learning character segments from received text enables facilitated text input on an improved handheld electronic device. In receiving text on the handheld electronic device, the characters of the text are converted into the inputs with which the characters correspond. Segments and other objects are analyzed to generate a proposed character interpretation of the inputs. If at least a portion of the character interpretation differs from a corresponding portion of the received text, a character learning string comprising the differing characters are stored as a candidate. In response to receiving additional text on the handheld electronic device, the characters of the additional text are converted into the inputs with which the characters correspond. Segments and other objects are then analyzed to generate another proposed character interpretation of the series of additional inputs. If at least a portion of the another character interpretation differs from a corresponding portion of the additional received text, another character learning string comprising the differing characters of the additional received text are compared with the candidate. If a set of characters in the another character learning string match characters in the candidate, the set of characters are stored as a segment.
Abstract:
The present invention provides an easy to use system and method for assisting job seekers in locating job opportunities and applying for the same using an online connectivity protocol which is simple to use and highly efficient in terms of time consumption. The system identifies and extracts keywords from the job postings in an accessible job database to create a keyword targeted list that excludes common words and phrases. The keyword targeted list is then processed to form a keyword targeted prefix list which in turn is inserted into a search engine. Upon appropriate query by a potential job seeker, the search engine returns its results while giving prominent placement to one or more job postings sponsored by a recruiter. An interested job seeker clicks on the sponsored job posting is directed to the job details through a website mediated application programming interface.