摘要:
Methods and arrangements for enhancing search quality. Query search results are displayed, and search query provenance related to the search results is graphically depicted. There is graphically accorded an investigative function to avail investigation of at least one aspect of the search query provenance.
摘要:
A data mashup system having information extraction capabilities for receiving multiple streams of textual data, at least one of which contains unstructured textual data. A repository stores annotators that describe how to analyze the streams of textual data for specified unstructured data components. The annotators are applied to the data streams to identify and extract the specified data components according to the annotators. The extracted data components are tagged to generate structured data components and the specified unstructured data components in the input data streams are replaced with the tagged data components. The system then combines the tagged data from the multiple streams to form a mashup output data stream.
摘要:
A method, device, and computer program product are provided for regular expression learning is provided. An initial regular expression may be received from a user. The initial regular expression is executed over a database. Positive matches and negative matches are labeled. The initial regular expression and the labeled positive and negative matches are input in a transformation process. The transformation process may iteratively execute character class restrictions, quantifier restrictions, negative lookaheads on the initial regular expression to transform the initial regular expression into the pool of candidate regular expressions. The transformation process may execute, one at a time, the character class restrictions, quantifier restrictions, the negative lookaheads. A candidate regular expression is selected from the pool of candidate regular expressions, where the selected candidate regular expression has a best F-Measure out of the pool of candidate regular expressions.
摘要:
A data mashup system having information extraction capabilities for receiving multiple streams of textual data, at least one of which contains unstructured textual data. A repository stores annotators that describe how to analyze the streams of textual data for specified unstructured data components. The annotators are applied to the data streams to identify and extract the specified data components according to the annotators. The extracted data components are tagged to generate structured data components and the specified unstructured data components in the input data streams are replaced with the tagged data components. The system then combines the tagged data from the multiple streams to form a mashup output data stream.
摘要:
A data mashup system having information extraction capabilities for receiving multiple streams of textual data, at least one of which contains unstructured textual data. A repository stores annotators that describe how to analyze the streams of textual data for specified unstructured data components. The annotators are applied to the data streams to identify and extract the specified data components according to the annotators. The extracted data components are tagged to generate structured data components and the specified unstructured data components in the input data streams are replaced with the tagged data components. The system then combines the tagged data from the multiple streams to form a mashup output data stream.
摘要:
A data mashup system having information extraction capabilities for receiving multiple streams of textual data, at least one of which contains unstructured textual data. A repository stores annotators that describe how to analyze the streams of textual data for specified unstructured data components. The annotators are applied to the data streams to identify and extract the specified data components according to the annotators. The extracted data components are tagged to generate structured data components and the specified unstructured data components in the input data streams are replaced with the tagged data components. The system then combines the tagged data from the multiple streams to form a mashup output data stream.
摘要:
A method for translating a natural language query into a structured query for a database is provided. The method generally includes: generating a parse tree which represents a natural language query for a database; mapping terms in the parse tree to components of a structured query language for the database; and grouping the components of the structured query language.