摘要:
A system and method for enterprise search includes one or more computer-readable media storing computer-executable instructions that, when executed on one or more processors that perform acts including extracting one or more of term data, personal data and metadata from one or more predetermined resources; retrieving a set of information derived from the extracted term data, personal data and metadata responsive to a query; and receiving feedback responsive to the set of information, the feedback augmenting at least one of the one or more predetermined resources.
摘要:
A system and method for enterprise search includes one or more computer-readable media storing computer-executable instructions that, when executed on one or more processors that perform acts including extracting one or more of term data, personal data and metadata from one or more predetermined resources; retrieving a set of information derived from the extracted term data, personal data and metadata responsive to a query; and receiving feedback responsive to the set of information, the feedback augmenting at least one of the one or more predetermined resources.
摘要:
Various technologies and techniques are disclosed for calculating authorship dates for a document. A portion of a document to select to look for possible authorship dates is determined. The possible authorship dates are extracted from the portion of the document. A revised authorship date of the document is generated using a neural network. The revised authorship date is returned to an application or process that requested the date.
摘要:
Architecture that extracts author information from general documents and uses the author information for search results ranking. The architecture performs automatic author value extraction and makes the extracted value available at index time for subsequent use at query processing and results ranking. Machine learning (e.g., a perceptron algorithm) is employed and a set of input features for the perceptron algorithm utilized for author value extraction. The extracted author value is converted into a feature for input a ranking function for generating a ranking score for each document. The input features can also be weighted according to weighting criteria.
摘要:
Search results obtained from a ranking model are re-ranked based on user-configured ranking rules. For example, a user may desire to: place certain search results at a top/bottom of a ranking of search results; remove some search results; and/or adjust a ranking of some of the search results. A Graphical User Interface (GUI) allows a user to configure the ranking rules (e.g. enter key/value restrictions and to set a boost value) and to preview an application of one or more of the ranking rules. Query language operators that follow a standard operator syntax are created based on the inputs (e.g. a ranking query operator is created that may include multiple user supplied parameters). The user may also specify a portion of the results from which statistics (e.g. standard deviation, average score) are calculated. For example, a user may specify to calculate statistics for the top N number results.
摘要:
A query received from a user is directed to a particular search application (e.g. an Enterprise search portal) that is associated with a result source from which to retrieve results. The received query may be federated to additional result sources when the received query is determined to be a popular query in a result source. Query logs associated with the additional result sources are analyzed to determine when a query is popular as compared to the original result source. The query may be altered before being executed that uses one or more of the additional result sources. When the query (altered/unaltered) is determined to be popular for any of the additional result sources as compared to the original result source, the query is executed using that additional result source.
摘要:
Embodiments are configured to provide information relevant to individuals of interest to a searching user. In an embodiment, a method includes identifying relevant individuals of a network using a relevance model that includes the use of a number of managed properties and ranking features to identify relevant individuals of a defined network. The relevance model of one embodiment is defined by a schema that includes a textual matching ranking feature, social distance ranking feature, a levels to top ranking feature, and a proximity ranking feature.
摘要:
Embodiments are directed to ranking search results using a junk profile. For a given corpus of documents, one or more junk profiles may be created and maintained. The junk profile provides reference metrics to represent known junk documents. For example, a junk profile may comprise a dictionary of document data that is automatically inserted into documents created using a particular system or template. A junk profile may also comprise one or more representations (e.g., histograms) of a distribution of a particular junk variable for known junk documents. The junk profile provides a usable representation of known junk documents, and the present systems and methods employ the junk profile to predict the likelihood that documents in the corpus are junk. In embodiments, junk scores are calculated and used to rank such documents higher or lower in response to a search query.
摘要:
A query pipeline for an enterprise search system is configurable by a user of the system. A user may create rules for custom query transformation and parallel query generation, federation of queries, mixing of results and application of display layouts to the received search results. A user interface (UI) assists a user in configuring the search pipeline. For example, a user may enter condition action rules for queries that affect how a query is transformed, how parallel queries are generated, how queries are federated, how search results are ranked and displayed, how rules are ordered and the like.
摘要:
Embodiments are configured to provide information relevant to individuals of interest to a searching user. In an embodiment, a method includes identifying relevant individuals of a network using a relevance model that includes the use of a number of managed properties and ranking features to identify relevant individuals of a defined network. The relevance model of one embodiment is defined by a schema that includes a textual matching ranking feature, social distance ranking feature, a levels to top ranking feature, and a proximity ranking feature.