摘要:
A method and an apparatus for identifying synonym and utilizing such synonym to conduct search is disclosed. The disclosed method includes: obtaining arbitrary two words to be identified; determining whether a shortest edit distance between the two words less than or equal to an edit distance threshold; determining whether the two words to be identified exist in a preset knowledge database, and if an answer is yes then searching a smallest granularity type with highest weight value for each word in the knowledge database; and if the two word have the same smallest granularity type with highest weight value, then determining such two words are synonyms, or non-synonym otherwise. The disclosed techniques greatly improve accuracy of synonym identification and guarantee effect of synonym identification.
摘要:
A method and an apparatus for identifying synonym and utilizing such synonym to conduct search is disclosed. The disclosed method includes: obtaining arbitrary two words to be identified; determining whether a shortest edit distance between the two words less than or equal to an edit distance threshold; determining whether the two words to be identified exist in a preset knowledge database, and if an answer is yes then searching a smallest granularity type with highest weight value for each word in the knowledge database; and if the two word have the same smallest granularity type with highest weight value, then determining such two words are synonyms, or non-synonym otherwise. The disclosed techniques greatly improve accuracy of synonym identification and guarantee effect of synonym identification.
摘要:
A search system includes: a data rewriting system that obtains, from a database, one or more search term candidates that are relevant to a present search query. The data rewriting system retrieves properties of the present search query and the one or more search term candidates, where the properties describe respective matching results of the present search query and the one or more search term candidates. Based at least in part on the matching results, the data rewriting system determines whether or not the present search query needs to be rewritten, and rewrites the present search query based at least in part on the matching results to provide a rewritten present search query if it is determined that the present search query needs to be rewritten. A search engine performs a search based at least in part on the rewritten present search query.
摘要:
The present disclosure describes a search method, a search apparatus and a search system. The method includes: a data rewriting system that obtains, from a database, one or more search term candidates that are relevant to a present search term. The data rewriting system retrieves properties of the present search term and the one or more search term candidates, where the properties describe respective matching results of the present search term and the one or more search term candidates. Based on the matching results, the data rewriting system determines whether or not the present search term needs to be rewritten, and rewrites the present search term based on the matching results to provide a rewritten present search term if it is determined that the present search term needs to be rewritten. A search engine performs a search based on the rewritten present search term. The disclosed method, apparatus and system avoid the approach of conducting a search based on fixed rules after the present search term is rewritten, thus reducing the probability of having ambiguity in the search process and improving the degree of search accuracy.
摘要:
Ranking search results, comprises retrieving search results that include target strings that relate to a query string; segmenting the query string and each of the target strings; pairing segments in the query string with respective segments in the target strings to form combinations; retrieving weights that correspond to the combinations; and determining a weighted word length based on the weights corresponding to each of the target strings; and ranking the target strings based on their respective weighted word lengths. Alternatively, ranking search results includes determining a minimum weight of each inserted word with respect to segments in the query string; determining a minimum weight of each deleted word with respect to segments in the target strings; determining a total edit distance for each target string; and ranking the target strings based on the total edit distances.
摘要:
The present disclosure discloses a method for generating a search result and an information search system. The method for generating a search result includes: receiving, by an information search system, a search request; obtaining, by searching, a plurality of pieces of matching information that match the search request; obtaining a respective amount of user response associated with each of the plurality of pieces of matching information and further obtaining a total amount of user response associated with a respective categories to which each of the plurality of pieces of matching information belongs; and ranking the plurality of pieces of information to generate a search result based on the total amount of user response associated with the respective category to which each of the plurality of pieces of matching information belongs. By using the above technical scheme, a result of more rational ranking of matching information can be displayed to a user when the user performs a search, thus improving experience of the user.
摘要:
Ranking search results, comprises receiving a query string; retrieving a plurality of search results that include a corresponding plurality of target strings that relate to the query string; segmenting the query string and each of the plurality of target strings; pairing segments in the query string with respective segments in the target strings to form a plurality of combinations; retrieving a plurality of weights that correspond to the plurality of combinations based on a mapping of word combinations and their respective weights, wherein a weight measures semantic correlation between words in a word combination; and determining a weighted word length based on the weights corresponding to each of the plurality of target strings; and ranking the plurality of target strings based on their respective weighted word lengths. Alternatively, ranking search results includes determining a minimum weight of each inserted word with respect to segmented words in the query string; determining a minimum weight of each deleted word with respect to segmented words in the target strings; determining a total edit distance based at least in part on the minimum weight of each inserted word and the minimum weight of each deleted word; and ranking the target strings based on the total edit distances.
摘要:
Generating ranked search results includes receiving a plurality of matching information items that match a search request, ranking at least some of the plurality of matching information items using a linear ranking model that linearly combines a first plurality of feature values to obtain a first set of ranked results, ranking at least some of the first set of ranked results using a nonlinear ranking model that nonlinearly combines a second plurality of feature values to obtain a second set of ranked results, and provide a search response based on the second set of ranked results.
摘要:
The present disclosure discloses a method for generating a search result and an information search system. The method for generating a search result includes: receiving, by an information search system, a search request; obtaining, by searching, a plurality of pieces of matching information that match the search request; obtaining a respective amount of user response associated with each of the plurality of pieces of matching information and further obtaining a total amount of user response associated with a respective categories to which each of the plurality of pieces of matching information belongs; and ranking the plurality of pieces of information to generate a search result based on the total amount of user response associated with the respective category to which each of the plurality of pieces of matching information belongs. By using the above technical scheme, a result of more rational ranking of matching information can be displayed to a user when the user performs a search, thus improving experience of the user.
摘要:
Generating ranked search results includes receiving a plurality of matching information items that match a search request, ranking at least some of the plurality of matching information items using a linear ranking model that linearly combines a first plurality of feature values to obtain a first set of ranked results, ranking at least some of the first set of ranked results using a nonlinear ranking model that nonlinearly combines a second plurality of feature values to obtain a second set of ranked results, and provide a search response based on the second set of ranked results.