Query expansion for web search
    41.
    发明授权
    Query expansion for web search 有权
    网页搜索的查询扩展

    公开(公告)号:US08898156B2

    公开(公告)日:2014-11-25

    申请号:US13040192

    申请日:2011-03-03

    申请人: Jun Xu Hang Li

    发明人: Jun Xu Hang Li

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864 G06F17/30672

    摘要: Systems, methods, and devices are described for retrieving query results based at least in part on a query and one or more similar queries. Upon receiving a query, one or more similar queries may be identified and/or calculated. In one embodiment, the similar queries may be determined based at least in part on click-through data corresponding to previously submitted queries. Information associated with the query and each of the similar queries may be retrieved, ranked, and or combined. The combined query results may then be re-ranked based at least in part on a responsiveness and/or relevance to the previously submitted query. The re-ranked query results may then be output to a user that submitted the original query.

    摘要翻译: 描述了至少部分地基于查询和一个或多个类似查询来检索查询结果的系统,方法和设备。 在接收到查询时,可以识别和/或计算一个或多个类似的查询。 在一个实施例中,可以至少部分地基于对应于先前提交的查询的点击数据来确定类似的查询。 与查询相关联的信息和每个相似查询可以被检索,排序和/或组合。 组合的查询结果可以至少部分地基于对先前提交的查询的响应性和/或相关性来重新排序。 然后可以将重新排列的查询结果输出给提交原始查询的用户。

    Search results ranking using editing distance and document information
    42.
    发明授权
    Search results ranking using editing distance and document information 有权
    使用编辑距离和文档信息搜索结果排名

    公开(公告)号:US08812493B2

    公开(公告)日:2014-08-19

    申请号:US12101951

    申请日:2008-04-11

    IPC分类号: G06F7/00

    CPC分类号: G06F17/2211 G06F17/30864

    摘要: Architecture for extracting document information from documents received as search results based on a query string, and computing an edit distance between the data string and the query string. The edit distance is employed in determining relevance of the document as part of result ranking by detecting near-matches of a whole query or part of the query. The edit distance evaluates how close the query string is to a given data stream that includes document information such as TAUC (title, anchor text, URL, clicks) information, etc. The architecture includes the index-time splitting of compound terms in the URL to allow the more effective discovery of query terms. Additionally, index-time filtering of anchor text is utilized to find the top N anchors of one or more of the document results. The TAUC information can be input to a neural network (e.g., 2-layer) to improve relevance metrics for ranking the search results.

    摘要翻译: 用于基于查询字符串从作为搜索结果接收的文档提取文档信息的结构,以及计算数据串和查询字符串之间的编辑距离。 编辑距离用于通过检测整个查询或部分查询的近似匹配来确定文档作为结果排名的一部分的相关性。 编辑距离评估查询字符串与包含诸如TAUC(标题,锚文本,URL,点击)信息等文档信息的给定数据流的距离。该体系结构包括索引时间分割URL中的复合术语 以便更有效地发现查询条款。 另外,使用锚文本的索引时间过滤来查找一个或多个文档结果的前N个锚点。 可以将TAUC信息输入到神经网络(例如,2层),以改进用于对搜索结果排序的相关性度量。

    Predicting users' attributes based on users' behaviors
    43.
    发明授权
    Predicting users' attributes based on users' behaviors 有权
    根据用户的行为预测用户的属性

    公开(公告)号:US08756184B2

    公开(公告)日:2014-06-17

    申请号:US12957649

    申请日:2010-12-01

    摘要: A method, apparatus, system, article of manufacture, and computer readable storage medium provide the ability to predict and utilize a user's attributes. A sample user behavior and a sample user attribute are collected. A model is trained based on the sample user behavior and sample user attribute. Using the model, a probability of a predicted user attribute based on the sample user behavior is predicted. Using the model and the probability, the predicted user attribute is fuzzily determined based on a real user behavior. The predicted user attribute is used to improve a user's experience.

    摘要翻译: 方法,装置,系统,制品和计算机可读存储介质提供预测和利用用户属性的能力。 收集示例用户行为和示例用户属性。 根据样本用户行为和样本用户属性对模型进行培训。 使用该模型,预测基于样本用户行为的预测用户属性的概率。 使用模型和概率,基于真实用户行为模糊地确定预测用户属性。 预测的用户属性用于改善用户体验。

    Learning similarity function for rare queries
    44.
    发明授权
    Learning similarity function for rare queries 有权
    学习罕见查询的相似度函数

    公开(公告)号:US08612367B2

    公开(公告)日:2013-12-17

    申请号:US13021446

    申请日:2011-02-04

    IPC分类号: G06F15/18

    CPC分类号: G06N99/005 H04L9/3236

    摘要: Techniques are described for determining queries that are similar to rare queries. An n-gram space is defined to represent queries and a similarity function is defined to measure the similarities between queries. The similarity function is learned by leveraging training data derived from user behavior data and formalized as an optimization problem using a metric learning approach. Furthermore, the similarity function can be defined in the n-gram space, which is equivalent to a cosine similarity in a transformed n-gram space. Locality sensitive hashing can be exploited for efficient retrieval of similar queries from a large query repository. This technique can be used to enhance the accuracy of query similarity calculation for rare queries, facilitate the retrieval of similar queries and significantly improve search relevance.

    摘要翻译: 描述了用于确定与罕见查询类似的查询的技术。 定义n-gram空间来表示查询,并且定义相似度函数来测量查询之间的相似性。 通过利用从用户行为数据导出的训练数据,并使用度量学习方法将其形式化为优化问题,来学习相似度函数。 此外,可以在n-gram空间中定义相似度函数,这相当于在变换的n-gram空间中的余弦相似度。 可以利用局部敏感散列来高效地检索大型查询库中的类似查询。 这种技术可以用于提高罕见查询的查询相似度计算的准确性,便于检索类似的查询,并显着提高搜索的相关性。

    Regularized latent semantic indexing for topic modeling
    45.
    发明授权
    Regularized latent semantic indexing for topic modeling 有权
    主题建模的正则化潜在语义索引

    公开(公告)号:US08533195B2

    公开(公告)日:2013-09-10

    申请号:US13169808

    申请日:2011-06-27

    摘要: Electronic documents are retrieved from a database and/or from a network of servers. The documents are topic modeled in accordance with a Regularized Latent Semantic Indexing approach. The Regularized Latent Semantic Indexing approach may allow an equation involving an approximation of a term-document matrix to be solved in parallel by multiple calculating units. The equation may include terms that are regularized via either l1 norm and/or via l2 norm. The Regularized Latent Semantic Indexing approach may be applied to a set, or a fixed number, of documents such that the set of documents is topic modeled. Alternatively, the Regularized Latent Semantic Indexing approach may be applied to a variable number of documents such that, over time, the variable of number of documents is topic modeled.

    摘要翻译: 从数据库和/或从服务器网络检索电子文档。 这些文件是根据正则潜在语义索引方法建模的主题。 正则潜在语义索引方法可以允许涉及术语文档矩阵的近似的等式由多个计算单元并行求解。 方程式可以包括通过l1范数和/或通过l2范数规则化的项。 正则潜在语义索引方法可以应用于一组或固定数量的文档,使得该组文档被主题建模。 或者,正则潜在语义索引方法可以应用于可变数量的文档,使得随着时间的推移,文档数量的变量被主题建模。

    Time-frequency code spreading method and apparatus in OFDMA system
    47.
    发明授权
    Time-frequency code spreading method and apparatus in OFDMA system 有权
    OFDMA系统中的时频码扩展方法及装置

    公开(公告)号:US08351485B2

    公开(公告)日:2013-01-08

    申请号:US12680394

    申请日:2008-09-27

    申请人: Hang Li

    发明人: Hang Li

    IPC分类号: H04B1/00

    摘要: The present invention provides a time-frequency code spreading method in an OFDMA system. The method includes: converting a transmission message into one or more modulating signal vectors, and each bit of the transmission message is spread onto all vector elements of a modulating signal vector; mapping one or more modulating signal vectors to a set of time-frequency grids, wherein in an OFDMA time-frequency plane, two time-frequency grids to which any two vector elements in each modulating signal vector are mapped do not have the same frequency location or time location. In addition, the present invention also provides a time-frequency code spreading apparatus in an OFDMA system.

    摘要翻译: 本发明提供了一种OFDMA系统中的时频码扩展方法。 该方法包括:将传输消息转换成一个或多个调制信号向量,并将传输消息的每个比特扩展到调制信号向量的所有向量元素上; 将一个或多个调制信号向量映射到一组时频网格,其中在OFDMA时频平面中,映射每个调制信号向量中的任何两个矢量元素的两个时间频率网格不具有相同的频率位置 或时间位置。 此外,本发明还提供了OFDMA系统中的时间 - 频率代码扩展装置。

    Method and device for transmitting voice in wireless system
    48.
    发明授权
    Method and device for transmitting voice in wireless system 有权
    用于在无线系统中传输语音的方法和装置

    公开(公告)号:US08331269B2

    公开(公告)日:2012-12-11

    申请号:US12682518

    申请日:2008-10-09

    IPC分类号: H04L12/66 H04W4/00 G06F15/173

    摘要: Embodiments of the present invention provide a method and device for transmitting voice in a wireless system. The method includes: identifying, by a transmitter, each original voice encoding packet needed to be sent out with a number indicating playback order, and performing channel encoding on each identified original voice encoding packet to construct a voice session packet; establishing a voice session or voice data mixed session between the transmitter and a receiver; allocating a channel dynamically for the voice session or the voice data mixed session; sending, by the transmitter, newly-arrived voice session packets, delayed voice session packets, voice session packets needed to be re-transmitted, data session packets and control command packets according to pre-configured priority; receiving and detecting, by the receiver, the voice session packets, sending an NACK packet comprising number of a lost voice session packet to the transmitter to inform the transmitter to re-transmit the voice session packet, if it is confirmed that the voice session packet is lost; and putting voice session packets properly received into a jitter buffer controller at the receiver if the receiver is a terminal. In embodiments of the present invention, spectral efficiency and reliability of real-time voice services in a wireless multi-service transmission system may be improved while satisfying the Quality of Service (QoS) requirements of real-time services, such as voice service.

    摘要翻译: 本发明的实施例提供了一种用于在无线系统中发送语音的方法和装置。 该方法包括:通过发送器识别需要发出的每个原始语音编码分组,其中指示播放顺序的数字,并且对每个识别的原始语音编码分组执行信道编码以构建语音会话分组; 在发射机和接收机之间建立语音会话或语音数据混合会话; 为语音会话或语音数据混合会话动态分配信道; 根据预配置的优先级,由发射机发送新到达的语音会话分组,延迟的语音会话分组,需要重发的语音会话分组,数据会话分组和控制命令分组; 由所述接收机接收和检测所述语音会话分组,向所述发射机发送包括丢失语音会话分组数量的NACK分组,以通知所述发射机重新发送所述语音会话分组,如果确认所述语音会话分组 失去了 并且如果接收器是终端,则将语音会话分组正确地接收到接收机的抖动缓冲器控制器中。 在本发明的实施例中,可以在满足诸如语音服务等实时业务的服务质量(QoS)要求的同时,提高无线多业务传输系统中的实时语音业务的频谱效率和可靠性。

    Signal transmission method and apparatus used in OFDMA wireless communication system
    49.
    发明授权
    Signal transmission method and apparatus used in OFDMA wireless communication system 有权
    在OFDMA无线通信系统中使用的信号传输方法和装置

    公开(公告)号:US08289914B2

    公开(公告)日:2012-10-16

    申请号:US12680667

    申请日:2008-09-27

    申请人: Hang Li Guanghan Xu

    发明人: Hang Li Guanghan Xu

    IPC分类号: H04W4/00

    摘要: Embodiments of the present invention provide a signal transmission method and apparatus used in an Orthogonal Frequency Division Multiple Access (OFDMA) wireless communication system, to enhance stability of signal transmission and resist time-frequency dispersion. The signal transmission method used in the OFDMA wireless communication system provided by an embodiment of the invention includes: converting an L×1 symbol vector into an N×1 modulating signal vector according to a loading factor fed back by a receiving party, in which value of N is known, both L and N are natural numbers larger than one, N is larger than or equal to L, the loading factor is a ratio of L and N; mapping the N×1 modulating signal vector into N time-frequency grids; and converting the N time-frequency grids into a signal waveform and sending the signal waveform to the receiving party.

    摘要翻译: 本发明的实施例提供了一种在正交频分多址(OFDMA)无线通信系统中使用的信号传输方法和装置,以增强信号传输的稳定性并抵抗时间 - 频率分散。 在本发明实施例提供的OFDMA无线通信系统中使用的信号传输方法包括:根据接收方反馈的负载因子将L×1符号向量转换为N×1个调制信号向量,其中值 N是已知的,L和N都是大于1的自然数,N大于或等于L,负载系数是L和N的比率; 将N×1调制信号矢量映射为N个时频网格; 并将N个时频网格转换成信号波形并将信号波形发送到接收方。

    TOPICS IN RELEVANCE RANKING MODEL FOR WEB SEARCH
    50.
    发明申请
    TOPICS IN RELEVANCE RANKING MODEL FOR WEB SEARCH 有权
    用于网络搜索的相关排名模式的主题

    公开(公告)号:US20120030200A1

    公开(公告)日:2012-02-02

    申请号:US13271638

    申请日:2011-10-12

    申请人: Qing Yu Jun Xu Hang Li

    发明人: Qing Yu Jun Xu Hang Li

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: Described is a technology by which topics corresponding to web pages are used in relevance ranking of those pages. Topics are extracted from each web page of a set of web pages that are found via a query. For example, text such as nouns may be extracted from the title, anchor texts and URL of a page, and used as the topics. The extracted topics from a page are used to compute a relevance score for that page based on an evaluation of that page's topics against the query. The pages are then ranked relative to one another based at least in part on the relevance score computed for each page, such as by determining a matching level for each page, ranking pages by each level, and ranking pages within each level. Also described is training a model to perform the relevance scoring and/or ranking.

    摘要翻译: 描述了一种技术,通过该技术将与网页相对应的主题用于那些页面的相关性排名。 从通过查询找到的一组网页的每个网页中提取主题。 例如,可以从标题,锚文本和页面的URL中提取诸如名词的文本,并且用作主题。 从页面提取的主题用于根据对该页面的主题对查询的评估来计算该页面的相关性分数。 这些页面至少部分地基于针对每个页面计算的相关性分数相对于彼此进行排名,例如通过确定每个页面的匹配级别,按各级别排序页面以及在每个级别内对页面进行排序。 还描述了训练模型以执行相关性评分和/或排名。