发明授权
- 专利标题: Hybrid comparison for unicode text strings consisting primarily of ASCII characters
-
申请号: US15719479申请日: 2017-09-28
-
公开(公告)号: US10089281B1公开(公告)日: 2018-10-02
- 发明人: Thomas Neumann , Viktor Leis , Alfons Kemper
- 申请人: Tableau Software, Inc.
- 申请人地址: US WA Seattle
- 专利权人: Tableau Software, Inc.
- 当前专利权人: Tableau Software, Inc.
- 当前专利权人地址: US WA Seattle
- 代理机构: Morgan, Lewis & Bockius LLP
- 主分类号: H03M5/00
- IPC分类号: H03M5/00 ; G06F17/22 ; H03M7/02 ; H03M7/30 ; G06F17/27
摘要:
Comparing text strings with Unicode encoding includes receiving two text strings S1 and S2. The process computes, for the first text string S1, a first weight according to a weight function ƒ that computes an ASCII prefix ƒA(S1), computes a Unicode weight suffix ƒU(S1), and concatenates the weights to form the first weight ƒ(S1)=ƒA(S1)+ƒU(S1). Computing the ASCII prefix for the first string applies bitwise operations to n-byte contiguous blocks of the first string to determine whether each block contains only ASCII characters, and replaces accented Unicode characters with equivalent unaccented ASCII characters when comparison is designated as accent-insensitive. When there is a first block containing a non-replaceable non-ASCII character, the Unicode weight suffix is computed by performing a character-by-character Unicode weight lookup beginning with the first block. The same process is applied to the second string. The text string are compared by comparing their computed weights.
信息查询