Hybrid comparison for unicode text strings consisting primarily of ASCII characters

发明授权

US10089281B1 Hybrid comparison for unicode text strings consisting primarily of ASCII characters 有权

请登陆查看更多内容

专利标题： Hybrid comparison for unicode text strings consisting primarily of ASCII characters
申请号： US15719479

申请日： 2017-09-28
公开(公告)号： US10089281B1

公开(公告)日： 2018-10-02
发明人: Thomas Neumann , Viktor Leis , Alfons Kemper
申请人： Tableau Software, Inc.
申请人地址： US WA Seattle
专利权人： Tableau Software, Inc.
当前专利权人： Tableau Software, Inc.
当前专利权人地址： US WA Seattle
代理机构： Morgan, Lewis & Bockius LLP
主分类号： H03M5/00
IPC分类号： H03M5/00 ; G06F17/22 ; H03M7/02 ; H03M7/30 ; G06F17/27

Hybrid comparison for unicode text strings consisting primarily of ASCII characters

摘要：

Comparing text strings with Unicode encoding includes receiving two text strings S1 and S2. The process computes, for the first text string S1, a first weight according to a weight function ƒ that computes an ASCII prefix ƒA(S1), computes a Unicode weight suffix ƒU(S1), and concatenates the weights to form the first weight ƒ(S1)=ƒA(S1)+ƒU(S1). Computing the ASCII prefix for the first string applies bitwise operations to n-byte contiguous blocks of the first string to determine whether each block contains only ASCII characters, and replaces accented Unicode characters with equivalent unaccented ASCII characters when comparison is designated as accent-insensitive. When there is a first block containing a non-replaceable non-ASCII character, the Unicode weight suffix is computed by performing a character-by-character Unicode weight lookup beginning with the first block. The same process is applied to the second string. The text string are compared by comparing their computed weights.

信息查询

Espacenet

IPC分类:

H	电学
H03	基本电子电路
H03M	一般编码、译码或代码转换（用射流方法入F15C4/00；光学模/数转换器入G02F7/00；专用于特殊应用的编码、译码或代码转换见有关小类，例如G01D，G01R，G06F，G06T，G09G，G10L，G11B，G11C，H04B，H04L，H04M，H04N；专用于密码技术或涉及需要保密的其他目的的编码或译码入G09C）
H03M5/00	单个数字表示形式的转换