GENERATING REPRESENTATIVE SAMPLING DATA FOR BIG DATA ANALYTICS

    公开(公告)号:US20240184636A1

    公开(公告)日:2024-06-06

    申请号:US18060618

    申请日:2022-12-01

    IPC分类号: G06F9/50 G06F3/06

    摘要: In an approach, a processor divides a set of data into at least two smaller data blocks. For each of the at least two smaller data blocks, a processor calculates an original value for a data distribution of a respective smaller data block, runs at least two different sampling methods against the respective smaller data block to produce at least two different sets of sample data for the respective smaller data block, calculates respective sampling values for the data distribution of each set of sample data, and selects a set of sample data of the at least two different sets of sample data that has the respective sampling value that is closest to the original value for the respective smaller data block. A processor merges each selected set of sample data for each smaller data block to form a final set of sample data.

    AUTOMATICALLY INTEGRATING USER TRANSLATION FEEDBACK

    公开(公告)号:US20230196034A1

    公开(公告)日:2023-06-22

    申请号:US17557737

    申请日:2021-12-21

    摘要: A method includes: receiving, by a computing device, user input indicating an incorrect translation of a word appearing in an interface of an application; identifying, by the computing device, other instances of the word in other interfaces of the application, wherein the identifying is performed using a glossary relationship set that is based on association analysis, wherein the other instances of the word constitute less than all instances of the word appearing in all interfaces of the application; and generating, by the computing device, a new version of the application having a revised translation of the word in the interface and the other instances of the word in the other interfaces.