-
公开(公告)号:US11893817B2
公开(公告)日:2024-02-06
申请号:US17386386
申请日:2021-07-27
发明人: Paulo Abelha Ferreira , Pablo Nascimento da Silva , Rômulo Teixeira de Abreu Pinho , Tiago Salviano Calmon , Vinicius Michel Gottin
IPC分类号: G06F30/00 , G06V30/412 , G06F16/35 , G06V30/413 , G06V30/414 , G06F18/214
CPC分类号: G06V30/412 , G06F16/35 , G06F18/214 , G06V30/413 , G06V30/414
摘要: Techniques described herein relate to a method for predicting field values of documents. The method may include identifying a field prediction model generation request; obtaining, training documents from a document manager; selecting a first training document; making a first determination that the first training document is a text-based document; performing text-based data extraction to identify first words and first boxes included in the first training document; identifying first keywords and first candidate words included in the first training document based on the first words and the first boxes; and generating a first annotated training document using the first keywords and the first candidate words, wherein the first annotated training document comprises color-based representation masks for the first keywords, the first candidate words, and first general words included in the first training document.
-
公开(公告)号:US11893422B2
公开(公告)日:2024-02-06
申请号:US17215586
申请日:2021-03-29
发明人: Philip Shilane , Abhinav Duggal , George Mathew
IPC分类号: G06F9/50 , G06F16/174 , H04L67/1023 , G06F16/16 , G06F16/13
CPC分类号: G06F9/505 , G06F16/134 , G06F16/162 , G06F16/1748 , H04L67/1023
摘要: A deduplicated file system includes a set of microservices including front-ends and back-ends. Assignments of files are balanced across front-ends. The files are represented by segment trees including multiple segment levels. Assignments of similarity groups are balanced across back-ends. Similarity groups are associated with segments at a lower-level of the segment trees that form the files. Front-ends are responsible for operations involving an upper-level of the trees. Back-ends are responsible for operations involving the lower-level of the trees. A mapping of file assignments to front-ends and of similarity group assignments to back-ends is stored. A request to perform a file system operation is received. The mapping is consulted to identify particular front and back-ends that should be responsible for handling and processing the request.
-
公开(公告)号:US11893064B2
公开(公告)日:2024-02-06
申请号:US16782426
申请日:2020-02-05
IPC分类号: G06F16/906 , G06F16/907 , G06F16/901 , G06F16/182 , H04L67/141
CPC分类号: G06F16/906 , G06F16/182 , G06F16/907 , G06F16/9014 , H04L67/141
摘要: Different logical partitions representing parts of a distributed file system global namespace are hosted on some cluster nodes, e.g., metadata nodes. File content and shadow logical partitions corresponding to the different logical partitions are hosted on other nodes, e.g., data nodes. Each file is associated with a metadata and data node. TCP links are established between nodes. Upon opening files, a file manager server session is generated between each pair of nodes associated with the open files to track open states and is recorded in a mapping table. The mapping table identifies each open file and associated nodes. When a metadata or data node of a particular pair of nodes associated with an open file becomes unavailable, the mapping table is consulted to identify another of the metadata or data node associated with the open file. Crash recovery protocols are performed on the other of the metadata or data node.
-
公开(公告)号:US11892980B2
公开(公告)日:2024-02-06
申请号:US17447901
申请日:2021-09-16
IPC分类号: G06F16/174
CPC分类号: G06F16/1748
摘要: One example method includes performing a hash of data to generate a hash value, checking a binary trie to determine if the hash value has previously been entered into the binary trie, if the hash value has previously been entered in the binary trie, declaring the data as a duplicate of other data, and if the hash value has not been previously entered in the binary trie, updating the binary trie to include the hash value.
-
公开(公告)号:US11868890B2
公开(公告)日:2024-01-09
申请号:US17714247
申请日:2022-04-06
发明人: Chandra Yeleshwarapu , Jonas F. Dias , Angelo Ciarlini , Romulo D. Pinho , Vinicius Gottin , Andre Maximo , Edward Pacheco , David Holmes , Keshava Rangarajan , Scott David Senften , Joseph Blake Winston , Xi Wang , Clifton Brent Walker , Ashwani Dev , Nagaraj Sirinivasan
CPC分类号: G06N3/08 , G06F9/48 , G06F9/4843 , G06F9/4881 , G06F9/50 , G06F9/5061 , G06F9/5066 , G06F9/5077 , G06F9/5083 , G06N3/02 , G06N3/04 , G06N3/086 , G06F2209/501 , G06F2209/5011 , G06F2209/5019
摘要: A computer implemented method, computer program product, and system for managing execution of a workflow comprising a set of subworkflows, comprising optimizing the set of subworkflows using a deep neural network, wherein each subworkflow of the set of subworkflows has a set of tasks, wherein each task of the sets of tasks has a requirement of resources of a set of resources; wherein each task of the sets of tasks is enabled to be dependent on another task of the sets of tasks, training the deep neural network by: executing the set of subworkflows, collecting provenance data from the execution, and collecting monitoring data that represents the state of said set of resources, wherein the training causes the neural network to learn relationships between the states of said set of resources, the said sets of tasks, their parameters and the obtained performance, optimizing an allocation of resources of the set of resources to each task of the sets of tasks to ensure compliance with a user-defined quality metric based on the deep neural network output.
-
66.
公开(公告)号:US11868641B2
公开(公告)日:2024-01-09
申请号:US18068926
申请日:2022-12-20
发明人: Itay Azaria , Kfir Wolfson , Jehuda Shemer , Saar Cohen
CPC分类号: G06F3/065 , G06F3/0611 , G06F3/0619 , G06F3/0679 , G06F11/1471 , G06F2201/82
摘要: One example method includes intercepting an IO issued by an application, writing the IO and IO metadata to a splitter journal in NVM, forwarding the IO to storage, and asynchronous with operations occurring along an IO path between the application and storage, evacuating the splitter journal by sending the IO and IO metadata from the splitter journal to a replication site. In this example, sending the IO and IO metadata from the journal to the replication site does not increase a latency associated with the operations on the IO path.
-
公开(公告)号:US11853417B2
公开(公告)日:2023-12-26
申请号:US17132001
申请日:2020-12-23
发明人: Maxim Balin , Tomer Shachar , Yevgeni Gehtman
CPC分类号: G06F21/554 , G06F21/54 , G06F21/602
摘要: Techniques are provided for hardware device integrity validation using platform configuration values. One method comprises obtaining platform configuration values associated with software of a hardware device; comparing the obtained platform configuration values for the hardware device to one or more platform configuration values stored in a platform configuration table; and performing one or more automated remedial actions (e.g., initiating a reboot of the hardware device) based on a result of the comparison. The platform configuration values for the hardware device may be obtained from a local platform configuration value table of the hardware device. The platform configuration values for the hardware device may be obtained by an integrity validation monitor associated with the hardware device, and the integrity validation monitor may send the obtained platform configuration values for the hardware device to an integrity validation server that securely stores the platform configuration table and performs the comparison.
-
公开(公告)号:US11853305B2
公开(公告)日:2023-12-26
申请号:US17364814
申请日:2021-06-30
发明人: Min Gong , Qicheng Qiu , Jiacheng Ni
IPC分类号: G06F16/2457 , G06N20/00 , G06F16/35
CPC分类号: G06F16/24573 , G06F16/35 , G06N20/00
摘要: File annotation is described. An example method includes: processing files to be annotated by using an annotation model to determine a first performance of the annotation model, the first performance being associated with the confidence of a model annotation result generated by the annotation model; if the first performance is lower than a predetermined threshold, determining a group of target files from the files based at least on the confidence of the model annotation result; acquiring truth-value annotation information of the group of target files for retraining the annotation model; and if a second performance of the retrained annotation model is higher than or equal to the predetermined threshold, determining annotation information for at least some of the files by using the retrained annotation model. Based on this approach, automatic annotation of files can be realized with less truth-value annotation information, thereby reducing annotation costs.
-
69.
公开(公告)号:US11847333B2
公开(公告)日:2023-12-19
申请号:US16528071
申请日:2019-07-31
发明人: Istvan Gonczi , Sorin Faibish , Ivan Basov
IPC分类号: G06F3/06
CPC分类号: G06F3/0641 , G06F3/0608 , G06F3/0631 , G06F3/0652 , G06F3/0659 , G06F3/0673
摘要: A method, computer program product, and computer system for identifying duplicate sectors in a block of a plurality of blocks. The duplicate sectors in the block may be zeroed out. A data reduction operation may be performed on the block after the duplicate sectors are zeroed out.
-
公开(公告)号:US11842782B2
公开(公告)日:2023-12-12
申请号:US17491080
申请日:2021-09-30
发明人: Matthew Bryan
CPC分类号: G11C29/10 , G06F9/4881
摘要: Phased parameterized combinatoric testing for a data storage system is disclosed. A testing recipe can be performed according to different input arguments. Combinatoric testing of the data storage system can be based on different combinations of operations and arguments. The disclosed testing can employ a consistent integer index for arguments passed into the sequenced operations of the recipe. The recipe can be employed to generate a phased test tree that can enable testing based on a phase rather than loading an entire test suite into memory. The consistent integer index can be used to identify failed test cases such that the entire test can be reconstituted from stored failed test information. Distribution of test cases to worker process can based on the phased test tree to facilitate interning an operation. Stored failed test information can include human-readable failure information.
-
-
-
-
-
-
-
-
-