Genealogical entity resolution system and method

    公开(公告)号:US12111850B2

    公开(公告)日:2024-10-08

    申请号:US17715649

    申请日:2022-04-07

    摘要: Systems and methods for determining whether two tree persons in a genealogical database correspond to the same real-life individual. Embodiments include obtaining, from a tree database, a first tree person from a first genealogical tree and a second tree person from a second genealogical tree. Embodiments also include identifying a plurality of familial categories. Embodiments further include, for each familial category of the plurality of familial categories, extracting a first quantity of features for each of the tree persons in the familial category, generating a first similarity score for each possible pairing of tree persons, identifying a representative pairing based on a maximum first similarity score, and extracting a second quantity of features for each of the tree persons in the representative pairing. Embodiments may also include generating a second similarity score based on the second quantity of features.

    FAMILY TREE INTERFACE
    3.
    发明公开

    公开(公告)号:US20230161796A1

    公开(公告)日:2023-05-25

    申请号:US18057904

    申请日:2022-11-22

    摘要: A family tree interface may include a default number of family members in addition to a target node which are expandable upon selection by a user. The default tree interface is expandable by a user vertically to include more generations and laterally. The tree interface includes labels showing a relationship of a tree node to the target node. In some embodiments, one or more family members that have not been rendered may be cached to speed up the visual rendering process. A graphical user interface, in a viewing session, may display an initial view of the family tree associated with the target individual. Upon receipt of an expand request, the viewing session may add the one or more additional family members to generate an expanded view of the family tree. The expanded view may partially adjust the initial view without refreshing the viewing session.

    SYSTEMS AND METHODS FOR DETECTION AND CORRECTION OF OCR TEXT

    公开(公告)号:US20230083000A1

    公开(公告)日:2023-03-16

    申请号:US17895818

    申请日:2022-08-25

    IPC分类号: G06V30/12 G06V30/19 G06V30/26

    摘要: OCR-text correction system and method embodiments are described. The OCR-text correction embodiments comprise or cooperate with a transformer-based sequence-to-sequence language model. The model is pretrained to denoise corrupted text and is fine-tuned using OCR-correction-specific examples. Text obtained at least in part through OCR is applied to the fine-tuned pretrained transformer model to detect at least one error in a subset of the text. Responsive to detecting the at least one error, the fine-tuned pretrained transformer model outputs an updated subset of the text to correct the at least one error.

    CONTEXT-BASED KEYPHRASE EXTRACTION FROM INPUT TEXT

    公开(公告)号:US20220253604A1

    公开(公告)日:2022-08-11

    申请号:US17667320

    申请日:2022-02-08

    摘要: Described herein are systems, methods, and other techniques for extracting one or more keyphrases from an input text. The input text may include a plurality of words. A plurality of token-level attention matrices may be generated using a transformer-based machine learning model. The plurality of token-level attention matrices may be converted into a plurality of word-level attention matrices. A set of candidate phrases may be identified from the plurality of words based on the plurality of word-level attention matrices. The one or more keyphrases may be selected from the set of candidate phrases.

    HANDWRITING RECOGNITION
    6.
    发明申请

    公开(公告)号:US20220189188A1

    公开(公告)日:2022-06-16

    申请号:US17643545

    申请日:2021-12-09

    IPC分类号: G06V30/226 G06K9/62 G06N3/04

    摘要: A simplified handwriting recognition approach includes a first network comprising convolutional neural network comprising one or more convolutional layers and one or more max-pooling layers. The first network receives an input image of handwriting and outputs an embedding based thereon. A second network comprises a network of cascaded convolutional layers including one or more subnetworks configured to receive an embedding of a handwriting image and output one or more character predictions. The subnetworks are configured to downsample and flatten the embedding to a feature map and then a vector before passing the vector to a dense neural network for character prediction. Certain subnetworks are configured to concatenate an input embedding with an upsampled version of the feature map.

    Genealogical entity resolution system and method

    公开(公告)号:US11321361B2

    公开(公告)日:2022-05-03

    申请号:US16758757

    申请日:2018-10-19

    摘要: Systems and methods for determining whether two tree persons in a genealogical database correspond to the same real-life individual. Embodiments include obtaining, from a tree database, a first tree person from a first genealogical tree and a second tree person from a second genealogical tree. Embodiments also include identifying a plurality of familial categories. Embodiments further include, for each familial category of the plurality of familial categories, extracting a first quantity of features for each of the tree persons in the familial category, generating a first similarity score for each possible pairing of tree persons, identifying a representative pairing based on a maximum first similarity score, and extracting a second quantity of features for each of the tree persons in the representative pairing. Embodiments may also include generating a second similarity score based on the second quantity of features.

    SYSTEMS AND METHODS FOR IDENTIFYING AND SEGMENTING OBJECTS FROM IMAGES

    公开(公告)号:US20210390704A1

    公开(公告)日:2021-12-16

    申请号:US17343626

    申请日:2021-06-09

    摘要: Systems and methods for identifying and segmenting objects from images include a preprocessing module configured to adjust a size of a source image; a region-proposal module configured to propose one or more regions of interest in the size-adjusted source image; and a prediction module configured to predict a classification, bounding box coordinates, and mask. Such systems and methods may utilize end-to-end training of the modules using adversarial loss, facilitating the use of a small training set, and can be configured to process historical documents, such as large images comprising text. The preprocessing module within said systems and methods can utilize a conventional image scaler in tandem with a custom image scaler to provide a resized image suitable for GPU processing, and the region-proposal module can utilize a region-proposal network from a single-stage detection model in tandem with a two-stage detection model paradigm to capture substantially all particles in an image.