SAMPLE PROCESSING BASED ON LABEL MAPPING
    1.
    发明公开

    公开(公告)号:US20230409678A1

    公开(公告)日:2023-12-21

    申请号:US17845956

    申请日:2022-06-21

    Applicant: LEMON INC.

    CPC classification number: G06K9/6227 G06K9/6256 G06K9/6267

    Abstract: A method is proposed for sample processing. A first label for a training sample in a plurality of training samples is mapped into a second label, the first label being represented in a first label space and the second label being represented in a second label space smaller than the first label space. A plurality of classification models are obtained based on the second label and the training sample, a classification model describing an association relationship between a sample and a classification of a label, represented in the second label space, for the sample. A predication model is generated based on the plurality of classification models, the predication model describing an association relationship between a sample and a label, represented in the first label space, for the sample. The long tail effect in the original label space may be alleviated in building the predication model.

    METHOD FOR FEATURE CONSTRUCTION, METHOD FOR CONTENT DISPLAY AND RELATED APPARATUS

    公开(公告)号:US20240289406A1

    公开(公告)日:2024-08-29

    申请号:US18563312

    申请日:2022-04-28

    Applicant: Lemon Inc.

    CPC classification number: G06F16/9574

    Abstract: The present disclosure relates to a method for feature construction, a method for content display and a related apparatus. The method for feature construction comprises: acquiring interaction data on a content page and loading performance data of the content page; constructing a user interaction feature according to the interaction data on the content page, and constructing a page performance feature of the content page according to the loading performance data of the content page. The user interaction feature and the page performance feature are used for training a content display model, and the content display model is used for determining target content displayed to a target user.

    METHOD AND APPARATUS FOR KNOWLEDGE GRAPH CONSTRUCTION, STORAGE MEDIUM, AND ELECTRONIC DEVICE

    公开(公告)号:US20240330373A1

    公开(公告)日:2024-10-03

    申请号:US18573944

    申请日:2022-08-12

    Applicant: Lemon Inc.

    CPC classification number: G06F16/90344

    Abstract: The disclosure relates to a method and apparatus for knowledge graph construction, a storage medium, and an electronic device. The method comprises: obtaining a target entity identifier and determining an industry type label corresponding to the target entity identifier; determining a target industry attribute table based on a predetermined correspondence among the industry type label, an industry type, and an industry attribute table; obtaining target attribute values of the target entity identifier from a public database based on respective target attribute names in the target industry attribute table, to obtain a target attribute of the target entity identifier, wherein the target attribute characterizes a key-value pair consisting of the target attribute name and the target attribute value; and constructing a knowledge graph based on an entity characterized by the target entity identifier, the industry type label, and the target attribute.

    FEATURE EXTRACTION
    4.
    发明公开
    FEATURE EXTRACTION 审中-公开

    公开(公告)号:US20230334839A1

    公开(公告)日:2023-10-19

    申请号:US17724140

    申请日:2022-04-19

    Applicant: Lemon Inc.

    Abstract: Implementations of the present disclosure relate to methods, devices, and computer program products of extracting a feature for multimedia data that comprises a plurality of medium types. In a method, a first feature is determined for a first medium type in the plurality of medium types by masking a portion in a first medium object with the first medium type. A second feature is determined for a second medium type other than the first medium type in the plurality of medium types. The feature is generated for the multimedia data based on the first and second features. With these implementations, multiple medium types are considered in the feature extraction, and thus the feature may fully reflect various aspects of the multimedia data in an accurate way.

    WEB PAGE CLASSIFICATION METHOD, APPARATUS, STORAGE MEDIUM AND ELECTRONIC DEVICE

    公开(公告)号:US20240289394A1

    公开(公告)日:2024-08-29

    申请号:US18572097

    申请日:2022-06-02

    Applicant: Lemon Inc.

    CPC classification number: G06F16/906 G06F16/9535 G06F16/958

    Abstract: The present disclosure relates to a web page classification method, apparatus, storage medium, and electronic device. The method comprises: acquiring feature information of a web page to be classified; respectively predicting, according to each piece of the feature information, a candidate web page category of the web page to be classified; and determining, from all the candidate web page categories, a target web page category to which the web page to be classified belongs. The candidate web page category of the web page to be classified is predicted by using various feature information of the web page to be classified, and the target web page category of the web page to be classified is further determined from the candidate web page categories, thereby improving the accuracy of web page classification.

    SPEECH TENDENCY CLASSIFICATION
    6.
    发明公开

    公开(公告)号:US20230377560A1

    公开(公告)日:2023-11-23

    申请号:US17747704

    申请日:2022-05-18

    Applicant: Lemon Inc.

    CPC classification number: G10L15/02 G10L15/04 G10L17/00 G10L25/90

    Abstract: Embodiments of the present disclosure relate to speech tendency classification. According to embodiments of the present disclosure, a method comprises extracting, from a speech segment, voiceprint information and at least one of volume information or speaking rate information; determining, based on the voiceprint information, first probability information indicating respective first probabilities of a plurality of tendency categories into which the speech segment is classified; determining, based on the at least one of the volume information or the speaking rate information, second probability information indicating respective second probabilities of the plurality of tendency categories into which the speech segment is classified; and determining, based at least in part on the first probability information and the second probability information, target probability information for the speech segment, the target probability information indicating respective target probabilities of the plurality of tendency categories into which the speech segment is classified.

    METHOD AND APPARATUS FOR KNOWLEDGE GRAPH CONSTRUCTION, STORAGE MEDIUM, AND ELECTRONIC DEVICE

    公开(公告)号:US20240135196A1

    公开(公告)日:2024-04-25

    申请号:US18397227

    申请日:2023-12-27

    Applicant: Lemon Inc.

    CPC classification number: G06N5/02

    Abstract: The present disclosure relates to a method and apparatus for knowledge graph construction, storage medium and electronic device. The method for knowledge graph construction, comprises: identifying an entity concept from a title text of a target web page and at least one entity corresponding to the entity concept from a body text of the target web page; constructing a syntax parse tree of the title text based on syntax parse rules of a language to which the title text belongs, and determining, from the syntax parse tree, a modifier for modifying the entity concept; and generating a knowledge graph based on the entity concept, the modifier, and the at least one entity. Through the solution of the present disclosure, knowledge graphs with high accuracy and high recall rates are constructed without structured processing on target web pages.

    ATTRIBUTE AND RATING CO-EXTRACTION
    8.
    发明公开

    公开(公告)号:US20230342553A1

    公开(公告)日:2023-10-26

    申请号:US17727015

    申请日:2022-04-22

    Applicant: LEMON INC.

    CPC classification number: G06F40/30 G06F40/279 G06N3/0454

    Abstract: Embodiments of the present disclosure relate to attribute and rating co-extraction. According to embodiments of the present disclosure, a method is proposed. The method comprises: determining, by a first sub-network of a model, a first feature representation based on a first token contained in a text, the first feature representation indicating semantic information of the first token in the text; determining, by a second sub-network of the model, first attribute information associated with the first token based on the first feature representation, the first attribute information indicating a first attribute involved in the text; and determining, by a third sub-network of the model, first rating information associated with the first token based on the first feature representation, the first rating information indicating a rating related to the first attribute.

Patent Agency Ranking