Training method and training system for machine learning system

    公开(公告)号:US11257005B2

    公开(公告)日:2022-02-22

    申请号:US16119585

    申请日:2018-08-31

    Inventor: Jun Zhou

    Abstract: A training method and a training system for a machine learning system are provided. The method includes allocating training data to a plurality of working machines; dividing training data allocated by each working machine into a plurality of data pieces; obtaining a local weight and a local loss function value calculated by each working machine based on each data piece; aggregating the local weight and the local loss function value calculated by each work machine based on each data piece to obtain a current weight and a current loss function value; performing model abnormality detection using the current weight and/or the current loss function value; inputting a weight and a loss function value of a previous aggregation to the machine learning system for training in response to a result of the model abnormality detection being a first type of abnormality; and modifying the current weight and/or the current loss function value to a current weight and/or a current loss function value within a first threshold in response to the result of the model abnormality detection being a second type of abnormality, and inputting thereof to the machine learning system for training.

    Cluster-based word vector processing method, device, and apparatus

    公开(公告)号:US10769383B2

    公开(公告)日:2020-09-08

    申请号:US16743224

    申请日:2020-01-15

    Abstract: Embodiments of the present application disclose a cluster-based word vector processing method, apparatus, and device. Solutions are include: in a cluster having a server cluster and a worker computer cluster, in which each worker computer in the worker computer cluster separately reads some corpuses in parallel, extracts a word and context words of the word from the read corpuses, obtains corresponding word vectors from a server in the server cluster, and trains the corresponding word vectors, and the server cluster updates word vectors of same words that are stored before the training according to training results of one or more respective worker computers with respect to the word vectors of the same words.

    SECRET SHARING WITH A TRUSTED INITIALIZER
    23.
    发明申请

    公开(公告)号:US20200126143A1

    公开(公告)日:2020-04-23

    申请号:US16390057

    申请日:2019-04-22

    Abstract: An item rating and recommendation platform identifies rating data comprising respective ratings of multiple items with respect to multiple users, identifies user-feature data comprising multiple user features contributing to the respective ratings of the multiple items with respect to the multiple users, and receives, from a social network platform via a secret sharing scheme with a trusted initializer, manipulated social network data computed based on social network data and first input data from the trusted initializer. The social network data indicate social relationships between any two of the multiple users. In the secret sharing scheme with the trusted initializer, the social network platform shares with the item rating and recommendation platform the manipulated social network data without disclosing the social network data. The item rating and recommendation platform updates the user-feature data based on the rating data and the manipulated social network data.

    RECOMMENDATION SYSTEM CONSTRUCTION METHOD AND APPARATUS

    公开(公告)号:US20190272472A1

    公开(公告)日:2019-09-05

    申请号:US16290208

    申请日:2019-03-01

    Abstract: A client device determines a local user gradient value based on a current user preference vector and a local item gradient value based on a current item feature vector. The client device updates a user preference vector by using the local user gradient value and updates an item feature vector by using the local item gradient value. The client device determines a neighboring client device based on a predetermined adjacency relationship. The local item gradient value is sent by the client device to the neighboring client device. The client device receives a neighboring item gradient value sent by the neighboring client device. The client device updates the item feature vector by using the neighboring item gradient value. In response to the client device determining that a predetermined iteration stop condition is satisfied, the client device outputs the user preference vector and the item feature vector.

    DETERMINING COMPUTER-EXECUTED ENSEMBLE MODEL
    25.
    发明申请

    公开(公告)号:US20200349416A1

    公开(公告)日:2020-11-05

    申请号:US16812105

    申请日:2020-03-06

    Abstract: Implementations of the present specification provide a method for determining a computer-executed ensemble model. The method includes: obtaining a current ensemble model and a plurality of untrained candidate submodels; integrating each of the plurality of candidate submodels into the current ensemble model to obtain a plurality of first candidate ensemble models; training at least the plurality of first candidate ensemble models to obtain a plurality of second candidate ensemble models after this training; performing performance evaluation on each of the plurality of second candidate ensemble models to obtain corresponding performance evaluation results; determining, based on the performance evaluation results, an optimal candidate ensemble model with optimal performance from the plurality of second candidate ensemble models; and updating the current ensemble model with the optimal candidate ensemble model if the performance of the optimal candidate ensemble model satisfies a predetermined condition.

    GBDT MODEL FEATURE INTERPRETATION METHOD AND APPARATUS

    公开(公告)号:US20200293924A1

    公开(公告)日:2020-09-17

    申请号:US16889695

    申请日:2020-06-01

    Abstract: Implementations of the present specification disclose methods, devices, and apparatuses for determining a feature interpretation of a predicted label value of a user generated by a GBDT model. In one aspect, the method includes separately obtaining, from each of a predetermined quantity of decision trees ranked among top decision trees, a leaf node and a score of the leaf node; determining a respective prediction path of each leaf node; obtaining, for each parent node on each prediction path, a split feature and a score of the parent node; determining, for each child node on each prediction path, a feature corresponding to the child node and a local increment of the feature on the child node; obtaining a collection of features respectively corresponding to the child nodes; and obtaining a respective measure of relevance between the feature corresponding to the at least one child node and the predicted label value.

    SECRET SHARING WITH NO TRUSTED INITIALIZER
    28.
    发明申请

    公开(公告)号:US20200125745A1

    公开(公告)日:2020-04-23

    申请号:US16390147

    申请日:2019-04-22

    Abstract: An item rating and recommendation platform identifies rating data including respective ratings of multiple items with respect to multiple users; identifies user-feature data including user features contributing to the respective ratings of the multiple items with respect to the multiple users; and receives, from a social network platform via a secret sharing scheme without a trusted initializer, manipulated social network data computed based on social network data and a first number of random variables. The social network data indicate social relationships between any two of the number of users. In the secret sharing scheme without the trust initializer, the social network platform shares with the item rating and recommendation platform manipulated social network data without disclosing the social network data. The item rating and recommendation platform updates the user-feature data based on the rating data and the manipulated social network data.

    METHOD AND DEVICE FOR PROCESSING SHORT LINK, AND SHORT LINK SERVER

    公开(公告)号:US20180307774A1

    公开(公告)日:2018-10-25

    申请号:US16019897

    申请日:2018-06-27

    Inventor: Jun Zhou

    Abstract: Techniques for navigating webpages requested through short links are provided. In some implementations, a short link uniform resource locator (URL) is received, the short link URL is processed to extract a simplified short link and an address code, and a determination is made as to whether the simplified short link is associated with a long link URL representing an address of a webpage. In response to determining that the simplified short link is associated with a long link URL, the associated long link URL is provided. In response to determining that the simplified short link is not associated with a long link URL, a common long link URL associated with the address code is provided.

Patent Agency Ranking