-
公开(公告)号:US20200319927A1
公开(公告)日:2020-10-08
申请号:US16907637
申请日:2020-06-22
Applicant: Alibaba Group Holding Limited
Inventor: Jun Zhou , Xiaolong Li
Abstract: Evaluation results of a plurality of users are received from a plurality of data providers. The evaluation results are obtained by the plurality of data providers evaluating the plurality of users based on evaluation models of the plurality of data providers. A plurality of training samples is constructed by using the evaluation results. Each training sample includes a respective subset of the evaluation results corresponding to a same user of the plurality of users. A label for each training sample is generated based on an actual service execution status of the same user. A model is trained based on the plurality of training samples and the plurality of labels, including setting a plurality of variable coefficients, each variable coefficient specifying a contribution level of a corresponding data provider. Virtual resources to each data provider are allocated based on the plurality of variable coefficients.
-
公开(公告)号:US10691494B2
公开(公告)日:2020-06-23
申请号:US16697913
申请日:2019-11-27
Applicant: Alibaba Group Holding Limited
Inventor: Jun Zhou , Xiaolong Li
Abstract: Evaluation results of a plurality of users are received from a plurality of data providers. The evaluation results are obtained by the plurality of data providers evaluating the plurality of users based on evaluation models of the plurality of data providers. A plurality of training samples is constructed by using the evaluation results. Each training sample includes a respective subset of the evaluation results corresponding to a same user of the plurality of users. A label for each training sample is generated based on an actual service execution status of the same user. A model is trained based on the plurality of training samples and the plurality of labels, including setting a plurality of variable coefficients, each variable coefficient specifying a contribution level of a corresponding data provider. Virtual resources to each data provider are allocated based on the plurality of variable coefficients.
-
公开(公告)号:US10430518B2
公开(公告)日:2019-10-01
申请号:US15874725
申请日:2018-01-18
Applicant: Alibaba Group Holding Limited
Inventor: Shaosheng Cao , Xiaolong Li
Abstract: A word vector processing method is provided. Word segmentation is performed on a corpus to obtain words, and n-gram strokes corresponding to the words are determined. Each n-gram stroke represents n successive strokes of a corresponding word. Word vectors of the words and stroke vectors of the n-gram strokes are initialized corresponding to the words. After performing the word segmentation, the n-gram strokes are determined, and the word vectors and stroke vectors are determined, training the word vectors and the stroke vectors.
-
公开(公告)号:US20190034658A1
公开(公告)日:2019-01-31
申请号:US16047399
申请日:2018-07-27
Applicant: Alibaba Group Holding Limited
Inventor: Ling Xie , Xiaolong Li
Abstract: Encrypted user data are received at a service device from at least one user equipment, and the user data is encrypted in a trusted zone of the at least one user equipment. The encrypted user data then be decrypted in a trust zone of the service device by a first central processing unit (CPU) to obtain decrypted user data. A model is trained by using the decrypted user data to determine a training intermediate value and a training effective representative value, and a determination is made whether the training effective representative value satisfies a specified condition is determined. If so, the trained model is generated based on a model parameter. Otherwise, a model parameter is iterately adjusted and the model is iteratively trained based on an adjusted model parameter until the trained effective representative value satisfies the specified condition.
-
公开(公告)号:US10769383B2
公开(公告)日:2020-09-08
申请号:US16743224
申请日:2020-01-15
Applicant: ALIBABA GROUP HOLDING LIMITED
Inventor: Shaosheng Cao , Xinxing Yang , Jun Zhou , Xiaolong Li
Abstract: Embodiments of the present application disclose a cluster-based word vector processing method, apparatus, and device. Solutions are include: in a cluster having a server cluster and a worker computer cluster, in which each worker computer in the worker computer cluster separately reads some corpuses in parallel, extracts a word and context words of the word from the read corpuses, obtains corresponding word vectors from a server in the server cluster, and trains the corresponding word vectors, and the server cluster updates word vectors of same words that are stored before the training according to training results of one or more respective worker computers with respect to the word vectors of the same words.
-
公开(公告)号:US20190042763A1
公开(公告)日:2019-02-07
申请号:US16053606
申请日:2018-08-02
Applicant: Alibaba Group Holding Limited
Inventor: Peilin Zhao , Jun Zhou , Xiaolong Li , Longfei Li
Abstract: Techniques for data sharing between a data miner and a data provider are provided. A set of public parameters is downloaded from the data miner. The public parameters are data miner parameters associated with a feature set of training sample data. A set of private parameters in the data provider can be replaced with the set of public parameters. The private parameters are data provider parameters associated with the feature set of training sample data. The private parameters are updated to provide a set of update results. The private parameters are updated based on a model parameter update algorithm associated with the data provider. The update results is uploaded to the data miner.
-
公开(公告)号:US10748524B2
公开(公告)日:2020-08-18
申请号:US16774422
申请日:2020-01-28
Applicant: Alibaba Group Holding Limited
Inventor: Zhiming Wang , Jun Zhou , Xiaolong Li
Abstract: A speech wakeup method, apparatus, and electronic device are disclosed in embodiments of this specification. The method includes: inputting speech data to a speech wakeup model trained with general speech data; and outputting, by the speech wakeup model, a result for determining whether to execute speech wakeup, wherein the speech wakeup model includes a Deep Neural Network (DNN) and a Connectionist Temporal Classifier (CTC).
-
公开(公告)号:US10776334B2
公开(公告)日:2020-09-15
申请号:US16736673
申请日:2020-01-07
Applicant: ALIBABA GROUP HOLDING LIMITED
Inventor: Shaosheng Cao , Xinxing Yang , Jun Zhou , Xiaolong Li
IPC: G06F16/00 , G06F16/22 , G06F16/27 , G06F16/28 , G06F16/906
Abstract: Embodiments of the present specification disclose random walking and a cluster-based random walking method, apparatus and device. A solution includes: obtaining information about each node included in graph data, generating, according to the information about each node, an index vector reflecting a degree value of a respective node, then generating an element vector reflecting an identifier of an adjacent node of the node, and generating a random sequence according to the index vector and the element vector, to implement random walks in the graph data. The solution is applicable to clusters and individual machines.
-
公开(公告)号:US10747959B2
公开(公告)日:2020-08-18
申请号:US16778995
申请日:2020-01-31
Applicant: Alibaba Group Holding Limited
Inventor: Xiaofu Chang , Linlin Chao , Peng Xu , Xiaolong Li
Abstract: A dialog generation method includes: training a sequence to sequence (seq2seq)-based dialog model using a loss function including topic range constraint information; and generating a dialog using the trained dialog model. With the dialog generation method, topic range constraint information is introduced in the process of dialog model training using a loss function including the topic range constraint information, thus helping to prevent the trained model from producing low-quality meaningless replies.
-
公开(公告)号:US20200097329A1
公开(公告)日:2020-03-26
申请号:US16697913
申请日:2019-11-27
Applicant: Alibaba Group Holding Limited
Inventor: Jun Zhou , Xiaolong Li
Abstract: Evaluation results of a plurality of users are received from a plurality of data providers. The evaluation results are obtained by the plurality of data providers evaluating the plurality of users based on evaluation models of the plurality of data providers. A plurality of training samples is constructed by using the evaluation results. Each training sample includes a respective subset of the evaluation results corresponding to a same user of the plurality of users. A label for each training sample is generated based on an actual service execution status of the same user. A model is trained based on the plurality of training samples and the plurality of labels, including setting a plurality of variable coefficients, each variable coefficient specifying a contribution level of a corresponding data provider. Virtual resources to each data provider are allocated based on the plurality of variable coefficients.
-
-
-
-
-
-
-
-
-