Adaptive sampling scheme for imbalanced large scale data

    公开(公告)号:US10346861B2

    公开(公告)日:2019-07-09

    申请号:US14933254

    申请日:2015-11-05

    Applicant: ADOBE INC.

    Abstract: Embodiments of the present invention relate to providing business customers with predictive capabilities, such as identifying valuable customers or estimating the likelihood that a product will be purchased. An adaptive sampling scheme is utilized, which helps generate sample data points from large scale data that is imbalanced (for example, digital website traffic with hundreds of millions of visitors but only a small portion of them are of interest). In embodiments, a stream of sample data points is received. Positive samples are added to a positive list until the desired number of positives is reached and negative samples are added to a negative list until the desired number of negative samples is reached. The positive list and the negative list can then be combined, shuffled, and fed into a prediction model.

Patent Agency Ranking