摘要:
A video demographics analysis system selects a training set of videos to use to correlate viewer demographics and video content data. The video demographics analysis system extracts demographic data from viewer profiles related to videos in the training set and creates a set of demographic distributions, and also extracts video data from videos in the training set. The video demographics analysis system correlates the viewer demographics with the video data of videos viewed by that viewer. Using the prediction model produced by the machine learning process, a new video about which there is no a priori knowledge can be associated with a predicted demographic distribution specifying probabilities of the video appealing to different types of people within a given demographic category, such as people of different ages within an age demographic category.
摘要:
A video demographics analysis system selects a training set of videos to use to correlate viewer demographics and video content data. The video demographics analysis system extracts demographic data from viewer profiles related to videos in the training set and creates a set of demographic distributions, and also extracts video data from videos in the training set. The video demographics analysis system correlates the viewer demographics with the video data of videos viewed by that viewer. Using the prediction model produced by the machine learning process, a new video about which there is no a priori knowledge can be associated with a predicted demographic distribution specifying probabilities of the video appealing to different types of people within a given demographic category, such as people of different ages within an age demographic category.
摘要:
In one implementation, a computer-implemented method includes receiving, at a server system, a request for an advertisement to provide to a first user of a social network, and determining, for each of a plurality of advertisements, a probability that the first user will select the advertisement based, at least in part, on previous propagations of the advertisement by one or more second users of the social network. The method can further include scoring, by the server system, the plurality of advertisements based upon the determined probabilities of selection by the first user and bids associated with the plurality of advertisements, and providing one or more of the plurality of advertisements for presentation to the first user based upon the scoring of the plurality of advertisements.
摘要:
Apparatus and method for summarizing an original large data set with a representative data set. The data elements in both the original data set and the representative data set have the same variables, but there are significantly fewer data elements in the representative data set. Each data element in the representative data set has an associated weight, representing the degree of compression. There are three steps for constructing the representative data set. First, the original data elements are partitioned into separate bins. Second, moments of the data elements partitioned in each bin are calculated. Finally, the representative data set is generated by finding data elements and associated weights having substantially the same moments as the original data set.
摘要:
A method provides for mining information from large volumes of data regarding transactions. The method provides for inferring a behavioral characteristic of a party to the transaction based on a large volume of data concerning a multitude of parties. That inferred characteristic may be dynamic in nature.
摘要:
A method and apparatus for determining the accuracy limit of a learning machine for predicting path performance degradation imposed by the quality of the path performance data is disclosed. A plurality of learning machines of increasing capacity are trained using training data and tested using test data, and the training error rates and test error rates are calculated. The asymptotic error rates of the learning machines are calculated and compared. When the change in asymptotic error rate falls below a certain rate, the asymptotic error rate estimates the accuracy limit for a learning machine for predicting path performance degradation. The accuracy limit is derived from insufficiencies in the path performance data and is applicable to any learning machine trained on and applied to the path performance data, regardless of the complexity of the learning machine or the size of the training data set.
摘要:
A computer-implemented method for defining a segment based on interaction proneness includes receiving online activity data that specifies instances of presentation for one or more content items, and instances of user interaction detected for any of the content items. The method includes training at least one predictive model on the online activity data, the predictive model trained to predict interaction proneness based on one or more characteristics associated with the instances of user interaction. The method includes identifying, using the predictive model, at least one of the characteristics as being associated with the interaction proneness. The method includes generating at least one segment definition that takes into account the identified characteristic.
摘要:
A video demographics analysis system selects a training set of videos to use to correlate viewer demographics and video content data. The video demographics analysis system extracts demographic data from viewer profiles related to videos in the training set and creates a set of demographic distributions, and also extracts video data from videos in the training set. The video demographics analysis system correlates the viewer demographics with the video data of videos viewed by that viewer. Using the prediction model produced by the machine learning process, a new video about which there is no a priori knowledge can be associated with a predicted demographic distribution specifying probabilities of the video appealing to different types of people within a given demographic category, such as people of different ages within an age demographic category.
摘要:
In one implementation, a computer-implemented method includes receiving, at a server system, a request for an advertisement to provide to a first user of a social network, and determining, for each of a plurality of advertisements, a probability that the first user will select the advertisement based, at least in part, on previous propagations of the advertisement by one or more second users of the social network. The method can further include scoring, by the server system, the plurality of advertisements based upon the determined probabilities of selection by the first user and bids associated with the plurality of advertisements, and providing one or more of the plurality of advertisements for presentation to the first user based upon the scoring of the plurality of advertisements.
摘要:
A method and apparatus for determining the limit on learning machine accuracy imposed by the quality of data. A plurality of learning machines of increasing capacity are trained using training data and tested using test data, and the training error rates and test error rates are calculated. The asymptotic error rates of the learning machines are calculated and compared. When the change in asymptotic error rate falls below a certain rate, the asymptotic error rate estimates the limit on learning machine accuracy imposed by the data.