Abstract:
A system for training a model to predict a sequence (e.g. a sequence of words) given a context is disclosed. A model can be trained to make these predictions using a combination of individual predictions compared to base truth and sequences of predictions based on previous predictions, where the resulting sequence is compared to the base truth sequence. In particular, the model can initially use the individual predictions to train the model. The model can then be further trained over the training data in multiple iterations, where each iteration includes two processes for each training element. In the first process, an initial part of the sequence is predicted, and the model and model parameters are updated after each prediction. In the second process, the entire remaining amount of the sequence is predicted and compared to the corresponding training sequence to adjust model parameters to encourage or discourage each prediction.
Abstract:
In one embodiment, a sequence of input words is received. Each of the input words is encoded as an indicator vector, wherein a sequence of the indicator vectors captures features of the sequence of input words. The sequence of the indicator vectors is then mapped to a distribution of a contextual probability of a first output word in a sequence of output words. For each subsequent output word, the sequence of the indicator vectors is encoded with a context, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous output words; and the encoded sequence of the indicator vectors and the context is mapped to the distribution of the contextual probability of the subsequent output word. Finally, a condensed summary is generated using a decoder by maximizing the contextual probability of each of the output words.
Abstract:
In one embodiment, a method includes receiving text query that includes n-grams. A vector representation of each n-gram is determined using a deep-learning model. A nonlinear combination of the vector representations of the n-grams is determined, and an embedding of the text query is determined based on the nonlinear combination. The embedding of the text query corresponds to a point in an embedding space, and the embedding space includes a plurality of points corresponding to a plurality of label embeddings. Each label embedding is based on a vector representation of a respective label determined using the deep-learning model. Label embeddings are identified as being relevant to the text query by applying a search algorithm to the embedding space. Points corresponding to the identified label embeddings are within a threshold distance of the point corresponding to the embedding of the text query in the embedding space.
Abstract:
Systems, methods, and non-transitory computer-readable media can acquire video content for which video feature descriptors are to be determined. The video content can be processed based at least in part on a convolutional neural network including a set of two-dimensional convolutional layers and a set of three-dimensional convolutional layers. One or more outputs can be generated from the convolutional neural network. A plurality of video feature descriptors for the video content can be determined based at least in part on the one or more outputs from the convolutional neural network.
Abstract:
In one embodiment, a method includes accessing a first set of entities, with which a user has interacted, and a second set of entities in a social-networking system. A first set of vector representations of the first set of entities are determined using a deep-learning model. A target entity is selected from the first set of entities, and the vector representation of the target entity is removed from the first set. The remaining vector representations in the first set are combined to determine a vector representation of the user. A second set of vector representations of the second set of entities are determined using the deep-learning model. Similarity scores are computed between the user and each of the target entity and the entities in the second set of entities. Vector representations of entities in the second set of entities are updated based on the similarity scores using the deep-learning model.
Abstract:
In one embodiment, a method includes receiving text query that includes n-grams. A vector representation of each n-gram is determined using a deep-learning model. A nonlinear combination of the vector representations of the n-grams is determined, and an embedding of the text query is determined based on the nonlinear combination. The embedding of the text query corresponds to a point in an embedding space, and the embedding space includes a plurality of points corresponding to a plurality of label embeddings. Each label embedding is based on a vector representation of a respective label determined using the deep-learning model. Label embeddings are identified as being relevant to the text query by applying a search algorithm to the embedding space. Points corresponding to the identified label embeddings are within a threshold distance of the point corresponding to the embedding of the text query in the embedding space.
Abstract:
In one embodiment, a method includes retrieving a first vector representation of a first entity, with which a user has interacted, and a second vector representation of a second entity, with which the user has not interacted. The first and second vector representations are determined using an initial deep-learning model. A first similarity score is computed between a vector representation of the user and the first vector representation, and a second similarity score is computed between the vector representation of the user and the second vector representation. The second vector representation is updated if the second similarity score is greater than the first similarity score using the initial deep-learning model. An updated deep-learning model is generated based on the initial deep-learning model and on the updated second vector representation.
Abstract:
In one embodiment, a method includes receiving, from a client system, a text input comprising one or more n-grams, determining, using a deep-learning model, a vector representation of the text input based on the one or more n-grams, determining an embedding of the vector representation of the text input in a d-dimensional embedding space, identifying one or more labels based on, for each of the one or more labels, a respective similarity of an embedding of a vector representation of the label in the embedding space to the embedding of the vector representation of the text input, and sending, to the client system in response to the received text input, instructions for presenting a user interface comprising one or more of the identified labels as suggested labels.
Abstract:
Systems, methods, and non-transitory computer-readable media can acquire video content for which video feature descriptors are to be determined. The video content can be processed based at least in part on a convolutional neural network including a set of two-dimensional convolutional layers and a set of three-dimensional convolutional layers. One or more outputs can be generated from the convolutional neural network. A plurality of video feature descriptors for the video content can be determined based at least in part on the one or more outputs from the convolutional neural network.
Abstract:
A system for training a model to predict a sequence (e.g. a sequence of words) given a context is disclosed. A model can be trained to make these predictions using a combination of individual predictions compared to base truth and sequences of predictions based on previous predictions, where the resulting sequence is compared to the base truth sequence. In particular, the model can initially use the individual predictions to train the model. The model can then be further trained over the training data in multiple iterations, where each iteration includes two processes for each training element. In the first process, an initial part of the sequence is predicted, and the model and model parameters are updated after each prediction. In the second process, the entire remaining amount of the sequence is predicted and compared to the corresponding training sequence to adjust model parameters to encourage or discourage each prediction.