-
公开(公告)号:US10657332B2
公开(公告)日:2020-05-19
申请号:US15850382
申请日:2017-12-21
Applicant: Facebook, Inc.
Inventor: Ying Zhang , Reshef Shilon , Jing Zheng
IPC: G06F40/49 , G06F16/35 , G06F40/30 , G06F40/44 , G06F40/58 , G06F40/216 , G06F40/284
Abstract: Exemplary embodiments relate to techniques to classify or detect the intent of content written in a language for which a classifier does not exist. These techniques involve building a code-switching corpus via machine translation, generating a universal embedding for words in the code-switching corpus, training a classifier on the universal embeddings to generate an embedding mapping/table; accessing new content written in a language for which a specific classifier may not exist, and mapping entries in the embedding mapping/table to the universal embeddings. Using these techniques, a classifier can be applied to the universal embedding without needing to be trained on a particular language. Exemplary embodiments may be applied to recognize similarities in two content items, make recommendations, find similar documents, perform deduplication, and perform topic tagging for stories in foreign languages.
-
公开(公告)号:US20190197119A1
公开(公告)日:2019-06-27
申请号:US15850382
申请日:2017-12-21
Applicant: Facebook, Inc.
Inventor: Ying Zhang , Reshef Shilon , Jing Zheng
CPC classification number: G06F17/2845 , G06F17/2785 , G06F17/289
Abstract: Exemplary embodiments relate to techniques to classify or detect the intent of content written in a language for which a classifier does not exist. These techniques involve building a code-switching corpus via machine translation, generating a universal embedding for words in the code-switching corpus, training a classifier on the universal embeddings to generate an embedding mapping/table; accessing new content written in a language for which a specific classifier may not exist, and mapping entries in the embedding mapping/table to the universal embeddings. Using these techniques, a classifier can be applied to the universal embedding without needing to be trained on a particular language. Exemplary embodiments may be applied to recognize similarities in two content items, make recommendations, find similar documents, perform deduplication, and perform topic tagging for stories in foreign languages.
-