-
公开(公告)号:US12093646B2
公开(公告)日:2024-09-17
申请号:US17151088
申请日:2021-01-15
申请人: Recruit Co., Ltd.
发明人: Zhengjie Miao , Yuliang Li , Xiaolan Wang , Wang-Chiew Tan
IPC分类号: G06F40/284 , G06F40/289 , G06N5/04 , G06N20/00
CPC分类号: G06F40/284 , G06F40/289 , G06N5/04 , G06N20/00
摘要: Disclosed embodiments relate to extracting classification information from input text. Techniques can include obtaining input text, identifying a plurality of tokens in the input text, pre-training a machine learning model, determining tagging information of the plurality of tokens using a first classification layer of the machine learning model, pairing sequences of tokens using the tagging information associated with the plurality of tokens, wherein the paired sequences of tokens are determined by a second classification layer, determining one or more attribute classifiers to apply to the one or more paired sequences, wherein the attribute classifiers are determined by a third classification layer of the machine learning model, evaluating sentiments of the paired sequences, wherein the sentiments of the paired sequences are determined by a fourth classification layer of the language machine learning model, aggregating sentiments of the paired sequences associated with an attribute classifier, and storing the aggregated sentiments.
-
2.
公开(公告)号:US11934783B2
公开(公告)日:2024-03-19
申请号:US18295735
申请日:2023-04-04
申请人: Recruit Co., Ltd.
发明人: Yoshihiko Suhara , Behzad Golshan , Yuliang Li , Chen Chen , Xiaolan Wang , Jinfeng Li , Wang-Chiew Tan , çagatay Demiralp , Aaron Traylor
IPC分类号: G06F40/284 , G06F16/35 , G06F18/211 , G06N7/01
CPC分类号: G06F40/284 , G06F16/35 , G06F18/211 , G06N7/01
摘要: Disclosed embodiments relate to natural language processing. Techniques can include receiving input text, extracting, from the input text, at least one modifier and aspect pair, receiving data from a knowledgebase, based on the at least one modifier and aspect pair and commonsense data, generate one or more premise embeddings, convert the input text into tokens, generating at least one vector for one or more of the tokens based on an analysis of the tokens, combine the at least one vector with the one or more premise embeddings to create at least one combined vector, and analyze the at least one combined vector wherein the analysis generates an output indicative of a feature of the input text.
-
公开(公告)号:US20220229984A1
公开(公告)日:2022-07-21
申请号:US17151088
申请日:2021-01-15
申请人: Recruit Co., Ltd.,
发明人: Zhengjie Miao , Yuliang Li , Xiaolan Wang , Wang-Chiew Tan
IPC分类号: G06F40/284 , G06N20/00 , G06N5/04 , G06F40/289
摘要: Disclosed embodiments relate to extracting classification information from input text. Techniques can include obtaining input text, identifying a plurality of tokens in the input text, pre-training a machine learning model, determining tagging information of the plurality of tokens using a first classification layer of the machine learning model, pairing sequences of tokens using the tagging information associated with the plurality of tokens, wherein the paired sequences of tokens are determined by a second classification layer, determining one or more attribute classifiers to apply to the one or more paired sequences, wherein the attribute classifiers are determined by a third classification layer of the machine learning model, evaluating sentiments of the paired sequences, wherein the sentiments of the paired sequences are determined by a fourth classification layer of the language machine learning model, aggregating sentiments of the paired sequences associated with an attribute classifier, and storing the aggregated sentiments.
-
4.
公开(公告)号:US11620448B2
公开(公告)日:2023-04-04
申请号:US17008572
申请日:2020-08-31
申请人: Recruit Co., Ltd.
发明人: Yoshihiko Suhara , Behzad Golshan , Yuliang Li , Chen Chen , Xiaolan Wang , Jinfeng Li , Wang-Chiew Tan , Çağatay Demiralp , Aaron Traylor
IPC分类号: G06F40/284 , G06F16/35 , G06K9/62 , G06N7/00
摘要: Disclosed embodiments relate to natural language processing. Techniques can include receiving input text, extracting, from the input text, at least one modifier and aspect pair, receiving data from a knowledgebase, based on the at least one modifier and aspect pair and commonsense data, generate one or more premise embeddings, convert the input text into tokens, generating at least one vector for one or more of the tokens based on an analysis of the tokens, combine the at least one vector with the one or more premise embeddings to create at least one combined vector, and analyze the at least one combined vector wherein the analysis generates an output indicative of a feature of the input text.
-
公开(公告)号:US20220351071A1
公开(公告)日:2022-11-03
申请号:US17246354
申请日:2021-04-30
申请人: Recruit Co., Ltd.,
发明人: Yuliang Li , Xiaolan Wang , Zhengjie Miao
IPC分类号: G06N20/00 , G06N5/02 , G06F16/21 , G06F40/284
摘要: Disclosed embodiments relate to generating training data for a machine learning model. Techniques can include accessing a machine learning model from a machine learning model repository and identifying a data set associated with the machine learning model. The identified data set is utilized to generate a set of data augmentation operators. The data augmentation operators applied on a selected sequence of tokens associated with the machine learning model to generate sequences of tokens. A subset of sequences of tokens are selected and stored in a training data repository. The stored sequences of tokens are provided to the machine learning model as training data.
-
-
-
-