-
公开(公告)号:WO2023075908A1
公开(公告)日:2023-05-04
申请号:PCT/US2022/041513
申请日:2022-08-25
Applicant: TENCENT AMERICA LLC
Inventor: JIN, Lifeng , SONG, Linfeng , XU, Kun , YU, Dong
Abstract: There is included a method and apparatus comprising computer code for a joint training method using neural networks with noise-robust losses comprising encoding input tokens from a noisy dataset into input vectors using an input encoder; predicting a label based on the input vectors using a classifier model; calculating a beta value based on the input vectors and the label using a label quality predictor model, wherein the beta value is instance-specific for each training instance; and joint training more than one model using a first modified loss function based on the beta value and an entropy value.
-
公开(公告)号:WO2022177629A1
公开(公告)日:2022-08-25
申请号:PCT/US2021/063787
申请日:2021-12-16
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
Abstract: A method of training a natural language neural network comprises obtaining at least one constituency span; obtaining a training video input; applying a multi-modal transform to the video input, thereby generating a transformed video input; comparing the at least one constituency span and the transformed video input using a compound Probabilistic Context-Free Grammar (PCFG) model to match the at least one constituency span with corresponding portions of the transformed video input; and using results from the comparison to learn a constituency parser.
-
公开(公告)号:WO2022177630A1
公开(公告)日:2022-08-25
申请号:PCT/US2021/063791
申请日:2021-12-16
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
Abstract: A method of generating a neural network based open-domain dialogue model, includes receiving an input utterance from a device having a conversation with the dialogue model, obtaining a plurality of candidate replies to the input utterance from the dialogue model, determining a plurality of discriminator scores for the candidate replies based on reference-free discriminators, determining a plurality of quality score associated with the candidate replies, and training the dialogue model based on the quality scores.
-
公开(公告)号:WO2021242369A1
公开(公告)日:2021-12-02
申请号:PCT/US2021/023120
申请日:2021-03-19
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
Abstract: A method, computer program, and computer system for training a graph-to-text generation network is provided. Encoded graph information corresponding to a target sentence is received, and the encoded graph information is decoded based on a biaffine attention score. One or more loss values are determined based on the decoded information, whereby the text-to-graph generation network is trained by minimizing the one or more loss values. A first loss value is generated by reconstructing one or more triple relations based on the biaffine attention score, and a second loss value predicts the graph as a linearized sequence.
-
公开(公告)号:WO2022186875A1
公开(公告)日:2022-09-09
申请号:PCT/US2021/063788
申请日:2021-12-16
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
IPC: G06F40/20 , G06F40/205 , G06F40/284 , G06F40/30 , G10L15/22
Abstract: A method, computer program, and computer system is provided for representing multi-turn conversations. Data corresponding to a conversation having one or more utterances is received, contextual representations are identified for the one or more utterances, a span corresponding to the identified contextual representations is determined, and the one or more utterances are rewritten based on maximizing a probability associated with the determined span.
-
公开(公告)号:WO2021152568A1
公开(公告)日:2021-08-05
申请号:PCT/IB2021/050837
申请日:2021-02-02
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
IPC: G06F40/00
Abstract: A method, computer program, and computer system is provided for extracting relations between one or more entities in a sentence. A forest corresponding to probabilities of relations between each pair of the entities is generated, and the generated forest is encoded with relation information for each of the pairs of entities. One or more features are extracted based on the generated forest and the encoded relation information, and a relation is predicted between the entities of each pair of entities based on the extracted features.
-
公开(公告)号:WO2021112936A1
公开(公告)日:2021-06-10
申请号:PCT/US2020/049308
申请日:2020-09-04
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
IPC: G06F40/00
Abstract: A method, computer program, and computer system for recovering a dropped pronoun is provided. The method, computer program, and computer system involve receiving data corresponding to one or more input words and determining contextual representations for the received input word data. The dropped pronoun may be identified based on a probability value associated with the contextual representations, and a span associated with one or more of the received input words may be determined. The span may correspond to which of the input words the dropped pronoun refers.
-
公开(公告)号:WO2023091240A1
公开(公告)日:2023-05-25
申请号:PCT/US2022/045140
申请日:2022-09-29
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
Abstract: Hierarchical context tagging for utterance rewriting comprising computer code for obtaining source tokens and context tokens, encoding the source tokens and the context tokens to generate source contextualized embeddings and context contextualized embeddings, tagging the source tokens with tags indicating a keep or delete action for each source token of the source tokens, selecting a rule to insert before the each source token, wherein the rule contains a sequence of one or more slots, and generating spans from the context tokens, wherein each span corresponds to one of the one or more slots of the selected rule.
-
公开(公告)号:WO2023069194A1
公开(公告)日:2023-04-27
申请号:PCT/US2022/041515
申请日:2022-08-25
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
Abstract: There is included a method and apparatus for sentiment analysis for multi-turn conversations comprising computer code for obtaining input dialogues; extracting sentiment expressions based on sentence embeddings corresponding to the input dialogues; generating polarity values based on the sentence embeddings corresponding to the input dialogues; and determining a target mention associated with at least one of the sentiment expressions based on the sentiment expressions and the sentence embeddings, wherein the determining of the target mention includes generating rich contextual representations based on the sentence embeddings and the sentiment expressions; and determining the target mention based on calculated boundaries, wherein the calculated boundaries are generated using the rich contextual representations.
-
公开(公告)号:WO2022177631A1
公开(公告)日:2022-08-25
申请号:PCT/US2021/063792
申请日:2021-12-16
Applicant: TENCENT AMERICA LLC
Inventor: SONG, Linfeng
IPC: G10L15/00 , G10L15/197 , G10L15/22 , G10L15/26
Abstract: A method, computer program, and computer system is provided for parsing multi-party dialogue. Dialogue data having one or more elementary discourse units is received. A local representation and a global representation are determined for each of the elementary discourse units based on performing a pairwise comparison on the elementary discourse units. Relationships between the elementary discourse units are identified based on the determined local and global representations. A contextual link is predicted between non-adjacent elementary discourse units based on the identified relationships.
-
-
-
-
-
-
-
-
-