Patent search ap:("ZHEJIANG LAB") AND inv:"Lixin Shi" Page 1

1.

发明授权
Method and apparatus of NER-oriented chinese clinical text data augmentation 有权

公开(公告)号：US11972214B2

公开(公告)日：2024-04-30

申请号：US18348317

申请日：2023-07-06

Applicant: ZHEJIANG LAB

Inventor： Jingsong Li , Lixin Shi , Ran Xin , Zongfeng Yang , Yu Tian , Tianshu Zhou

IPC: G06F17/00 , G06F40/169 , G06F40/284 , G06F40/295 , G06F40/30 , G06F40/40

CPC classification number: G06F40/295 , G06F40/169 , G06F40/284 , G06F40/30 , G06F40/40

Abstract: Disclosed is a method and an apparatus NER-orientated Chinese clinical text data augmentation, and unannotated data and annotated data of label linearization processing through data preprocessing. A concealed part is predicted based on retained information by using the unannotated data and concealing part of information in text, and meanwhile an entity word-level discrimination task is introduced for pre-training of a span-based language model; and a plurality of decoding mechanisms are introduced in a fine-tune stage, a relationship between a text vector and text data is obtained based on the pre-trained span-based language model, linearized data with entity labels is converted into the text vector, and text generation is performed through forward decoding and reverse decoding in a prediction stage of a text generation model to obtain enhanced data with annotation information.

Patent Agency Ranking