-
公开(公告)号:US20230109692A1
公开(公告)日:2023-04-13
申请号:US17822722
申请日:2022-08-26
发明人: ANUMITA DASGUPTA , INDRAJIT BHATTACHARYA , GIRISH KESHAV PALSHIKAR , PRATIK SAINI , SANGAMESHWAR SURYAKANT PATIL , SOHAM DATTA , PRABIR MALLICK , SAMIRAN PAL , SUNIL KUMAR KOPPARAPU , AISHWARYA CHHABRA , AVINASH KUMAR SINGH , KAUSTUV MUKHERJI , MEGHNA ABHISHEK PANDHARIPANDE , ANIKET PRAMANICK , ARPITA KUNDU , SUBHASISH GHOSH , CHANDRASEKHAR ANANTARAM , ANAND SIVASUBRAMANIAM , GAUTAM SHROFF
摘要: This disclosure relates generally to method and system for providing assistance to interviewers. Technical interviewing is immensely important for enterprise but requires significant domain expertise and investment of time. The present disclosure aids assists interviewers with a framework via an interview assistant bot. The method initiates an interview session for a job description by selecting a set of qualified candidates resume to be interviewed. Further, the IA bot recommends each interviewer with a set of question and reference answer pairs prior initiating the interview. At each interview step, the IA bot records interview history and recommends interviewer with the revised set of questions. Further, an assessment score is determined for the candidate using the reference answer extracted from a resource corpus. Additionally, statistics about the interview process is generated, such as number and nature of questions asked, and its variation across to identify outliers for corrective actions.
-
2.
公开(公告)号:US20230061773A1
公开(公告)日:2023-03-02
申请号:US17822714
申请日:2022-08-26
发明人: SANGAMESHWAR SURYAKANT PATIL , SAMIRAN PAL , AVINASH KUMAR SINGH , SOHAM DATTA , GIRISH KESHAV PALSHIKAR , INDRAJIT BHATTACHARYA , HARSIMRAN BEDI , YASH AGRAWAL , VASUDEVA VARMA KALIDINDI
IPC分类号: G06F16/332 , G06F16/35 , G06F40/30 , G06F40/295
摘要: Questions play a central role in assessment of a candidate's expertise during an interview or examination. However, generating such questions from input text documents manually needs specialized expertise and experience. Further, techniques that are available for automated question generation require input sentence as well as an answer phrase in that sentence to generate question. This in-turn requires large training datasets consisting tuples of input sentence answer-phrase and the corresponding question. Additionally, training datasets are available are for general purpose text, but not for technical text. Present application provides systems and methods for generating technical questions from technical documents. The system extracts meta information and linguistic information of text data present in technical documents. The system then identifies relationships that exist in provided text data. The system further creates one or more graphs based on the identified relationships. The created graphs are the used by the system to generate technical questions.
-
3.
公开(公告)号:US20240126791A1
公开(公告)日:2024-04-18
申请号:US18470657
申请日:2023-09-20
发明人: ANUMITA DASGUPTABANDYOPADHYAY , PRABIR MALLICK , TAPAS NAYAK , INDRAJIT BHATTACHARYA , SANGAMESHWAR SURYAKANT PATIL
IPC分类号: G06F16/31 , G06F16/332
CPC分类号: G06F16/31 , G06F16/3329
摘要: This disclosure relates generally to long-form answer extraction and, more particularly, to long-form answer extraction based on combination of sentence index generation techniques. Existing answer extractions techniques have achieved significant progress for extractive short answers; however, less progress has been made for long form questions that require explanations. Further the state-of-art long-answer extractions techniques result in poorer long-form answers or not address sparsity which becomes an issue longer contexts. Additionally, pre-trained generative sequence-to-sequence models are gaining popularity for factoid answer extraction tasks. Hence the disclosure proposes a long-form answer extraction based on several steps including training a set of generative sequence-to-sequence models comprising a sentence indices generation model and a sentence index spans generation. The trained set of generative sequence-to-sequence models is further utilized for model long-form answer extraction based on a union of several sentence index generation techniques comprising a sentence indices and a sentence index spans.
-
公开(公告)号:US20240119075A1
公开(公告)日:2024-04-11
申请号:US18479646
申请日:2023-10-02
发明人: PRABIR MALLICK , SAMIRAN PAL , AVINASH KUMAR SINGH , ANUMITA DASGUPTA , SOHAM DATTA , KAAMRAAN KHAN , TAPAS NAYAK , INDRAJIT BHATTACHARYA , GIRISH KESHAV PALSHIKAR
IPC分类号: G06F16/332 , G06F16/33 , G06F40/186 , G06F40/284 , G06F40/289 , G06F40/30 , G06F40/40
CPC分类号: G06F16/3329 , G06F16/3344 , G06F40/186 , G06F40/284 , G06F40/289 , G06F40/30 , G06F40/40
摘要: Conventional Question and Answer (QA) datasets are created for generating factoid questions only and the present disclosure generates longform technical QA dataset from textbooks. Initially, the system receives a technical textbook document and extracts a plurality of contexts. Further, a first plurality of questions are generated based on the plurality of contexts. A plurality of answerable questions are generated further based on the plurality of contexts using an unsupervised template-based matching technique. Further, a combined plurality of questions are generated by combining the first plurality of questions and the plurality of answerable questions. Further, an answer for the combined plurality of questions are generated using an autoregressive language model and a mapping score is computed. Further, a plurality of optimal answers are selected based on the corresponding mapping score. Finally, a longform technical question and answer dataset is generated based on the combined plurality of questions and optimal answers.
-
5.
公开(公告)号:US20240095466A1
公开(公告)日:2024-03-21
申请号:US18450588
申请日:2023-08-16
IPC分类号: G06F40/40 , G06F40/137 , G06F40/205 , G06Q50/20 , G06V30/413
CPC分类号: G06F40/40 , G06F40/137 , G06F40/205 , G06Q50/20 , G06V30/413 , G06V2201/10
摘要: The present disclosure a method for document structure based unsupervised long-form technical question generation. Initially, the system receives a textbook document. Further, a PDF metadata is extracted from the textbook document using a Natural Language Processing (NLP) technique. Further, a plurality of structures from the textbook document based on the PDF metadata using an NLP based filtering technique. Further, a plurality of index based question templates and Table of Contents (TOC) based question templates are obtained from a plurality of predefined question templates using the plurality of structures. Further, the generated plurality of long-form technical questions are generated using the obtained index and TOC based question templates. The plurality of long-form technical questions are further evaluated by the system using plurality of metrics. Further, the generated plurality of long-form technical questions are used to finetune a supervised question generation model for generating optimal questions from document structure.
-
-
-
-