METHODS AND SYSTEMS FOR PREDICTING DIFFICULTY OF LONG FORM TECHNICAL QUESTIONS USING WEAK SUPERVISION

    公开(公告)号:US20240111964A1

    公开(公告)日:2024-04-04

    申请号:US18454136

    申请日:2023-08-23

    CPC classification number: G06F40/40 G06F16/35 G06F40/137 G06F40/186

    Abstract: Technical interviewing is important for organizations for assessing a candidate to make hiring decision. For effective technical interviewing, predicting difficulty of long form technical questions is crucial. The present disclosure provides systems and methods for predicting difficulty of long form technical questions using weak supervision from textbooks. Further, zero shot pre-trained large language models and unsupervised template-based technique are used for generating questions. Furthermore, a difficulty score is assigned to the generated questions based on context difficulty and task difficulty. The context difficulty for the generated questions is computed using hierarchical structure of the textbooks, and the task difficulty is computed by determining a similarity between the generated questions and Bloom's taxonomy levels. In the present disclosure, few supervised question difficulty prediction models are trained by means of weak supervision using the generated questions and corresponding difficulty scores and further evaluated for prediction performance using a gold-standard question difficulty dataset.

    Method and system for long-form answer extraction based on combination of sentence index generation techniques

    公开(公告)号:US12111856B2

    公开(公告)日:2024-10-08

    申请号:US18470657

    申请日:2023-09-20

    CPC classification number: G06F16/31 G06F16/3329

    Abstract: This disclosure relates generally to long-form answer extraction and, more particularly, to long-form answer extraction based on combination of sentence index generation techniques. Existing answer extractions techniques have achieved significant progress for extractive short answers; however, less progress has been made for long form questions that require explanations. Further the state-of-art long-answer extractions techniques result in poorer long-form answers or not address sparsity which becomes an issue longer contexts. Additionally, pre-trained generative sequence-to-sequence models are gaining popularity for factoid answer extraction tasks. Hence the disclosure proposes a long-form answer extraction based on several steps including training a set of generative sequence-to-sequence models comprising a sentence indices generation model and a sentence index spans generation. The trained set of generative sequence-to-sequence models is further utilized for model long-form answer extraction based on a union of several sentence index generation techniques comprising a sentence indices and a sentence index spans.

    Method and system for generating annotations and field-names for relational schema

    公开(公告)号:US11880345B2

    公开(公告)日:2024-01-23

    申请号:US17463591

    申请日:2021-09-01

    CPC classification number: G06F16/211

    Abstract: This disclosure relates generally to generating annotations and field-names for a relational schema. Typically, most domains have relational database (RDB) system built for them instead of domain ontologies and usually linguistic information of the schema is not used to recover the domain terms. The disclosed method and system facilitate generating annotations and field-names for a relational schema, while considering the linguistic information of a schema by using a trained model, trained through a proposed training technique. The trained model comprises of at least one knowledge graph and a set of associated parameters. The trained model is further used to perform a plurality of tasks, wherein the plurality of tasks include generating a plurality of new fieldnames for a relational schema through a stochastic generative process and for generating a new annotation for a fieldname of a relational schema through a probabilistic inference technique.

Patent Agency Ranking