Conversational agent pipeline trained on synthetic data

    公开(公告)号:US10210861B1

    公开(公告)日:2019-02-19

    申请号:US16146924

    申请日:2018-09-28

    申请人: Apprente, Inc.

    摘要: In one embodiment synthetic training data items are generated, each comprising a) a textual representation of a synthetic sentence and b) one or more transcodes of the synthetic sentence comprising one or more actions and one or more entities associated with the one or more actions. For each synthetic training data item, the textual representation of the synthetic sentence is converted into a sequence of phonemes that represent the synthetic sentence. A first machine learning model is then trained as a transcoder that determines transcodes comprising actions and associated entities from sequences of phonemes, wherein the training is performed using a first training dataset comprising the plurality of synthetic training data items that comprise a) sequences phonemes that represent synthetic sentences and b) transcodes of the synthetic sentences. The transcoder may be used in a conversational agent.