Using deep learning techniques to determine the contextual reading order in a form document
Abstract:
Techniques for determining reading order in a document. A current labeled text run (R1), RIGHT text run (R1) and DOWN text run (R3) are generated. The R1 labeled text run is processed by a first LSTM, the R2 labeled text run is processed by a second LSTM, and the R3 labeled text run is processed by a third LSTM, wherein each of the LSTMs generates a respective internal representation (R1′, R2′ and R3′). Deep learning tools other than LSTMs can be used, as will be appreciated. The respective internal representations R1′, R2′ and R3′ are concatenated or otherwise combined into a vector or tensor representation and provided to a classifier network that generates a predicted label for a next text run as RIGHT, DOWN or EOS in the reading order of the document.
Information query
Patent Agency Ranking
0/0