Methods and systems for automated detection of personal information using neural networks
Abstract:
A method, a computing device, and a non-transitory machine-readable medium for detecting personal information. Terms that are of interest are extracted from a corpus of raw text that has been extracted from a collection of documents. For each of the terms, a surrounding sentence is extracted to form a target sentence to thereby form a plurality of target sentences. The surrounding sentence includes at least one reference to a data subject. A matrix of feature information is generated for each of the target sentences to form a plurality of matrices. A neural network model is trained, using the matrices as input, to compute an output that indicates a likelihood of a given sentence containing personal information.
Information query
Patent Agency Ranking
0/0