Method and electronic device for generating semantic representation of document to determine data security risk

    公开(公告)号:US12050682B2

    公开(公告)日:2024-07-30

    申请号:US18144239

    申请日:2023-05-07

    IPC分类号: G06F21/55

    CPC分类号: G06F21/552 G06F2221/034

    摘要: A method and an electronic device (100) are disclosed for generating semantic representation of a document to determine data security risk associated with the document. The method includes receiving, by a document semantics controller (160) of the electronic device (100), a document in an electronic form and determining, by the document semantics controller (160), raw text. Further, the method includes generating, by the document semantics controller (160), a plurality of sentence blocks using the raw text and determining, by the document semantics controller (160), embeddings for the plurality of sentence blocks. Further, the method includes determining, by the document semantics controller (160), the semantic representation of the document based on the embeddings for each of the sentence blocks; and generating, by the document semantics controller (160), the semantic representation of the document to determine the data security risk associated with the document.

    METHOD AND ELECTRONIC DEVICE TO ASSIGN APPROPRIATE SEMANTIC CATEGORIES TO DOCUMENTS WITH ARBITRARY GRANULARITY

    公开(公告)号:US20240202215A1

    公开(公告)日:2024-06-20

    申请号:US18083522

    申请日:2022-12-18

    IPC分类号: G06F16/28 G06F18/23213

    CPC分类号: G06F16/285 G06F18/23213

    摘要: Embodiments herein disclose a method for determining at least one semantic category of at least one document using an electronic device 100. The method includes receiving at least one document embedding indicating a semantic representation of at least one document. Further, the method includes determining a probable set of semantic categories of a plurality of semantic categories associated with the document embedding based on an execution of the at least one document embedding on a plurality of proto-models. Further, the method includes receiving the semantic model associated with each of the probable set of semantic categories. Further, the method includes executing the at least one document embedding on the received semantic model. Further, the method includes determining the at least one semantic category out of the probable set of semantic categories, of the at least one document embedding based on the at least one executed document.

    METHOD AND ELECTRONIC DEVICE FOR GENERATING SEMANTIC REPRESENTATION OF DOCUMENT TO DETERMINE DATA SECURITY RISK

    公开(公告)号:US20230273992A1

    公开(公告)日:2023-08-31

    申请号:US18144239

    申请日:2023-05-07

    IPC分类号: G06F21/55

    CPC分类号: G06F21/552 G06F2221/034

    摘要: A method and an electronic device (100) are disclosed for generating semantic representation of a document to determine data security risk associated with the document. The method includes receiving, by a document semantics controller (160) of the electronic device (100), a document in an electronic form and determining, by the document semantics controller (160), raw text. Further, the method includes generating, by the document semantics controller (160), a plurality of sentence blocks using the raw text and determining, by the document semantics controller (160), embeddings for the plurality of sentence blocks. Further, the method includes determining, by the document semantics controller (160), the semantic representation of the document based on the embeddings for each of the sentence blocks; and generating, by the document semantics controller (160), the semantic representation of the document to determine the data security risk associated with the document.

    Method and electronic device for generating semantic representation of document to determine data security risk

    公开(公告)号:US11687647B2

    公开(公告)日:2023-06-27

    申请号:US17160369

    申请日:2021-01-27

    IPC分类号: G06F21/55

    CPC分类号: G06F21/552 G06F2221/034

    摘要: A method and an electronic device (100) are disclosed for generating semantic representation of a document to determine data security risk associated with the document. The method includes receiving, by a document semantics controller (160) of the electronic device (100), a document in an electronic form and determining, by the document semantics controller (160), raw text. Further, the method includes generating, by the document semantics controller (160), a plurality of sentence blocks using the raw text and determining, by the document semantics controller (160), embeddings for the plurality of sentence blocks. Further, the method includes determining, by the document semantics controller (160), the semantic representation of the document based on the embeddings for each of the sentence blocks; and generating, by the document semantics controller (160), the semantic representation of the document to determine the data security risk associated with the document.

    METHOD AND ELECTRONIC DEVICE FOR GENERATING SEMANTIC REPRESENTATION OF DOCUMENT TO DETERMINE DATA SECURITY RISK

    公开(公告)号:US20210256115A1

    公开(公告)日:2021-08-19

    申请号:US17160369

    申请日:2021-01-27

    IPC分类号: G06F21/55 G06F40/35

    摘要: A method and an electronic device (100) are disclosed for generating semantic representation of a document to determine data security risk associated with the document. The method includes receiving, by a document semantics controller (160) of the electronic device (100), a document in an electronic form and determining, by the document semantics controller (160), raw text. Further, the method includes generating, by the document semantics controller (160), a plurality of sentence blocks using the raw text and determining, by the document semantics controller (160), embeddings for the plurality of sentence blocks. Further, the method includes determining, by the document semantics controller (160), the semantic representation of the document based on the embeddings for each of the sentence blocks; and generating, by the document semantics controller (160), the semantic representation of the document to determine the data security risk associated with the document.