Challenges and Open Problems of Legal Document Anonymization

Csányi, Gergely Márk; Nagy, Dániel; Vági, Renátó; Vadász, János Pál; Orosz, Tamás

Challenges and Open Problems of Legal Document Anonymization

Csányi Gergely Márk; Nagy Dániel; Vági Renátó; Vadász János Pál; Orosz Tamás

DOI : 10.3390/sym13081490

URI : http://hdl.handle.net/20.500.12944/20330

MTMT : 32164020

Date : 2021

Journal title : Symmetry

Journal volume : 13

Journal issue number : 8

Pages : 25.jan

Document type : folyóiratcikk

Subject : data mining, text mining, text recognition, machine learning, knowledge engineering, Társadalomtudományok, Állam- és jogtudományok

Abstract :

Data sharing is a central aspect of judicial systems. The openly accessible documents can make the judiciary system more transparent. On the other hand, the published legal documents can contain much sensitive information about the involved persons or companies. For this reason, the anonymization of these documents is obligatory to prevent privacy breaches. General Data Protection Regulation (GDPR) and other modern privacy-protecting regulations have strict definitions of private data containing direct and indirect identifiers. In legal documents, there is a wide range of attributes regarding the involved parties. Moreover, legal documents can contain additional information about the relations between the involved parties and rare events. Hence, the personal data can be represented by a sparse matrix of these attributes. The application of Named Entity Recognition methods is essential for a fair anonymization process but is not enough. Machine learning-based methods should be used together with anonymization models, such as differential privacy, to reduce re-identification risk. On the other hand, the information content (utility) of the text should be preserved. This paper aims to summarize and highlight the open and symmetrical problems from the fields of structured and unstructured text anonymization. The possible methods for anonymizing legal documents discussed and illustrated by case studies from the Hungarian legal practice.

Show full item record

Files in this item

Challenges and Open Problems of Legal Document Anonymization

Name: symmetry-13-01490 (1).pdf

Size: 1.153Mb

Format: PDF

Open

Challenges and Open Problems of Legal Document Anonymization

Abstract :

Files in this item

Tallózás a gyűjteményekben