Szerző dc.contributor.author | Csányi Gergely Márk | |
Szerző dc.contributor.author | Nagy Dániel | |
Szerző dc.contributor.author | Vági Renátó | |
Szerző dc.contributor.author | Vadász János Pál | |
Szerző dc.contributor.author | Orosz Tamás | |
Elérhetőség dátuma dc.date.accessioned | 2023-04-05T09:26:07Z | |
Rendelkezésre állás dátuma dc.date.available | 2023-04-05T09:26:07Z | |
Kiadás dc.date.issued | 2021 | |
Issn dc.identifier.issn | 2073-8994 | |
Uri dc.identifier.uri | http://hdl.handle.net/20.500.12944/20330 | |
Kivonat dc.description.abstract | Data sharing is a central aspect of judicial systems. The openly accessible documents can make the judiciary system more transparent. On the other hand, the published legal documents can contain much sensitive information about the involved persons or companies. For this reason, the anonymization of these documents is obligatory to prevent privacy breaches. General Data Protection Regulation (GDPR) and other modern privacy-protecting regulations have strict definitions of private data containing direct and indirect identifiers. In legal documents, there is a wide range of attributes regarding the involved parties. Moreover, legal documents can contain additional information about the relations between the involved parties and rare events. Hence, the personal data can be represented by a sparse matrix of these attributes. The application of Named Entity Recognition methods is essential for a fair anonymization process but is not enough. Machine learning-based methods should be used together with anonymization models, such as differential privacy, to reduce re-identification risk. On the other hand, the information content (utility) of the text should be preserved. This paper aims to summarize and highlight the open and symmetrical problems from the fields of structured and unstructured text anonymization. The possible methods for anonymizing legal documents discussed and illustrated by case studies from the Hungarian legal practice. | |
Nyelv dc.language | en | |
Kulcsszó dc.subject | data mining, text mining, text recognition, machine learning, knowledge engineering | |
Cím dc.title | Challenges and Open Problems of Legal Document Anonymization | |
Típus dc.type | folyóiratcikk | |
Változtatás dátuma dc.date.updated | 2023-04-04T12:08:25Z | |
Változat dc.description.version | kiadói | |
Hozzáférés dc.rights.accessRights | nyílt hozzáférésű | |
dc.description.notes | Funding Agency and Grant Number: National Research, Development and Innovation Fund of Hungary under the 2020-1.1.2-PIACI KFI funding scheme [2020-1.1.2-PIACI-KFI-2020-00049] Funding text: Project No. 2020-1.1.2-PIACI-KFI-2020-00049 has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the 2020-1.1.2-PIACI KFI funding scheme. | |
Doi azonosító dc.identifier.doi | 10.3390/sym13081490 | |
Tudományág dc.subject.discipline | Társadalomtudományok | |
Tudományterület dc.subject.sciencebranch | Állam- és jogtudományok | |
Mtmt azonosító dc.identifier.mtmt | 32164020 | |
Folyóirat dc.identifier.journalTitle | Symmetry | |
Évfolyam dc.identifier.journalVolume | 13 | |
Füzetszám dc.identifier.journalIssueNumber | 8 | |
Terjedelem dc.format.page | 25.jan | |
Wos azonosító dc.identifier.wos | 000689974900001 | |
Scopus azonosító dc.identifier.scopus | 85112768075 | |
Folyóiratcím rövidítve dc.identifier.journalAbbreviatedTitle | SYMMETRY-BASEL | |
Kiadás éve dc.description.issuedate | 2021 | |
Szerző intézménye dc.contributor.department | Állam- és Jogtudományi Doktori Iskola | |
Szerző intézménye dc.contributor.department | Információs Társadalom Kutatóintézet | |
Szerző intézménye dc.contributor.department | Villamos Energetika Tanszék| |