000151313 001__ 151313
000151313 005__ 20251017144618.0
000151313 0247_ $$2doi$$a10.1016/j.fsidi.2021.301120
000151313 0248_ $$2sideral$$a124080
000151313 037__ $$aART-2021-124080
000151313 041__ $$aeng
000151313 100__ $$0(orcid)0000-0003-1666-5325$$aMartín-Pérez, M.$$uUniversidad de Zaragoza
000151313 245__ $$aBringing order to approximate matching: Classification and attacks on similarity digest algorithms
000151313 260__ $$c2021
000151313 5060_ $$aAccess copy available to the general public$$fUnrestricted
000151313 5203_ $$aFuzzy hashing or similarity hashing (a.k.a. bytewise approximate matching) converts digital artifacts into an intermediate representation to allow an efficient (fast) identification of similar objects, e.g., for blacklisting. They gained a lot of popularity over the past decade with new algorithms being developed and released to the digital forensics community. When releasing algorithms (e.g., as part of a scientific article), they are frequently compared with other algorithms to outline the benefits and sometimes also the weaknesses of the proposed approach. However, given the wide variety of algorithms and approaches, it is impossible to provide direct comparisons with all existing algorithms. In this paper, we present the first classification of approximate matching algorithms which allows an easier description and comparisons. Therefore, we first reviewed existing literature to understand the techniques various algorithms use and to familiarize ourselves with the common terminology. Our findings allowed us to develop a categorization relying heavily on the terminology proposed by NIST SP 800-168. In addition to the categorization, this article presents an abstract set of attacks against algorithms and why they are feasible. Lastly, we detail the characteristics needed to build robust algorithms to prevent attacks. We believe that this article helps newcomers, practitioners, and experts alike to better compare algorithms, understand their potential, as well as characteristics and implications they may have on forensic investigations.
000151313 536__ $$9info:eu-repo/grantAgreement/ES/DGA/T21-20R-DISCO$$9info:eu-repo/grantAgreement/ES/MICIU/Medrese-RTI2018-098543-B-I00$$9info:eu-repo/grantAgreement/ES/MINECO-INCIBE/INCIBEC-2015-02486$$9info:eu-repo/grantAgreement/ES/MINECO-INCIBE/INCIBEI-2015-27300
000151313 540__ $$9info:eu-repo/semantics/openAccess$$aby-nc-nd$$uhttps://creativecommons.org/licenses/by-nc-nd/4.0/deed.es
000151313 590__ $$a1.805$$b2021
000151313 591__ $$aCOMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS$$b94 / 112 = 0.839$$c2021$$dQ4$$eT3
000151313 591__ $$aCOMPUTER SCIENCE, INFORMATION SYSTEMS$$b131 / 163 = 0.804$$c2021$$dQ4$$eT3
000151313 592__ $$a1.23$$b2021
000151313 593__ $$aComputer Science Applications$$c2021$$dQ1
000151313 593__ $$aPathology and Forensic Medicine$$c2021$$dQ1
000151313 593__ $$aMedical Laboratory Technology$$c2021$$dQ1
000151313 593__ $$aInformation Systems$$c2021$$dQ1
000151313 594__ $$a5.0$$b2021
000151313 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000151313 700__ $$0(orcid)0000-0001-7982-0359$$aRodríguez, R.J.$$uUniversidad de Zaragoza
000151313 700__ $$aBreitinger, F.
000151313 7102_ $$15007$$2570$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Lenguajes y Sistemas Inf.
000151313 773__ $$g36, Suplem. (2021), 301120 [9 pp.]$$pForensic sci. int. digital invest.$$tForensic science international. Digital investigation$$x2666-2825
000151313 8564_ $$s334204$$uhttps://zaguan.unizar.es/record/151313/files/texto_completo.pdf$$yVersión publicada
000151313 8564_ $$s2676072$$uhttps://zaguan.unizar.es/record/151313/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000151313 909CO $$ooai:zaguan.unizar.es:151313$$particulos$$pdriver
000151313 951__ $$a2025-10-17-14:20:38
000151313 980__ $$aARTICLE