000087544 001__ 87544
000087544 005__ 20200716101509.0
000087544 0247_ $$2doi$$a10.1016/j.eswa.2019.01.050
000087544 0248_ $$2sideral$$a110344
000087544 037__ $$aART-2019-110344
000087544 041__ $$aeng
000087544 100__ $$aSelvi, J.
000087544 245__ $$aDetection of algorithmically generated malicious domain names using masked N-grams
000087544 260__ $$c2019
000087544 5060_ $$aAccess copy available to the general public$$fUnrestricted
000087544 5203_ $$aMalware detection is a challenge that has increased in complexity in the last few years. A widely adopted strategy is to detect malware by means of analyzing network traffic, capturing the communications with their command and control (C&C) servers. However, some malware families have shifted to a stealthier communication strategy, since anti-malware companies maintain blacklists of known malicious locations. Instead of using static IP addresses or domain names, they algorithmically generate domain names that may host their C&C servers. Hence, blacklist approaches become ineffective since the number of domain names to block is large and varies from time to time. In this paper, we introduce a machine learning approach using Random Forest that relies on purely lexical features of the domain names to detect algorithmically generated domains. In particular, we propose using masked N-grams, together with other statistics obtained from the domain name. Furthermore, we provide a dataset built for experimentation that contains regular and algorithmically generated domain names, coming from different malware families. We also classify these families according to their type of domain generation algorithm. Our findings show that masked N-grams provide detection accuracy that is comparable to that of other existing techniques, but with much better performance.
000087544 536__ $$9info:eu-repo/grantAgreement/ES/DGA/T21-17R-DISCO
000087544 540__ $$9info:eu-repo/semantics/openAccess$$aby-nc-nd$$uhttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
000087544 590__ $$a5.452$$b2019
000087544 591__ $$aCOMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE$$b21 / 136 = 0.154$$c2019$$dQ1$$eT1
000087544 591__ $$aOPERATIONS RESEARCH & MANAGEMENT SCIENCE$$b2 / 83 = 0.024$$c2019$$dQ1$$eT1
000087544 591__ $$aENGINEERING, ELECTRICAL & ELECTRONIC$$b32 / 266 = 0.12$$c2019$$dQ1$$eT1
000087544 592__ $$a1.494$$b2019
000087544 593__ $$aArtificial Intelligence$$c2019$$dQ1
000087544 593__ $$aEngineering (miscellaneous)$$c2019$$dQ1
000087544 593__ $$aComputer Science Applications$$c2019$$dQ1
000087544 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/acceptedVersion
000087544 700__ $$0(orcid)0000-0001-7982-0359$$aRodríguez, R.J.$$uUniversidad de Zaragoza
000087544 700__ $$aSoria-Olivas, E.
000087544 7102_ $$15007$$2570$$aUniversidad de Zaragoza$$bDpto. Informát.Ingenie.Sistms.$$cÁrea Lenguajes y Sistemas Inf.
000087544 773__ $$g124 (2019), 156-163$$pExpert syst. appl.$$tExpert Systems with Applications$$x0957-4174
000087544 8564_ $$s297880$$uhttps://zaguan.unizar.es/record/87544/files/texto_completo.pdf$$yPostprint
000087544 8564_ $$s54122$$uhttps://zaguan.unizar.es/record/87544/files/texto_completo.jpg?subformat=icon$$xicon$$yPostprint
000087544 909CO $$ooai:zaguan.unizar.es:87544$$particulos$$pdriver
000087544 951__ $$a2020-07-16-09:17:48
000087544 980__ $$aARTICLE