Behavior-Driven Development in Malware Analysis
AbstractA daily task of malware analysts is the extraction of behaviors from malicious binaries. Such behaviors include domain generation algorithms, cryptographic algorithms or deinstallation routines. Ideally, this tedious task should be automated. So far scientific solutions have not gotten beyond proof-of-concepts. Malware analysts continue to reimplement behaviors of interest manually. However, often times they merely translate the malicious binary assembler code to a higher-level language. This yields to poorly readable and undocumented code whose correctness is not ensured. Furthermore, the current process that malware analysts are following leads to a suboptimal focusing since they deal with too much binary code at once. In this paper, we aim at overcoming these shortcomings by improving the malware analysis process regarding the reimplementation of malicious behaviors. We achieve this by integrating Behavior-Driven Development in the malware analysis process. We explain in detail how the integration of Behavior-Driven Development into the malware analysis process can be done. In a case study on the highly obfuscated malware Nymaim, we show the feasibility of our approach.
T. Bhat and N. Nagappan. Evaluating the efficacy of test-driven development: Industrial case studies. In Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering, ISESE’06, 2006.
E.M. Maximilien and L. Williams. Assessing test-driven development at ibm. In Software Engineering, 2003. Proceedings. 25th International Conference on, 2003.
C. Kolbitsch, T. Holz, C. Kruegel, and E. Kirda. Inspector gadget: Automated extraction of proprietary gadgets from malware binaries. In Security and Privacy (SP), 2010 IEEE Symposium on, 2010.
J. Caballero, N. M. Johnson, S. McCamant, and D. Song. Binary code extraction and interface identification for security applications. In NDSS, 2010.
R.G. Hamlet. Testing programs with the aid of a compiler. Software Engineering, IEEE Transactions on, 1977.
T. E. Bell and T. A. Thayer. Software requirements: Are they really a problem? In Proceedings of the 2Nd International Conference on Software Engineering, ICSE ’76, 1976.
K. Beck. Extreme Programming Explained: Embrace Change. Addison-Wesley Longman Publishing Co., Inc., 2000.
D. North, Introducing BDD. http://dannorth.net/introducing-bdd. visited on March 18, 2016.
C. A. R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, 12(10):576–580, 1969.
cucumber - BDD framework. https://cucumber.io. visited on March 18, 2016.
behave - BDD framework. http://pythonhosted.org/behave. visited on March 18, 2016.
H. A. Mller, M. A. Orgun, S. R. Tilley, and J. S. Uhl. A reverse engineering approach to subsystem structure identification, 1993.
T. Mackinnon, S. Freeman, and P. Craig. Endo-testing: unit testing with mock objects. In Extreme programming examined, 2000.
D. Plohmann, E. Gerhards-Padilla, and F. Leder. Botnets: Detection, measurement, disinfection & defence. The European Network and Information Security Agency (ENISA), 2011.
M. Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley Longman Publishing Co., Inc., 1999.
D. Plohmann. Patchwork: Stitching against malware families with ida pro. In Spring 9, 2014.
Hex-Rays. IDA Pro. https://www.hex-rays.com/products/ida/, 2015 Last access: March 18, 2016.
Immunity Inc. Immunity debugger. http://www.immunityinc.com/ products/debugger/, 2015.
Fireeye. Mandiant ApateDNS. https://www.fireeye.com/services/ freeware/mandiant-apatedns.html, 2015 Last access: March 18, 2016.
G. Van Rossum and F. L. Drake Jr. Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam, 1995.
D. Gelperin and B. Hetzel. The growth of software testing. Commun. ACM, 1988.
Van Lindberg. Intellectual Property and Open Source: A Practical Guide to Protecting Code. O’Reilly Media, 2008.
S. De Sousa Borges V. Durelli, R. Penteado and M. Viana. An iterative reengineering process applying test-driven development and reverse engineering patterns. In INFOCOMP, 2010.
T. Barabosch, A. Wichmann, F. Leder, and E. Gerhards-Padilla. Automatic extraction of domain name generation algorithms from current malware. In STO-MP-IST-111, 2012.
Google. Google translate. https://translate.google.com, 2015 Last access: March 18, 2016.
Copyright (c) 2015 Thomas Barabosch, Elmar Gerhards-Padilla
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.