From Words to Intelligence
Leveraging the Cyber Operation Constraint Principle, Natural Language Understanding, and Association Rules for Cyber Threat Analysis
This paper proposes a system for collecting and structuring blog articles about cyber-attacks, with the goal of improving the ability of security researchers to compare threat actor modus operandi.
By grounding our work in the field of criminology, we also formulate a Cyber Operation Constraint Principle that could inform future research. We derived from it a tool, the AbductionReductor, that has the potential to augment partial knowledge about a threat actor's behaviour while investigating its actions.
Our approach has the potential to significantly support cyber threat analysis and investigation. Future research must focus on the challenge of synchrony and diachrony linguistic analysis.
SCHLETTE, Daniel, CASELLI, Marco, et PERNUL, Günther. A comparative study on cyber threat intelligence: the security incident response perspective. IEEE Communications Surveys & Tutorials, 2021, vol. 23, no 4, p. 2525-2556.
CHISMON, David et RUKS, Martyn. Threat intelligence: Collecting, analysing, evaluating. MWR InfoSecurity Ltd, 2015, vol. 3, no 2, p. 20-22.
DOERR, Christian. Cyber Threat Intelligences Standards–A High Level Overview. TU Delft CTI Labs, 2018.
BERADY, Aimad. Understanding sophisticated threats. 2022. Thèse de doctorat. CentraleSupélec.
KWIATKOWSKI, Ivan, MOUCHOUX, Ronan, Automation and structured knowledge in Tactical Threat Intelligence. In : BotConf. 2018.
RAHMAN, Md Rayhanur, MAHDAVI-HEZAVEH, Rezvan, et WILLIAMS, Laurie. A literature review on mining cyberthreat intelligence from unstructured texts. In : 2020 International Conference on Data Mining Workshops (ICDMW). IEEE, 2020. p. 516-525.
BRIDGES, Robert A., HUFFER, Kelly MT, JONES, Corinne L., et al. Cybersecurity automated information extraction techniques: Drawbacks of current methods, and enhanced extractors. In : 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2017. p. 437-442.
SOLLACI, Luciana B. et PEREIRA, Mauricio G. The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey. Journal of the medical library association, 2004, vol. 92, no 3, p. 364.
WORLD HEALTH ORGANIZATION, et al. World Health Organization best practices for the naming of new human infectious diseases. In : World Health Organization best practices for the naming of new human infectious diseases. 2015.
SKONIECZNY, Stanislaw. The IUPAC rules for naming organic molecules. Journal of chemical education, 2006, vol. 83, no 11, p. 1633.
NWOGU, Kevin Ngozi. The medical research paper: Structure and functions. English for specific purposes, 1997, vol. 16, no 2, p. 119-138.
HELBICH, Marco, HAGENAUER, Julian, LEITNER, Michael, et al. Exploration of unstructured narrative crime reports: an unsupervised neural network and point pattern analysis approach. Cartography and Geographic Information Science, 2013, vol. 40, no 4, p. 326-336.
GOODCHILD, Michael F. Citizens as sensors: the world of volunteered geography. GeoJournal, 2007, vol. 69, p. 211-221.
CHEN, Hsinchun, CHUNG, Wingyan, XU, Jennifer Jie, et al. Crime data mining: a general framework and some examples. computer, 2004, vol. 37, no 4, p. 50-56.
BIRKS, Daniel, COLEMAN, Alex, et JACKSON, David. Unsupervised identification of crime problems from police free-text data. Crime Science, 2020, vol. 9, no 1, p. 18.
FOSDICK, Raymond B. Modus operandi system in the detection of criminals. J. Am. Inst. Crim. L. & Criminology, 1915, vol. 6, p. 560.
CORNISH, Derek B. The procedural analysis of offending and its relevance for situational prevention. Crime prevention studies, 1994, vol. 3, no 1, p. 151-196.
COHEN, Lawrence E. et FELSON, Marcus. Social change and crime rate trends: A routine activity approach. American sociological review, 1979, p. 588-608.
RATCLIFFE, Jerry H. Intelligence-led policing. Routledge, 2016.
CHAU, Michael, XU, Jennifer J., et CHEN, Hsinchun. Extracting meaningful entities from police narrative reports. 2002.
SPENCER, W. Dean. Software development for AN. Georgia Institute of Technology, 1979.
EDWARDS, Charles, MIGUES, Samuel, NEBEL, Roger, et al. System and method of data collection, processing, analysis, and annotation for monitoring cyber-threats and the notification thereof to subscribers. U.S. Patent Application No 09/950,820, 28 mars 2002.
LOCARD, Edmond. Traité de criminalistique. J. Desvignes, 1931.
CULLEY, Adrian. Computer forensics: past, present and future. Information security Technical report, 2003, vol. 8, no 2, p. 32-36.
OATLEY, Giles, CHAPMAN, Brendan, et SPEERS, James. Forensic intelligence and the analytical process. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2020, vol. 10, no 3, p. e1354.
EVANS, Jonathan St BT et OVER, David E. Reasoning to and from belief: Deduction and induction are still distinct. Thinking & Reasoning, 2013, vol. 19, no 3-4, p. 267-283.
STROM, Blake E., APPLEBAUM, Andy, MILLER, Doug P., et al. Mitre att&ck: Design and philosophy. In : Technical report. The MITRE Corporation, 2018.
ZHU, Ziyun et DUMITRAŞ, Tudor. Featuresmith: Automatically engineering features for malware detection by mining the security literature. In : Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 2016. p. 767-778.
HUSARI, Ghaith, AL-SHAER, Ehab, AHMED, Mohiuddin, et al. Ttpdrill: Automatic and accurate extraction of threat actions from unstructured text of cti sources. In : Proceedings of the 33rd annual computer security applications conference. 2017. p. 103-115.
HUSARI, Ghaith, NIU, Xi, CHU, Bill, et al. Using entropy and mutual information to extract threat actions from cyber threat intelligence. In : 2018 IEEE international conference on intelligence and security informatics (ISI). IEEE, 2018. p. 1-6.
AYOADE, Gbadebo, CHANDRA, Swarup, KHAN, Latifur, et al. Automated threat report classification over multi-source data. In : 2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC). IEEE, 2018. p. 236-245.
THEIN, Thin Tharaphe, EZAWA, Yuki, NAKAGAWA, Shunta, et al. Paragraph-based estimation of cyber kill chain phase from threat intelligence reports. Journal of Information Processing, 2020, vol. 28, p. 1025-1029.
LIN, Ling-Hsuan et HSIAO, Shun-Wen. Attack Tactic Identification by Transfer Learning of Language Model. arXiv preprint arXiv:2209.00263, 2022.
LEGOY, Valentine Solange Marine. Retrieving att&ck tactics and techniques in cyber threat reports. 2019. Thèse de maîtrise. University of Twente.
LEGOY, Valentine, CASELLI, Marco, SEIFERT, Christin, et al. Automated retrieval of att&ck tactics and techniques for cyber threat reports. arXiv preprint arXiv:2004.14322, 2020.
YODER, Sarah. Automating mapping to att&ck: the threat report att&ck mapper (tram) tool. 2019.
ALVES, Paulo MMR, GERALDO FILHO, P. R., et GONÇALVES, Vinícius P. Leveraging BERT's Power to Classify TTP from Unstructured Text. In : 2022 Workshop on Communication Networks and Power Systems (WCNPS). IEEE, 2022. p. 1-7.
NWALA, Alexander. A survey of 5 boilerplate removal methods. 2017.
STROUBE, Bryan. Literary freedom: Project gutenberg. XRDS: Crossroads, The ACM Magazine for Students, 2003, vol. 10, no 1, p. 3-3.
LI, Maolong, YANG, Qiang, HE, Fuzhen, et al. An unsupervised learning approach for NER based on online encyclopedia. In : Web and Big Data: Third International Joint Conference, APWeb-WAIM 2019, Chengdu, China, August 1–3, 2019, Proceedings, Part I 3. Springer International Publishing, 2019. p. 329-344.
BROMANDER, Siri, JØSANG, Audun, et EIAN, Martin. Semantic Cyberthreat Modelling. STIDS, 2016, p. 74-78.
HETTEMA, Hinne. Rationality constraints in cyber defense: incident handling, attribution and cyber threat intelligence. Computers & Security, 2021, vol. 109, p. 102396.
JONES, Corinne L., BRIDGES, Robert A., HUFFER, Kelly MT, et al. Towards a relation extraction framework for cyber-security concepts. In : Proceedings of the 10th Annual Cyber and Information Security Research Conference. 2015. p. 1-4.
RAHMAN, Md Rayhanur et WILLIAMS, Laurie. From Threat Reports to Continuous Threat Intelligence: A Comparison of Attack Technique Extraction Methods from Textual Artifacts. arXiv preprint arXiv:2210.02601, 2022.
Copyright (c) 2023 Ronan Mouchoux, François Moerman
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.