
Authors:
(1) S M Rakib Hasan, Department of Computer Science and Engineering, BRAC University, Dhaka, Bangladesh (sm.rakib.hasan@g.bracu.ac.bd);
(2) Aakar Dhakal, Department of Computer Science and Engineering, BRAC University, Dhaka, Bangladesh (aakar.dhakal@g.bracu.ac.bd).
V. Conclusion and Future Work, and References
In response to the escalating complexity of malware, researchers have been exploring innovative approaches to strengthen cybersecurity efforts. The development and application of advanced techniques, such as machine learning algorithms and behavioural analysis, have shown promise in augmenting the accuracy and efficiency of malware detection. Furthermore, as malware authors become more adept at evading detection, the focus has shifted toward understanding and countering the methods used for obfuscation.
The study [1] discussed the use of static disassembly and machine learning for malware classification, proposing four easy-to-extract and small-scale features for classification. The authors compare their proposed features with detailed behaviour-related features like API sequences and show that the proposed features provide macroscopic information about malware, achieving high accuracy with a smaller feature vector. The paper discusses various approaches to malware analysis combined with machine learning and hand-designed static features. Some approaches mentioned include using strings, registry changes, and API sequences for distinguishing malware variants, extracting 3 grams of byte codes, and using image representations with the K-Nearest Neighbors algorithm. The study also mentions the challenges in malware analysis, such as concept drift and the need for interpretability and explainability in models and features.
According to another study [2], malware analysis involves two main tasks: malware detection and malware classification, with the latter assigning each sample to the correct malware family. Malware classification systems can be divided into two groups: dynamic analysis and static analysis. Dynamic analysis methods capture the behaviour of the program at runtime by monitoring interactions with the operating system, such as analysing API calls and their temporal order or extracting behavioural graphs based on API call parameters. Static analysis methods perform analysis without executing the program, using techniques like pattern detection, bytecode analysis, or disassembling the code to extract information on the program’s content. Both dynamic and static analysis rely on the extraction of features for malware detection and classification. This paper focuses on malware classification based on the extraction of static features from PE malware designed for MS Windows systems.
Another paper [3] explored malware obfuscation techniques, including encrypted, oligomorphic, polymorphic, and metamorphic malwares, which are used to evade antivirus scanners. It provides an overview of the history of malwares developed to defeat signature-based antivirus scanners. The paper discusses the obfuscation techniques commonly used by polymorphic and metamorphic malwares, such as subroutine reordering, instruction substitution, and code transposition. It highlights the limitations of encrypted malwares and the development of oligomorphic and polymorphic malwares to overcome those limitations. The paper also mentions the use of emulation and armouring techniques by antivirus tools to detect and prevent polymorphic malwares.
The paper [4] proposes a novel framework for Android malware detection using various features and a multimodal deep-learning method. The framework is based on static analysis and aims to distinguish between malware and benign applications. The authors compare the accuracy of their model with other deep neural network models and evaluate the framework’s performance in terms of efficiency in model updates, the usefulness of diverse features, and the feature representation method. The proposed framework outperforms a previously proposed method in terms of detection accuracy, precision, recall, F-measure, and overall accuracy. The features generated by the framework effectively capture application characteristics for malware detection. The paper introduces the first application of multimodal deep learning to Android malware detection and provides experimental results to evaluate the performance of the framework.
The current paper [5]demonstrated experimentally that dynamic analysis-based detection performs equally well in evaluating obfuscated and non-obfuscated malware, while static analysis-based detection is more accurate on non-obfuscated samples but is greatly negatively affected by obfuscation. However, this effect can be mitigated by using obfuscated samples in the learning phase.
The paper [6] provides a comprehensive review of malware detection approaches and recent detection methods, ware detection approaches and recent detection methods, aiming to help researchers gain a general understanding of these approaches and their pros and cons. The authors discuss signature-based and heuristic-based detection approaches, which are fast and efficient for known malware but struggle with unknown malware. They also mention behaviour-based, model-checking-based, and cloud-based approaches, which perform well for unknown and complicated malware. Additionally, deep learning-based, mobile devices-based, and IoT-based approaches are emerging to detect both known and unknown malware. However, no single approach can detect all malware in the wild. The literature review section of the paper mentions several related studies. For example, [7] proposes a technique to extract malware behaviours from system calls and use a phylogenetic tree to improve the classification. Fukushima et al [8] propose a behaviour-based detection approach for Windows OS, and [9] suggest a bounded feature space behaviour modelling method. Other studies focus on system-centric behaviour models and hardware-enhanced architectures for malware detection.
This paper is available on arxiv under CC BY-SA 4.0 DEED license.