If you are interested in R&D in malware detection using AI: you can use WMI API to extract fruitful information about every malware and build a dataset for malware, specifically file-less malware. Windows Management Instrumentation (WMI) API: is used to monitor windows operating systems. For example monitoring process creations, services, and privileges information for every malware, determining if the malware is packed or not. by checking allocation virtual size. Furthermore, threat actors use WMI in malicious intent, such as developing file-less malware You can write a WMI script using C++/C/ python/Powershell. the attachment image is an example from the collected data. source code: https://lnkd.in/ganSBPai you can extend this code to extract more information using WMI from these links: https://lnkd.in/gcxwDyf4 https://lnkd.in/g-H2NzqV reference: blackhat python book
With growing the malware there is an approach called Shared code analysis, or similarity analysis, that will save tons of reverse engineering work for malware researchers. Shared code analysis is an approach to comparing two malware samples by estimating the percentage of precompilation source code they share. There are four measures to identify similarity between malware samples: 1-instruction sequence based similarity (x86 Assembly instructions). 2-String based similarity . 3- IAT based similarity. 4- Dynamic API Call based similarity (you can collect malicious API Calls from logs) . Benefits of shared code analysis approach: -Determine a new malware sample’s code similarity to thousands of previously seen malware samples, -Identify new malware families based on sharing code. -Visualize malware relationships to know the most common techniques that threat actors use (this benefit is important in building malware detector based ML). -Replacement for manual reverse engineering work. H