IronNet Blog

The need for behavior-based detection as attackers adopt uncommon coding languages

According to recent findings from the BlackBerry Research and Intelligence Team, exotic programming languages are gaining popularity among both APTs (advanced persistent threats) and cybercriminals alike. 

By writing or rewriting their malware in uncommon programming languages, malware developers can easily bypass static signature-based detections, thus creating a significant hole in our cyber defenses. One way to suture this gap is by using behavioral-based detection to more effectively tag anomalous behavior if and when traditional signatures fail.

Signature-based detection vs. Behavior-based detection

Signature-based detection typically monitors inbound network traffic to find predefined patterns and sequences that match known indicators of compromise (IOCs). Behavior-based detection, on the other hand, goes beyond just identifying patterns linked to specific types of attacks or malware. 

Behavioral analytics examine the patterns and activities of users and applications in a network to create a behavioral baseline that learns and adapts to the dynamic nature of an organization’s raw network traffic. Using network traffic analysis, behavior-based solutions identify anomalies in a network that deviate from standard activity, allowing defenders to catch previously unseen threats.

How developers are integrating exotic programming languages into malware 

One way threat actors are getting creative is by “wrapping” commodity malware in loaders and droppers coded in unconventional languages in order to obfuscate the first stage of the infection process. Other developers are taking it a step further and completely rewriting existing malware code to create new and improved variants.

And some threat actors are doing both. APT28 – a group known to be associated with Russia’s General Staff Main Intelligence Directorate (GRU) – leverages a multi-language kill chain and has repeatedly employed unusual languages in its development process. For example, APT28’s Zebrocy backdoor was originally written in Delphi in 2015, but was rewritten from Delphi to Go in 2018. A year later in 2019, the Zebrocy downloader first popped up in Nim, but was later seen rewritten from Nim to Go in October 2020. 

APT28 still leverages the same initial intrusion vector and many of the same tactics, suggesting it is likely easier for malicious actors to port their original malware code to other languages rather than changing their tactics, techniques, and procedures (TTPs) to dodge defenders. Because TTPs are really just adversarial "behaviors," they are more difficult for an attacker to change and are thus the best type of indicators for defenders to focus on.

Uncommon programming languages make signature-based detection inadequate

Until recently, it’s been rare to see malware written in these languages. As such, reverse engineers are not as familiar with their implementation, and malware analysis tools and sandboxes may have a difficult time analyzing their samples. Additionally, unlike traditional C-based languages, it’s harder to identify and decipher uncommon programming languages because they are composed of more complex and convoluted binaries.

Here’s where it becomes a problem. Anti-virus (AV) products / endpoint detection and response (EDR) solutions follow pre-existing lists of common detections in order to scan and sandbox C-language executables. However, when these security tools encounter an unrecognized language, they’ll allow it to pass and the malicious activity will go unflagged because of a lack of heuristics for known malicious actions. 

Rewriting malware breaks the static signatures for well-known malware families, and the lack of an identifiable signature makes this tactic attractive to threat actors who want to add additional layers of obfuscation to their attacks. And since signature-based detection depends on specific static characteristics within a file, it is virtually useless when encountering unknown malware strains.

The signature-based approach will forever be a “cat and mouse game” of attackers making small tweaks in their process to try to catch ever-changing threat actors. Because coding concepts for malware remain relatively the same, it is not as much of a “jump” for developers to adopt new languages, making signature-based detection an ineffective security solution in these cases.

Why there needs to be a focus on behavior/network-based detection

At first glance, it may seem that there isn’t a good solution to combat exotic malware. However, even though threat actors change the coding language to slip signature-based EDRs, the action and behavior of the malware remains unchanged. This means it’s time for defenders to focus on how malware interacts with the system itself in order to effectively tag dynamic behavior. That’s why network-based behavioral analytics are the most effective way to track and defend against malware written in exotic languages. 

For example, once malware – no matter the language it is written in – infiltrates a host machine, it establishes communication with an external server to receive instructions, download additional payloads, or exfiltrate information. Because of this, defenders are able to monitor host activity along with the type and quantity of traffic entering and exiting the network to detect malicious behavior by unknown malware variants.

To be clear, signature-based detections should not just be thrown out; however, security professionals must integrate behavioral detections on both endpoints as well as the network to ensure full coverage.

A deeper look at the uncommon languages 

Examples of malware written in these languages:

Go  Rust Nim Dlang

Zebrocy

WellMess

ElectroRAT

Robbinhood

NanoCore Dropper

RustyBuer

Convuster Adware

PyOxidizer

Nim-based Cobalt Strike loaders

NimzaLoader

DeroHe

Zebrocy

Vovalex

RemcosRat

OutCrypt

DShell

(Data compiled from BlackBerry Research and Intelligence Team)

DLang

First released in 2007, DLang (aka D) is a multi-use programming language that closely mirrors a C-like syntax. Given the similar general look to C-based languages, developers can author and port code in D quickly and efficiently with a relatively low learning curve. 

Despite the features of D that make it an optimal language for malware developers, there are only a handful of documented D-based malicious executables. One of these is the Vovalex ransomware family, which first popped up in February 2021 and is the first observed ransomware written in D. By today’s standards, Vovalex is a relatively unsophisticated strain that doesn’t include many of the functionalities existent in modern ransomware, like spreading mechanisms or deleting shadow copies; however, it can still cause considerable damage to victims. 

Nim

Nim was first released in 2008, and it was designed with three objectives in mind: efficiency, expressiveness, and elegance. As a statically typed and compiled language, Nim is becoming an increasingly popular language due to multiple features that set it apart from other options, like a low barrier to entry for new developers, support for multiple programming paradigms, and native and dependency-free binaries. 

More malware strains are being written in Nim, such as NimzaLoader, which was first distributed in a phishing campaign by threat actor TA800 in February 2021. Some researchers speculate that NimzaLoader is a variant of BazarLoader – which was predominantly used by TA800 in the past – but distinct differences between the two pieces of malware lead some analysts to consider NimzaLoader as its own malware family. 

Rust

Rust is a relatively new programming language that has become a fan favorite among developers, offering a solution to many of the pain points common in other languages. Becoming an independent organization from Mozilla in 2015, the Rust language provides sizable improvement in memory safety over more traditional languages and has features such as a borrowing system that significantly reduces the exploitation of services and an ownership model that boosts multiprocessing efficiency. 

Rust has been used by malware developers to create entirely new malware strains, rewrite backdoors or loaders, and to develop new variants of existing malware. One example is RustyBuer, which is a new variation of the Buer malware loader that has been found targeting over 50 industry verticals. Another is Convuster, which is a piece of adware written in Rust that has been targeting macOS systems since March 2021. 

Go 

As a simple, reliable, and efficient language, Go has become an advantageous language for malware authors to leverage because its large binaries often evade AV and EDR detection. An open-source, statically typed, compiled language, Go adds significant difficulty to performing malware analysis and reverse engineering on a suspect binary, thus making it an advantageous language to leverage by threat actors. 

There has been a continous growth of malicious payloads developed in this language, and Go-based malware has been hitting systems on a nearly regular basis over the last year. This includes Russian APT28’s Zebrocy and APT29’s WellMess, which were both developed into new variants using Go. One example of using Go to design and develop an entirely new malware strain from the ground up is ElectroRAT, which appears as a Trojanized version of common cryptocurrency-related applications and has the main goal of targeting and pilfering victims’ crypto wallets.

In Summary

Security professionals need to stay on top of the proliferation of malware being written in more uncommon languages to avoid being flooded with new threats that they are unable to detect and mitigate. 

The threat of malware being rewritten in once-considered “rare” languages is rising. Therefore, it is vital for the cybersecurity community to focus on the behavior of malware and improve the deployment of behavior/network-based detection analytics in order to remain proactive in defending against the malicious use of unusual programming languages.