There are lots of useful ways of looking at a file to determine whether it’s malware, and if it is, what it’s trying to do. Some of those ways involve dissecting the file on disk for clues to a file’s structure (static analysis), and some of those ways involve watching the file in action (dynamically) to see what it actually does. These are both useful pieces of the puzzle, and in most instances researchers will first use static and then dynamic methods of analyzing the file.
In Part 1 – Static Analysis we covered some methods of Static analysis; namely Text View, Hex View and Assembler View. All three methods are ways of looking for clues that tell you what a file might be trying to do. For instance, is the file armored in some way to dissuade analysis? Is the text in the file indicative of a professional piece of software, or is it more casual, using curse-words or “leetspeak?” Does the file have structures that seem to be indicative of exploiting vulnerabilities in other software, or does it seem to be trying to achieve persistence? These things are not conclusive evidence alone, but help create a picture of what the file is trying to do.
The other important part of static analysis is to figure out what sorts of dependencies a file has. The first and most obvious dependency is the type of operating system – does this file run on Windows, OS X or Linux, for example? A file may have other requirements to run too, depending on what programming language it was written in, or if it tries to spread (such as through an instant messaging or peer-to-peer file sharing app) or infect certain types of files.
Once we have given the file a thorough look with static analysis methods, we hopefully have a good picture of how we need to set up our test environments.
You know that scene in Jurassic Park where they’re driving through the park and a goat is brought up as a snack for the T-Rex? (Apologies, I could only find the scene with added sheep-commentary.) In malware analysis, we have a similar idea for tempting malware to do its thing. Researchers use what we call a “sacrificial goat machine” – or just “goat” for short. This machine is set up to be the tastiest possible treat for the malware, by giving it all the conditions it needs to do what it intends to do, and appearing as much as possible to be a real user’s machine rather than a safely quarantined test machine.
In order to do this quickly, most researchers have several standard “images” that are either physical or virtual machines that can be quickly taken back to a known-clean state. This is helpful for either repeating analysis on one file if needed, or getting ready to analyze other files. Usually these include an image for various different OS versions, to see if the malware behaves differently on one versus another. If, for instance, a researcher specializes in just Mac threats they might have an image for all the supported version of OS X plus any versions that are in beta. The same goes for researchers specializing in other operating systems as well. (Though things get complicated when you throw in different Linux flavors or the limitations of different carrier or handset-manufacturers’ versions of Android.)
Once a researcher has an image all set up, the next thing he or she needs to do is start up any recording tools they might have, so they can see what changes are made by the file. A lot of malware is essentially silent, if you’re just looking at it on your screen, so we need to have tools that will report any system changes or network traffic. And those tools need to be smart enough not to be fooled by rootkit techniques that try to hide the changes.
When everything’s all ready to go, the researcher will start the file up (usually just double-clicking it), and then let it do its thing for a few minutes. Sometimes the file will need a little extra coaxing to perform its various actions, so a researcher will usually spend those minutes interacting with the goat system like a regular user would, by opening files and moving around the system. This can activate various trigger-events that malware sometimes have, hoping to verify that it is on a real user’s machine rather than in an automated honeypot machine.
Sometimes it can be helpful to isolate specific parts of a file’s behavior, especially if a sample is going to extraordinary lengths to hide its actions – and for this purpose, we have what’s called a Debugger. These tools were originally created for programmers to help them step through small sections of code, so they could find and correct bugs. Debugging files can be equally useful to a malware researcher that wants to step through small sections of code to figure out certain specific behavior within a file. It can be very helpful to get a file’s decryption routine, or to figure out passwords they use to join C&C channels, as well as to identify certain conditions used for trigger events, for instance.
If you have ever wondered why you sometimes see really in-depth analysis way after a particular malware was first discovered, it’s often because the researcher went through much of the malware’s code with a debugger. Most malware is pretty small in size, but much of the code is usually convoluted and repetitive. Going through a sample in a debugger may require analyzing a section of code once, changing a variable, stepping back through the code again, then changing another variable and doing it yet again… it can be a very arduous and time-consuming process. This sort of thorough analysis isn’t something that gets done with every sample, but with high profile or particularly tricky malware as needed.
In the End
Malware analysis can be a fairly quick and dirty process or a months-long process, depending on the skill and effort of the malware author that created it as well as the malware analyst that receives it. If an analyst is looking at his or her umpteenth variant of a family that’s been publicly released, it can be dealt with in a matter of minutes. If it’s the first sample of a heavily armored and feature-rich spyware that’s hitting hundreds of thousands of users, dozens of researchers around the world are probably going to spend a lot of long nights trying to provide useful and juicy tidbits about its behavior. Hopefully we’ve given you some insight into what that process entails, so it’ll seem less mysterious.