Network Communication - Lab

1. Laboratory
2. Sample one
3. Sample 02
- 3.1. Check the networking libs and corresponding methods used by malware to communicate

1. Laboratory

Class: Malware Analysis and Incident Forencsis
Topic: Network Communication

2. Sample one

2.1. Check networking libs and corresponding methods used by malware to communicate.

It’s possible to obtain this kind of information via static analysis, for example using PEStudio. In the libraries section we see that the PE uses:

kernel32.dll
advapi32.dll
urlmon.dll

A fast search on the web gives as result that urlmon.dll is a windows system library that exposes APIs for accessing URL and URI functions.

In the imports section of PEStudio it’s possible to validate the previous assumption because we see clearly that the function URLDownloadToCache is imported from urlmon.dll.

2.2. Can you identify suspicious strings linked to network communications?

In the strings section of PEStudio is possible to look at some strings present into the PE file. Some suspicious strings are:

www.practicalmalwareanalysis.com/%s/%c.png. It is a url, but the presence of the format strings %s and %c could be a clue for the fact that the executable will write some information on the link itself, maybe using a sprintf.
%c%c:%c%c:%c%c:%c%c:%c%c:%c%c, this format string resemble a MAC Address.
%s-%s, this format string maybe is used to concatenate two strings together, maybe for beaconing purposes
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/, this one seems the Base64 alphabet using for encoding strings.

In the following steps we have to validate some of those hypothesis, but it’s good to have them in a preliminary phase, because we can use those points to focus our analysis.

2.3. Which kind of activity you can spot on the Network? And which strategy does the malware use to beacon the C2?

To answer the question it’s sufficient to use fakenet and launch the sample.

It’s possible to see that the sample makes a GET request to the host http://www.practicalmalwareanalysis.com at the path: ODA6NmU6NmY6NmU6Njk6NjMtc3R1ZGVudAaa (that seems random at first), to the resource a.png. Also the user agent in quite strange, but it can be the user agent used by the urlmon.dll.

If the malware is re-launched, it is possible to see that the path that in a first moment looked like a random string is not a random string because the same one is used in different requests, so it’s possible to assume that it is some form of beaconing, but this hypothesis must be validated.

It’s possible to use IDA to validate the hypothesis, analyzing the code, to understand how the beaconing process is implemented.

2.4. How is the beacon crafted?

Opening the sample in IDA brings us to the main function located at 0x00401285, thanks to IDA its easy to see that, together with other local variables, a pointer to a HwProfileInfo is allocated. After a memory allocation, the function GetCurrentHwProfileA is called. From the Microsoft documentation:

[GetCurrentHwProfileA] Retrieves information about the current hardware profile for the local computer.

this function return a non-zero value if it succeeds, and takes only one argument, a pointer to a HW_PROFILE_STRUCTURE, that the function fills with some info about the system. From the Microsoft documentation:

[HW_PROFILE_STRUCTURE] Contains information about a hardware profile. The GetCurrentHwProfile function uses this structure to retrieve the current hardware profile for the local computer.

In fact it contains the reported docking state of the computer (if it is a laptop), the globally unique identifier of the current hardware profile as a GUID string, and the display name for the current hardware profile.

The sample use the second part of the HW_PROFILE_STRUCTURE, the GUID, but it does not use all of it, the sample takes twelve characters from the GUID, and arranges them using the format string %c%c:%c%c:%c%c:%c%c:%c%c:%c%c. So the malware does not use the MAC address to uniquely identify the infected machine, but a substring of the GUID arranged like a MAC address.

This info gives another important clue that we need to verify: ODA 6N mU 6N mY 6N mU 6N jk 6N jMtc3R1ZGVudAaa is the string used in the GET request, note the repeating pattern, it can be related to %c%c : %c%c : %c%c : %c%c : %c%c : %c%c. Also this observation provides another clue: the last part of the string contains some info not present in the MAC address like string.

In fact, just some lines after, there is a call to GetUserNameA that:

Retrieves the name of the user associated with the current thread

the username is stored in the ebp+Buffer, and if the call to the function succeed the sample puts the to strings, the GUID and the logon username, together using an sprintf using the format string %s-%s. This new string that I’ll refer to as clear_text_fingerptint is passed, together with a buffer to sub_4010BB.

2.5. Can you spot any known string encoding method?

sub_4010BB takes the clear_text_fingerptint and a buffer, the it uses strlen to compute the length clear_text_fingerptint, and a loop that iterate over the just computed length starts. Inside this first loop there is another loop, responsible to perform some operation every three characters calling the function sub_401000.

sub_401000 access the string that starts byte_4050C0 at index stored in ecx, the string accessed is the alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/, this function also has some conditional statement used to put some kind of default character if a condition is met, the character used is a.

Two approaches are possible now: try to understand exactly how the code works, or try to validate a simple hypothesis first, the purpose is to know how the malware act on the infected machine and how the beacon message is crafted, not to know the exact implementation. A simple hypothesis easy to verify follows:

The sample take a string constructed in the following way: %c%c:%c%c:%c%c:%c%c:%c%c:%c%c-%s, so we know what we are looking for,
A strange string is used for performing the request: ODA6NmU6NmY6NmU6Njk6NjMtc3R1ZGVudAaa, and
The Base64 encoding alphabet is used to encode the character in sub_401000, so
maybe the sample use Base64 to encode the string

To verify this hypothesis I am going to use CyberChef

There are two strange characters at the end of the decoded string, that matches perfectly our hypothesis by the way. The mystery is easy to solve, remember that BAse64 uses the equal sign as a padding character, but in this string no equal characters are found, in sub_401000 the code was using a as default character if some conditions were met, so maybe the sample is using a as a padding character; to validate this hypothesis just substitute the last to a with the equal sign:

2.6. How does the malware expect to receive in response to its beacon?

Proceeding in the main function, after the beacon encoding, there is a call to another function sub_3011A3. This function takes the computed encoded string as argument; in the initial lines of the function it’s possible to note that the the last character of the encoded string is extracted from the string itself pushed into the stack together with the encoded string itself, also the format string www.practicalmalwareanalysis.com/%s/%c.png is pushed on the stack; and a call to sprintf follows. Now we know with certainty how the beacon string is finalized.

After a call to URLDownloadToCacheFile is performed, from the microsoft documentation:

Downloads data to the Internet cache and returns the file name of the cache location for retrieving the bits.

If the function call succeed the sample will make a call to CreateProcessA, using the return value of URLDownloadToCacheFile as one lpApplicationName argument, lpApplicationName represent:

The name of the module to be executed. This module can be a Windows-based application. It can be some other type of module (for example, MS-DOS or OS/2) if the appropriate subsystem is available on the local computer.

So the sample is expecting an executable file.

2.7. How does it use the received data?

The sample try to execute the received data, we can infer that the sample, after infecting an host, downloads and execute another sample.