Building the Poke encoder in Metasploit

So as mentioned in a previous post, I built a custom encoder that would hide my payload as a string of words could not be understood by a static analysis from Antivirus solutions.

Unfortunately for me, building the shell code was slightly annoying. The steps were:

Use msfvenom with the payload of choice, use output it to -f num
Delete the whitespace and the newline characters from the payload
Load the payload into the encoder in what ever Lang I wrote it in (Right now, Python, Golang, CSharp and Nim).
Run the encoder and copy the output
Format the output to fit the shellcode program syntax
Compile the shellcode program.

This process was not very efficient and was prone to errors. Any modification in any of the steps would require me to start over. Regenerate, restrip, reencode, recompile, retest.

So what if I could build my encoder into Metasploit directly? I decided today was great way to try to learn how to.

Investigation

So where do we even start?

Well, I choose to start by going to the metasploit github page and look for strings that are similar to any of the -f options when generating the payload. We can narrow it down by using uncommon strings.

We use the command msfvenom -l format to list the formats. I chose to use the transform string to search for my starting point.

Using this string, we search the metasploit repo for any files that mention this.

The first result is pretty promising. We check out the file.

Development Environment

At this point, I felt it was a good idea to setup a development environment.

Metasploit has a good resource page for doing this.

The only real command we needed to do before git cloning it was:

sudo apt update && sudo apt install -y git autoconf build-essential libpcap-dev libpq-dev zlib1g-dev libsqlite3-dev

After that, we can simply git clone the repo, and get started:

git clone https://github.com/rapid7/metasploit-framework.git

We cd into the new repo and run ./msfconsole to ensure that it runs normally.

Seeing this page, we can now start with our development of our encoder.

Important to note

It is important to note, that I use the term “encoder”" improperly. I am actually building a transform format. An encoder actually adds additional functions to the resulting byte code to decode it from its encoded from. I am actually only using metasploit to generate the encoded form, and using the shellcode program in other languages to decode it.

Also, after every change we will need to relaunch metasploit to apply the changes.

After this is done, I chose to use VSCode with the SSH module to make further edits.

Starting Point

So from before, we can start with the file metasploit-framework/lib/msf/base/simple/buffer.rb.

After some testing, I found out that there is two function that we need to edit for our custom encoder/transform to be an allowable option in metasploit.

self.comment and self.transform. When a payload is generated within msfconsole, both of these function are called.

If the format name is called, and the entry is not found in both functions, the error raise BufferFormatError, "Unsupported buffer format: #{fmt}", caller is called.

Adding our own entry poke to both sections with a simple hello world string will allow use to call that switch statement.

Relaunching and calling the format poke returns use the expected result.

Rebuilding the encoder……again……

Alright, so we have built the encoder in several languages so far…

Full disclosure, I’ve never done work in ruby before. So this was a struggle. It was likely also inefficient. But because we are not commiting this to the official repo, we are okay with the inefficiencies.

I split the encoder into two functions.

self.pokeform - This function will be called from self.transform with the string of the bytes passed to it. This function will ensure we have /usr/share/dict/american-english installed (our wordlist) and that each byte is split up to an array.
self.WordToByte - Each byte that is passed to this function is given a word representation of it using the algorithm that we built before. Return a word.

self.WordToByte(bytenumber)

Looking back, I know the name of the function is actually reversed. Should be byte to word….. Nice.

This function starts by opening the file /usr/share/dict/american-english and splitting (by newline) the file into an array of words.

Then we start an infinite loop that will only die once we get a word that fits that desired parameters.

Steps

Initiate a variable resultnumber variable with the number 0.
Get Random word and uppercase it.
If word has non alphanumeric character, restart loop at step 1.
Split the word into an array of characters.
For each character, if ascii value of character:
1. Is less or equal to 77, add ascii value to resultvalue and then AND 255 to make sure it does not exceed 255 (value of a byte.)
2. Is more then 77, substract ascii value from resultvalue and then AND 255 to the ensure the value does not go lower then 0.
If resultvalue is equal to bytenumber, return the word and therefor, end the loop. Else, repeat at step 1.

Source

def self.WordToByte(bytenumber)
    wordlist_data = File.read("/usr/share/dict/american-english").split #Get wordlist and put into array.
    loop do
      resultNumber = 0
      testword = wordlist_data.sample(1)[0].upcase 
      if (testword.index( /[^[:alnum:]]/ ) != nil) # If the word has any nonalphanumeric chars, go to next one.
        next
      end
      charsarray = testword.split("") # Split word to array of chars
      charsarray.each {|l| 
        if l.ord <= 77
          resultNumber += l.ord
          resultNumber = resultNumber & 255
        else
          resultNumber -= l.ord
          resultNumber = resultNumber & 255
        end
      }
      if resultNumber == bytenumber.to_i() #If word result is equal to desired byte, return the word
        return testword
      end
    end
  end

Comments

I should note that this is inefficient because we are reopening are reading the file of words for every single byte, rather then having it open once. I will fix that eventually.

self.poke(buf)

The main function of our project. It takes a buffer that is supplied by metasploit with our shellcode.

Steps

Checks to see if our dictionary file is present. If not, errors out with an message notifying.
We use the existing rex::text.to_num() function to turn the buffer from its current format to a string format that I can work with using the variable data. We also strip the whitespace so we can just use comma separation to turn it into an array later
We initiate a new empty buffer buf for a our final payload.
We split the split the data string into an array using .split(",") and but it into the bytearray variable.
For each element in the in bytearray, use the WordToByte function to turn our byte into a word, and push it to the buf array.
After all the bytes have been converted to words, I return the buf variable as a string using the .join command.

Source

  def self.pokeform(buf) # Transform bytecode to poke encoder
    if(File.exists?('/usr/share/dict/american-english'))
      
      data = Rex::Text.to_num(buf).gsub(/\s+/, "") # Get the original payload, in num format, and strip whitespace chars.
      buf = []
      bytearray = data.split(",") # Split the bytes by comma.
      bytearray.each {|x|
      word = WordToByte(x.hex) # Find a word rep of the byte.
      alertstring = "%s -> %d" % [word,x.hex] #lol just a print statement
      buf << word
     }
     return buf.join(",")
    else
      raise BufferFormatError, "/usr/share/dict/american-english not found",caller # Return error if dict file not found
    end
    return 
  end

Testing our encoder

We can save this file, run metasploit, and test our encoder.

./msfconsole
use windows/x64/exec
set cmd calc.exe
generate -f poke

Haha! It works! It is a tad slow on generation, but that is expected due to the inefficiencies that I introduced. and as mentioned in the previous blog post, if we run it again, it creates a completely new set of words with the same result.

Future improvements

So there is a couple of ways we can make this better. We will need to change the file descriptor to be open and handled once instead of once per byte.

We could also just prebuild the word per byte list….. Lets not talk about that and just let me enjoy this.

We can also XOR the byte before a matched to it word. On the OSEP, I used this as a final technique to hide from AV/EDR.

Finally, we could also dynamically assign the number which the program decides if the ascii value is added or subtracted from the result value.

Final Notes

This was great exercise to learn how ruby runs and how to add functionally to metasploit. I don’t plan on adding a pull request metasploit because its rather a silly project but it will say alive in my github somewhere.