oSpy crypto madness

So anyway, to start with the oldest event first, it all started after a good friend of mine, Youness Alaoui, came with a bunch of suggestions that he’d very much like to see in oSpy. His suggestions were all very good, some of them were things which I had already thought of whilst others were completely new and quite interesting. The suggestion that we ended up discussing was one I had thought of in the past, but which I was very unsure of how to solve in a clean way. This issue was around the very neat interception that’s being done with the crypto APIs, meaning that you get to see encrypted network traffic, like HTTPS, in the clear. The downside to this was that not only did you see the unencrypted traffic entering one of the crypto functions, and the encrypted data coming out, but you were also bothered with the following send call(s) transmitting the encrypted data. (And of course the other way around for receiving data.) Part of this was also the initial handshake, which is usually not very useful. So, there were basically two solutions seen:

1) Post-process the trace and coalesce these calls in a heuristical way.

2) Intercept the algorithms one level up, from within the respective API where you’ve got access to both the socket and the data where it gets encrypted/after it gets decrypted and sent/received, and log it there. Avoid logging recv/send/encrypt/decrypt for these cases by checking where these are called from.

I decided to go for the second approach, as that seemed like the best solution all things considered — less prone to error once it works. So the plan went like this: I’d start with wininet.dll’s API, as this seemed to be an API used in a lot of applications, disassemble its code with IDA, use oSpy to trace an app using this API, and then use oSpy’s IDA integration in order to conveniently examine the code where recv/send/EncryptMessage/DecryptMessage were called from. The next step was then to use oSpyAgent’s internal code signature matching API to have it scan through memory to find the spots where I thought it made the most sense to tap in (inside the C++ classes’ code behind the exported C API), and once found write some JMP-instructions, some magic trampoline code (I just love __declspec(naked) functions :-)), do the logging based on the knowledge of the stack frame, perform the instructions overwritten by the JMP, and JMP back to the next instruction after. Piece of cake, right? Yep, so it would seem. My cunning plan seemed to work, but only for a while. I did signature matching to find the return addresses of the API calls I didn’t want to do any logging for any longer, and simply constructed a blacklist that the logging code in the recv/send/EncryptMessage/DecryptMessage hooks would check the return-address against, and just avoid logging for these known addresses.

But then along came a badger, and it hit me really hard. It turned out that an internal C++ class called ICSocket was calling recv() whenever there was data to be read, and when there was secure communications involved, the object would actually be an ICSecureSocket instance, which is a class that inherits ICSocket, and this would in turn handle the data just received off the network after ICSocket had received it. A natural way to design it, but really annoying for me. If I blacklisted the recv() call made from inside ICSocket’s “oh-there’s-data-to-be-read”-method I’d effectively lose any data received for unencrypted protocols (ie. the API functions not using an ICSecureSocket socket). Hmm.. this was no good, I needed a way to tell if the object in question was an ICSecureSocket or not. So what was the easiest way to solve that problem? Well, ICSocket has virtual functions, meaning it has a vtable, meaning that if I could get hold of the this-pointer I could just look at what the pointer at offset 0 points to. If it points to the ICSecureSocket’s vtable then it was a done deal, if not then, well, just log as usual. So, the way I solved this was to simply scan for the signature of the ICSecureSocket constructor in memory, and once found read out the offset of the vtable from where it sets it on the instance being constructed. Next, just write a thin C++ class called CHookContext that keeps a dictionary of return-address to function-pointer mappings. Each hook would then just call its context’ ShouldLog() method with the return-address, and this method would then do a lookup in its dictionary. If no entry was found, just return true, everything’s good. If the function-pointer is set to NULL, return false right away, meaning you’re certain you don’t want to log for this return-address. If it’s set, call the function and decide whether to log based on the boolean it returns. Quite simple, but there was a catch. This callback would need access to the CPU registers at the point where the API function was getting called, obviously because it needed to access the this-pointer, which would either be in a register or a local variable. So this lead me to having to write another kind of hooking glue that I based off HOOK_GLUE_INTERRUPTIBLE. The result was simple and easy to use. Basically I used the PUSHAD instruction to save the contents of all general-purpose registers onto the top of the stack, and just define a C struct that would map exactly to how these were placed on the stack. Then I would just have the C callbacks declared as having instances of this struct passed by value, and voila, problem solved. The code would then look like this:

First off, instantiate the glue:
HOOK_GLUE_EXTENDED(recv, (4 * 4))

Next, declare the two callbacks:

static int __cdecl
recv_called(BOOL carry_on,
            CpuContext ctx_before,
            void *bt_addr,
            void *ret_addr,
            SOCKET s,
            char *buf,
            int len,
            int flags)
{
  ...
}

static int __stdcall
recv_done(int retval,
          CpuContext ctx_after,
          CpuContext ctx_before,
          void *bt_addr,
          void *ret_addr,
          SOCKET s,
          char *buf,
          int len,
          int flags)
{
  ...
  if (g_recvHookContext.ShouldLog(ret_addr, &ctx_before))
    {
      ...
    }
  ...
}

And finally, in the wininet hooking code:

static void *g_ICSecureSocketVTable = NULL;

static bool
ICSocketReceiveContinue_ShouldLog(CpuContext *context, va_list args)
{
  ICSocket_base *self = *((ICSocket_base **) (context->ebp - 4));
  return (self->vtable != g_ICSecureSocketVTable);
}

void
hook_wininet()
{
  ...
  if (find_signature(&signatures[SIGNATURE_ICSOCKET_RECEIVE_CONTINUE_RECV],
                     &retAddr, &error))
    {
      g_recvHookContext.RegisterReturnAddress(retAddr, ICSocketReceiveContinue_ShouldLog);
    }
  ...
}

(Please ignore the mixture of coding styles by the way. I really feel bad about this. oSpy started out with its logging agent written in C, as what it originally did was really straight-forward, but eventually as the complexity grew way beyond that there was a switch to C++, and you know the rest. This will be cleaned up eventually, but for now please just bear with me.)

As you can see it’s quite simple. But my oh my did I expect this seemingly tiny little feature request to be that much work. 🙂

Advertisements

~ by oleandre on March 4, 2007.

One Response to “oSpy crypto madness”

  1. This was among the most important features to me! Thank you very much for implementing it 😉

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: