oSpy visualizations

•March 4, 2007 • 1 Comment

Some of you might have seen oSpy’s visualization stuff. Basically it has a few visualizers that build visualizations on top of the parsed protocol data and shows you the result in a neat timeline view with each context (usually a network connection) in a column of its own, side by side. For example:

oSpy’s visualization feature

One thing that has started to really annoy me is the fact that oSpy doesn’t run on multiple platforms. Typically, as a Linux/OS X developer working on an inter-operable replacement for some Windows-only software, you need to either have a Windows installation in dual-boot or in a virtual machine, or have someone do that for you. This is something that’s usually 5-10 minutes work, and you have what you need. The really annoying part is that usually you’ll spend hours studying the trace, and having to use a Windows-installation just for this task feels really unnecessary. So, what I envisioned was that if I could export the raw data that these visualizations were built from to XML, one could write a neat little script that would parse this and output HTML, SVG, or whatever you want. I added this feature in a hurry, and Ali took the challenge of writing a converter that would output an HTML visualization right away, and shortly afterwards he had something working, which you can see here. Check out Ali’s blog post about this for more info about this. Really great piece of work there Ali — you rock! :-)

oSpy crypto madness

•March 4, 2007 • 1 Comment

So anyway, to start with the oldest event first, it all started after a good friend of mine, Youness Alaoui, came with a bunch of suggestions that he’d very much like to see in oSpy. His suggestions were all very good, some of them were things which I had already thought of whilst others were completely new and quite interesting. The suggestion that we ended up discussing was one I had thought of in the past, but which I was very unsure of how to solve in a clean way. This issue was around the very neat interception that’s being done with the crypto APIs, meaning that you get to see encrypted network traffic, like HTTPS, in the clear. The downside to this was that not only did you see the unencrypted traffic entering one of the crypto functions, and the encrypted data coming out, but you were also bothered with the following send call(s) transmitting the encrypted data. (And of course the other way around for receiving data.) Part of this was also the initial handshake, which is usually not very useful. So, there were basically two solutions seen:

1) Post-process the trace and coalesce these calls in a heuristical way.

2) Intercept the algorithms one level up, from within the respective API where you’ve got access to both the socket and the data where it gets encrypted/after it gets decrypted and sent/received, and log it there. Avoid logging recv/send/encrypt/decrypt for these cases by checking where these are called from.

I decided to go for the second approach, as that seemed like the best solution all things considered — less prone to error once it works. So the plan went like this: I’d start with wininet.dll’s API, as this seemed to be an API used in a lot of applications, disassemble its code with IDA, use oSpy to trace an app using this API, and then use oSpy’s IDA integration in order to conveniently examine the code where recv/send/EncryptMessage/DecryptMessage were called from. The next step was then to use oSpyAgent’s internal code signature matching API to have it scan through memory to find the spots where I thought it made the most sense to tap in (inside the C++ classes’ code behind the exported C API), and once found write some JMP-instructions, some magic trampoline code (I just love __declspec(naked) functions :-) ), do the logging based on the knowledge of the stack frame, perform the instructions overwritten by the JMP, and JMP back to the next instruction after. Piece of cake, right? Yep, so it would seem. My cunning plan seemed to work, but only for a while. I did signature matching to find the return addresses of the API calls I didn’t want to do any logging for any longer, and simply constructed a blacklist that the logging code in the recv/send/EncryptMessage/DecryptMessage hooks would check the return-address against, and just avoid logging for these known addresses.

But then along came a badger, and it hit me really hard. It turned out that an internal C++ class called ICSocket was calling recv() whenever there was data to be read, and when there was secure communications involved, the object would actually be an ICSecureSocket instance, which is a class that inherits ICSocket, and this would in turn handle the data just received off the network after ICSocket had received it. A natural way to design it, but really annoying for me. If I blacklisted the recv() call made from inside ICSocket’s “oh-there’s-data-to-be-read”-method I’d effectively lose any data received for unencrypted protocols (ie. the API functions not using an ICSecureSocket socket). Hmm.. this was no good, I needed a way to tell if the object in question was an ICSecureSocket or not. So what was the easiest way to solve that problem? Well, ICSocket has virtual functions, meaning it has a vtable, meaning that if I could get hold of the this-pointer I could just look at what the pointer at offset 0 points to. If it points to the ICSecureSocket’s vtable then it was a done deal, if not then, well, just log as usual. So, the way I solved this was to simply scan for the signature of the ICSecureSocket constructor in memory, and once found read out the offset of the vtable from where it sets it on the instance being constructed. Next, just write a thin C++ class called CHookContext that keeps a dictionary of return-address to function-pointer mappings. Each hook would then just call its context’ ShouldLog() method with the return-address, and this method would then do a lookup in its dictionary. If no entry was found, just return true, everything’s good. If the function-pointer is set to NULL, return false right away, meaning you’re certain you don’t want to log for this return-address. If it’s set, call the function and decide whether to log based on the boolean it returns. Quite simple, but there was a catch. This callback would need access to the CPU registers at the point where the API function was getting called, obviously because it needed to access the this-pointer, which would either be in a register or a local variable. So this lead me to having to write another kind of hooking glue that I based off HOOK_GLUE_INTERRUPTIBLE. The result was simple and easy to use. Basically I used the PUSHAD instruction to save the contents of all general-purpose registers onto the top of the stack, and just define a C struct that would map exactly to how these were placed on the stack. Then I would just have the C callbacks declared as having instances of this struct passed by value, and voila, problem solved. The code would then look like this:

First off, instantiate the glue:
HOOK_GLUE_EXTENDED(recv, (4 * 4))

Next, declare the two callbacks:

static int __cdecl
recv_called(BOOL carry_on,
            CpuContext ctx_before,
            void *bt_addr,
            void *ret_addr,
            SOCKET s,
            char *buf,
            int len,
            int flags)
{
  ...
}

static int __stdcall
recv_done(int retval,
          CpuContext ctx_after,
          CpuContext ctx_before,
          void *bt_addr,
          void *ret_addr,
          SOCKET s,
          char *buf,
          int len,
          int flags)
{
  ...
  if (g_recvHookContext.ShouldLog(ret_addr, &ctx_before))
    {
      ...
    }
  ...
}

And finally, in the wininet hooking code:

static void *g_ICSecureSocketVTable = NULL;

static bool
ICSocketReceiveContinue_ShouldLog(CpuContext *context, va_list args)
{
  ICSocket_base *self = *((ICSocket_base **) (context->ebp - 4));
  return (self->vtable != g_ICSecureSocketVTable);
}

void
hook_wininet()
{
  ...
  if (find_signature(&signatures[SIGNATURE_ICSOCKET_RECEIVE_CONTINUE_RECV],
                     &retAddr, &error))
    {
      g_recvHookContext.RegisterReturnAddress(retAddr, ICSocketReceiveContinue_ShouldLog);
    }
  ...
}

(Please ignore the mixture of coding styles by the way. I really feel bad about this. oSpy started out with its logging agent written in C, as what it originally did was really straight-forward, but eventually as the complexity grew way beyond that there was a switch to C++, and you know the rest. This will be cleaned up eventually, but for now please just bear with me.)

As you can see it’s quite simple. But my oh my did I expect this seemingly tiny little feature request to be that much work. :-)

Long time no pop

•March 4, 2007 • Leave a Comment

Thought it was about time I pop the recent events off my stuff-I-should-blog-about stack. A lot of stuff has happened since the last time, and I’ve been meaning to blog two or three times since then but every single time I’ve had free time on hand and felt like blogging there’s been something coming out of nowhere distracting me. That’s scatterbrainedness for you. :-) So, without further ado I’ll just catch up by discussing each issue in a separate blog post following this one, instead of doing one really huge post.

Crack on toast

•February 10, 2007 • 1 Comment

Just been through the basic morning bootstrap code and waiting for my coffee machine to heat up, so I thought I’d just throw out a few words while I’m just sitting here being more or less useless (“My name is Ole André and I’m an addict.”). Got a new oSpy release (1.9.4) out on Wednesday evening, and this release has a couple of new features that I think are worth mentioning. I’ve added a visualization miniview, inspired by IDA’s graph mode navigation window, which you can see a screencast of here. Another useful feature is that you can now select columns by clicking on the headers and right-click and choose “Delete selected” to get rid of them. This can be very useful if you’re dealing with scenarios where some streams are completely uinteresting, like for example an IM client pulling advertisements over HTTP, where you obviously don’t want those to clutter your view when you’re looking at other HTTP transactions for example.

Late Thursday evening I pieced together another feature that will make it into the next oSpy release. Basically I’ve been very annoyed over not being able to select and copy text from the nodes, and of course also the fact that lines get cut off if they were too long. I’ve also been longing for a way to scroll within the nodes instead of having nodes that are like 1000 pixels high in cases where the body contains a lot of data. Rich text formatting is another thing that I’ve been feeling like should be there, as there’s already been code in the project for quite some time to do XML syntax-highlighting, so I added this as well. I only had about 2 hours to hack on this so I didn’t get it finished enough to do a release, but you can see a screenshot of what it looked like here. I’m not entirely happy with the syntax highlighting colors, and there are still some bugs to be ironed out. I had intended to finish it off after work yesterday, but I was too tired so I ended up heading straight to bed after watching a few episodes of Stargate.

Anyway, my coffee machine has heated up now so I guess it’s time to start hacking. :)

Stay tuned.

What a weekend

•February 5, 2007 • Leave a Comment

Had the most counter-productive weekend ever, or so I thought. Stayed late at work on Friday, came home and spent most of the night just relaxing and listening to Dream Theater. Slept until 19:30 and had to hurry out of bed to get to the closest grocery store before closing time. Ate pizza and watched Alien 3, a really disappointing movie (I knew that, but hey, about time I catch up with the so-called classics). Thought about hacking a little on pymsn, but felt completely useless, just like the night before. Went to bed at around 04:00, slept until 11:00, spent one hour to get out of the bed. Fired up my beloved La Pavoni Professional and made myself a huge cup of cappuccino and enjoyed it while hacking a little on pymsn.

Chatted with Ali (asabil), agreed to split up the work so that he would write the SOAP building blocks after I had pieced together some protocol transcripts that he could work from (since I’ve always got XP running in vmware, standing by to make traces with oSpy). The plan was that he would do the SOAP building blocks one by one while I’d do the actual logic with each, in order to parallelise our efforts as much as possible.

As oSpy does an ok job on parsing HTTP and even provides pretty-printing with syntax-highlighting of bodies containing XML, this wasn’t any problem. However, due to how much data is captured with a typical trace of Windows Live Messenger, with all the interception that is being done, scrolling through thousands and thousands of rows to find the ones I’m looking for isn’t exactly fun. Filtering out most of the stuff, like the WinCrypt API interception and output from WLM’s internal debugging infrastructure, wasn’t too good an option either because then I could miss out on some details surrounding the stuff I was looking for. Like for example some interesting debug message being output just before that could shed some light on what it was doing (and/or point me to the interesting bits of code for static analysis). I could’ve used the search function, but it just didn’t feel right in this case. Then I thought, “a feature to jump to the next packet row, and another one to jump to the next row being part of a transaction (i.e. HTTP), now that would be useful, and it’s not that much work to do it”. So I went ahead and implemented it, and after that going through the traces became a real breeze. I made the transcripts that Ali needed and put them online.

Now, what does making such a transcript mean in this case? Well, the idea was to take the HTTP headers of each interesting request, paste them into a text file, inspect the body and have oSpy pretty-print it, copy-paste that into the text file after the headers. Add a separator, i.e. along the lines of “\n—\n”, and do the same for the response. There were two different requests being made, and each was actually two request/response pairs in the first trace, as there’s a redirect when the client logs in for the first time (on a freshly created user account). This means that you have to repeat this boring operation 8 times for the first trace, and 4 times for the subsequent ones. I did it, but didn’t like it, as oSpy’s mission is to eliminate the boring tasks associated with reverse-engineering protocols.

Anyway, I handed off the transcripts to Ali, and mentioned briefly “now it would be really cool to have oSpy generate such transcripts automatically”. Then I thought, “hmm, this isn’t that hard to do, I’ve already got the visualization stuff in oSpy that visualizes TCP and MSNP2P transactions.. it’s easy enough to write an HTTP visualizer and have it pretty-print bodies containing XML automatically, using the same code that’s already in there”. I dug in, and a few hours later the feature was implemented and working like a champ. What surprised me was how little time was actually necessary to implement the feature itself. I didn’t time it, but if I were to guess I’d say 10-15 minutes. The most time-consuming part was actually adding an optional headline to the “VisualTransaction” widgets that the conversation view organizes in a timeline view, and to refactor some code here and there.

The next thing I thought about was, “ok, now a simple visualizer for the MSN switchboard protocol would be useful”. I hacked one together, and apart from a few protocol parsing fixes that I did to the existing code, the hack itself took only about 5-10 minutes to do. The unified diff of the first version can be seen here. A screenshot of this in action here.

Not very useful alone, basically almost the same as viewing the TCP session in the dump view except that you don’t get pretty-printed XML for the commands with XML in the payload, and fancy timestamps.

Anyway, as you might have guessed by now, there was a cunning plan behind all of this. What good is a timeline view with lots of fancy drawing if you can only visualize one protocol at the same time?

So to make a long story short, I implemented this and added a separate dialog to make it possible to customize how you want the visualization. A couple of screenshots here and here. And, better, you can see this in action through the screencast here.

Needless to say I’ve released oSpy 1.9.3 with all of these features at the usual place. :-)

Hello world!

•January 19, 2007 • 1 Comment

.data
hello_world_msg db "Hello world!", 0dh, 0ah, "$"

.code
mov ah, 9
mov dx, hello_world_msg
int 21h

More on that story later.

Anyway, welcome to my personal blog. It was about time I got one, well, at least a blog that isn’t entirely about reverse-engineering related stuff, like my previous blog. This one definitely won’t be a high volume blog either, but hopefully that’ll improve with time.

Ah well, it’s getting late, I’m feeling completely useless and I should’ve been sleeping hours ago.