RTAudio (x-msrta) interop

When WLM got this new feature called voice calls I remember being quite excited. Like, which protocols would they be using this time, which codecs, and so on. I was quite surprised when I learned that the they were using SIP, RTP and ICE this time around, instead of all-proprietary protocols for both signaling and transport like most of the attempts at similar features in the past. There was one disappointment though, if you’re not running Windows and thus not able to use WLM, you won’t get the same experience as those using the official client. The reason is simple; they get to use an adaptive wideband codec which works well over lossy networks, you don’t. If you want to be a first class citizen in the Messenger world you need to be running Windows.

As much fun as it would be to reverse-engineer this codec I sadly don’t have enough sparetime for that these days, I’m no longer a student with virtually infinite amounts of sparetime like I was back in the libmimic days. I still have this passion for reverse-engineering though, and I find it great as a recreational activity in the life part of the work/life balance. So I sat down a couple of hours one late evening with a fresh cup of Yerba Mate tea and poked around with IDA, and worked out the internal API of this codec inside the appropriate binary. I also learned that all the supported audio codecs share the same interfaces. Of course there aren’t any exported functions exposing them, but that’s just esthetics anyway. Just load the DLL, scan through its memory for the signatures of what you need, declare some function pointers, wrap any structures you might need and write some assembly glue, and you’re all set. Use the wineloader approach used by various media players, and you can even run the code on a non-Windows OS given that it’s x86.

So what I did was put together a GStreamer plugin that would wrap all the encoders and decoders dynamically, just like gst-ffmpeg does with the codecs provided by FFmpeg. I first got this plugin working on Windows, and then ported it to Linux by using the wineloader code of gst-pitfdll as a starting point.

Plain encoding and decoding isn’t enough though, you also need error-concealment to work well with lossy networks. Reverse-engineering this interface was more work, as I had to figure out some internals of their RTP stack in order to know more about the data structures and what the different fields meant. I also had to inherit from an internal C++ base class, and this was a quite fun experience. It’s all very simple though, it’s just a matter of figuring out the size of the class, inherit from it the same way as you would with GObject/C and write some wrapper functions to emulate thiscall using stdcall. Meaning that you put the this pointer into ecx and pass the rest of the arguments on the stack.

I implemented it by having a headerfile with for example:

typedef struct _MSEncoder MSEncoder;

HRESULT WINAPI ms_encoder_set_bitrate (MSEncoder * encoder, guint bitrate);

Where WINAPI would be defined to the compiler-specific attribute for specifying stdcall calling convention. Now comes the interesting part, the C wrapper, which for MSVC looks like:

__declspec(naked) HRESULT WINAPI
ms_encoder_set_bitrate (MSEncoder * encoder, guint bitrate)
{
INVOKE_VFUNC (16);
}

The naked attribute tells the compiler not to generate any prolog and epilog, meaning that you’re responsible for setting up the stack frame if you want one. In my case I don’t want one, as I want to simply just put the first argument in ecx and shift the return address one level up on the stack.

So the macro INVOKE_VFUNC is simply:

#define INVOKE_VFUNC(func_offset) \
__asm { \
/* fill in ecx with ‘this’ pointer */ \
__asm mov ecx, [esp + 4] \
\
/* put return address where ‘this’ pointer was */ \
__asm pop edx \
__asm mov [esp], edx \
\
/* call the function in the vtable */ \
__asm mov edx, [ecx + 0] \
__asm mov edx, [edx + func_offset] \
__asm jmp edx \
}

As gcc doesn’t support naked functions on x86 I wrote some fairly self-explanatory assembler code for the GNU Assembler. Naked functions are actually quite beautiful, I really hope that gcc will support them on x86 one day.

So, if you want to play with the code you can browse it here (the plugin is in the “ext/mscodecs” subdirectory):

http://bazaar.launchpad.net/~oleavr/oabuild/gst-plugins-farsight/files

Or better:

bzr branch lp:~oleavr/oabuild/gst-plugins-farsight

To build just the plugin and play with it without installing it, simply do:

./configure

cd ext/mscodecs/

make

If you have a Windows installation with Messenger handy, copy RTMPLTFM.dll from the installation directory into /usr/local/lib/win32. However, if you don’t, make sure you accept their EULA and follow the instructions in the helper/README file.

Now to play with it:

export GST_PLUGIN_PATH=./.libs

gst-inspect-0.10 mscodecs

sender:

gst-launch-0.10 audiotestsrc samplesperbuffer=320 is-live=true ! msenc_rta16 bitrate=29000 ! rtprtaudiopay pt=114 ! udpsink

receiver without error concealment:

gst-launch-0.10 udpsrc ! application/x-rtp, payload=114, clock-rate=16000 ! rtprtaudiodepay ! msdec_rta16 ! alsasink

receiver with error concealment (experimental, very alpha):

gst-launch-0.10 udpsrc ! application/x-rtp, payload=114, clock-rate=16000 ! msrtahealer ! alsasink

(Add lossrate=15.0 to the encoder on the sender launch line to enable FEC in case you wanna test it on a lossy network or with “identity drop-probability=x” before the udpsink. Any value greater than 10.0 makes the encoder enable maximum FEC.)

It should now be really easy to set up an RTAudio voice call with a Messenger client using Farsight 2 — any takers? 🙂

Advertisements

~ by oleandre on May 31, 2008.

16 Responses to “RTAudio (x-msrta) interop”

  1. Thank you soo much.

  2. Hi, I am trying to use your code to decode X-MSRTA 16k. As rtmpltfm.dll hasn’t exported any functions to expose init, deinit and decode functions, can you please tell how can I get address of these functions from IDA.

  3. Hi, I am trying to use your code to decode X-MSRTA 16k. As rtmpltfm.dll hasn’t exported any functions to expose init, deinit and decode functions, can you please tell how can I get address of these functions from IDA.

  4. Hi,
    I am trying to test your plugin but after I compile and try to register it with gstreamer by “gst-inspect-0.10 mscodecs”, I get the following error

    Called unk_RegisterTraceGuidsW
    Called unk_RegisterTraceGuidsW
    Called unk_GetVersionExW
    Called unk_GetModuleHandleW
    Called unk_RegisterTraceGuidsW
    Called unk_IsDebuggerPresent
    Called unk__crt_debugger_hook
    Called unk_UnhandledExceptionFilter
    Called unk__crt_debugger_hook
    Called unk_TerminateProcess

    ERROR: Caught a segmentation fault while loading plugin file:
    ./.libs/libgstmscodecs.so

    Please either:
    – remove it and restart.
    – run with –gst-disable-segtrap and debug.
    Error initializing: Error re-scanning registry , child terminated by signal

    • Hi,

      It looks like you’re using a different version of the DLL. I haven’t tested the plugin since this blog post was written, so I’m pretty sure that the DLL shipped with later versions of Messenger has broken the internal API/ABI. I would suggest tracking down an old version from around the time of this blog post, and use the DLL from that version.

    • Looks like it was because of a different version of the dll. Using an older version (probably the one you reversed from) solved the problem

  5. Hi,
    is it possible to play audio from a captured pcap file already filtered for “payload=114”? I can’t figure it out how to solve “subclass did not specify output size”.

    This is what I get:

    gst-launch-0.10 -v filesrc location=rtp.dump ! rtprtaudiodepay ! msdec_rta16 ! alsasink
    Setting pipeline to PAUSED …
    Pipeline is PREROLLING …
    ERROR: from element /pipeline0/msdec_rta160: subclass did not specify output size
    Additional debug info:
    gstbasetransform.c(1511): gst_base_transform_handle_buffer (): /pipeline0/msdec_rta160:
    subclass did not specify output size
    ERROR: pipeline doesn’t want to preroll.
    Setting pipeline to NULL …
    FREEING pipeline …
    Called unk_??3@YAXPAX@Z

    Gstreamer did play with mad plugin, but only noise with this output:

    /pipeline0/mad0.src: caps = audio/x-raw-int, endianness=(int)1234, signed=(boolean)true, width=(int)32, depth=(int)32, rate=(int)8000, channels=(int)2
    /pipeline0/alsasink0.sink: caps = audio/x-raw-int, endianness=(int)1234, signed=(boolean)true, width=(int)32, depth=(int)32, rate=(int)8000, channels=(int)2

    • Yep, you can play straight from pcap. Put “pcapparse” after “filesrc” (provided you have a recent enough GStreamer release installed). Do “gst-inspect pcapparse” to learn about properties you may set to filter certain packets based on src/dstport etc.

    • hello,do you solve the problem of playing audio from a captured pcap file ? can you tell me how to do it,or can you tell me your email in order that I can consult you about the problem,thank you!!!

  6. any player is there for playing MSN RTP X-msrta codec packets? We r inerested to buy it.

  7. Hi, I am trying to use your code to decode X-MSRTA 16k. I can’t use the command “./configure” to compile the code .I get the follow error:
    configure: error: cannot find install-sh or install.sh
    I don’t know how to do,can you tell me how to compile the code including which code or file be needed. Thank you!!

  8. Hi, I am trying to use your code to decode X-MSRTA 16k.When I USE THE COMMAND”gst-inspect-0.10 mscodecs”.I get the follow ERROR:No such element or plugin ‘mscodecs’.But I already use the command “make&&make install”.What should I do?Thank you!

  9. 5.Hi,
    is it possible to play audio from a captured pcap file already filtered for “payload=114″? I use the ” gst-launch -v filesrc location=1.pcap ! pcapparse dst-port=1863 ! rtprtaudiodepay ! msdec_rta16 ! alsasink”
    I get the error as follow:
    Setting pipeline to PAUSED …
    Pipeline is PREROLLING …
    ERROR: from element /GstPipeline:pipeline0/GstFileSrc:filesrc0: Internal data flow error.
    Additional debug info:
    gstbasesrc.c(2582): gst_base_src_loop (): /GstPipeline:pipeline0/GstFileSrc:filesrc0:
    streaming task paused, reason not-negotiated (-4)
    ERROR: pipeline doesn’t want to preroll.
    Setting pipeline to NULL …
    Freeing pipeline …
    Caught SIGSEGV accessing address 0xb78cc330
    Killed
    How can I do ?

  10. hi,I want to play audio from a captured pcap
    gst-launch-0.10 -v filesrc location=1.pcap ! pcapparse src-ip=202.115.36.124 dst-ip=202.115.36.233 src-port=23692 dst-port=12118 ! application/x-rtp, payload=114, clock-rate=16000 ! rtprtaudiodepay ! msdec_rta16 ! alsasink
    but I get the follow error:

    Setting pipeline to PAUSED …
    Pipeline is PREROLLING …
    /GstPipeline:pipeline0/GstCapsFilter:capsfilter0.GstPad:src: caps = application/x-rtp, payload=(int)114, clock-rate=(int)16000, media=(string)audio, encoding-name=(string)x-msrta
    /GstPipeline:pipeline0/GstRtpRTAudioDepay:rtprtaudiodepay0.GstPad:src: caps = audio/x-msrta, rate=(int)16000, channels=(int)1
    /GstPipeline:pipeline0/GstRtpRTAudioDepay:rtprtaudiodepay0.GstPad:sink: caps = application/x-rtp, payload=(int)114, clock-rate=(int)16000, media=(string)audio, encoding-name=(string)x-msrta
    WARNING: from element /GstPipeline:pipeline0/GstRtpRTAudioDepay:rtprtaudiodepay0: Could not decode stream.
    Additional debug info:
    gstbasertpdepayload.c(391): gst_base_rtp_depayload_chain (): /GstPipeline:pipeline0/GstRtpRTAudioDepay:rtprtaudiodepay0:
    Received invalid RTP payload, dropping
    WARNING: from element /GstPipeline:pipeline0/GstRtpRTAudioDepay:rtprtaudiodepay0: Could not decode stream.
    Additional debug info:
    gstbasertpdepayload.c(391): gst_base_rtp_depayload_chain (): /GstPipeline:pipeline0/GstRtpRTAudioDepay:rtprtaudiodepay0:
    Received invalid RTP payload, dropping
    /GstPipeline:pipeline0/msdec_rta16:msdec_rta160.GstPad:src: caps = audio/x-raw-int, width=(int)16, depth=(int)16, endianness=(int)1234, signed=(boolean)true, rate=(int)16000, channels=(int)1
    /GstPipeline:pipeline0/msdec_rta16:msdec_rta160.GstPad:sink: caps = audio/x-msrta, rate=(int)16000, channels=(int)1
    ERROR: from element /GstPipeline:pipeline0/msdec_rta16:msdec_rta160: ms_decoder_decode returned 0xc004e010
    Additional debug info:
    gstmsdec.c(225): gst_msdec_transform (): /GstPipeline:pipeline0/msdec_rta16:msdec_rta160
    ERROR: pipeline doesn’t want to preroll.
    Setting pipeline to NULL …
    /GstPipeline:pipeline0/msdec_rta16:msdec_rta160.GstPad:src: caps = NULL
    /GstPipeline:pipeline0/msdec_rta16:msdec_rta160.GstPad:sink: caps = NULL
    /GstPipeline:pipeline0/GstRtpRTAudioDepay:rtprtaudiodepay0.GstPad:src: caps = NULL
    /GstPipeline:pipeline0/GstRtpRTAudioDepay:rtprtaudiodepay0.GstPad:sink: caps = NULL
    /GstPipeline:pipeline0/GstCapsFilter:capsfilter0.GstPad:src: caps = NULL
    Freeing pipeline …
    Called unk_??3@YAXPAX@Z

    what should I do? Thank you.

  11. i have capture pcap file of video call,it is using x-rtvc1 codec(RTVideo),can u able to decode it and play it? If any tool then please inform us we r interested to buy it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: