This week took place the Innovation in Software Engineering Latin America Symposium at IBM Guadalajara. I participated with an Asterisk session giving an overview of the Asterisk capabilities with AEL, AGI and AMI. Also I mentioned how it was possible to run Asterisk on an IBM System i Linux LPAR effectively converting the System i in a powerful IP PBX. I plan to give the same talk, but with some adjustments at ENLi the next month.
Archive for September, 2007
Asterisk talk at IBM
Saturday, September 29th, 2007Overriding the virtual table in a C++ object
Friday, September 14th, 2007Before starting, I just want to mention that I recently joined http://www.planetalinux.org/ , is a nice group of Linux bloggers, if you read this blog and want to read some interesting posts ( mostly in spanish, but some guys post in english ) go there!
Having said that, let’s finish this post.
Yesterday I was discussing with a friend of mine about how polymorphism is implemented in C++, and that is, using a virtual table ( remember the “virtual” keyword in method definitions? ). A virtual table is, rawly speaking, just like an array of function pointers. Each created object with virtual methods needs a virtual table. So, where does the virtual table is stored?, I really don’t know, but I do know where I can find the address of the virtual table associated to an object ( at least in g++ 4.1.1 ), the first sizeof(void*) bytes of an object are used to store a pointer to the virtual table. With this knowledge, one could think that is possible to override the virtual table pointer of the object and call arbitrary functions, and yes, we can. Let’s see some fun code.
#include <iostream> using namespace std; class Parent { public: virtual void VirtFunc1() { cout << "Parent::VirtFunc1" << endl; } virtual void VirtFunc2() { cout << "Parent::VirtFunc2" << endl; } }; class Child : public Parent { public: void VirtFunc1() { cout << "Child::VirtFunc1" << endl; } void VirtFunc2() { cout << "Child::VirtFunc2" << endl; } }; typedef void (*virtual_function)(); struct FakeVirtualTable { virtual_function virtual_one; virtual_function virtual_two; }; void fake_virtual_one() { cout << "Faked virtual call 1" << endl; } void fake_virtual_two() { cout << "Faked virtual call 2" << endl; } int main() { /* declare a Child class and a base pointer to it. */ Child child_class_obj; Parent* parent_class_ptr = &child_class_obj; /* create our fake virtual table with pointers to our fake methods */ FakeVirtualTable custom_table; custom_table.virtual_one = fake_virtual_one; custom_table.virtual_two = fake_virtual_two; /* take the address of our stack virtual table and override the real object pointer to the virtual table */ FakeVirtualTable* table_ptr = &custom_table; memcpy(parent_class_ptr, &table_ptr, sizeof(void*)); /* call the methods ( but we're really calling the faked functions ) */ parent_class_ptr->VirtFunc1(); parent_class_ptr->VirtFunc2(); return 0; }
So, try to run that code and, of course, the expected result is having fake_virtual_one() and fake_virtual_two() functions called. No magic there, we just replace the first sizeof(void*) bytes of the object with our own table pointer. There is not a use I can think of right now, but it is funny ….
Overflow in AppConference
Sunday, September 2nd, 2007Some weeks ago I got an offer for a free VoIPSurfer license. Why? just because of a patch I did one year ago for an issue I had, the patch helped the guy who seems to own voipsurfer, so, it is kind of open source Karma. The offering made me remember the interesting issue I had and I thought today it is a good day to write about it. Some background information can be found in this thread where I discussed with some Asterisk developers the issue: http://lists.digium.com/pipermail/asterisk-dev/2006-November/024616.html
Asterisk, as a software PBX, supports conferencing. However, default implementation in 1.2.x versions did not support native audio mixing. Asterisk default conferencing application is “app_meetme” and it works only if you have zaptel hardware available or if zaptel driver zt_dummy.ko is installed, otherwise it cannot work because meetme() use zt_conf functionality. Because of this I started looking for alternatives that did not require zaptel. I found app_conference which at that time was part of the IaxClient project, now it seems to be an independent project.
Anyway, I started doing some testing and kaboom!, Asterisk crashed when one of our IAX2 servers joined the conference. Running valgrind lead me to find code like this:
void mix_slinear_frames( char *dst, const char *src, int samples ) { if ( dst == NULL ) return ; if ( src == NULL ) return ; int i, val ; for ( i = 0 ; i < samples ; ++i ) { val = ( (short*)dst )[i] + ( (short*)src )[i] ; if ( val > 0x7fff ) { ( (short*)dst )[i] = 0x7fff - 1 ; continue ; } else if ( val < -0x7fff ) { ( (short*)dst )[i] = -0x7fff + 1 ; continue ; } else { ( (short*)dst )[i] = val ; continue ; } } return ; }
This is a typical function where we’re in buffer overflow danger. It receives 2 buffers as arguments and just 1 len. So better for the caller to be sure that both src and dst have enough data to read/write from/to. I quickly found a call like this:
// allocate a mix buffer which fill large enough memory to // hold a frame, and reset it’s memory so we don’t get noise char* cp_listenerBuffer = malloc( AST_CONF_BUFFER_SIZE ) ; memset( cp_listenerBuffer, 0×0, AST_CONF_BUFFER_SIZE ) ; // point past the friendly offset right to the data cp_listenerData = cp_listenerBuffer + AST_FRIENDLY_OFFSET ; // reset the spoken list pointer cf_spoken = frames_in ; // really mix the audio for ( ; cf_spoken != NULL ; cf_spoken = cf_spoken->next ) { // // if the members are equal, and they // are not null, do not mix them. // if ( ( cf_send->member == cf_spoken->member ) && ( cf_send->member != NULL ) ) { // don’t mix this frame } else if ( cf_spoken->fr == NULL ) { ast_log( LOG_WARNING, “unable to mix conf_frame with null ast_framen†) ; } else { // mix the new frame in with the existing buffer mix_slinear_frames( cp_listenerData, (char*)( cf_spoken->fr->data ), cf_spoken->fr->samples); } }
So let’s look closely to the call. The first argument passed is cp_listenerData, a pointer to a brand new voice frame where all the mixing of the conference audio sources will be stored. As we can see in the code, the buffer is of a fixed length AST_CONF_BUFFER_SIZE, but since the pointer is advanced AST_FRIENDLY_OFFSET, then the effective length is AST_CONF_BUFFER_SIZE – AST_FRIENDLY_OFFSET. Let’s see some interesting defines:
// 160 samples 16-bit signed linear
#define AST_CONF_BLOCK_SAMPLES 160
// 2 bytes per sample ( i.e. 16-bit )
#define AST_CONF_BYTES_PER_SAMPLE 2
// 320 bytes for each 160 sample frame of 16-bit audio
#define AST_CONF_FRAME_DATA_SIZE 320
// 1000 ms-per-second / 20 ms-per-frame = 50 frames-per-second
#define AST_CONF_FRAMES_PER_SECOND ( 1000 / AST_CONF_FRAME_INTERVAL )
// account for friendly offset when allocating buffer for frame
#define AST_CONF_BUFFER_SIZE ( AST_CONF_FRAME_DATA_SIZE + AST_FRIENDLY_OFFSET )
So AST_CONF_BUFFER_SIZE is 320 + AST_FRIENDLY_OFFSET, since we don’t really care about the asterisk friendly offset since *real* data must start after this offset, we can say our buffer is 320 bytes long. Where this number comes from? Well, some assumptions are made to calculate that number:
1. Telephony most common sampling rate is 8000 samples per second.
2. Audio frames will be 20ms long.
3. Each sample will be encoded using 16 bits ( “slinear” format ).
Given those assumptions we can then calculate: At 8000 samples per second, 20ms audio frame will be made of
( 8000 samples per second * ( 20ms / 1000ms per second) )
that is 160 samples in one 20ms audio frame. Since each of those samples will be encoded with 16 bits, we have a audio frame byte length of:
( 160 samples * 16 bits per sample / 8 bits per byte ), that is 320 bytes, hence #define AST_CONF_FRAME_DATA_SIZE 320
Assumption of 8000 samples per second is an Asterisk requirement, currently Asterisk does not support other sampling rate. 16 bits per sample is assumed because the mixing will be done in slinear format, so this is ok too. However, 20ms audio frames is a *bad* assumption. Some common phones ( Linksys SPA? ) provide an option in their configuration page to change the RTP frame size, and that caused the crash. How many bytes does a 30ms frame has?
( 8000 samples per second * ( 30ms / 1000ms per second) )
that is 240 samples, each one of those samples of 2 bytes ( 16 bits ) give us a frame size of 480 bytes!, far more than the considered 320 bytes, and kaboom! we have a buffer overflow that could lead in the best case to a DoS attack.
Asterisk smoothers to the rescue!
Asterisk smoothers is a way to take variant size frames as input and return frames of single size. So, we can feed the smoother with 480 bytes frames or whichever size we get and the smoother will return us a frame of the size we want ( 320 bytes in this case ).
struct ast_smoother *ast_smoother_new(int size);
int ast_smoother_feed(struct ast_smoother *s, struct ast_frame *f, int swap);
struct ast_frame *ast_smoother_read(struct ast_smoother *s);
ast_smoother_free(struct ast_smoother *s);
So, I used this interface to smooth each frame with more samples ( hence size ) than required for the slinear audio mixing. Each time a new frame was read from a conference member I added this verification and partial fix code:
/* create the smoother if were receiving more samples than needed */ if ( AST_CONF_BLOCK_SAMPLES < fr->samples ) { if ( member->inSmoother == NULL ) { /* calculate bytes per sample */ ast_log(LOG_DEBUG, “%s frame has %d samples in %d bytesnâ€, member->channel_name, fr->samples, fr->datalen); float bytes_per_sample = ( (float)fr->datalen / (float)fr->samples ); ast_log(LOG_DEBUG, “%s frame has %f bytes per samplenâ€, member->channel_name, bytes_per_sample); float new_frame_len = ( AST_CONF_BLOCK_SAMPLES * bytes_per_sample ); /* WARNING, currently iLBC codec is not fully supported, sound is still choppy */ if ( fr->subclass == AST_FORMAT_ILBC ) { ast_log(LOG_DEBUG, “ILBC format only accepts datalen multiple of 50, so make it happynâ€); if ( new_frame_len < 50 ) { new_frame_len = 50; } } ast_log(LOG_DEBUG, “%s new frame size is %fnâ€, member->channel_name, new_frame_len); member->inSmoother = ast_smoother_new(new_frame_len); } ast_smoother_feed(member->inSmoother, fr); while ( ( sfr = ast_smoother_read( member->inSmoother ) ) ) { conf_frame* cfr = create_conf_frame( member, member->inFrames, sfr ) ; if ( cfr == NULL ) { ast_log( LOG_ERROR, “unable to malloc conf_framen†) ; return -1 ; } add_member_frame(member, cfr); } } else { conf_frame* cfr = create_conf_frame( member, member->inFrames, fr ) ; add_member_frame(member, cfr); }
Messy, but hey! it compiled! ship it! ( I don’t remember why in the hell I used float for bytes_per_sample, but im sure I had a good reason?? )
So, what I did was just calculate the proper frame size when more samples than needed were received, so, when converted to slinear, the frame will result in 240 bytes as required by mix_slinear_frame and it will not overflow mixing the audio properly.
Anyway, I still got a problem I did not solve at that time. For some reason iLBC codec only accepted frame sizes multiple of 50, and since slinear mixing required 240 ( not a multiple of 50 ) voice sounded choppy
I guess some improvements were made to app_conference since I used it last time, so I will test it with Asterisk 1.4 and let’s see how it goes …
I’ll post results later,