Converting 8.24 bit samples in CoreAudio on iOS

When working with CoreAudio on iOS many of the sample applications use the iPhones canonical audio format which is 32 bit 8.24 fixed-point audio. This is because it is the hardwares ‘native’ format.

You end up with a buffer of fixed point data, which is a bit of a pain to deal with.

Other libraries and source code tend to work with floating point samples between 0 and +/-1.0 or signed 16 bit integer samples…so this fixed point stuff is a bit of a pain. You could force CoreAudio to give you 16 bit integer samples to start with (which means it does the conversion for you before giving you the audio buffer) or you could do the conversion yourself, as and when you need to. This can be a more efficient way of doing things, depending on your needs.

In this post I want to show you how you can convert the native 8.24 fixed point sample data into 16 bit integer and/or floating point sample data…and give you an explanation of how it works. But first, I need to de-mystify some stuff to do with bits and bytes.

Bit Order != Byte Order

In Objective-C you can think of the bits of a binary number going from left to right. Just as in base 10, the most significant digit is the left most digit

128| 64| 32| 16| 8 | 4 | 2 | 1
 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0

The above binary number may represent the integer 66. We can apply bit shift operations to binary numbers such that if I shifted all the bits right ( >>) by 1 place I would have:

128| 64| 32| 16| 8 | 4 | 2 | 1
 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1

This might represent an integer value of 33. The left most bit has been newly introduced or padded with 0.

So. When you are thinking about bits, and bit shifting operations, think left to right in terms of significance. Got that? Right, now lets move onto bytes.

The above examples dealt with a single byte (8 bits). When a multi-byte number is represented in a byte array, it can be either little endian or big endian. On Intel, and in terms of CoreAudio, little endian is used. This means the BYTE with the most significance has the highest memory address and the BYTE with the least significance has the lowest memory address (little-end-first = little-endian).

See this post for why this is important when dealing with raw sample data in CoreAudio, and this post on codeproject for a more in-depth explanation. The most important thing to realise is that Bit order and Byte order significance are different beasts. Dont get confused.

For the rest of this post, we are dealing with the representation of the binary digits from the perspective of the language, not the architecture. i.e Think in terms of bit order and not byte order.

Converting 8.24 bit samples to 16 bit integer samples

What does this mean? It means we are going to:

  • Preserve the sign of the sample data (+/- bit)
  • Throw away 8 bits of the 24 bit sample. We assume these bits contain extra precision that we just dont need or are not interested in.
  • Be left with a signed 16 bit sample. A signed 16 bit integer can range from -32,768 to 32,767. This will be the resulting range of our sample.

Remember, we are thinking in terms of bit order; the most significant bit (or the ‘high order’ bit) is the left-most bit. Here is an example of a 32 bit (4 byte), 8.24 fixed point sample:

  8 bits  |         24 bit sample
 11111111 | 01101010 | 00011101 | 11001011

In 8.24 fixed point samples, the first 8 bits represent the sign. They are either all 0 or all 1. The next 24 bits represent the sample data. We want to preserve the sign, but chuck away 8 bits of the sample data to give us a signed 16 bit integer sample.

The trick is to shift the bits 9 places to the right. It’s a crafty move. This is what happens to our 32 bits of data if we shift them right 9 places: 9 bits fall of the end, the sign bits get shunted up and the new bits get padded with zeros such that we get left with:

  new bits  |sign bits|                         gone
 00000000 | 0111111 | 10110101 | 00001110    111001011
                    |   first 16 bits    |

We still have 32 bits of data with the bits shunted up. We are only interested in the first 16 bits of data (the right most bits) that now contain the most significant bits of the 24 bit sample data. A brilliant side effect is that the first (left-most) bit of the first 16 bits represent the sign!

By casting the resulting 32 bits to a 16 bit signed integer we take the first 16 bits, which are the bits we want, and we have a signed 16 bit sample that ranges from -32,768 to 32,767. If we want this as a floating point value between 0 and 1 we can now simply divide by 32,768. Walla.

The code is thus:

SInt16 sampleInt16 = (SInt16)(originalSample >> 9);
float sampleFloat = sampleInt16 / 32768.0;

Simple when you know how. And why!



Playing a Sound from a Dashcode Widget

I’ve just begun to play around with Dashcode for the first time. I wanted to develop a widget that, when clicked, plays a sound. Figured it would be pretty simple. It nearly was.

The general steps are as follows:

  • Launch Dashcode and select ‘Custom’ from the Dashboard section.
  • Click on Library (top right) and click Parts
  • Drag a button onto the widget
  • Drag and drop a sound file (.m4a or any other file supported by QuickTime) from Finder onto the widget
  • For the sake of following these instructions, rename the sound file element from ‘video’ to ‘sfx’

Okay. Now we just have to write some code that will play the sound when you click the button. Fortunately or unfortunately, Apple provide some example code. If you go to the Library again and select Code instead of Parts and type ‘play quicktime’ in the search box you get some sample code. This is what you will see:

// Values you provide
var qtElement = document.getElementById("elementID");	// replace with the ID of a QuickTime element

// QuickTime code

Now, at this point it should simply be a case of opening up the inspector, navigating to the events tab (far right), creating an onclick handler, pasting in the code and renaming ‘elementID’ to ‘sfx’.

Well it would be…If the sample code wasn’t completely wrong! The code that actually works is below:

function onButtonClick(event)
    var qtElement = document.getElementById("sfx");;

There are two problems with the original sample code.

  1. The actual element is just a div container. To get to the media player element you have to access the first child of this container.
  2. The Play() method has the wrong casing and should be lower case, not upper case.

It kinda makes you wonder what the point of providing sample code is if it doesn’t actually work? Anyhow. Hope someone finds this useful.

I woud be interested to know if anyone knows of a better way of playing a sound effect? If you do, please post your suggestion in the comments.