http://audioprograming.wordpress.com/2012/03/03/android-audio-streaming-with-opensl-es-and-the-ndk/


Audio streaming in Android is a topic that has not been covered in general in the Android documentation or in programming examples. To cover that gap, I will like to discuss the use of the OpenSL ES API through the Android Native Development Kit (NDK). For those of you who are new to Android programming, it is important to explain a little bit how the various components of the development system work.

First we have the top-level application programming environment, the Android SDK, which is Java based. This supports audio streaming via the AudioTrack API, which is part of the SDK. There are various examples of AudioTrack applications around, including the pd-android and Supercollider for Android projects.

In addition to the SDK, Android also provides a slightly lower-level programming environment, called the NDK, which allows developers to write C or C++ code that can be used in the Application via the Java Native Interface (JNI).  Since Android 2.3, the NDK includes the OpenSL ES API, which has not been used as widely at the time of writing. One project currently employing it is Csound for Android. This note discusses the use of the OpenSL API and the NDK environment for the development of audio streaming apps.

Setting up the development environment

For this, you will need to go to the Google Android development site and download all the tools. These include the SDK, the NDK and the eclipse plugin. You will also need to get the Eclipse IDE, the ‘classic’ version is probably the most suitable for this work. Instructions for installing these packages are very clear and there is plenty of information on the internet to help you if things go wrong.

Another useful tool for Android development is SWIG, which is used to create the Java code to wrap the C functions we will write. It is not completely required, because you can use the JNI directly. However, it is very handy, as the JNI is not the easiest piece of development software around (some would call it ‘a nightmare’). SWIG wraps C code very well and it simplifies the process immensely. We will use it in the example discussed here.

An example project

The example project we will be discussing can be obtained via git with the following command:

$git clone https://bitbucket.org/victorlazzarini/android-audiotest

Alternatively, these sources can be obtained from the same location as an archive, via the web page interface.

The project consists of a NDK project for the OpenSL streaming IO module and an Eclipse project for the application example. The NDK project is first built by running the top-level script

$sh build.sh

This simple script first sets up the location of the downloaded NDK (you will need to set this to match your system locations)

export ANDROID_NDK_ROOT=$HOME/work/android-ndk-r7

and then proceeds to call SWIG to build the Java interface code that will link our C opensl example module to the app. It creates both a C++ file wrapping the C code and the Java classes we need to use to run it.

swig -java -package opensl_example -includeall -verbose
-outdir src/opensl_example -c++ -I/usr/local/include
-I/System/Library/Frameworks/JavaVM.framework/Headers
-I./jni -o jni/java_interface_wrap.cpp opensl_example_interface.i

When this is done, it calls the NDK build script,

$ANDROID_NDK_ROOT/ndk-build TARGET_PLATFORM=android-9 V=1

that will build a dynamically-loadable module (.so) containing our native code. This script is hardwired to use the Android.mk file in the ./jni directory.

Once the NDK part is built, we can turn to Eclipse. After starting it, we should import the project by using File->Import and then the ‘Import into existing workspace’ option. It will ask for the project directory and we just browse and select the top-level one (android-audiotest). If everything proceeded to plan, you can plug in your device and choose build (Android app). The application will be built and run in the device. At this point you will be able to talk into the mic and hear your voice over the speakers (or, more appropriately, a pair of headphones).

The native interface code

Two source files compose the native part of this project: opensl_io.c, which has the all the audio streaming functions; and opensl_example.c, which uses these to implement the simple audio processing example. A reference for the OpenSL API is found in theOpenSL ES 1.0.1 specification, which is also distributed in the Android NDK docs/opensl directory. There we find some specific documentation on the Android implementation of the API, which is also available online.

Opening the device for audio output

The entry point into OpenSL is through the creation of the audio engine, as in

result = slCreateEngine(&(p->engineObject), 0, NULL, 0, NULL, NULL);

This initialises an  engine object of the type SLObjectItf  (which in the example above is held in a data structure pointed by p). Once an engine is created, it needs to be realised (this is going to be a common process with OpenSL objects, creation followed by realisation). An engine interface is then obtained, which will be used subsequently to open and initialise the input and output devices (with their sources and sinks):

result = (*p->engineObject)->Realize(p->engineObject, SL_BOOLEAN_FALSE);
...
result = (*p->engineObject)->GetInterface(p->engineObject,
                                     SL_IID_ENGINE, &(p->engineEngine));

Once the interface to the engine object is obtained, we can use it to create other API objects. In general, for all API objects, we:

  1. create the object (instantiation)
  2. realise it (initialisation)
  3. obtain an interface to it (to access any features needed), via the GetInterface() method

In the case of playback, the first object to be created is the Output Mix (also a SLObjectItf), and then realised:

const SLInterfaceID ids[] = {SL_IID_VOLUME};
const SLboolean req[] = {SL_BOOLEAN_FALSE}
result = (*p->engineEngine)->CreateOutputMix(p->engineEngine,
                                    &(p->outputMixObject), 1, ids, req);
...
result = (*p->outputMixObject)->Realize(p->outputMixObject,
                                                 SL_BOOLEAN_FALSE);

As we will not need to manipulate it, we do not need to get its interface. Now, we configure the source and sink of a player object we will need to create. For output, the source is going to be a buffer queue, which is where we will send our audio data samples. We configure it with the usual parameters: data format, channels, sampling rate (sr), etc:

SLDataLocator_AndroidSimpleBufferQueue loc_bufq =
                           {SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 2};
SLDataFormat_PCM format_pcm = {SL_DATAFORMAT_PCM,channels,sr,
               SL_PCMSAMPLEFORMAT_FIXED_16, SL_PCMSAMPLEFORMAT_FIXED_16,
               speakers, SL_BYTEORDER_LITTLEENDIAN};
SLDataSource audioSrc = {&loc_bufq, &format_pcm};

and the sink the Output Mix, we created above,

SLDataLocator_OutputMix loc_outmix = {SL_DATALOCATOR_OUTPUTMIX,
                                                p->outputMixObject};
SLDataSink audioSnk = {&loc_outmix, NULL};

The audio player object then gets created with this source and sink, and realised:

const SLInterfaceID ids1[] = {SL_IID_ANDROIDSIMPLEBUFFERQUEUE};
const SLboolean req1[] = {SL_BOOLEAN_TRUE};
result = (*p->engineEngine)->CreateAudioPlayer(p->engineEngine,
                    &(p->bqPlayerObject), &audioSrc, &audioSnk,
                     1, ids1, req1);
...
result = (*p->bqPlayerObject)->Realize(p->bqPlayerObject, 
                                             SL_BOOLEAN_FALSE)

Then we get the player object interface,

result = (*p->bqPlayerObject)->GetInterface(p->bqPlayerObject, 
                                 SL_IID_PLAY,&(p->bqPlayerPlay));

and the buffer queue interface (of type SLBufferQueueItf)

result = (*p->bqPlayerObject)->GetInterface(p->bqPlayerObject,
       SL_IID_ANDROIDSIMPLEBUFFERQUEUE, &(p->bqPlayerBufferQueue));

The OpenSL API provides a callback mechanism for audio IO. However, unlike other asynchronous audio IO implementations, like in CoreAudio or Jack, the callback does not pass the audio buffers for processing, as one of its arguments. Instead, the callback is only use to signal the application, indicating that the buffer queue is ready to receive data.

The buffer queue interface obtained above will be used to set up a callback (bqPlayerCallback, which is passed as context):

result = (*p->bqPlayerBufferQueue)->RegisterCallback(
                      p->bqPlayerBufferQueue,bqPlayerCallback, p);

Finally, the player interface is used to start audio playback:

result = (*p->bqPlayerPlay)->SetPlayState(p->bqPlayerPlay,
                                            SL_PLAYSTATE_PLAYING);

Opening the device for audio input

The process of starting the recording of audio data is very similar to playback. First we set our source and sink, which will be the Audio Input and a buffer queue, respectively:

SLDataLocator_IODevice loc_dev = {SL_DATALOCATOR_IODEVICE,
                      SL_IODEVICE_AUDIOINPUT,
                      SL_DEFAULTDEVICEID_AUDIOINPUT, NULL};
SLDataSource audioSrc = {&loc_dev, NULL};
...
SLDataLocator_AndroidSimpleBufferQueue loc_bq =
                      {SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 2};
SLDataFormat_PCM format_pcm = {SL_DATAFORMAT_PCM, channels, sr,
          SL_PCMSAMPLEFORMAT_FIXED_16, SL_PCMSAMPLEFORMAT_FIXED_16,
          speakers, SL_BYTEORDER_LITTLEENDIAN};
SLDataSink audioSnk = {&loc_bq, &format_pcm};

Then we create an audio recorder, realize it and get its interface:

const SLInterfaceID id[1] = {SL_IID_ANDROIDSIMPLEBUFFERQUEUE};
const SLboolean req[1] = {SL_BOOLEAN_TRUE};
result = (*p->engineEngine)->CreateAudioRecorder(p->engineEngine,
                              &(p->recorderObject), &audioSrc,
                               &audioSnk, 1, id, req);
...
result = (*p->recorderObject)->Realize(p->recorderObject,
                                          SL_BOOLEAN_FALSE);
...
result = (*p->recorderObject)->GetInterface(p->recorderObject,
                           SL_IID_RECORD, &(p->recorderRecord));

The buffer queue interface is obtained and the callback set:

result = (*p->recorderObject)->GetInterface(p->recorderObject,
     SL_IID_ANDROIDSIMPLEBUFFERQUEUE, &(p->recorderBufferQueue));
...
result = (*p->recorderBufferQueue)->RegisterCallback(
                   p->recorderBufferQueue, bqRecorderCallback,p);

We can now start audio recording:

result = (*p->recorderRecord)->SetRecordState(
                      p->recorderRecord,SL_RECORDSTATE_RECORDING);

Audio IO

Streaming audio to/from the device is done by the Enqueue() method of SLBufferQueueItf:

SLresult (*Enqueue) (SLBufferQueueItf self,
                     const void *pBuffer, SLuint32 size);

This should be called whenever the buffer queue is ready for a new data buffer (either for input or output). As soon as the player or recorder object is set into playing or recording state, the buffer queue will be ready for data. After this, the callback mechanism will be responsible for signaling the application that the buffer queue is ready for another block of data. We can call the Enqueue() method in the callback itself, or elsewhere. If we decide for the former, to get the callback mechanism running, we need to enqueue a buffer as we start recording or playing, otherwise the callback will never be called.

An alternative is to use the callback only to notify the application, waiting for it as we have a full buffer to deliver. In this case we would employ a double buffer, so that while one half is enqueued, the other is getting filled or consumed by our application. This allows us to create a simple interface that can receive a block of audio that will be used to write to the buffer or to receive the samples from the buffer.

Here is what we do for input. The callback is very minimal, it just notifies our main processing thread that the buffer queue is ready:

void bqRecorderCallback(SLAndroidSimpleBufferQueueItf bq, void *context)
{
  OPENSL_STREAM *p = (OPENSL_STREAM *) context;
  notifyThreadLock(p->inlock);
}

Meanwhile, a processing loop would call the audio input function to get a block of samples. When the buffer is emptied, we wait for the notification to enqueue a buffer to be filled by the device and switch buffers:

int android_AudioIn(OPENSL_STREAM *p,float *buffer,int size){
  short *inBuffer;
  int i, bufsamps = p->inBufSamples, index = p->currentInputIndex;
  if(p == NULL || bufsamps ==  0) return 0;

  inBuffer = p->inputBuffer[p->currentInputBuffer];
  for(i=0; i < size; i++){
    if (index >= bufsamps) {
      waitThreadLock(p->inlock);
      (*p->recorderBufferQueue)->Enqueue(p->recorderBufferQueue,
                     inBuffer,bufsamps*sizeof(short));
      p->currentInputBuffer = (p->currentInputBuffer ? 0 : 1);
      index = 0;
      inBuffer = p->inputBuffer[p->currentInputBuffer];
    }
    buffer[i] = (float) inBuffer[index++]*CONVMYFLT;
  }
  p->currentInputIndex = index;
  if(p->outchannels == 0) p->time += (double) size/(p->sr*p->inchannels);
  return i;
}

For output, we do the reverse. The callback is exactly the same, but now it notifies that the device has consumed our buffer. So in the processing loop, we call this function that fills the output buffer with the blocks we pass to it. When the buffer is full, we wait for the notification so that we can enqueue the data and switch buffers:

int android_AudioOut(OPENSL_STREAM *p, float *buffer,int size){

short *outBuffer, *inBuffer;
int i, bufsamps = p->outBufSamples, index = p->currentOutputIndex;
if(p == NULL  || bufsamps ==  0)  return 0;
outBuffer = p->outputBuffer[p->currentOutputBuffer];

for(i=0; i < size; i++){
outBuffer[index++] = (short) (buffer[i]*CONV16BIT);
if (index >= p->outBufSamples) {
waitThreadLock(p->outlock);
(*p->bqPlayerBufferQueue)->Enqueue(p->bqPlayerBufferQueue,
outBuffer,bufsamps*sizeof(short));
p->currentOutputBuffer = (p->currentOutputBuffer ?  0 : 1);
index = 0;
outBuffer = p->outputBuffer[p->currentOutputBuffer];
}
}
p->currentOutputIndex = index;
p->time += (double) size/(p->sr*p->outchannels);
return i;
}

The interface

The code discussed above is structured into a minimal API for audio streaming with OpenSL. It contains five functions (and one opaque data structure):

/*
  Open the audio device with a given sampling rate (sr), input and
  output channels and IO buffer size in frames.
  Returns a handle to the OpenSL stream
*/
OPENSL_STREAM* android_OpenAudioDevice(int sr, int inchannels,
                                int outchannels, int bufferframes);
/*
Close the audio device
*/
void android_CloseAudioDevice(OPENSL_STREAM *p);
/*
Read a buffer from the OpenSL stream *p, of size samples.
Returns the number of samples read.
*/
int android_AudioIn(OPENSL_STREAM *p, float *buffer,int size);
/*
Write a buffer to the OpenSL stream *p, of size samples.
Returns the number of samples written.
*/
int android_AudioOut(OPENSL_STREAM *p, float *buffer,int size);
/*
Get the current IO block time in seconds
*/
double android_GetTimestamp(OPENSL_STREAM *p);

Processing

The example is completed by a trivial processing function, start_processing(), which we will wrap in Java so that it can be called by the application. It employs the API described above:

p = android_OpenAudioDevice(SR,1,2,BUFFERFRAMES);
...
while(on) {
   samps = android_AudioIn(p,inbuffer,VECSAMPS_MONO);
   for(i = 0, j=0; i < samps; i++, j+=2)
     outbuffer[j] = outbuffer[j+1] = inbuffer[i];
   android_AudioOut(p,outbuffer,VECSAMPS_STEREO);
  }
android_CloseAudioDevice(p);

A stop_processing() function is also supplied, so that we can stop the streaming to close the application.

The application code

Finally, completing the project, we have a small Java class, which is based on the Eclipse auto- generated Java application code, with the addition of a secondary thread and calls to the two wrapped native functions described above:

public class AudiotestActivity extends Activity {
    /** Called when the activity is first created. */
    Thread thread;
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        thread = new Thread() {
            public void run() {
                setPriority(Thread.MAX_PRIORITY);
                opensl_example.start_process();
            }
        };
        thread.start();   
    }
    public void onDestroy(){
        super.onDestroy();
        opensl_example.stop_process();
        try {
            thread.join();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        thread = null;
    }
}

Final Words

I hope these notes have been useful to shed some light on the development of audio processing applications using native code and OpenSL. While not offering (yet) a low-latency option to the AudioTrack SDK, it can still possibly deliver better performance, as native code is not subject to  virtual machine overheads such as garbage collection pauses. We assume that this is the way forward for audio development on Android. In the NDK document on the Android OpenSL implementation, we read that

as the Android platform and specific device implementations continue to evolve, an OpenSL ES application can expect to benefit from any future system performance improvements.

This indicates that Android audio developers should perhaps take some notice of OpenSL. The lack of examples has not been very encouraging, but I hope this blog is a step towards changing this.