Delegates and Native API Callbacks - Answer

§ April 16, 2009 16:35 by beefarino |

A while back I posted a little puzzle about an exception I was hitting after passing a delegate to a native API call.  In a nutshell, my application was passing an anonymous delegate to an unmanaged library call:

Result Code result = ExternalLibraryAPI.Open(
    deviceHande,
    delegate( IntPtr handle, int deviceId )
    {
        // ...
    }
);
VerifyResult( result ); 

The call returns immediately, and the unmanaged library would eventually invoke the callback in response to a user pressing a button on a device.  After a semi-random period, pressing the button would yield an exception in my application.  The questions I asked were:

  1. What's the exception?
  2. How do you avoid it?

I've waited a bit to post the answers to see if anyone besides Zach would chime in.  Zach correctly identified the nut of the problem - the delegate is being garbage collected because there is no outstanding reference on the anonymous delegate once the unmanaged call returns.  With no one referencing the delegate object, the garbage collector is free to reclaim it.  When the device button is pushed, the native library invokes the callback, which no longer exists in memory.  So, to answer the first question, the specific exception that is raised is a CallbackOnCollectedDelegate:

CallbackOnCollectedDelegate was detected.
Message: A callback was made on a garbage collected delegate of type 'Device.Interop!Device.Interop.ButtonPressCallback::Invoke'. This may cause application crashes, corruption and data loss. When passing delegates to unmanaged code, they must be kept alive by the managed application until it is guaranteed that they will never be called.

The verbage in this exception message answers my second question.  To avoid the exception, you need to hold a reference to any delegate you pass to unmanaged code for as long as you expect the delegate to be invoked.  You need to use an intermediary reference on the delegate to maintain it's life.

Based on this, any of these examples are doomed to fail eventually, because none of them maintain a reference on the delegate object being passed to the unmanaged library:

ExternalLibraryAPI.Open(
    deviceHande,
    delegate( IntPtr handle, int deviceId )
    {
        // ...
    }
);
 
ExternalLibraryAPI.Open(
    deviceHande,
    new ButtonPressCallback( this.OnButtonPress )
);
 
ExternalLibraryAPI.Open(
    deviceHande,
    this.OnButtonPress
);

The correct way to avoid the problem is to hold an explicit reference to the specific delegate instance being passed to unmanaged code:

ButtonPressCallback buttonPressCallback = this.OnButtonPress;
ExternalLibraryAPI.Open(
    deviceHande,
    buttonPressCallback
);
// hold the reference until we're sure no
// further callbacks will be made on the
// delegate, then we can release the
// reference and allow it to be GC'ed
buttonPressCallback = null;

At first I thought Zach's pinning solution was correct; however, you can only pin blittable types, of which delegates are not, so "pinning a delegate" isn't even possible or necessary.  If you're interested, the details of how delegates are marshalled across the  managed/unmanaged boundary are quite interesting, as I found out from Chris Brumme's blog:

Along the same lines, managed Delegates can be marshaled to unmanaged code, where they are exposed as unmanaged function pointers.  Calls on those pointers will perform an unmanaged to managed transition; a change in calling convention; entry into the correct AppDomain; and any necessary argument marshaling.  Clearly the unmanaged function pointer must refer to a fixed address.  It would be a disaster if the GC were relocating that!  This leads many applications to create a pinning handle for the delegate.  This is completely unnecessary.  The unmanaged function pointer actually refers to a native code stub that we dynamically generate to perform the transition & marshaling.  This stub exists in fixed memory outside of the GC heap.

However, the application is responsible for somehow extending the lifetime of the delegate until no more calls will occur from unmanaged code.  The lifetime of the native code stub is directly related to the lifetime of the delegate.  Once the delegate is collected, subsequent calls via the unmanaged function pointer will crash or otherwise corrupt the process. 

Thanks again Zach, and to everyone who reads my blog!



Delegates and Native API Callbacks

§ March 27, 2009 02:07 by beefarino |

For one of my contracts I'm implementing a .NET layer over a native third-party SDK.  The SDK makes heavy use of function pointers, which means, in terms of p/invoke and interoperability, I've got a lot of delegates flying around.  I recently hit a rather nasty bug in my .NET layer - see if you can figure this one out.

Consider this truncated example derived from the SDK:

typedef void (__stdcall *PCALLBACK)( HANDLE deviceHandle, LONG32 deviceID );
// ...
API_CALL RESULT_TYPE __stdcall Open( HANDLE deviceHandle, PCALLBACK pfnCallback );

In a nutshell, when Open is called, the pfnCallback function pointer is registered, along with the handle.  The Open function returns almost immediately, and the native library invokes the callback function periodically in response to user interaction with a device.

Here is the C# that enables the Open method to be called from .NET:

public delegate void Callback( IntPtr deviceHandle, int deviceID );
// ...
[DllImport( "ExternalLibrary.dll")]
public static extern ResultCode Open( IntPtr deviceHandle, Callback callback );

And here it is in use:

Result Code result = ExternalLibraryAPI.Open(
    deviceHande,
    delegate( IntPtr handle, int deviceId )
    {
        // ...
    }
);
VerifyResult( result ); 

The call to Open succeeds, and the callback is invoked whenever the user pushes the appropriate button on the device.  Eventually though, a nasty exception is raised as the user continues to fiddle with the device. 

I will tell you that:

  • there is no context or stack information to the exception;
  • there is not a bug in the third-party SDK;
  • this post has all the information you need to find and fix the problem - i.e., I'm not hiding something from you.

So my questions to you are:

  1. What's the exception?
  2. How do you avoid it?
Leave your answers as comments. I'll post the answer in a few days.


Overlapped I/O Aborted by Terminating Thread

§ October 17, 2008 07:10 by beefarino |

A while back I posted about using Overlapped I/O from the .NET Framework.  I've started integrating the hardware with the rest of the project and hit a snag.  It seems that if a thread makes an overlapped I/O request and later terminates, the I/O request is aborted and your IOCompletionCallback routine receives error code 995 (system error code ERROR_OPERATION_ABORTED, or System.IO.IOException): "The I/O operation has been aborted because of either a thread exit or an application request".  I haven't looked into why this happens, but functionally it seems that the Windoze kernel assumes that the I/O request is valid only as long as the requesting thread is alive and kicking, which seems both perfectly reasonable and unreasonable depending on your perspective.  If you do happen to know the specifics on the kernel's behavior here, please comment; you'll save me some digging.

Example

Here is a unit test that illustrates the problem; a brief walkthrough follows the code:

[Test]
public void OIOTerminatesOnThreadExit()
{
    TcpListener listener = new TcpListener( 8888 );
    TcpClient client = new TcpClient();

    Exception exception = null;

    Thread listeningThread = new Thread(
        delegate()
        {
            listener.Start();

            // block until we receive a client
            TcpClient myClient = listener.AcceptTcpClient();

            // initiate an overlapped I/O operation on
            //  the underlying socket
            myClient.GetStream().BeginRead(
                new byte[ 16 ], 0, 15,                    
                r => {
                    try
                    {
                        // calling EndRead should
                        //    yield an exception            
                        myClient.GetStream().EndRead( r );
                    }
                    catch( Exception e )
                    {
                        // save the exception for later
                        //  assertion and validation
                        exception = e;
                    }
                },
                null
            );
        }
    );

    // start the listening thread
    listeningThread.Start();

    // connect to the TcpListener, so it can initiate an
    //  overlapped I/O operation
    client.Connect( Dns.GetHostName(), 8888 );

    // wait for the listening thread to finish
    listeningThread.Join();

    // verify
    Assert.IsNotNull( exception );
    Assert.IsInstanceOfType( typeof( IOException ), exception );
    StringAssert.Contains(
        "The I/O operation has been aborted because of either a thread exit or an application request",
        exception.Message
    );
}

Note that for brevity this test contains no error handling or Tcp timeouts, which it really should.

The test creates a thread that starts a TcpListener and waits for a connection.  Once a connection is established the thread issues an overlapped I/O read request on the network stream.  The AsyncCallback handler for the BeginRead operation just calls EndRead, saving any exception that occurs for further scrutiny.  Immediately following the overlapped I/O request, the listeningThread terminates normally.

Once the listeningThread is started, the unit test uses a TcpClient to connect to the TcpListener.  This will allow the listeningThread to make the overlapped I/O request and terminate.  After the TcpClient is connected, the test waits for the listeningThread to terminate.  

At this point, the test verifies that an exception was received from the BeginRead AsyncRequest callback and validates its type and content.

Workaround

My current workaround is pretty simple: kick off the I/O operation from the thread pool instead of an application thread.  Thread pool threads don't really terminate like application threads, they just go back into the pool when their work unit is complete, and the kernel seems to be content to oblige I/O requests from the thread pool even after the thread is returned to the pool (e.g., when you call an asynchronous BeginRead operation from your AsyncResult callback).

Here's the example with the workaround applied:

[Test]
public void OIODoesNotTerminateOnThreadPoolThreadExit()
{
    TcpListener listener = new TcpListener( 8888 );
    TcpClient client = new TcpClient();

    Exception exception = null;

    // start the listeningThread on the thread pool
    ThreadPool.QueueUserWorkItem(
        delegate( object unused )
        {
            listener.Start();

            // block until we receive a client
            TcpClient myClient = listener.AcceptTcpClient();

            // initiate an overlapped I/O operation on
            //  the underlying socket
            myClient.GetStream().BeginRead(
                new byte[ 16 ], 0, 15,                    
                r => {
                    try
                    {
                        myClient.GetStream().EndRead( r );
                    }
                    catch( Exception e )
                    {
                        // save the exception for later
                        //  assertion and validation
                        exception = e;
                    }
                },
                null
            );
        }
    );

    // connect to the TcpListener, so it can initiate an
    //  overlapped I/O operation
    client.Connect( Dns.GetHostName(), 8888 );

    // verify
    Assert.IsNull( exception );            
}       

The only changes here are that the listening thread is queued on the thread pool, and the test verifies that the exception remains null.  Using the thread pool seems to counter the kernel behavior and keep the outstanding overlapped I/O requests active even after the thread work unit is complete.



Using Overlapped I/O from Managed Code

§ July 26, 2008 04:01 by beefarino |

I've been working on a project that requires some low-level device driver interaction using the DeviceIoControl Windows API call.  Without going into too much detail, the driver's programming interface requires a thread-heavy implementation to monitor low-level GPIO inputs for signal changes.  Fortunately Windows has built-in an asynchronous programming mode called Overlapped I/O that saves me the overhead of having to manage a bunch of threads for such low-level tripe.  In a nutshell, Overlapped I/O is the backbone of the asynchronous reading and writing of files and network streams in the .NET framework.

Unfortunately, there is no managed equivalent for device communication in the .NET framework; hand-rolling access to the DeviceIoControl API call is simple enough, but leveraging overlapped I/O from managed code requires some digging.  There are very few examples available, so I decided to post one in case anyone else out there happens to be working on such obscure (but very interesting!) projects too.

There are five steps to leveraging Overlapped I/O from managed code:

  1. Access the relevant routines from Kernel32.dll via P/Invoke;
  2. Create a file/device/pipe handle tagged for overlapped I/O;
  3. Bind the handle to the managed Thread Pool;
  4. Prepare a NativeOverlapped pointer;
  5. Pass the NativeOverlapped pointer to the relevant Win32 API (DeviceIoControl in my case).

Kernel32 P/Invoke

Using overlapped I/O requires some direct access to routines from kernel32.dll.  Below is the a static class that imports the routines and defines the constants needed.  If you have any questions or suggestions please feel free to post a comment.

using System;
using System.Runtime.InteropServices;
using System.Threading;
using Microsoft.Win32.SafeHandles;
namespace PInvoke
{
    /// <summary>
    /// Static imports from Kernel32.
    /// </summary>
    static public class Kernel32
    {
        const uint Overlapped = 0x40000000;
        
        [ Flags ]
        public enum AccessRights : uint
        {
            GENERIC_READ                     = (0x80000000),
            GENERIC_WRITE                    = (0x40000000),
            GENERIC_EXECUTE                  = (0x20000000),
            GENERIC_ALL                      = (0x10000000)
        }
        [ Flags ]
        public enum ShareModes : uint
        {
            FILE_SHARE_READ                 = 0x00000001,
            FILE_SHARE_WRITE                = 0x00000002,
            FILE_SHARE_DELETE               = 0x00000004  
        }
        public enum CreationDispositions
        {
            CREATE_NEW          = 1,
            CREATE_ALWAYS       = 2,
            OPEN_EXISTING       = 3,
            OPEN_ALWAYS         = 4,
            TRUNCATE_EXISTING   = 5
        }
        [ DllImport( "kernel32.dll", SetLastError=true, CharSet=CharSet.Unicode ) ]
        public static extern SafeFileHandle CreateFile(
            string lpFileName,
            uint dwDesiredAccess,
            uint dwShareMode,
            IntPtr lpSecurityAttributes,
            uint dwCreationDisposition,
            uint dwFlagsAndAttributes,
            IntPtr hTemplateFile
        );
        [ DllImport( "kernel32.dll", CharSet=CharSet.Unicode ) ]
        public static extern void CloseHandle(
            SafeHandle handle
        );
        [ DllImport( "kernel32.dll", SetLastError=true, CharSet=CharSet.Unicode ) ]
        unsafe public static extern bool DeviceIoControl(
            SafeHandle hDevice,
            uint dwIoControlCode,
            IntPtr lpInBuffer,
            uint nInBufferSize,
            IntPtr lpOutBuffer,
            uint nOutBufferSize,
            ref uint lpBytesReturned,
            NativeOverlapped *lpOverlapped
        );
    }
}

Creating the Device Handle

In order to leverage asynchronous I/O against the device handle, it needs to be opened specifically for asynchronous operations.  To do this use the CreateFile API call, passing the device path as the lpFileName parameter and specifying the Overlapped flag in the dwFlagsAndAttributes parameter:

SafeFileHandle deviceHandle = Kernel32.CreateFile(
    devicePath,
    ( uint )( Kernel32.AccessRights.GENERIC_READ | Kernel32.AccessRights.GENERIC_WRITE ),
    ( uint )( Kernel32.ShareModes.FILE_SHARE_READ | Kernel32.ShareModes.FILE_SHARE_WRITE ),
    IntPtr.Zero,
    ( uint )( Kernel32.CreationDispositions.OPEN_EXISTING ),
    Kernel32.Overlapped, // note the OVERLAPPED flag here
    IntPtr.Zero
);                    

Binding the Device Handle to the Thread Pool

The Windows Thread Pool manages 1000 threads per process dedicated to nothing but handling overlapped I/O.  In order to make use of them, it is necessary to "bind" the device handle returned by calling CreateFile to the thread pool.  This is done using the ThreadPool.BindHandle call:

ThreadPool.BindHandle( deviceHandle );             

The API documentation for BindHandle gives almost no indication as to it's purpose, but I've found this call is absolutely necessary in order to receive notifications of asynchronous operations.  (If you're interested in the details of what this call actually accomplishes, it seems to bind the handle to an I/O Completion Port owned by the Thread Pool; check out Chris Brumme's comments from this post on his blog for more.)

Creating a NativeOverlapped Pointer

Creating a NativeOverlapped pointer to pass to the DeviceIoControl API is relatively simple:

Overlapped overlapped = new Overlapped();
NativeOverlapped* nativeOverlapped = overlapped.Pack(
    DeviceWriteControlIOCompletionCallback,
    null
);             

After creating an instance of the Overlapped class, use its Pack method to create a pointer to a NativeOverlapped structure; the layout of this structure is byte-for-byte identical to the Win32 OVERLAPPED structure, which makes it suitable for passing directly to unmanaged API calls.  Note that the pointer returned by Overlapped.Pack() is fixed (pinned) in memory until the pointer is passed to Overlapped.Unpack and Overlaped.Free.

The first parameter to Overlapped.Pack is an IOCompletionCallback delegate that will be called when the overlapped I/O operation completes.  The second argument could be an array of objects representing buffers of memory used for passing data to and from the device driver; to keep this post as simple as possible I'm leaving that feature out of the example code.

The IOCompletionCallback delegate is mandatory and follows the following pattern:

unsafe void DeviceWriteControlIOCompletionCallback( uint errorCode, uint numBytes, NativeOverlapped* nativeOverlapped )
{    
    try
    {
        // ...
    }
    finally
    {
          System.Threading.Overlapped.Unpack( nativeOverlapped );
          System.Threading.Overlapped.Free( nativeOverlapped );
    }
}

The errorCode parameter will be 0 on a successful overlapped I/O operation, or a standard Win32 error code on a failure.  The numBytes parameter will contain the number of bytes received from the operation.

With whatever else needs to happen post-I/O, the fixed NativeOverlapped pointer must be unpacked and freed, or the application will both leak memory and fragment the managed heap over time.

Passing the NativeOverlapped Pointer to DeviceIoControl

With a NativeOverlapped pointer in hand, call DeviceIoControl:

 

const int ErrorIOPending = 997;
uint bytesReturned = 0;
bool result = Kernel32.DeviceIoControl(
    deviceHandle,
    ( uint )ioCtlCode,
    IntPtr.Zero,
    0,
    IntPtr.Zero,
    0,
    ref bytesReturned,
    nativeOverlapped
);
if( result )
{
    // operation completed synchronously    
    
    System.Threading.Overlapped.Unpack( nativeOverlapped );
    System.Threading.Overlapped.Free( nativeOverlapped );
}
else
{
    int error = Marshal.GetLastWin32Error();    
    if( ErrorIOPending != error )
    {
        // failed to execute DeviceIoControl using overlapped I/O ...
        
        System.Threading.Overlapped.Unpack( nativeOverlapped );
        System.Threading.Overlapped.Free( nativeOverlapped );
    }
}

 

The nativeOverlapped pointer and overlapped device handle cause the kernel to attempt an asynchronous execution of DeviceIoControl.  

The call will return immediately with a boolean result whose semantics are a bit awkward.  If the result is true, it means the operation completed synchronously as opposed to asynchronously.  A false result indicates the operation may have be executing asynchronously.  To verify this, check the Win32 error code via Marshal.GetLastWin32Error: if it's ERROR_IO_PENDING (997), the DeviceIoControl operation is executing asynchronously and the IOCompletionCallback delegate will be called when the operation is finished; otherwise, the error code will indicate why DeviceIoControl failed to execute asynchronously.

If DeviceIoControl executes synchronously, or if it fails to execute asynchronously, the NativeOverlapped pointer must still be unpacked and freed to avoid leaking memory and fragmenting the managed heap.

That's the nut of using Overlapped I/O from managed code.  If you do end up using this for something, please drop me a line, either as a comment to this post or using the contact link on this site; I'd love to hear what you're working on!