We want samples

June 18th, 2007 - Fernando Roberto

A few months after my hiring into Open, they asked me to do a participation in a lecture about Secure Code. My part was about stack overflow and how to take advantage of that oversight programmer to break into the program. The major point to note was in addition to programmers, the audience was composed by software engineers and architects, the commercial staff, who had more contact with customers were there also, there were one or two people from administrative sector as well, in summary, an audience that few of them have heard about call stack. One of them said: “I sort of remember from when I got my .Net certification, which said that structures are created on the heap and objects are created on the heap, or vice versa.” Anyway, things became more complicated when I started to show the sample code I supplied, with PUSHs, MOVs and POPs. The result was: some slept, while others just drooled and spoke in that alien language while they sleep. I know that language very well because my wife always talks to me while she sleeps and I’m sitting in bed with the notebook. Despite her words being completely incomprehensible, she always responds when I make questions to her. She introduces the subject by saying: “Admivoza bumizav” then I ask: “Why do you think so?” And she replies: “Zumirag abmish mua”. But back to the subject, it became clear that the amount of technical detail was incompatible with the public. So, not long ago, in a talking about Windows drivers, I tried to pass a vision with little details, just to give an idea what drivers are and how they contribute to the system work. After all, the entire development department was there, including architects, engineers and .Net coders (nothing against them). To my surprise, at the end of the lecture, everyone was expecting more details, something like:  “Hey, what about the remaining? Don’t you have any little example?”. So, okay… In this post, I will write a minimal driver, but one that has some interaction with a test application.

At the beginning…

Here I will assume that you have already known how to create a driver project from scratch and how to use Visual Studio to code drivers, going straight to the point where we must write the driver. The today’s example will be a very basic echo driver commands that will receive commands through reading and writing. So, it would be like that: you initially write buffers, which are strings in our example using the WriteFile() function and everything will be stored into the driver. The subsequent readings from the function ReadFile() will bring the same data from where were written. This example will be very useful in other posts, and will also serve as a starting point for those wanting to write their first driver.

Knowing that we will store the strings in a list, and then we need to create a buffer list composed by nodes defined as shown below.

//-f--> Type definition to be stored into
//      our buffer list.
typedef struct _BUFFER_ENTRY
{
    PVOID       pBuffer;    //-f--> Sent buffer
    ULONG       ulSize;     //      Buffer size
    LIST_ENTRY  Entry;      //      List node
 
} BUFFER_ENTRY, *PBUFFER_ENTRY;
 
 
//-f--> List head and mutex for protection 
LIST_ENTRY      g_BufferList;
KMUTEX          g_Mutex;

After defining the structure, we have to create a global variable that will be the list head and also a mutex, to protect our list from possible accesses in parallel. This would happen if we had two test applications running at the same time. If you want more details about the DDK linked lists, there is a post that explains about that, too.

Writing DriverEntry routine

The first point to notice here is that, since our example has been coded in a .CPP file, it’s necessary to put an extern “C” at the DriverEntry function definition. Otherwise, the linker will not find the driver entry point. Early in the implementation, we can see the message that will be launched to the debugger via KdPrint(). Next, I will initialize our linked list head and the mutex that protect it. Now let’s set some members in the DriverObject structure and the first will be DriverUnload member, which receives a pointer to a callback function that will inform the driver that it is being unloaded. The next few members are the routines that will be called when the driver receives requests about Create/Open, Close, Read and Write. I will talk about these routines with more details a little later. Continuing, the DeviceObject is created, which will be our way of communication way with the driver. As I said at the lecture, all requests a driver receives are through a device. IoCreateDevice() function makes this for us. Created the device, I will now configure it so that it should use intermediate buffers. For this, I have to set the bit DO_BUFFERED_IO in Flags field from DeviceObject you have just created. Intermediate buffers? We’ll talk about BufferedIo versus DirectIo in a future opportunity. There are so many new concepts in this post and I, unfortunately cannot go very deep into their details, otherwise nobody would read it that never seems ending up.

Having a DeviceObject is cool, but it is not everything in driver’s life; for a User Mode application being able to communicate with a driver, you must create a Symbolic Link. This is done next with IoCreateSymbolicLink() function. The remaining function code should not cause great surprise to most of you, but if in doubt, just send me an e-mail and we can resolve this in a fight.

/****
***     DriverEntry
**
**      This is the driver entry point.
**      Knife in teeth and blood in eyes.
*/
 
extern "C"
NTSTATUS DriverEntry(IN PDRIVER_OBJECT pDriverObj,
                     IN PUNICODE_STRING pusRegistryPath)
{
    NTSTATUS        nts;
    PDEVICE_OBJECT  pDeviceObj = NULL;
 
    __try
    {
        //-f--> Saying hello to the Kernel Debugger
        KdPrint(("Starting KernelEcho driver...\n"));
 
        //-f--> Initializing the buffer head list and mutex
        InitializeListHead(&g_BufferList);
        KeInitializeMutex(&g_Mutex, 0);
 
        //-f--> Setting the driver unload callback
        pDriverObj->DriverUnload = OnDriverUnload;
 
        //-f--> Setting the driver routines that
        //      will be supported.
        pDriverObj->MajorFunction[IRP_MJ_CREATE] = OnCreate;
        pDriverObj->MajorFunction[IRP_MJ_CLOSE] = OnClose;
        pDriverObj->MajorFunction[IRP_MJ_WRITE] = OnWrite;
        pDriverObj->MajorFunction[IRP_MJ_READ] = OnRead;
 
        //-f--> Creating the control device
        nts = IoCreateDevice(pDriverObj,
                             0,
                             &g_usDeviceName,
                             FILE_DEVICE_UNKNOWN,
                             0,
                             FALSE,
                             &pDeviceObj);
        if (!NT_SUCCESS(nts))
            ExRaiseStatus(nts);
 
        //-f--> Configuring I/O using intermediary buffer
        pDeviceObj->Flags |= DO_BUFFERED_IO;
 
        //-f--> Creating a symbolic link, so applications
        //      can reach this device.
        nts = IoCreateSymbolicLink(&g_usSymbolicLink,
                                   &g_usDeviceName);
        if (!NT_SUCCESS(nts))
            ExRaiseStatus(nts);
 
    }
    __except(EXCEPTION_EXECUTE_HANDLER)
    {
        //-f--> Getting the error code
        nts = GetExceptionCode();
 
        //-f--> That will force the debugger to stop here.
        //      But only when this code is built in Checked mode.
        ASSERT(FALSE);
        KdPrint(("An exception occurred at " __FUNCTION__ "\n"));
 
        //-f--> Since we had problems during the initialization, let's
        //      undo what was done.
        if (pDeviceObj)
            IoDeleteDevice(pDeviceObj);
    }
 
    return nts;
}

Writing Dispatch Functions

I’ll also assume that you have already had an idea of what an IRP is. Now let’s write the functions to manipulate them. They are called Dispatch Functions. These functions are set at driver startup time as you have seen in the code above. In the DriverObject structure, the MajorFunction member is a function pointer array indexed for macros like IRP_MJ_READ. The function prototype is the same for all functions and it will be displayed below. Dispatch Function needs to handle IRPs following some rules, like everything else at DDK. A minimal function could be written as follows:

/****
***     OnDispatch
**
**      A minimum Dispatch Function example
*/
 
NTSTATUS
OnDispatch(IN PDEVICE_OBJECT  pDeviceObj,
           IN PIRP            pIrp)
{
    //-f--> Here I fill the IRP status.
    pIrp->IoStatus.Status = STATUS_SUCCESS;
    pIrp->IoStatus.Information = 0;
 
    //-f--> Completing the IRP. After that, touching the IRP structure
    //      is absolutely prohibited. That does not belong to you anymore.
    IoCompleteRequest(pIrp, IO_NO_INCREMENT);
 
    //-f--> Return status code to the IoManager.
    return STATUS_SUCCESS;
}

But what if we had called the ReadFile() function with a handle to our device if we did not fill the position IRP_MJ_READ? Would we burn up forever in marble from hell? Indeed, if we take a look at MajorFunction table before filling it, we can see that there is the same address in all slots. Let’s put a break-point at DriverEntry entrance, take a look on the table slots before being filled out and see what’s there.

As we saw, this table is all initialized with a function pointer that it’s in implementation; considering that the power management IRPs have a special treatment, it completes the IRP with status STATUS_INVALID_DEVICE_REQUEST.

A Dispatch Function basically follows one of the three alternatives for dealing with an IRP. In cases of filters, our driver could pass the IRP to the driver which it was attached to. Okay, that was noted here… “Do a post providing a filter example”. The second alternative would retain the IRP to an asynchronous processing and the last but not least, simply complete the IRP. Notice that in our example, all that we do is to tell the application that the IRP was successfully performed and completed it. To complete the IRP, we use the IoCompleteRequest() function, which receives the IRP to be completed and the Priority Boost. What? Assuming your IRP had some interaction with hardware, it would consume some thread time in Kernel Mode. This time it would create a delay in the current thread and this would be compensated by this Boost. Because this is not our case, we use the macro to define no Boost. DDK has a list of constants that determines the boost that a thread should receive for each type of device. See the WDM.h excerpt (this definition may be in ntddk.h depending on your DDK version).

//
// Priority increment for completing CD-ROM I/O.  This is used by CD-ROM device
// and file system drivers when completing an IRP (IoCompleteRequest)
//
 
#define IO_CD_ROM_INCREMENT             1

At the end of this post, there will be a link to download all the files needed to build the application and driver. Notice, in the sample sources, that our OnCreate() and OnClose() Dispatch Functions are too similar to the example above. That’s because we don’t take no action when it opens or closes a handle to the device we had created.

Getting parameters from IRP.

In the other OnRead() and OnWrite() Dispatch Functions, we must get data that the driver needs to execute the IRP, such as buffer posted by the user and its size, both in writing and reading. These parameters are into a Stack Location within the IRP. Wow, the more I pray, the more weird names appear to me… Stack Locations structures are parameters that are allocated along with the IRP. There is a Stack Location for each device in the device stack that was called. This conversation can become quite fun, but we have a post to finish. Let’s leave this subject about Stack Locations to our future filter example. There, the matter makes more sense, but if you cannot take such a curiosity and want to know more about it, see what the reference says about Stack Locations. For now, let’s just consider that these parameters are there and to have access to this structure we need to use IoGetCurrentIrpStackLocation() macro. To have a more practical idea of all this bullshit, here it goes all the OnWrite() function code with tons of comments.

/****
***     OnWrite
**
**      This routine is called, whenever an application calls WriteFile()
**      using our device handle as a parameter.
*/
 
NTSTATUS
OnWrite(IN PDEVICE_OBJECT  pDeviceObj,
        IN PIRP            pIrp)
{
    PIO_STACK_LOCATION  pStack;
    PVOID               pUserBuffer;
    ULONG               ulSize;
    PBUFFER_ENTRY       pBufferEntry = NULL;
    NTSTATUS            nts;
    BOOLEAN             bMutexAcquired = FALSE;
 
    __try
    {
        //-f--> Say hello to the debugger
        KdPrint(("Writing into EchoDevice...\n"));
 
        //-f--> The Buffer address is a parameter that comes
        //      from the IRP
        pUserBuffer = (PCHAR)pIrp->AssociatedIrp.SystemBuffer;
        ASSERT(pUserBuffer != NULL);
 
        //-f--> Get the current stack location address from the IRP
        pStack = IoGetCurrentIrpStackLocation(pIrp);
 
        //-f--> Get the buffer size
        ulSize = pStack->Parameters.Write.Length;
 
        //-f--> Here, the node and the string buffer
        //      are allocated all together
        pBufferEntry = (PBUFFER_ENTRY) ExAllocatePoolWithTag(
            PagedPool,
            sizeof(BUFFER_ENTRY) + ulSize,
            ECHO_TAG);
 
        //-f--> If there is no memory, forget it...
        if (!pBufferEntry)
            ExRaiseStatus(STATUS_NO_MEMORY);
 
        //-f--> Initializing the structure
        pBufferEntry->pBuffer = (pBufferEntry + 1);
        pBufferEntry->ulSize = ulSize;
 
        //-f--> Do the copy string from the buffer sent by the user
        //      to the buffer allocated here.
        RtlCopyMemory(pBufferEntry->pBuffer,
                      pUserBuffer,
                      ulSize);
 
        //-f--> Acquire the mutex that protect the list
        //      against simultaneous accesses.
        nts = KeWaitForMutexObject(&g_Mutex,
                                   UserRequest,
                                   KernelMode,
                                   FALSE,
                                   NULL);
        if (!NT_SUCCESS(nts))
            ExRaiseStatus(nts);
 
        //-f--> We need to remember this in case something
        //      really bad happens.
        bMutexAcquired = TRUE;
 
        //-f--> Insert the new node to the list's tail.
        InsertTailList(&g_BufferList,
                       &pBufferEntry->Entry);
 
        //-f--> Tell the IoManager that all data sent
        //      to the driver were read successfully.
        pIrp->IoStatus.Information = ulSize;
        nts = STATUS_SUCCESS;
    }
    __except(EXCEPTION_EXECUTE_HANDLER)
    {
        //-f--> Get the error code.
        nts = GetExceptionCode();
 
        //-f--> That will force the debugger to stop here.
        //      But only when compiled in Checked mode.
        ASSERT(FALSE);
        KdPrint(("An exception occurred at " __FUNCTION__ "\n"));
 
        //-f--> If something gets wrong and we have already allocated
        //      this buffer, so let's release it.
        if (pBufferEntry)
            ExFreePool(pBufferEntry);
 
        //-f--> Tell the IoManager the no data were transferred.
        pIrp->IoStatus.Information = 0;
    }
 
    //-f--> Release the mutex
    if (bMutexAcquired)
        KeReleaseMutex(&g_Mutex,
                       FALSE);
 
    //-f--> Complete the IRP.
    pIrp->IoStatus.Status = nts;
    IoCompleteRequest(pIrp, IO_NO_INCREMENT);
    return nts;
}

Another important point to notice here is about filling the Information member on IoStatus structure that is in the IRP. In those data transfer functions, this field informs IoManager the data amount that was transferred from the application to the driver and vice versa. This field directly reflects on the fourth parameter of WriteFile() API, which has exactly the same function. Once received and validated the parameters, we allocate a node that will receive the buffer. Notice that we are allocating in paged memory; after all, all our functions will be executed in PASSIVE_LEVEL. Although this function is a Dispatch Function, it does not mean that it runs in DISPATCH_LEVEL. Take it easy, these are very different things. Noting… “Post about IRQLs and POOL_TYPEs”. The OnRead() function is similar to OnWrite(), so I’ll spare you of putting all code here.

When my driver is unloaded

OnDriverUnload() function will be called when the driver is being unloaded. Here, in addition to empty the buffer list that may have been forgotten in the driver, let’s delete the Symbolic Link and DeviceObject that was created at startup time. Simple as that…

/****
***     OnDriverUnload
**
**      The party is over, go home, regards for your wife
**      and a kiss in your kids.
*/
 
VOID OnDriverUnload(IN PDRIVER_OBJECT   pDriverObj)
{
    PLIST_ENTRY     pEntry;
    PBUFFER_ENTRY   pBufferEntry;
 
    //-f--> Say good night
    KdPrint(("Terminating KernelEcho driver...\n"));
 
    //-f--> Here we remove all nodes that weren't read by
    //      the application. That would happen if the application call
    //      WriteFile() but not ReadFile().
    while(!IsListEmpty(&g_BufferList))
    {
        //-f--> Take the first list node.
        pEntry = RemoveHeadList(&g_BufferList);
 
        //-f--> Get the outer structure from its node address.
        pBufferEntry = CONTAINING_RECORD(pEntry, BUFFER_ENTRY, Entry);
 
        //-f--> Finally, release the memory used by
        //      this node.
        ExFreePool(pBufferEntry);
    }
 
    //-f--> Deleting DeviceObject and SymbolicLink
    //      created at startup time.
    IoDeleteSymbolicLink(&g_usSymbolicLink);
    IoDeleteDevice(pDriverObj->DeviceObject);
}

Wow, that horrible mistake! What if the driver is finished while some reading or writing operation is being performed? Does a dark future await us and our souls will be damned for eternity? Does the Little Mermaid have something to do with it?

Well, better to let our beliefs aside and focus on the DDK. A driver cannot be terminated while there are some references to this device driver. Note that this routine has no return so, we cannot tell the system that the driver may or may not be unloaded. If an application still has an opened handle to any device while you ask to stop it, the system will respond that the driver cannot be terminated. In this condition, the OnDriverUnload() routine will not be called. But otherwise, if nothing prevents the driver from being unloaded and our routine is called, forget it… Your driver is already going to driver’s heaven.

The wonderful world of Userland

I will not put all the application source code in the post, but all sources are on a file available for downloading. I think one thing that’s worth showing here is the syntax of how to get the handle to the device we have created in our sample driver.

    //-f--> Here we open a handle to the device that
    //      was created by the driver. Remember that our
    //      sample driver must be installed and be working,
    //      in order the call bellow can work correctly.
    hDevice = CreateFile("\\\\.\\EchoDevice",
                         GENERIC_ALL,
                         0,
                         NULL,
                         OPEN_EXISTING,
                         0,
                         NULL);

Once the handle to the device is obtained, the Read, Write and Close operations will follow exactly as if we were performing the same operations with files. You don’t need to be a Jedi master to be able to use these functions.

    //-f--> It sends the received string to the driver via
    //      WriteFile.
    if (!WriteFile(hDevice,
                   szBuffer,
                   dwBytes,
                   &dwBytes,
                   NULL))

Installing and testing

I have shown in another post how to install a driver by hand. However, there are more civilized ways to install a driver. One of them is using the OSR Driver Loader, a tool that is offered by OSR to install your driver without rebooting the machine. Actually, this is a very simple procedure to do, but not simple enough to comment about it yet in this post, so let’s use the tool for now.

After compiling the driver, put a copy of it in the System32\drivers of the victim machine. Then run the DriverLoader and fill in the fields as shown in the figure below.

Then click on Register Service to install the new driver and then click on Start Service to start the driver. Okay, now you can use the test application. The application is very simple to use. Once started, enter the strings that should be sent to the driver. An empty string indicates the end of the strings and then it starts reading the same string queued in the driver.

Phew! As we have seen, even a driver that does something simple requires a considerable amount of code and many different concepts. I know that some gaps are remained in the post, but I hope I have helped. If you have questions at some points at driver or even at test application, do not hesitate to ask or send your comments. The contacts are very helpful to define the next posts.
Have fun!


KernelEcho.zip

Leave a Reply