Deprecated: Non-static method StringParser_Node::destroyNode() should not be called statically, assuming $this from incompatible context in /www/htdocs/w008ab83/ad/stringparser_bbcode/src/stringparser.class.php on line 356
AUDACIA Software - The discrepancy between interfaces and dynamic_cast
AUDACIA Software

The discrepancy between interfaces and dynamic_cast

Subtleties in the C++ implementation of COM interfaces.
Moritz Beutel, October 31st, 2008



  1. Introduction
  2. dynamic_cast<> on C++ classes
  3. COM in C++
  4. dynamic_cast<> on Delphi classes
  5. A pragmatic workaround
  6. References
  7. Comments



Introduction


When working with COM interfaces, you cannot use dynamic_cast<> to cast an interface to a concrete class. If you try it, it might work, but it might just as well crash your application. Why is this?

Because dynamic_cast<> violates the contract. The definition of a COM interfaces specifies the following layout requirements:
  • The binary layout of a COM object begins with a pointer to the Virtual Method Table.
  • The Virtual Method Table contains three function pointers: QueryInterface(), AddRef() and Release().
Nothing more, nothing less.


dynamic_cast<> on C++ classes


Now what happens if you use dynamic_cast<> on an interface?
IMyInterface* intf = createObject ();
MyClass* mc = dynamic_cast <MyClass*> (intf);

Usually, a C++ compiler generates a call to a function in the runtime library that performs the dynamic cast. In C++Builder, this function is declared as follows (it can be found in $(BDS)\source\cpprtl\Source\except\xxtype.cpp):
void    __far * DEFCC   __DynamicCast(void      __far * objectPtr,
				      void      __far * vtablePtr,
				      void      __far * srctypPtr,
				      void      __far * dsttypPtr,
				      int               reference);

The first parameter points to the object to be casted. The second parameter is the VMT pointer (which is not necessarily placed at the beginning of the object layout for all C++ classes, although there is a compiler option to change this). srctypPtr and dsttypPtr point to the RTTI descriptor tables for the static types given (IMyInterface and MyClass), and reference specifies whether the cast was performed on a pointer or on a reference - reference casts throw a std::bad_cast exception on failure, while pointer casts simply return 0 in that case.

When reading the implementation of __DynamicCast(), you might notice that it looks into the VMT and extracts the entries at negative offsets. First, it decreases the object pointer by the value in VMT[-2]. This is necessary as C++ permits multiple inheritance. Furthermore, objects can have multiple VMTs for the various base classes, and therefore, the -1 offset of each VMT redirects to the primary VMT, the VMT associated with the first specified base class. From the primary VMT, the offset -3 is taken as a pointer to the type descriptor for the object's actual type.

This is what the object and its VMT might look like:


To achieve the polymorphic cast, dynamic_cast<> first casts the pointer to the actual object's type and then searches for the class that was specified in the base classes list of the object's type descriptor. If no such base class exists, it returns either 0 or throws std::bad_alloc, depending on the reference parameter as mentioned above.

Obviously, this is an implementation detail; pretty much every C++ compiler does it differently. But we cannot guarantee that the actual object behind an interface was compiled by our compiler! It does not even need to be a C++ class - it could have been implemented in C, in Delphi or in a .NET language. By using dynamic_cast<> on an interface, we lie to the compiler: we say "this is a C++ class, please cast it at runtime", and, as Henry Spencer rightfully stated: "If you lie to the compiler, it will get its revenge."


COM in C++


So why does the compiler not prevent this? Why can we dynamic_cast<> an interface at all?


Because we lied to the compiler once before. Remember how COM interfaces are declared in C++:
class IUnknown
{
public:
    virtual HRESULT STDMETHODCALLTYPE QueryInterface (REFIID riid, void **ppvObject) = 0;
    virtual ULONG STDMETHODCALLTYPE AddRef (void) = 0;
    virtual ULONG STDMETHODCALLTYPE Release (void) = 0;
};

(For additional comfort, most Windows compilers support an annotation mechanism to associate the interface ID with the class, e.g. __declspec(uuid()) in C++Builder and Visual C++, but that is not relevant here.)

This declaration tells the compiler "IUnknown is an abstract base class in every sense". Therefore, it anticipates all the hidden information at the negative offsets of a VMT to be there.

As it turns out, we do not have much of a choice here. We could alternatively use the C-style declaration:
typedef struct _IUnknownVtbl
{
    HRESULT (STDMETHODCALLTYPE * QueryInterface) (
        IUnknown* This,
        REFIID    riid,
        void**    ppvObject);

    ULONG (STDMETHODCALLTYPE * AddRef) (
        IUnknown* This);

    ULONG (STDMETHODCALLTYPE * Release) (
        IUnknown* This);
} IUnknownVtbl;

struct IUnknown
{
    const IUnknownVtbl* lpVtbl;
};

But that would force us to do many explicit casts, thus omitting type safety. Also, we could accidentally call one object's method and pass another object to the This parameter. All this can be avoided by using the C++ variant.

Other languages such as Delphi introduced an "interface" keyword to restrict interfaces to the requirements given by COM. Unfortunately, the C++ standards comittee usually is reluctant to add new keywords to the language. This has caused quite some confusion as much inconsistence arised due to the recycling of the existing keywords:
template <class T> // does not restrict arguments to class types!
    class MyTemplate
{
public:                          // publicly visible

    static long counter;         // class variable

    const MyTemplate& operator = (const MyTemplate&); // takes and returns a reference
                                                      // to a constant object

    T getValue (void) const;     // can be called for constant objects

    ~MyTemplate (void) throw (); // specifies that the destructor cannot
                                 // throw an exception

    virtual void foo (void);     // a virtual function that takes no parameters

    void bar (void*);            // a function that takes an untyped pointer
};

struct Derived : public virtual MyTemplate <int> // public virtual inheritance
                                                 // (this is possible as a workaround
                                                 // for the DoD problem)
{
    // ...
};

static long counter; // global variable not visible to other source modules

void foo (void)
{
    static long counter; // global variable not visible to other functions
    throw std::runtime_error ("An error occurred."); // throws an exception
}
The next standard will add even more confusing keyword reuses: enum class (scoped enumerations), auto (automatic type determination), sizeof (number of variadic template arguments), possibly default and delete (for default constructors, copy constructors and assignment operators) etc. Apparently, there is few hope for an interface keyword as no existing keyword could easily be used for that. (Wait, what about "register"? We don't it anyway these days.)


dynamic_cast<> on Delphi classes


But, even if the internals of dynamic_cast<> are compiler-specific, it should be safe to use dynamic_cast<> on interfaces as long as all interfaces point to objects generated by the same compiler, shouldn't it?


Again, no. Even a single compiler might know multiple variants of dynamic_cast<>. For example, both major C++ compilers for Windows, the Visual C++ compiler and the C++Builder compiler, are "hybrid" compilers in some way. Visual C++ is able to compile both managed and unmanaged code into a single project and provides sophisticated interaction techniques, while C++Builder supports both Delphi-style and C++-style classes. Although Delphi-style classes cannot use multiple inheritance, they are free to implement as many interfaces as they like. Interfaces are, as explained above, represented by abstract C++ classes, and in order to permit Delphi classes to derive from them, the compiler performs checks whether the abstract C++ classes meet all interface requirements (no data members, private functions, multiple bases).

As I exposed in "Why Delphi classes must be created on the heap", a Delphi class has a VMT layout quite different from a C++ class:



Obviously, dynamic_cast<> must use a different mechanism than the one described above to cast between Delphi classes. Tracing a dynamic_cast<> call on a Delphi class reveals that the compiler recognizes the difference and instead generates a call of the __DynamicCastVCLptr() or the __DynamicCastVCLref() function:
void	*__fastcall _EXPFUNC __DynamicCastVCLref(void *ptr, void *targetVMT);
void	*__fastcall _EXPFUNC __DynamicCastVCLptr(void *ptr, void *targetVMT);

As Delphi classes do not feature multiple inheritance, a pointer to a Delphi class always points to the beginning of the object, and no pointer adjustment is required when doing polymorphic casts. Therefore, __DynamicCastVCLptr() and __DynamicCastVCLref() only check whether the destination class is a base class of the actual type or the actual type itself and plainly return the given pointer upon success. If the base class cannot be found, __DynamicCastVCLptr() returns 0 and __DynamicCastVCLref() throws a std::bad_cast exception.


Mixing Delphi and C++ classes is one of the easiest ways to mess up with dynamic_cast<> and interfaces:
#include <System.hpp>
#include <unknwn.h>
#include <tchar.h>
#pragma hdrstop

#define DO_QUERYINTERFACE(I)                    \
    if (__uuidof (I) == riid)                   \
    {                                           \
        AddRef ();                              \
        *ppvObject = static_cast <I*> (this);   \
        return S_OK;                            \
    }

class TMyDelphiClass : public TInterfacedObject, public IUnknown
{
public:
    virtual HRESULT STDMETHODCALLTYPE QueryInterface (REFIID riid, void** ppvObject)
    {
        DO_QUERYINTERFACE (IUnknown)
        return TInterfacedObject::QueryInterface (riid, ppvObject);
    }

    virtual ULONG STDMETHODCALLTYPE AddRef (void)
    { return TInterfacedObject::_AddRef (); }
    virtual ULONG STDMETHODCALLTYPE Release (void)
    { return TInterfacedObject::_Release (); }
};

class MyCppClass : public IUnknown
{
private:
    LONG refcnt;

public:
    virtual HRESULT STDMETHODCALLTYPE QueryInterface (REFIID riid, void** ppvObject)
    {
        *ppvObject = 0;
        DO_QUERYINTERFACE (IUnknown)
        return E_NOINTERFACE;
    }

    virtual ULONG STDMETHODCALLTYPE AddRef (void)
    { return InterlockedIncrement (&refcnt); }
    virtual ULONG STDMETHODCALLTYPE Release (void)
    {
        LONG ref = InterlockedDecrement (&refcnt);
        if (ref == 0)
            delete this;
        return ref;
    }
};

int _tmain (void)
{
        // (The _di_-prefixed interfaces are typedefs to the DelphiInterface<>
        // smart pointer which imitates Delphi's intrinsic interface handling.)
    _di_IUnknown i;
    i = new TMyDelphiClass;

        // This should return 0, but instead, it raises an AV because it
        // invokes _DynamicCast() which expects a C++-style VMT layout.
    MyCppClass* cc = dynamic_cast <MyCppClass*> (&*i);

        // This could fail at compile time, but the compiler apparently is
        // not smart enough. Therefore, supposed that we survived the first AV
        // by sheer luck, we would get a second one here for the same reason.
    TMyDelphiClass* dc = dynamic_cast <TMyDelphiClass*> (&*i);
}


A pragmatic workaround


But what to do if you need a way to cast to the actual class type?

Although dynamic_cast<> cannot be used for COM interfaces, COM does specify a way to cast an interface at runtime: the QueryInterface() function. So if you want to be able to cast to your class, you could use the annotation facility of your compiler and give your class an ID for which you test in your implementation of QueryInterface():
class __declspec(uuid("{22807D4C-1770-472D-A44B-4D87B53D0189}")) MySelfAwareClass
    : public MyCppClass
{
    virtual HRESULT STDMETHODCALLTYPE QueryInterface (REFIID riid, void **ppvObject)
    {
        DO_QUERYINTERFACE (MySelfAwareClass)
        return MyCppClass::QueryInterface (riid, ppvObject);
    }
};

int _tmain (void)
{
    _di_IUnknown i = new MySelfAwareClass;

    MySelfAwareClass* msac;
    i->QueryInterface (__uuidof (MySelfAwareClass),
        reinterpret_cast <void**> (&msac));
}

The fact that GUIDs are, well, globally unique ensures that our extension to QueryInterface() does not interfere with any COM requirements.


References


[1] The Delphi and C++Builder Runtime Library source code (delivered with C++Builder)
[2] Raymond Chen, The layout of a COM object, 05.02.2004
[3] The C++ Standards Committee, State of C++ Evolution (Post-Antipolis 2008 Mailing), 30.06.2008


Comments



Deprecated: mysql_connect(): The mysql extension is deprecated and will be removed in the future: use mysqli or PDO instead in /www/htdocs/w008ab83/ad/phputils/dbc_mysql.php on line 112

New entry:

Name:
E-Mail:
Website:
Date:
Number of characters in your name:
Message: