Chapter 14: Polymorphism

Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.

Please state the document version you're referring to, as found in the title (in this document: 6.5.0) and please state chapter and paragraph name or number you're referring to.

All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.

As we have seen in chapter 13, C++ provides the tools to derive classes from base classes, and to use base class pointers to address derived objects. As we've also seen, when using a base class pointer to address an object of a derived class, the type of the pointer determines which member function will be used. This means that a Vehicle *vp, pointing to a Truck object, will incorrectly compute the truck's combined weight in a statement like vp->weight(). The reason for this should now be clear: vp calls Vehicle::weight() and not Truck::weight(), even though vp actually points to a Truck.

Fortunately, a remedy is available. In C++ a Vehicle *vp may call a function Truck::weight() when the pointer actually points to a Truck.

The terminology for this feature is polymorphism: it is as though the pointer vp changes its type from a base class pointer to a pointer to the class of the object it actually points to. So, vp might behave like a Truck * when pointing to a Truck, and like an Auto * when pointing to an Auto etc.. (In one of the StarTrek movies, Capt. Kirk was in trouble, as usual. He met an extremely beautiful lady who, however, later on changed into a hideous troll. Kirk was quite surprised, but the lady told him: ``Didn't you know I am a polymorph?'')

Polymorphism is realized by a feature called late binding. It's called that way because the decision which function to call (a base class function or a function of a derived class) cannot be made compile-time, but is postponed until the program is actually executed: only then it is determined which member function will actually be called.

14.1: Virtual functions

The default behavior of the activation of a member function via a pointer or reference is that the type of the pointer (or reference) determines the function that is called. E.g., a Vehicle * will activate Vehicle's member functions, even when pointing to an object of a derived class. This is referred to as early or static binding, since the type of function is known compile-time. The late or dynamic binding is achieved in C++ using virtual member functions.

A member function becomes a virtual member function when its declaration starts with the keyword virtual. Once a function is declared virtual in a base class, it remains a virtual member function in all derived classes; even when the keyword virtual is not repeated in a derived class.

As far as the vehicle classification system is concerned (see section 13.1) the two member functions weight() and setWeight() might well be declared virtual. The relevant sections of the class definitions of the class Vehicle and Truck are shown below. Also, we show the implementations of the member functions weight() of the two classes:

    class Vehicle
    {
        public:
            virtual int weight() const;
            virtual void setWeight(int wt);
    };

    class Truck: public Vehicle
    {
        public:
            void setWeight(int engine_wt, int trailer_wt);
            int weight() const;
    };

    int Vehicle::weight() const
    {
        return (weight);
    }

    int Truck::weight() const
    {
        return (Auto::weight() + trailer_wt);
    }
Note that the keyword virtual only needs to appear in the Vehicle base class. There is no need (but there is also no penalty) to repeat it in derived classes: once virtual, always virtual. On the other hand, a function may be declared virtual anywhere in a class hierarchy: the compiler will be perfectly happy if weight() is declared virtual in Auto, rather than in Vehicle. The specific characteristics of virtual member functions would then, for the member function weight(), only appear with Auto (and its derived classes) pointers or references. With a Vehicle pointer, static binding would remain to be used. The effect of late binding is illustrated below:
    Vehicle v(1200);            // vehicle with weight 1200
    Truck t(6000, 115,          // truck with cabin weight 6000, speed 115,
          "Scania", 15000);     // make Scania, trailer weight 15000
    Vehicle *vp;                // generic vehicle pointer

    int main()
    {
        vp = &v;                            // see (1) below
        cout << vp->weight() << endl;

        vp = &t;                            // see (2) below
        cout << vp->weight() << endl;

        cout << vp->speed() << endl;     // see (3) below
    }
Since the function weight() is defined virtual, late binding is used: The example illustrates that when a pointer to a class is used only the functions which are members of that class can be called. These functions may be virtual. However, this only influences the type of binding (early vs. late) and not the set of member functions that is visible to the pointer.

A virtual member function cannot be a static member function: a virtual member function is still an ordinary member function in that it has a this pointer. As static member functions have no this pointer, they cannot be declared virtual.

14.2: Virtual destructors

When the operator delete releases memory occupied by a dynamically allocated object, or when an object goes out of scope, the appropriate destructor is called to ensure that memory allocated by the object is also deleted. Now consider the following code fragment (cf. section 13.1):
    Vehicle *vp = new Land(1000, 120);

    delete vp;          // object destroyed
In this example an object of a derived class (Land) is destroyed using a base class pointer (Vehicle *). For a `standard' class definition this will mean that Vehicle's destructor is called, instead of the Land object's destructor. This not only results in a memory leak when memory is allocated in Land, but it will also prevent any other task, normally performed by the derived class's destructor from being completed (or, better: started). A Bad Thing.

In C++ this problem is solved using virtual destructors. By applying the keyword virtual to the declaration of a destructor the appropriate derived class destructor is activated when the argument of the delete operator is a base class pointer. In the following partial class definition the declaration of such a virtual destructor is shown:

    class Vehicle
    {
        public:
            virtual ~Vehicle();
            virtual size_t weight() const;
    };
By declaring a virtual destructor, the above delete operation (delete vp) will correctly call Land's destructor, rather than Vehicle's destructor.

From this discussion we are now able to formulate the following situations in which a destructor should be defined:

In the second case, the destructor doesn't have any special tasks to perform. In these cases the virtual destructor is given an empty body. For example, the definition of Vehicle::~Vehicle() may be as simple as:
    Vehicle::~Vehicle()
    {}
Often the destructor will be defined inline below the class interface.

temporary note: With the gnu compiler 4.1.2 an annoying bug prevents virtual destructors to be defined inline below their class interfaces without explicitly declaring the virtual destructor as inline within the interface. Until the bug has been repaired, inline virtual destructors should be defined as follows (using the class Vehicle as an example):

    class Vehicle
    {
        ...
        public:
            inline virtual ~Vehicle();  // note the `inline'
            ...
    };

    inline Vehicle::~Vehicle()          // inline implementation
    {}                                  // is kept unaltered.

14.3: Pure virtual functions

Until now the base class Vehicle contained its own, concrete, implementations of the virtual functions weight() and setWeight(). In C++ it is also possible only to mention virtual member functions in a base class, without actually defining them. The functions are concretely implemented in a derived class. This approach, in some languages (like C#, Delphi and Java) known as an interface, defines a protocol, which must be implemented by derived classes. This implies that derived classes must take care of the actual definition: the C++ compiler will not allow the definition of an object of a class in which one or more member functions are left undefined. The base class thus enforces a protocol by declaring a function by its name, return value and arguments. The derived classes must take care of the actual implementation. The base class itself defines therefore only a model or mold, to be used when other classes are derived. Such base classes are also called abstract classes or abstract base classes. Abstract base classes are the foundation of many design patterns (cf. Gamma et al. (1995)) , allowing the programmer to create highly reusable software. Some of these design patterns are covered by the Annotations (e.g, the Template Method in section 20.3), but for a thorough discussion of Design Patterns the reader is referred to Gamma et al.'s book.

Functions that are only declared in the base class are called pure virtual functions. A function is made pure virtual by prefixing the keyword virtual to its declaration and by postfixing it with = 0. An example of a pure virtual function occurs in the following listing, where the definition of a class Object requires the implementation of the conversion operator operator string():

    #include <string>

    class Object
    {
        public:
            virtual operator std::string() const = 0;
    };
Now, all classes derived from Object must implement the operator string() member function, or their objects cannot be constructed. This is neat: all objects derived from Object can now always be considered string objects, so they can, e.g., be inserted into ostream objects.

Should the virtual destructor of a base class be a pure virtual function? The answer to this question is no: a class such as Vehicle should not require derived classes to define a destructor. In contrast, Object::operator string() can be a pure virtual function: in this case the base class defines a protocol which must be adhered to.

Realize what would happen if we would define the destructor of a base class as a pure virtual destructor: according to the compiler, the derived class object can be constructed: as its destructor is defined, the derived class is not a pure abstract class. However, inside the derived class destructor, the destructor of its base class is implicitly called. This destructor was never defined, and the linker will loudly complain about an undefined reference to, e.g., Virtual::~Virtual().

Often, but not necessarily always, pure virtual member functions are const member functions. This allows the construction of constant derived class objects. In other situations this might not be necessary (or realistic), and non-constant member functions might be required. The general rule for const member functions applies also to pure virtual functions: if the member function will alter the object's data members, it cannot be a const member function. Often abstract base classes have no data members. However, the prototype of the pure virtual member function must be used again in derived classes. If the implementation of a pure virtual function in a derived class alters the data of the derived class object, than that function cannot be declared as a const member function. Therefore, the constructor of an abstract base class should well consider whether a pure virtual member function should be a const member function or not.

14.3.1: Implementing pure virtual functions

Pure virtual member functions may be implemented. To implement a pure virtual member function: pure virtual and implemented member function, provide it with its normal = 0; specification, but implement it nonetheless. Since the = 0; ends in a semicolon, the pure virtual member is always at most a declaration in its class, but an implementation may either be provided in-line below the class interface or it may be defined as a non-inline member function in a source file of its own.

Pure virtual member functions may be called from derived class objects or from its class or derived class members by specifying the base class and scope resolution operator with the function to be called. The following small program shows some examples:

#include <iostream>

class Base
{
    public:
        virtual ~Base();
        virtual void pure() = 0;
};

inline Base::~Base()
{}

inline void Base::pure()
{
    std::cout << "Base::pure() called\n";
}

class Derived: public Base
{
    public:
        virtual void pure();
};

inline void Derived::pure()
{
    Base::pure();
    std::cout << "Derived::pure() called\n";
}

int main()
{
    Derived derived;

    derived.pure();
    derived.Base::pure();

    Derived *dp = &derived;

    dp->pure();
    dp->Base::pure();
}
// Output:
//      Base::pure() called
//      Derived::pure() called
//      Base::pure() called
//      Base::pure() called
//      Derived::pure() called
//      Base::pure() called

Implementing a pure virtual function has limited use. One could argue that the pure virtual function's implementation may be used to perform tasks that can already be performed at the base-class level. However, there is no guarantee that the base class virtual function will actually be called from the derived class overridden version of the member function (like a base class constructor that is automatically called from a derived class constructor). Since the base class implementation will therefore at most be called optionally its functionality could as well be implemented in a separate member, which can then be called without the requirement to mention the base class explicitly.

14.4: Virtual functions in multiple inheritance

As mentioned in chapter 13 a class may be derived from multiple base classes. Such a derived class inherits the properties of all its base classes. Of course, the base classes themselves may be derived from classes yet higher in the hierarchy.

Consider what would happen if more than one `path' would lead from the derived class to the base class. This is illustrated in the code example below: a class Derived is doubly derived from a class Base:

    class Base
    {
        int d_field;
        public:
            void setfield(int val);
            int field() const;
    };
    inline void Base::setfield(int val)
    { 
        d_field = val; 
    }
    inline int field() const
    { 
        return d_field; 
    }

    class Derived: public Base, public Base
    {
    };
Due to the double derivation, the functionality of Base now occurs twice in Derived. This leads to ambiguity: when the function setfield() is called for a Derived object, which function should that be, since there are two? In such a duplicate derivation, C++ compilers will normally refuse to generate code and will (correctly) identify an error.

The above code clearly duplicates its base class in the derivation, which can of course easily be avoided by not doubly deriving from Base. But duplication of a base class can also occur through nested inheritance, where an object is derived from, e.g., an Auto and from an Air (see the vehicle classification system, section 13.1). Such a class would be needed to represent, e.g., a flying car (such as the one in James Bond vs. the Man with the Golden Gun...). An AirAuto would ultimately contain two Vehicles, and hence two weight fields, two setWeight() functions and two weight() functions.

14.4.1: Ambiguity in multiple inheritance

Let's investigate closer why an AirAuto introduces ambiguity, when derived from Auto and Air. The duplication of Vehicle data is further illustrated in Figure 13.

Figure 13 is shown here.
Figure 13: Duplication of a base class in multiple derivation.


The internal organization of an AirAuto is shown in Figure 14

Figure 14 is shown here.
Figure 14: Internal organization of an AirAuto object.


The C++ compiler will detect the ambiguity in an AirAuto object, and will therefore fail to compile a statement like:

    AirAuto cool;

    cout << cool.weight() << endl;
The question of which member function weight() should be called, cannot be answered by the compiler. The programmer has two possibilities to resolve the ambiguity explicitly: The second possibility from the two above is preferable, since it relieves the programmer who uses the class AirAuto of special precautions.

However, apart from these explicit solutions, there is a more elegant one, discussed in the next section.

14.4.2: Virtual base classes

As illustrated in Figure 14, an AirAuto represents two Vehicles. The result is not only an ambiguity in the functions which access the weight data, but also the presence of two weight fields. This is somewhat redundant, since we can assume that an AirAuto has just one weight.

We can achieve the situation that an AirAuto is only one Vehicle, yet used multiple derivation. This is realized by defining the base class that is multiply mentioned in a derived class' inheritance tree as a virtual base class. For the class AirAuto this means that the derivation of Land and Air is changed:

    class Land: virtual public Vehicle
    {
        // etc
    };

    class Auto: public Land
    {
        // etc
    };


    class Air: virtual public Vehicle
    {
        // etc
    };

    class AirAuto: public Auto, public Air
    {
    };
The virtual derivation ensures that via the Land route, a Vehicle is only added to a class when a virtual base class was not yet present. The same holds true for the Air route. This means that we can no longer say via which route a Vehicle becomes a part of an AirAuto; we can only say that there is an embedded Vehicle object. The internal organization of an AirAuto after virtual derivation is shown in Figure 15.

Figure 15 is shown here.
Figure 15: Internal organization of an AirAuto object when the base classes are virtual.


Note the following:

Summarizing, using virtual derivation avoids ambiguity when member functions of a base class are called. Furthermore, duplication of data members is avoided.

14.4.3: When virtual derivation is not appropriate

In contrast to the previous definition of a class such as AirAuto, situations may arise where the double presence of the members of a base class is appropriate. To illustrate this, consider the definition of a Truck from section 13.4:
    class Truck: public Auto
    {
        int d_trailer_weight;

        public:
            Truck();
            Truck(int engine_wt, int sp, char const *nm,
                   int trailer_wt);

            void setWeight(int engine_wt, int trailer_wt);
            int weight() const;
    };

    Truck::Truck(int engine_wt, int sp, char const *nm,
                  int trailer_wt)
    :
        Auto(engine_wt, sp, nm)
    {
        d_trailer_weight = trailer_wt;
    }

    int Truck::weight() const
    {
        return                  // sum of:
            Auto::weight() +    //   engine part plus
            trailer_wt;         //   the trailer
    }
This definition shows how a Truck object is constructed to contain two weight fields: one via its derivation from Auto and one via its own int d_trailer_weight data member. Such a definition is of course valid, but it could also be rewritten. We could derive a Truck from an Auto and from a Vehicle, thereby explicitly requesting the double presence of a Vehicle; one for the weight of the engine and cabin, and one for the weight of the trailer. A small point of interest here is that a derivation like
    class Truck: public Auto, public Vehicle
is not accepted by the C++ compiler: a Vehicle is already part of an Auto, and is therefore not needed. An intermediate class solves the problem: we derive a class TrailerVeh from Vehicle, and Truck from Auto and from TrailerVeh. All ambiguities concerning the member functions are then be solved for the class Truck:
    class TrailerVeh: public Vehicle
    {
        public:
            TrailerVeh(int wt);
    };

	inline TrailerVeh::TrailerVeh(int wt)
	:
	    Vehicle(wt)
	{}
	
    class Truck: public Auto, public TrailerVeh
    {
        public:
            Truck();
            Truck(int engine_wt, int sp, char const *nm, int trailer_wt);
            void setWeight(int engine_wt, int trailer_wt);
            int weight() const;
    };

	inline Truck::Truck(int engine_wt, int sp, char const *nm,
	                    int trailer_wt)
	:
	    Auto(engine_wt, sp, nm),
	    TrailerVeh(trailer_wt)
	{}
	
    inline int Truck::weight() const
    {
        return                      // sum of:
            Auto::weight() +        //   engine part plus
            TrailerVeh::weight();   //   the trailer
    }

14.5: Run-time type identification

C++ offers two ways to retrieve the type of objects and expressions while the program is running. The possibilities of C++'s run-time type identification are limited compared to languages like Java. Normally, C++ uses static type checking and static type identification. Static type checking and determination is possibly safer and certainly more efficient than run-time type identification, and should therefore be used wherever possible. Nonetheles, C++ offers run-time type identification by providing the dynamic cast and typeid operators. These operators operate on class type objects, containing at least one virtual member function.

14.5.1: The dynamic_cast operator

The dynamic_cast<>() operator is used to convert a base class pointer or reference to, respectively, a derived class pointer or reference.

A dynamic cast is performed run-time. A prerequisite for using the dynamic cast operator is the existence of at least one virtual member function in the base class.

In the following example a pointer to the class Derived is obtained from the Base class pointer bp:

    class Base
    {
        public:
            virtual ~Base();
    };

    class Derived: public Base
    {
        public:
            char const *toString();
    };
	inline char const *Derived::toString()
	{
	    return "Derived object";
	}
	
    int main()
    {
        Base *bp;
        Derived *dp,
        Derived d;

        bp = &d;

        dp = dynamic_cast<Derived *>(bp);

        if (dp)
            cout << dp->toString() << endl;
        else
            cout << "dynamic cast conversion failed\n";
    }
Note the test: in the if condition the success of the dynamic cast is checked. This must be done run-time, as the compiler can't do this all by itself. If a base class pointer is provided, the dynamic cast operator returns 0 on failure and a pointer to the requested derived class on success. Consequently, if there are multiple derived classes, a series of checks could be performed to find the actual derived class to which the pointer points (In the next example derived classes are only declared):
    class Base
    {
        public:
            virtual ~Base();
    };
    class Derived1: public Base;
    class Derived2: public Base;

    int main()
    {
        Base *bp;
        Derived1 *d1,
        Derived1 d;
        Derived2 *d2;

        bp = &d;

        if ((d1 = dynamic_cast<Derived1 *>(bp)))
            cout << *d1 << endl;
        else if ((d2 = dynamic_cast<Derived2 *>(bp)))
            cout << *d2 << endl;
    }
Alternatively, a reference to a base class object may be available. In this case the dynamic_cast<>() operator will throw an exception if it fails. For example:
    #include <iostream>

    class Base
    {
        public:
            virtual ~Base();
            virtual char const *toString();
    };
    inline Base::~Base()
    {}
    inline char const *Base::toString()
    {
        return "Base::toString() called";
    }

    class Derived1: public Base
    {};

    class Derived2: public Base
    {};

    void process(Base &b)
    {
        try
        {
            std::cout << dynamic_cast<Derived1 &>(b).toString() << std::endl;
        }
        catch (std::bad_cast)
        {}

        try
        {
            std::cout << dynamic_cast<Derived2 &>(b).toString() << std::endl;
        }
        catch (std::bad_cast)
        {
            std::cout << "Bad cast to Derived2\n";
        }
    }

    int main()
    {
        Derived1 d;

        process(d);
    }
    /*
        Generated output:

        Base::toString() called
        Bad cast to Derived2
    */
In this example the value std::bad_cast is introduced. The std::bad_cast exception is thrown if the dynamic cast of a reference to a derived class object fails.

Note the form of the catch clause: bad_cast is the name of a type. In section 16.4.1 the construction of such a type is discussed.

The dynamic cast operator is a useful tool when an existing base class cannot or should not be modified (e.g., when the sources are not available), and a derived class may be modified instead. Code receiving a base class pointer or reference may then perform a dynamic cast to the derived class to access the derived class's functionality.

Casts from a base class reference or pointer to a derived class reference or pointer are called downcasts.

One may wonder what the difference is between a dynamic_cast and a reinterpret_cast. Of course, the dynamic_cast may be used with references and the reinterpret_cast can only be used for pointers. But what's the difference when both arguments are pointers?

When the reinterpret_cast is used, we tell the compiler that it literally should re-interpret a block of memory as something else. A well known example is obtaining the individual bytes of an int. An int consists of sizeof(int) bytes, and these bytes can be accessed by reinterpreting the location of the int value as a char *. When using a reinterpret_cast the compiler offers absolutely no safeguard. The compiler will happily reinterpret_cast an int * to a double *, but the resulting dereference produces at the very least a meaningless value.

The dynamic_cast will also reinterpret a block of memory as something else, but here a run-time safeguard is offered. The dynamic cast fails when the requested type doesn't match the actual type of the object we're pointing at. The dynamic_cast's purpose is also much more restricted than the reinterpret_cast's purpose, as it should only be used for downcasting to derived classes having virtual members.

14.5.2: The `typeid' operator

As with the dynamic_cast<>() operator, the typeid is usually applied to base class objects, that are actually derived class objects. Similarly, the base class should contain one or more virtual functions.

In order to use the typeid operator, source files must

    #include <typeinfo>
Actually, the typeid operator returns an object of type type_info, which may, e.g., be compared to other type_info objects.

The class type_info may be implemented differently by different implementations, but at the very least it has the following interface:

    class type_info
    {
        public:
            virtual ~type_info();
            int operator==(const type_info &other) const;
            int operator!=(const type_info &other) const;
            char const *name() const;
        private:
            type_info(type_info const &other);
            type_info &operator=(type_info const &other);
    };
Note that this class has a private copy constructor and overloaded assignment operator. This prevents the normal construction or assignment of a type_info object. Such type_info objects are constructed and returned by the typeid operator. Implementations, however, may choose to extend or elaborate the type_info class and provide, e.g., lists of functions that can be called with a certain class.

If the type_id operator is given a base class reference (where the base class contains at least one virtual function), it will indicate that the type of its operand is the derived class. For example:

    class Base;     // contains at least one virtual function
    class Derived: public Base;

    Derived d;
    Base    &br = d;

    cout << typeid(br).name() << endl;
In this example the typeid operator is given a base class reference. It will print the text ``Derived'', being the class name of the class br actually refers to. If Base does not contain virtual functions, the text ``Base'' would have been printed.

The typeid operator can be used to determine the name of the actual type of expressions, not just of class type objects. For example:

    cout << typeid(12).name() << endl;     // prints:  int
    cout << typeid(12.23).name() << endl;  // prints:  double
Note, however, that the above example is suggestive at most of the type that is printed. It may be int and double, but this is not necessarily the case. If portability is required, make sure no tests against these static, built-in text-strings are required. Check out what your compiler produces in case of doubt.

In situations where the typeid operator is applied to determine the type of a derived class, it is important to realize that a base class reference should be used as the argument of the typeid operator. Consider the following example:

    class Base;     // contains at least one virtual function
    class Derived: public Base;

    Base *bp = new Derived;     // base class pointer to derived object

    if (typeid(bp) == typeid(Derived *))    // 1: false
        ...
    if (typeid(bp) == typeid(Base *))       // 2: true
        ...
    if (typeid(bp) == typeid(Derived))      // 3: false
        ...
    if (typeid(bp) == typeid(Base))         // 4: false
        ...
    if (typeid(*bp) == typeid(Derived))     // 5: true
        ...
    if (typeid(*bp) == typeid(Base))        // 6: false
        ...

    Base &br = *bp;

    if (typeid(br) == typeid(Derived))      // 7: true
        ...
    if (typeid(br) == typeid(Base))         // 8: false
        ...
Here, (1) returns false as a Base * is not a Derived *. (2) returns true, as the two pointer types are the same, (3) and (4) return false as pointers to objects are not the objects themselves.

On the other hand, if *bp is used in the above expressions, then (1) and (2) return false as an object (or reference to an object) is not a pointer to an object, whereas (5) now returns true: *bp actually refers to a Derived class object, and typeid(*bp) will return typeid(Derived). A similar result is obtained if a base class reference is used: 7 returning true and 8 returning false.

When a 0-pointer is passed to the operator typeid a bad_typeid exception is thrown.

14.6: Deriving classes from `streambuf'

The class streambuf (see section 5.7 and figure 4) has many (protected) virtual member functions (see section 5.7.1) that are used by the stream classes using streambuf objects. By deriving a class from the class streambuf these member functions may be overriden in the derived classes, thus implementing a specialization of the class streambuf for which the standard istream and ostream objects can be used.

Basically, a streambuf interfaces to some device. The normal behavior of the stream-class objects remains unaltered. So, a string extraction from a streambuf object will still return a consecutive sequence of non white space delimited characters. If the derived class is used for input operations, the following member functions are serious candidates to be overridden. Examples in which some of these functions are overridden will be given later in this section:

When the derived class is used for output operations, the next member functions should be considered: For derived classes using buffers and supporting seek operations, consider these member functions: Next, consider the following problem, which will be solved by constructing a class CapsBuf derived from streambuf. The problem is to construct a streambuf writing its information to the standard output stream in such a way that all white-space delimited series of characters are capitalized. The class CapsBuf obviously needs an overridden overflow() member and a minimal awareness of its state. Its state changes from `Capitalize' to `Literal' as follows: A simple variable to remember the last character allows us to keep track of the current state. Since `Capitalize' is similar to `last character processed is a white space character' we can simply initialize the variable with a white space character, e.g., the blank space. Here is the initial definition of the class CapsBuf:
#include <iostream>
#include <streambuf>
#include <ctype.h>

class CapsBuf: public std::streambuf
{
    int d_last;

    public:
        CapsBuf()
        :
            d_last(' ')
        {}

    protected:
        int overflow(int c)             // interface to the device.
        {
            std::cout.put(isspace(d_last) ? toupper(c) : c);
            return d_last = c;
        }
};
An example of a program using CapsBuf is:
    #include "capsbuf1.h"
    using namespace std;

    int main()
    {
        CapsBuf     cb;

        ostream     out(&cb);

        out << hex << "hello " << 32 << " worlds" << endl;

        return 0;
    }
    /*
        Generated output:

        Hello 20 Worlds
    */
Note the use of the insertion operator, and note that all type and radix conversions (inserting hex and the value 32, coming out as the ASCII-characters '2' and '0') is neatly done by the ostream object. The real purpose in life for CapsBuf is to capitalize series of ASCII-characters, and that's what it does very well.

Next, we realize that inserting characters into streams can also be realized by a construction like

    cout << cin.rdbuf();
or, boiling down to the same thing:
    cin >> cout.rdbuf();
Realizing that this is all about streams, we now try, in the main() function above:
    cin >> out.rdbuf();
We compile and link the program to the executable caps, and start:
    echo hello world | caps
Unfortunately, nothing happens.... Nor do we get any reaction when we try the statement cin >> cout.rdbuf(). What's wrong here?

The difference between cout << cin.rdbuf(), which does produce the expected results and our using of cin >> out.rdbuf() is that the operator>>(streambuf *) (and its insertion counterpart) member function performs a streambuf-to-streambuf copy only if the respective stream modes are set up correctly. So, the argument of the extraction operator must point to a streambuf into which information can be written. By default, no stream mode is set for a plain streambuf object. As there is no constructor for a streambuf accepting an ios::openmode, we force the required ios::out mode by defining an output buffer using setp(). We do this by defining a buffer, but don't want to use it, so we let its size be 0. Note that this is something different than using 0-argument values with setp(), as this would indicate `no buffering', which would not alter the default situation. Although any non-0 value could be used for the empty [begin, begin) range, we decided to define a (dummy) local char variable in the constructor, and use [&dummy, &dummy) to define the empty buffer. This effectively defines CapsBuf as an output buffer, thus activating the

    istream::operator>>(streambuf *)
member. As the variable dummy is not used by setp() it may be defined as a local variable. It's only purpose in life it to indicate to setp() that no buffer is used. Here is the revised constructor of the class CapsBuf:
    CapsBuf::CapsBuf()
    :
        d_last(' ')
    {
        char dummy;
        setp(&dummy, &dummy);
    }
Now the program can use either
    out << cin.rdbuf();
or:
    cin >> out.rdbuf();
Actually, the ostream wrapper isn't really needed here:
    cin >> &cb;
would have produced the same results.

It is not clear whether the setp() solution proposed here is actually a kludge. After all, shouldn't the ostream wrapper around cb inform the CapsBuf that it should act as a streambuf for doing output operations?

14.7: A polymorphic exception class

Earlier in the Annotations (section 8.3.1) we hinted at the possibility of designing a class Exception whose process() member would behave differently, depending on the kind of exception that was thrown. Now that we've introduced polymorphism, we can further develop this example.

By now it will probably be clear that our class Exception should be a virtual base class, from which special exception handling classes can be derived. It could even be argued that Exception can be an abstract base class declaring only pure virtual member functions. In the discussion in section 8.3.1 a member function severity() was mentioned which might not be a proper candidate for a purely abstract member function, but for that member we can now use the completely general dynamic_cast<>() operator.

The (abstract) base class Exception is designed as follows:

    #ifndef _EXCEPTION_H_
    #define _EXCEPTION_H_

    #include <iostream>
    #include <string>

    class Exception
    {
        friend std::ostream &operator<<(std::ostream &str,
                                        Exception const &e);
        std::string d_reason;

        public:
            virtual ~Exception();
            virtual void process() const = 0;
            virtual operator std::string() const;
        protected:
            Exception(char const *reason);
    };

        inline Exception::~Exception()
        {}
        inline Exception::operator std::string() const
        {
            return d_reason;
        }
        inline Exception::Exception(char const *reason)
        :
            d_reason(reason)
        {}
        inline std::ostream &operator<<(std::ostream &str, Exception const &e)
        {
            return str << e.operator std::string();
        }

    #endif
The operator string() member function of course replaces the toString() member used in section 8.3.1. The friend operator<<() function is using the (virtual) operator string() member so that we're able to insert an Exception object into an ostream. Apart from that, notice the use of a virtual destructor, doing nothing.

A derived class FatalException: public Exception could now be defined as follows (using a very basic process() implementation indeed):

    #ifndef _FATALEXCEPTION_H_
    #define _FATALEXCEPTION_H_

    #include "exception.h"

    class FatalException: public Exception
    {
        public:
            FatalException(char const *reason);
            void process() const;
    };
        inline FatalException::FatalException(char const *reason)
        :
            Exception(reason)
        {}
        inline void FatalException::process() const
        {
            exit(1);
        }
    #endif

The translation of the example at the end of section 8.3.1 to the current situation can now easily be made (using derived classes WarningException and MessageException), constructed like FatalException:

    #include <iostream>
    #include "message.h"
    #include "warning.h"
    using namespace std;

    void initialExceptionHandler(Exception const *e)
    {
        cout << *e << endl;         // show the plain-text information

        if
        (
            !dynamic_cast<MessageException const *>(e)
            &&
            !dynamic_cast<WarningException const *>(e)
        )
            throw;                  // Pass on other types of Exceptions

        e->process();               // Process a message or a warning
        delete e;
    }

14.8: How polymorphism is implemented

This section briefly describes how polymorphism is implemented in C++. It is not necessary to understand how polymorphism is implemented if using this feature is the only intention. However, we think it's nice to know how polymorphism is at all possible. Besides, the following discussion does explain why there is a cost of polymorphism in terms of memory usage.

The fundamental idea behind polymorphism is that the compiler does not know which function to call compile-time; the appropriate function will be selected run-time. That means that the address of the function must be stored somewhere, to be looked up prior to the actual call. This `somewhere' place must be accessible from the object in question. E.g., when a Vehicle *vp points to a Truck object, then vp->weight() calls a member function of Truck; the address of this function is determined from the actual object which vp points to.

A common implementation is the following: An object containing virtual member functions holds as its first data member a hidden field, pointing to an array of pointers containing the addresses of the virtual member functions. The hidden data member is usually called the vpointer, the array of virtual member function addresses the vtable. Note that the discussed implementation is compiler-dependent, and is by no means dictated by the C++ ANSI/ISO standard.

The table of addresses of virtual functions is shared by all objects of the class. Multiple classes may even share the same table. The overhead in terms of memory consumption is therefore:

Consequently, a statement like vp->weight() first inspects the hidden data member of the object pointed to by vp. In the case of the vehicle classification system, this data member points to a table of two addresses: one pointer for the function weight() and one pointer for the function setWeight(). The actual function which is called is determined from this table.

The internal organization of the objects having virtual functions is further illustrated in figures Figure 16 and Figure 17 (provided by Guillaume Caumon).

Figure 16 is shown here.
Figure 16: Internal organization objects when virtual functions are defined.


Figure 17 is shown here.
Figure 17: Complementary figure, provided by Guillaume Caumon


As can be seen from figures Figure 16 and Figure 17, all objects which use virtual functions must have one (hidden) data member to address a table of function pointers. The objects of the classes Vehicle and Auto both address the same table. The class Truck, however, introduces its own version of weight(): therefore, this class needs its own table of function pointers.

14.9: Undefined reference to vtable ...

Occasionaly, the linker will complain with a message like the following:
    In function
        `Derived::Derived[in-charge]()':
        : undefined reference to `vtable for Derived'
This error is caused by the absence of the implementation of a virtual function in a derived class, while the function is mentioned in the derived class's interface.

Such a situation can easily be created:

Here is an example producing the error:
    class Base
    {
        public:
            virtual void member();
    };

        inline void Base::member()
        {}

    class Derived
    {
        public:
            virtual void member();      // only declared
    };

    int main()
    {
        Derived d;  // Will compile, since all members were declared.
                    // Linking will fail, since we don't have the
                    // implementation of Derived::member()
    }
It's of course easy to correct the error: implement the derived class's missing virtual member function.

14.10: Virtual constructors

As we have seen (section 14.2) C++ supports virtual destructors. Like many other object oriented languages (e.g., Java), however, the notion of a virtual constructor is not supported. The absence of a virtual constructor turns into a problem when only a base class reference or pointer is available, and a copy of a derived class object is required. Gamma et al. (1995) developed the Prototype Design Pattern to deal with this situation.

In the Prototype Design Pattern each derived class is given the task to make available a member function returning a pointer to a new copy of the object for which the member is called. The usual name for this function is clone(). A base class supporting `cloning' only needs to define a virtual destructor, and a virtual copy constructor, a pure virtual function, having the prototype virtual Base *clone() const = 0.

Since clone() is a pure virtual function all derived classes must implement their own `virtual constructor'.

This setup suffices in most situations where we have a pointer or reference to a base class, but fails for example with abstract containers. We can't create a vector<Base>, with Base featuring the pure virtual copy() member in its interface, as Base() is called to initialize new elements of such a vector. This is impossible as clone() is a pure virtual function, so a Base() object can't be constructed.

The intuitive solution, providing clone() with a default implementation, defining it as an ordinary virtual function, fails too as the container calls the normal Base(Base const &) copy constructor, which would then have to call clone() to obtain a copy of the copy constructor's argument. At this point it becomes unclear what to do with that copy, as the new Base object already exists, and contains no Base pointer or reference data member to assign clone()'s return value to.

An alternative and preferred approach is to keep the original Base class (defined as an abstract base class), and to manage the Base pointers returned by clone() in a separate class Clonable(). In chapter 16 we'll encounter means to merge Base and Clonable into one class, but for now we'll define them as separate classes.

The class Clonable is a very standard class. As it contains a pointer member, it needs a copy constructor, destructor, and overloaded assignment operator (cf. chapter 7). It's given at least one non-standard member: Base &get() const, returning a reference to the derived object to which Clonable's Base * data member refers, and optionally a Clonable(Base const &) constructor to allow promotions from objects of classes derived from Base to Clonable.

Any non-abstract class derived from Base must implement Base *clone(), returning a pointer to a newly created (allocated) copy of the object for which clone() is called.

Once we have defined a derived class (e.g., Derived1), we can put our Clonable and Base facilities to good use.

In the next example we see main() in which a vector<Clonable> was defined. An anonymous Derived1 object is thereupon inserted into the vector. This proceeds as follows:

In this sequence, two temporary objects are used: the anonymous object and the Derived1 object constructed by the first Derived1::clone() call. The third Derived1 object is inserted into the vector. Having inserted the object into the vector, the two temporary objects are destroyed.

Next, the get() member is used in combination with typeid to show the actual type of the Base & object: a Derived1 object.

The most interesting part of main() is the line vector<Clonable> v2(bv), where a copy of the first vector is created. As shown, the copy keeps intact the actual types of the Base references.

At the end of the program, we have created two Derived1 objects, which are then correctly deleted by the vector's destructors. Here is the full program, illustrating the `virtual constructor' concept:

    #include <iostream>
    #include <vector>
    #include <typeinfo>

    class Base
    {
        public:
            virtual ~Base();
            virtual Base *clone() const = 0;
    };

        inline Base::~Base()
        {}

    class Clonable
    {
        Base *d_bp;

        public:
            Clonable();
            ~Clonable();
            Clonable(Clonable const &other);
            Clonable &operator=(Clonable const &other);

            // New for virtual constructions:
            Clonable(Base const &bp);
            Base &get() const;

        private:
            void copy(Clonable const &other);
    };

        inline Clonable::Clonable()
        :
            d_bp(0)
        {}
        inline Clonable::~Clonable()
        {
            delete d_bp;
        }
        inline Clonable::Clonable(Clonable const &other)
        {
            copy(other);
        }

        Clonable &Clonable::operator=(Clonable const &other)
        {
            if (this != &other)
            {
                delete d_bp;
                copy(other);
            }
            return *this;
        }

        // New for virtual constructions:
        inline Clonable::Clonable(Base const &bp)
        {
            d_bp = bp.clone();      // allows initialization from
        }                           // Base and derived objects
        inline Base &Clonable::get() const
        {
            return *d_bp;
        }

        void Clonable::copy(Clonable const &other)
        {
            if ((d_bp = other.d_bp))
                d_bp = d_bp->clone();
        }

    class Derived1: public Base
    {
        public:
            ~Derived1();
            virtual Base *clone() const;
    };

        inline Derived::~Derived1()
        {
            std::cout << "~Derived1() called\n";
        }
        inline Base *Derived::clone() const
        {
            return new Derived1(*this);
        }

    using namespace std;

    int main()
    {
        vector<Clonable> bv;

        bv.push_back(Derived1());
        cout << "==\n";

        cout << typeid(bv[0].get()).name() << endl;
        cout << "==\n";

        vector<Clonable> v2(bv);
        cout << typeid(v2[0].get()).name() << endl;
        cout << "==\n";
    }