Chapter 7: Classes and memory allocation

Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.

Please state the document version you're referring to, as found in the title (in this document: 7.0.0) and please state chapter and paragraph name or number you're referring to.

All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.

In contrast to the set of functions which handle memory allocation in C (i.e., malloc() etc.), the operators new and delete are specifically meant to be used with the features that C++ offers. Important differences between malloc() and new are:

A comparable relationship exists between free() and delete: delete makes sure that when an object is deallocated, a corresponding destructor is called.

The automatic calling of constructors and destructors when objects are created and destroyed, has a number of consequences which we shall discuss in this chapter. Many problems encountered during C program development are caused by incorrect memory allocation or memory leaks: memory is not allocated, not freed, not initialized, boundaries are overwritten, etc.. C++ does not `magically' solve these problems, but it does provide a number of handy tools.

Unfortunately, the very frequently used str...() functions, like strdup() are all malloc() based, and should therefore preferably not be used anymore in C++ programs. Instead, a new set of corresponding functions, based on the operator new, are preferred. Also, since the class string is available, there is less need for these functions in C++ than in C. In cases where operations on char * are preferred or necessary, comparable functions based on new could be developed. E.g., for the function strdup() a comparable function char *strdupnew(char const *str) could be developed as follows:

    char *strdupnew(char const *str)
    {
        return str ? strcpy(new char [strlen(str) + 1], str) : 0;
    }
In this chapter the following topics will be covered:

7.1: The operators `new' and `delete'

C++ defines two operators to allocate and deallocate memory. These operators are new and delete.

The most basic example of the use of these operators is given below. An int pointer variable is used to point to memory which is allocated by the operator new. This memory is later released by the operator delete.

    int *ip;

    ip = new int;
    delete ip;
Notes:

The operator new can be used to allocate primitive types and to allocate objects. When a non-class type is allocated (a primitive type or a struct type without a constructor), the allocated memory is not guaranteed to be initialized to 0. Alternatively, an initialization expression may be provided:

    int *v1 = new int;          // not guaranteed to be initialized to 0
    int *v1 = new int();        // initialized to 0
    int *v2 = new int(3);       // initialized to 3
    int *v3 = new int(3 * *v2); // initialized to 9
When class-type objects are allocated, the constructor must be mentioned, and the allocated memory will be initialized according to the constructor that is used. For example, to allocate a string object the following statement can be used:
        string *s = new string();
Here, the default constructor was used, and s will point to the newly allocated, but empty, string. If overloaded forms of the constructor are available, these can be used as well. E.g.,
        string *s = new string("hello world");
which results in s pointing to a string containing the text hello world.

7.1.1: Allocating arrays

Operator new[] is used to allocate arrays. The generic notation new[] is an abbreviation used in the Annotations. Actually, the number of elements to be allocated is specified as an expression between the square brackets, which are prefixed by the type of the values or class of the objects that must be allocated:
        int *intarr = new int[20];   // allocates 20 ints
Note well that operator new is a different operator than operator new[]. In section 9.9 redefining operator new[] is covered.

Arrays allocated by operator new[] are called dynamic arrays. They are constructed during the execution of a program, and their lifetime may exceed the lifetime of the function in which they were created. Dynamically allocated arrays may last for as long as the program runs.

When new[] is used to allocate an array of primitive values or an array of objects, new[] must be specified with a type and an (unsigned) expression between square brackets. The type and expression together are used by the compiler to determine the required size of the block of memory to make available. With the array allocation, all elements are stored consecutively in memory. The array index notation can be used to access the individual elements: intarr[0] will be the very first int value, immediately followed by intarr[1], and so on until the last element: intarr[19]. With non-class types (primitive types, struct types without constructors, pointer types) the returned allocated block of memory is not guaranteed to be initialized to 0.

To allocate arrays of objects, the new[]-bracket notation is used as well. For example, to allocate an array of 20 string objects the following construction is used:

        string *strarr = new string[20];   // allocates 20 strings
Note here that, since objects are allocated, constructors are automatically used. So, whereas new int[20] results in a block of 20 uninitialized int values, new string[20] results in a block of 20 initialized string objects. With arrays of objects the default constructor is used for the initialization. Unfortunately it is not possible to use a constructor having arguments when arrays of objects are allocated. However, it is possible to overload operator new[] and provide it with arguments which may be used for a non-default initialization of arrays of objects. Overloading operator new[] is discussed in section 9.9.

Similar to C, and without resorting to the operator new[], arrays of variable size can also be constructed as local arrays within functions. Such arrays are not dynamic arrays, but local arrays, and their lifetime is restricted to the lifetime of the block in which they were defined.

Once allocated, all arrays are fixed size arrays. There is no simple way to enlarge or shrink arrays: there is no renew operator. In section 7.1.3 an example is given showing how to enlarge an array.

7.1.2: Deleting arrays

A dynamically allocated array may be deleted using operator delete[]. Operator delete[] expects a pointer to a block of memory, previously allocated using operator new[].

When an object is deleted, its destructor (see section 7.2) is called automatically, comparable to the calling of the object's constructor when the object was created. It is the task of the destructor, as discussed in depth later in this chapter, to do all kinds of cleanup operations that are required for the proper destruction of the object.

The operator delete[] (empty square brackets) expects as its argument a pointer to an array of objects. This operator will now first call the destructors of the individual objects, and will then delete the allocated block of memory. So, the proper way to delete an array of Objects is:

    Object *op = new Object[10];
    delete[] op;
Realize that delete[] only has an additional effect if the block of memory to be deallocated consists of objects. With pointers or values of primitive types normally no special action is performed. Following int *it = new int[10] the statement delete[] it the memory occupied by all ten int values is returned to the common pool. Nothing special happens.

Note especially that an array of pointers to objects is not handled as an array of objects by delete[]: the array of pointers to objects doesn't contain objects, so the objects are not properly destroyed by delete[], whereas an array of objects contains objects, which are properly destroyed by delete[]. In section 7.2 several examples of the use of delete versus delete[] will be given.

The operator delete is a different operator than operator delete[]. In section 9.9 redefining delete[] is discussed. The rule of thumb is: if new[] was used, also use delete[].

7.1.3: Enlarging arrays

Once allocated, all arrays are arrays of fixed size. There is no simple way to enlarge or shrink arrays: there is no renew operator. In this section an example is given showing how to enlarge an array. Enlarging arrays is only possible with dynamic arrays. Local and global arrays cannot be enlarged. When an array must be enlarged, the following procedure can be used: The following example focuses on the enlargement of an array of string objects:
    #include <string>
    using namespace std;

    string *enlarge(string *old, unsigned oldsize, unsigned newsize)
    {
        string *tmp = new string[newsize];  // allocate larger array

        for (unsigned idx = 0; idx < oldsize; ++idx)
            tmp[idx] = old[idx];            // copy old to tmp

        delete[] old;                       // using [] due to objects

        return tmp;                         // return new array
    }

    int main()
    {
        string *arr = new string[4];        // initially: array of 4 strings

        arr = enlarge(arr, 4, 6);           // enlarge arr to 6 elements.
    }

7.1.4: The `placement new' operator

Although normally there should be an operator delete call for every call to operator new, there is a noticeable exception to that rule. It is called the placement new operator.

In this variant of operator new the operator accepts a block of memory and initializes is by a constructor of choice. The block of memory should of course be large enough to contain the object, but apart from that no other requirements exist.

The placement new operator uses the following syntax (using Type to indicate the data type that is used):

    Type *new(void *memory) Type(arguments);
Here, memory is block of memory of at least sizeof(Type) bytes large (usually memory will point to an array of characters), and Type(arguments) is any constructor of the clas Type.

The placement new operator comes in handy when the memory to contain one or more objects is already available or when its size is known beforehand. The memory could have been statically or dynamically allocated; when allocated dynamically the appropriate destructor must eventually be called to destroy the objects and the block of memory.

The question of how to call the destructors of objects initialized using the placement new operator is an interesting one:

It should come as no surprise that the object's destructors aren't called (assuming a dynamically allocated substrate for the objects). After all, the memory on which the objects are defined is known to the run-time system as a block of (e.g.) characters, for which no destructors are defined. Even calling delete on the pointer returned by the placement new operator doesn't help: it's still the same memory address as the original block of characters.

So, how then can the destructors of objects initialized by the placement new operator be called? The answer may be surprising: it must be called explicitly. Here is an example:

#include <iostream>
using namespace std;

class Object
{
    public:
        Object();
        ~Object();
};
inline Object::Object()
{
    cout << "Constructor\n";
};
inline Object::~Object()
{
    cout << "Destructor\n";
};

int main()
{
    char buffer[2 * sizeof(Object)];

    Object *obj = new(buffer) Object;       // placement new, 1st object
    new(buffer + sizeof(Object)) Object;    // placement new, 2nd object

    // delete obj;                          // DON'T DO THIS

    obj[0].~Object();                       // destroy 1st object
    obj[1].~Object();                       // destroy 2nd object
}

// Displays:
//  Constructor
//  Constructor
//  Destructor
//  Destructor

It's clearly dangerous to use the placement new operator: object destruction is not guaranteed, and since destruction must be performed manually there's no guarantee that object destruction takes place in the reverse order of object construction.

If in the above example buffer would have been allocated dynamically, a final statement

    delete [] buffer;
should be added to main. It would merely return buffer's allocated memory to the common pool, without calling any object destructor whatsoever.

7.2: The destructor

Comparable to the constructor, classes may define a destructor. This function is the opposite of the constructor in the sense that it is invoked when an object ceases to exist. For objects which are local non-static variables, the destructor is called when the block in which the object is defined is left: the destructors of objects that are defined in nested blocks of functions are therefore usually called before the function itself terminates. The destructors of objects that are defined somewhere in the outer block of a function are called just before the function returns (terminates). For static or global variables the destructor is called before the program terminates.

However, when a program is interrupted using an exit() call, the destructors are called only for global objects existing at that time. Destructors of objects defined locally within functions are not called when a program is forcefully terminated using exit().

The definition of a destructor must obey the following rules:

The destructor for the class Person is thus declared as follows:
    class Person
    {
        public:
            Person();               // constructor
            ~Person();              // destructor
    };
The position of the constructor(s) and destructor in the class definition is dictated by convention: first the constructors are declared, then the destructor, and only then other members are declared.

The main task of a destructor is to make sure that memory allocated by the object (e.g., by its constructor) is properly deleted when the object goes out of scope. Consider the following definition of the class Person:

    class Person
    {
        char *d_name;
        char *d_address;
        char *d_phone;

        public:
            Person();
            Person(char const *name, char const *address,
                   char const *phone);
            ~Person();

            char const *name() const;
            char const *address() const;
            char const *phone() const;
    };

        inline Person::Person()
        {}

/*
    person.ih contains:

    #include "person.h"
    char const *strdupnew(char const *org);
*/
The task of the constructor is to initialize the data fields of the object. E.g, the constructor is defined as follows:
    #include "person.ih"

    Person::Person(char const *name, char const *address, char const *phone)
    :
        d_name(strdupnew(name)),
        d_address(strdupnew(address)),
        d_phone(strdupnew(phone))
    {}
In this class the destructor is necessary to prevent that memory, allocated for the fields d_name, d_address and d_phone, becomes unreachable when an object ceases to exist, thus producing a memory leak. The destructor of an object is called automatically Since it is the task of the destructor to delete all memory that was dynamically allocated and used by the object, the task of the Person's destructor would be to delete the memory to which its three data members point. The implementation of the destructor would therefore be:
    #include "person.ih"

    Person::~Person()
    {
        delete d_name;
        delete d_address;
        delete d_phone;
    }
In the following example a Person object is created, and its data fields are printed. After this the showPerson() function stops, resulting in the deletion of memory. Note that in this example a second object of the class Person is created and destroyed dynamically by respectively, the operators new and delete.
    #include "person.h"
    #include <iostream>

    void showPerson()
    {
        Person karel("Karel", "Marskramerstraat", "038 420 1971");
        Person *frank = new Person("Frank", "Oostumerweg", "050 403 2223");

        cout << karel.name()     << ", " <<
                karel.address()  << ", " <<
                karel.phone()    << endl <<
                frank->name()    << ", " <<
                frank->address() << ", " <<
                frank->phone()   << endl;

        delete frank;
    }
The memory occupied by the object karel is deleted automatically when showPerson() terminates: the C++ compiler makes sure that the destructor is called. Note, however, that the object pointed to by frank is handled differently. The variable frank is a pointer, and a pointer variable is itself no Person. Therefore, before main() terminates, the memory occupied by the object pointed to by frank should be explicitly deleted; hence the statement delete frank. The operator delete will make sure that the destructor is called, thereby deleting the three char * strings of the object.

7.2.1: New and delete and object pointers

The operators new and delete are used when an object of a given class is allocated. As we have seen, one of the advantages of the operators new and delete over functions like malloc() and free() is that new and delete call the corresponding constructors and destructors. This is illustrated in the next example:
    Person *pp = new Person();  // ptr to Person object

    delete pp;                  // now destroyed
The allocation of a new Person object pointed to by pp is a two-step process. First, the memory for the object itself is allocated. Second, the constructor is called, initializing the object. In the above example the constructor is the argument-free version; it is however also possible to use a constructor having arguments:
    frank = new Person("Frank", "Oostumerweg", "050 403 2223");
    delete frank;
Note that, analogously to the construction of an object, the destruction is also a two-step process: first, the destructor of the class is called to delete the memory allocated and used by the object; then the memory which is used by the object itself is freed.

Dynamically allocated arrays of objects can also be manipulated by new and delete. In this case the size of the array is given between the [] when the array is created:

    Person *personarray = new Person [10];
The compiler will generate code to call the default constructor for each object which is created. As we have seen in section 7.1.2, the delete[] operator must be used here to destroy such an array in the proper way:
    delete[] personarray;
The presence of the [] ensures that the destructor is called for each object in the array.

What happens if delete rather than delete[] is used? Consider the following situation, in which the destructor ~Person() is modified so that it will tell us that it's called. In a main() function an array of two Person objects is allocated by new, to be deleted by delete []. Next, the same actions are repeated, albeit that the delete operator is called without []:

    #include <iostream>
    #include "person.h"
    using namespace std;

    Person::~Person()
    {
        cout << "Person destructor called" << endl;
    }

    int main()
    {
        Person *a  = new Person[2];

        cout << "Destruction with []'s" << endl;
        delete[] a;

        a = new Person[2];

        cout << "Destruction without []'s" << endl;
        delete a;

        return 0;
    }
/*
    Generated output:
Destruction with []'s
Person destructor called
Person destructor called
Destruction without []'s
Person destructor called
*/
Looking at the generated output, we see that the destructors of the individual Person objects are called if the delete[] syntax is followed, while only the first object's destructor is called if the [] is omitted.

If no destructor is defined, it is not called. This may seem to be a trivial statement, but it has severe implications: objects which allocate memory will result in a memory leak when no destructor is defined. Consider the following program:

    #include <iostream>
    #include "person.h"
    using namespace std;

    Person::~Person()
    {
        cout << "Person destructor called" << endl;
    }

    int main()
    {
        Person **a = new Person* [2];

        a[0] = new Person[2];
        a[1] = new Person[2];

        delete[] a;

        return 0;
    }
This program produces no output at all. Why is this? The variable a is defined as a pointer to a pointer. For this situation, however, there is no defined destructor. Consequently, the [] is ignored.

Now, as the [] is ignored, only the array a itself is deleted, because here `delete[] a' deletes the memory pointed to by a. That's all there is to it.

Of course, we don't want this, but require the Person objects pointed to by the elements of a to be deleted too. In this case we have two options:

7.2.2: The function set_new_handler()

The C++ run-time system makes sure that when memory allocation fails, an error function is activated. By default this function throws a (bad_alloc) exception ( ) (see section 8.10), terminating the program. Consequently, in the default case it is never necessary to check the return value of the operator new. This default behavior may be modified in various ways. One way to modify this default behavior is to redefine the function handling failing memory allocation. However, any user-defined function must comply with the following prerequisites:

The redefined error function might, e.g., print a message and terminate the program. The user-written error function becomes part of the allocation system through the function set_new_handler().

The implementation of an error function is illustrated below ( This implementation applies to the Gnu C/C++ requirements. The actual try-out of the program given in the example is not encouraged, as it will slow down the computer enormously due to the resulting use of the operating system's swap area.):

    #include <iostream>
    #include <string>
    using namespace std;

    void outOfMemory()
    {
        cout << "Memory exhausted. Program terminates." << endl;
        exit(1);
    }

    int main()
    {
        long allocated = 0;

        set_new_handler(outOfMemory);       // install error function

        while (true)                        // eat up all memory
        {
            memset(new int [100000], 0, 100000 * sizeof(int));
            allocated += 100000 * sizeof(int);
            cout << "Allocated " << allocated << " bytes\n";
        }
    }
After installing the error function it is automatically invoked when memory allocation fails, and the program exits. Note that memory allocation may fail in indirectly called code as well, e.g., when constructing or using streams or when strings are duplicated by low-level functions.

So far for the theory. On some systems the ` out of memory' condition may actually never be reached, as the operating system may interfere before the run-time sypport system gets a chance to stop the program (see also this link).

Note that it may not be assumed that the standard C functions which allocate memory, such as strdup(), malloc(), realloc() etc. will trigger the new handler when memory allocation fails. This means that once a new handler is installed, such functions should not automatically be used in an unprotected way in a C++ program. An example using new to duplicate a string, was given in a rewrite of the function strdup() (see section 7).

7.3: The assignment operator

Variables which are structs or classes can be directly assigned in C++ in the same way that structs can be assigned in C. The default action of such an assignment for non-class type data members is a straight byte-by-byte copy from one data member to another. Now consider the consequences of this default action in a function such as the following:
    void printperson(Person const &p)
    {
        Person tmp;

        tmp = p;
        cout << "Name:     " << tmp.name()       << endl <<
                "Address:  " << tmp.address()    << endl <<
                "Phone:    " << tmp.phone()      << endl;
    }
We shall follow the execution of this function step by step. Having executed printperson(), the object which was referenced by p now contains pointers to deleted memory.

This situation is undoubtedly not a desired effect of a function like the above. The deleted memory will likely become occupied during subsequent allocations: the pointer members of p have effectively become wild pointers, as they don't point to allocated memory anymore. In general it can be concluded that

every class containing pointer data members is a potential candidate for trouble.
Fortunately, it is possible to prevent these troubles, as discussed in the next section.

7.3.1: Overloading the assignment operator

Obviously, the right way to assign one Person object to another, is not to copy the contents of the object bytewise. A better way is to make an equivalent object: one with its own allocated memory, but which contains the same strings.

The `right' way to duplicate a Person object is illustrated in Figure 6.

Figure 6 is shown here.
Figure 6: Private data and public interface functions of the class Person, using the `correct' assignment.


There are several ways to duplicate a Person object. One way would be to define a special member function to handle assignments of objects of the class Person. The purpose of this member function would be to create a copy of an object, but one with its own name, address and phone strings. Such a member function might be:

    void Person::assign(Person const &other)
    {
        // delete our own previously used memory
        delete d_name;
        delete d_address;
        delete d_phone;

        // now copy the other Person's data
        d_name = strdupnew(other.d_name);
        d_address = strdupnew(other.d_address);
        d_phone = strdupnew(other.d_phone);
    }
Using this tool we could rewrite the offending function printperson():
    void printperson(Person const &p)
    {
        Person tmp;

        // make tmp a copy of p, but with its own allocated memory
        tmp.assign(p);

        cout << "Name:     " << tmp.name()       << endl <<
                "Address:  " << tmp.address()    << endl <<
                "Phone:    " << tmp.phone()      << endl;

        // now it doesn't matter that tmp gets destroyed..
    }
By itself this solution is valid, although it is a purely symptomatic solution. This solution requires the programmer to use a specific member function instead of the operator =. The basic problem, however, remains if this rule is not strictly adhered to. Experience learns that errare humanum est: a solution which doesn't enforce special actions is therefore preferable.

The problem of the assignment operator is solved using operator overloading: the syntactic possibility C++ offers to redefine the actions of an operator in a given context. Operator overloading was mentioned earlier, when the operators << and >> were redefined to be used with streams (like cin, cout and cerr), see section 3.1.2.

Overloading the assignment operator is probably the most common form of operator overloading. However, a word of warning is appropriate: the fact that C++ allows operator overloading does not mean that this feature should be used at all times. A few rules are:

Using these rules, operator overloading is minimized which helps keep source files readable. An operator simply does what it is designed to do. Therefore, I consider overloading the insertion (<<) and extraction (>>) operators in the context of streams ill-chosen: the stream operations do not have anything in common with the bitwise shift operations.

7.3.1.1: The member 'operator=()' To achieve operator overloading in the context of a class, the class is simply expanded with a (usually public) member function naming the particular operator. That member function is thereupon defined.

For example, to overload the assignment operator =, a function operator=() must be defined. Note that the function name consists of two parts: the keyword operator, followed by the operator itself. When we augment a class interface with a member function operator=(), then that operator is redefined for the class, which prevents the default operator from being used. Previously (in section 7.3.1) the function assign() was offered to solve the memory-problems resulting from using the default assignment operator. However, instead of using an ordinary member function it is much more common in C++ to define a dedicated operator for these special cases. So, the earlier assign() member may be redefined as follows (note that the member operator=() presented below is a first, rather unsophisticated, version of the overloaded assignment operator. It will be improved shortly):

    class Person
    {
        public:                             // extension of the class Person
                                            // earlier members are assumed.
            void operator=(Person const &other);
    };
and its implementation could be
    void Person::operator=(Person const &other)
    {
        delete d_name;                      // delete old data
        delete d_address;
        delete d_phone;

        d_name = strdupnew(other.d_name);   // duplicate other's data
        d_address = strdupnew(other.d_address);
        d_phone = strdupnew(other.d_phone);
    }
The actions of this member function are similar to those of the previously proposed function assign(), but now its name ensures that this function is also activated when the assignment operator = is used. There are actually two ways to call overloaded operators:
    Person pers("Frank", "Oostumerweg", "403 2223");
    Person copy;

    copy = pers;                // first possibility
    copy.operator=(pers);       // second possibility
Actually, the second possibility, explicitly calling operator=(), is not used very often. However, the code fragment does illustrate two ways to call the same overloaded operator member function.

7.4: The `this' pointer

As we have seen, a member function of a given class is always called in the context of some object of the class. There is always an implicit ` substrate' for the function to act on. C++ defines a keyword, this, to address this substrate (Note that `this' is not available in the not yet discussed static member functions.).

The this keyword is a pointer variable, which always contains the address of the object in question. The this pointer is implicitly declared in each member function (whether public, protected, or private). Therefore, it is as if each member function of the class Person contains the following declaration:

    extern Person *const this;
A member function like name(), which returns the name field of a Person, could therefore be implemented in two ways: with or without the this pointer:
    char const *Person::name()   // implicit usage of `this'
    {
        return d_name;
    }

    char const *Person::name()   // explicit usage of `this'
    {
        return this->d_name;
    }
The this pointer is not frequently used explicitly. However, situations do exist where the this pointer is actually required (cf. chapter 15).

7.4.1: Preventing self-destruction using `this'

As we have seen, the operator = can be redefined for the class Person in such a way that two objects of the class can be assigned, resulting in two copies of the same object.

As long as the two variables are different ones, the previously presented version of the function operator=() will behave properly: the memory of the assigned object is released, after which it is allocated again to hold new strings. However, when an object is assigned to itself (which is called auto-assignment), a problem occurs: the allocated strings of the receiving object are first deleted, resulting in the deletion of the memory of the right-hand side variable, which we call self-destruction. An example of this situation is illustrated here:

    void fubar(Person const &p)
    {
        p = p;          // auto-assignment!
    }
In this example it is perfectly clear that something unnecessary, possibly even wrong, is happening. But auto-assignment can also occur in more hidden forms:
    Person one;
    Person two;
    Person *pp = &one;

    *pp = two;
    one = *pp;
The problem of auto-assignment can be solved using the this pointer. In the overloaded assignment operator function we simply test whether the address of the right-hand side object is the same as the address of the current object: if so, no action needs to be taken. The definition of the function operator=() thus becomes:
    void Person::operator=(Person const &other)
    {
        // only take action if address of the current object
        // (this) is NOT equal to the address of the other object

        if (this != &other)
        {
            delete d_name;
            delete d_address;
            delete d_phone;

            d_name = strdupnew(other.d_name);
            d_address = strdupnew(other.d_address);
            d_phone = strdupnew(other.d_phone);
        }
    }
This is the second version of the overloaded assignment function. One, yet better version remains to be discussed.

As a subtlety, note the usage of the address operator '&' in the statement

    if (this != &other)
The variable this is a pointer to the `current' object, while other is a reference; which is an `alias' to an actual Person object. The address of the other object is therefore &other, while the address of the current object is this.

7.4.2: Associativity of operators and this

According to C++'s syntax, the assignment operator associates from right to left. I.e., in statements like:
    a = b = c;
the expression b = c is evaluated first, and the result is assigned to a.

So far, the implementation of the overloaded assignment operator does not permit such constructions, as an assignment using the member function returns nothing (void). We can therefore conclude that the previous implementation does solve an allocation problem, but concatenated assignments are still not allowed.

The problem can be illustrated as follows. When we rewrite the expression a = b = c to the form which explicitly mentions the overloaded assignment member functions, we get:

        a.operator=(b.operator=(c));
This variant is syntactically wrong, since the sub-expression b.operator=(c) yields void. However, the class Person contains no member functions with the prototype operator=(void).

This problem too can be remedied using the this pointer. The overloaded assignment function expects as its argument a reference to a Person object. It can also return a reference to such an object. This reference can then be used as an argument in a concatenated assignment.

It is customary to let the overloaded assignment return a reference to the current object (i.e., *this). The (final) version of the overloaded assignment operator for the class Person thus becomes:

    Person &Person::operator=(Person const &other)
    {
        if (this != &other)
        {
            delete d_address;
            delete d_name;
            delete d_phone;

            d_address = strdupnew(other.d_address);
            d_name = strdupnew(other.d_name);
            d_phone = strdupnew(other.d_phone);
        }
        // return current object. The compiler will make sure
        // that a reference is returned
        return *this;
    }

7.5: The copy constructor: initialization vs. assignment

In the following sections we shall take a closer look at another usage of the operator =. Consider, once again, the class Person. The class has the following characteristics: Now consider the following code fragment. The statement references are discussed following the example:
    Person karel("Karel", "Marskramerstraat", "038 420 1971"); // see (1)
    Person karel2;                                             // see (2)
    Person karel3 = karel;                                     // see (3)

    int main()
    {
        karel2 = karel3;                                    // see (4)
        return 0;
    }

The simple rule emanating from these examples is that whenever an object is created, a constructor is needed. All constructors have the following characteristics:

Therefore, we conclude that, given the above statement (3), the class Person must be augmented with a copy constructor:
    class Person
    {
        public:
            Person(Person const &other);
    };
The implementation of the Person copy constructor is:
    Person::Person(Person const &other)
    {
        d_name    = strdupnew(other.d_name);
        d_address = strdupnew(other.d_address);
        d_phone   = strdupnew(other.d_phone);
    }
The actions of copy constructors are comparable to those of the overloaded assignment operators: an object is duplicated, so that it will contain its own allocated data. The copy constructor, however, is simpler in the following respects: Apart from the above mentioned quite obvious usage of the copy constructor, the copy constructor has other important tasks. All of these tasks are related to the fact that the copy constructor is always called when an object is initialized using another object of its class. The copy constructor is called even when this new object is a hidden or is a temporary variable. To demonstrate that copy constructors are not called in all situations, consider the following. We could rewrite the above function person() to the following form:
    Person person()
    {
        string name;
        string address;
        string phone;

        cin >> name >> address >> phone;

        return Person(name.c_str(), address.c_str(), phone.c_str());
    }
This code fragment is perfectly valid, and illustrates the use of an anonymous object. Anonymous objects are const objects: their data members may not change. The use of an anonymous object in the above example illustrates the fact that object return values should be considered constant objects, even though the keyword const is not explicitly mentioned in the return type of the function (as in Person const person()).

As an other example, once again assuming the availability of a Person(char const *name) constructor, consider:

    Person namedPerson()
    {
        string name;

        cin >> name;
        return name.c_str();
    }
Here, even though the return value name.c_str() doesn't match the return type Person, there is a constructor available to construct a Person from a char const *. Since such a constructor is available, the (anonymous) return value can be constructed by promoting a char const * type to a Person type using an appropriate constructor.

Contrary to the situation we encountered with the default constructor, the default copy constructor remains available once a constructor (any constructor) is defined explicitly. The copy constructor can be redefined, but if not, then the default copy constructor will still be available when another constructor is defined.

7.5.1: Similarities between the copy constructor and operator=()

The similarities between the copy constructor and the overloaded assignment operator are reinvestigated in this section. We present here two primitive functions which often occur in our code, and which we think are quite useful. Note the following features of copy constructors, overloaded assignment operators, and destructors: The above two actions (duplication and deletion) can be implemented in two private functions, say copy() and destroy(), which are used in the overloaded assignment operator, the copy constructor, and the destructor. When we apply this method to the class Person, we can implement this approach as follows: What we like about this approach is that the destructor, copy constructor and overloaded assignment functions are now completely standard: they are independent of a particular class, and their implementations can therefore be used in every class. Any class dependencies are reduced to the implementations of the private member functions copy() and destroy().

Note, that the copy() member function is responsible for the copying of the other object's data fields to the current object. We've shown the situation in which a class only has pointer data members. In most situations classes have non-pointer data members as well. These members must be copied in the copy constructor as well. This can simply be realized by the copy constructor's body except for the initialization of reference data members, which must be initialized using the member initializer method, introduced in section 6.4.2. However, in this case the overloaded assignment operator can't be fully implemented either, as reference members cannot be given another value once initialized. An object having reference data members is inseparately attached to its referenced object(s) once it has been constructed.

7.5.2: Preventing certain members from being used

As we've seen in the previous section, situations may be encountered in which a member function can't do its job in a completely satisfactory way. In particular: an overloaded assignment operator cannot do its job completely if its class contains reference data members. In this and comparable situations the programmer might want to prevent the (accidental) use of certain member functions. This can be realized in the following ways:

7.6: Conclusion

Two important extensions to classes have been discussed in this chapter: the overloaded assignment operator and the copy constructor. As we have seen, classes with pointer data members, addressing allocated memory, are potential sources of memory leaks. The two extensions introduced in this chapter represent the standard way to prevent these memory leaks.

The simple conclusion is therefore: classes whose objects allocate memory which is used by these objects themselves, should implement a destructor, an overloaded assignment operator and a copy constructor as well.