One of the most common questions of db4o users is: why does not db4o allow to use equals() and hashcode to identify objects in the database. From the first glance it seems like a very attractive contract - let the developer decide what should be the base for comparing objects and making them unique in the database. For example if the database identity is based on the object's field values it will prevent duplicate objects from being stored to the database, as they will automatically be considered one object.
Yes, it looks attractive, but there is a huge pitfall: when we deal with objects, we deal with their references to each other comprising a unique object graph, which can be very complex. Preserving these references becomes a task of storing many-to-many relationships. This task can only be solved by providing unique identification to each object in memory and not only in the database, which means that it can't depend on the information stored in the object (like an aggregate of field values).
To see it clearly, let's look at an example. Suppose we have
Pilot{string name} and Car{Pilot pilot} classes, and their equals
method is based on comparing field values:
Now, this was a simple example, and can be solved by updating the car object together with the pilot. But what happens if there are thousands of objects referencing this pilot instance? They will all have to be retrieved and updated. Further, those objects can be also referenced somewhere and potentially a single update in a pilot object can trigger the re-write of the whole database.
Objects without identity also make Transparent Persistence and Activation impossible, as there will be no way to decide which instance is the right one for update or activation.
So unique identification of database objects in memory is unavoidable and identity based on an object reference is the most straightforward way to get this identification.