Re: DLL hell (was: Cocoa newbie frustration)
Re: DLL hell (was: Cocoa newbie frustration)
- Subject: Re: DLL hell (was: Cocoa newbie frustration)
- From: "Erik M. Buck" <email@hidden>
- Date: Tue, 9 Oct 2001 10:48:05 -0500
Related to the DLL Hell problem, C++ has uniquly fragile base classes.
Objective-C suffers from a much more minor form of the Fragile Base Class
Problem involving instance variable layout only. The fragile base class
problem makes it very difficult and ugly to build reusable frameworks and
component software with C++. Hence COM, SOM, DCOM, IDL...
Here is a short article from Be Inc. describing the Fragile Base Class
problem with C++ (Quoted without permission):
BE ENGINEERING INSIGHTS: What's the Fragile Base Class (FBC) Problem?
By Peter Potrebic
Try as we might to separate and hide our implementation of the BeOS, certain
dependencies are created the moment a developer compiles and links against
our dynamically loaded libraries. These dependencies include:
The size of objects (i.e. structs or classes)
The offsets to "visible" (public or protected) data
The existence and size of the vtable
The offsets to the virtual functions in the vtable
When your app is compiled and linked, it records all these statistics. If
any of these things changes in the library, the compiled app will no longer
run. This is the "Fragile Base Class" problem.
With the almost-in-your-hands Preview Release, we're putting a stake in the
ground: The Preview Release will be forward- compatible with subsequent
releases of the BeOS. How far into the future will this compatibility last?
Frankly, we don't know -- we'll run it out as far as we can, but if we hit a
brick wall we'll reassess our position.
To archive the goal of forward-compatibility, we had to take certain steps.
If you're designing your own library, and if you want to be able re-release
your library without breaking your client's code, you may want to follow
similar steps.
The Bright Side
Before we look at our FBC solution, let's look at what CAN change without
breaking compatibility:
Non-virtual functions. A class can adopt as many new non-virtuals as it
wants. Old code won't be able to take advantage of the new functions, of
course, but you won't break anything.
New classes. New classes are permitted as long as they don't change the
inheritance hierarchy of the existing classes.
Implementation. The way that existing functions are implemented is allowed
to change. Re-implementing old functions is obviously not something to be
done lightly, but it's not going to tickle the FBC problem.
The Dark Side
Here are the matters that affect the FBC problem, and our solution for each:
* The Size of Objects Cannot Change
The "size of an object" means the cumulative size of its data members. If
more data members are added, the class will break compatibility. In the
Preview Release, we've reserved an "appropriate" amount of data in each
class:
uint32 _reserved[X];
Where "X" is determined on a class-by-class basis. If we anticipate that the
class won't ever change (BRect, for example), then we didn't add any
padding. If the object is small but it might grow, then we added a little --
maybe 25-50% of the current size. For example, the BMessenger object has 13
bytes of real data; we've padded it with 7 extra bytes. Large classes get
even more.
So what happens three months from now when the "right amount" ends up being
too little? The final "_reserved" value can be used to point to another
structure that accommodates the new data (taking into account that a pointer
and a int32 might not be the same size).
* Offsets of Publicly Visible Data Members Cannot Change
When thinking about the FBC problem realize that the "protected" C++ keyword
really means "public." Anything that is "protected" is publicly visible. Any
"public" or "protected" data members are fixed in stone: Their offsets and
sizes can never change.
There isn't a "solution" to this because it really isn't a problem; it's
just something you must be aware of if you're making your own library.
* Be Wary of Inline Functions
If an inline function exposes the size/offset of a private data member then
that private member is actually public: Its size and offset can never
change. We've removed all such inlines from the kits. Unless there's some
overriding performance issue I'd recommend you do the same in your
libraries. Remember: The only safe inlines are those that only reference
public members (data or functions) or non-virtual private member functions.
* VTable Now for the Future
If a class/struct is *ever* going to have a vtable it better have one now.
Adding that first virtual function changes the size of the class, and
possibly the offsets of every single data member.
Add a dummy virtual (or a few) now or forever hold your peace. If a class
doesn't need virtual functions, then you don't need to do anything.
Even if a class already has virtual functions, you may want to add more --
do it now or never. In the Be kits, most classes have additional reserved
virtual functions; look at the beginning of any private section in a Be
header file and you'll see them:
class BWhatAPain {
public:
...
private:
virtual void _ReservedWhatAPain1();
virtual void _ReservedWhatAPain2();
virtual void _ReservedWhatAPain3();
...
};
* The Ugly Safety Net
For some classes, it's difficult to estimate the correct number of extra
virtuals. Too many is okay, but too few can be bad.
To solve this problem, an additional "ioctl"-like virtual function can be
added to the class hierarchy for unlimited (but ugly) extensibility.
"Perform" is the name of choice for this type of function. As an example,
look in the BArchivable class:
class BArchivable {
public:
...
virtual status_t Perform(uint32 d, void *arg);
};
If the function is needed, we can define "selector codes" and use the
Perform function like so:
ptr->Perform(B_SOME_ACTION, data);
It's not pretty, but it gives us room if a class runs out of dummy virtuals.
* Public Virtual Function Order Cannot Change
The order that public virtual functions appear in the header file is set in
concrete. The Metrowerks compiler orders vtable entries based on the order
in which they appear. Virtual function order can not be shuffled later on.
(We're lucky that entries aren't alphabetized! Think about it.)
Private virtuals, on the other hand, *can* be reordered. But that's only
because in the kits we don't define any private virtuals that can be (or
should be) overridden.
* A Dilemma
Looking at the last two items leads to an unfortunate problem. Let's say
that in a subsequent BeOS release, we want to use one of the dummy virtuals.
We can't simply move it to another part of the header file -- vtable order
is set in stone. But a function CAN be moved from private to public. As we
(at Be) need a dummy virtual, we'll simply "peel" the topmost private
function up into the public section.
For example:
class BWhatAPain {
public:
...
virtual int32 NewDR10Function(... arglist ...);
private:
virtual void _ReservedWhatAPain2();
virtual void _ReservedWhatAPain3();
...
};
This is why we chose to stick the dummy virtuals at the top of the private
section.
But what if the class has a protected section that contains virtual
functions? Remember, you can't reorder your virtuals, but you can
"interleave" sections. It's not pretty...
class BAreWeHavingFunYet {
public:
...
protected:
...
virtual int32 SomeOldProtectedVirtual();
public:
virtual int32 NewDR10Function(... arglist ...);
private:
virtual void _ReservedAreWeHavingFunYet2();
virtual void _ReservedAreWeHavingFunYet3();
...
};
...but it works.
* Another Dilemma
There's another subtle issue dealing with overriding a virtual function.
I'll explain the problem in a moment, but first the solution: If a class
might ever need to override an inherited virtual function it's much better
and simpler to override that function *now*.
Here's the problem. Let's say a kit declares a couple of classes thus:
class A {
public:
virtual X();
};
class B : public A
{ ... }; // i.e. B *doesn't* override A::X()
Now a developer creates their own class (C) that inherits from B, and
overrides the X() function as follows:
C::X() {
...
inherited::X(); // OUCH! statically resolved
call to A::X()
...
}
The call to inherited isn't virtual. It's a statically resolved call to the
"closest" override; in this case, it resolves to A::X(). That's okay as far
as it goes -- but what if, in a subsequent release, class B *does* override
X()? The developer's code will *still* resolve inherited::X() as A::X() --
in other words, the developer will skip right over B::X().
The solution that covers all cases is to fully override all inherited
virtuals (where the implementation simply calls inherited). But that's
overkill; it could impact performance and would complicate our API. So we
applied some discretion in the kits; some classes override some functions,
others don't.
But let's say it's getting close to DR10, and we now realize that a couple
of our do-we-need-to-override guesses were wrong. There's a solution, but
it's complex, *very* ugly, and too much trouble to explain here.
* Other Miscellaneous Items
Never put an object that contains a vtable into shared memory. The vtable
contains addresses that are only valid for a particular address space.
You can't override the new/delete operators after the fact. It's now or
never.
That's all there is to it!