One of the uses for introspection is as a way to build a serialize and unserialize library so that you can take an arbitrary object, write it out to disk (or the wire, etc) and be able to reconstruct that exact object later. In REALbasic 2008r1, you can serialize the object to disk but you cannot unserialize. There are a few key components that are missing before that becomes possible -- and that's what I'd like to discuss today.
For starters, you'd have to be able to create an object instance. This sounds simple, but it happens to have some interesting quirks. Plugin authors have always had access to an API call REALnewInstance. This took a string, or a REALclassRef, and would return back to you a new instance of the object. However, in practice, this is a terrible idea for several reasons. For starters, if you pass in a string, the compiler has no way to know that the plugin relies on a given class. This means that a plugin can call REALnewInstance( "Dictionary" ), but the linker dead-strips out the Dictionary class if the user doesn't make use of one within their project. Since most authors don't check to see if a call to REALnewInstance fails -- this basically leads to very hard to reproduce bugs with plugins. So we introduced the REALclassRef version of the API as a way for the user to alert REALbasic that various classes are required. In the PluginEntry function, the user call REALGetClassRef as a way to get a class reference, which is the part that alerts REALbasic to the use of various classes (at compile time, hence the reason it needs to happen within PluginEntry). Problems solved, right?
Wrong... that only solves the dead-stripping problem. There's one more annoying issue that most people don't realize -- the object returned isn't fully initialized because no constructors have been called! The plugin author is required to look for a constructor method and invoke it manually! This is also something a lot of people forget to do, and can cause a lot of obscure issues.
So if we're going to design a way to create an object instance in REALbasic, we want to solve these kinds of issues. So there should be no way to create an object via introspection that causes dead-stripping issues, and it must always create a fully-constructed object.
Well, for starters, let's say that there's no way to get an object instance via string -- that solves the dead-stripping problem by requiring the user to always use actual class references within their project. But we'll come back to this in a moment...
To solve the half-constructed problem is actually pretty easy -- have a way to get a list of constructor methods from a given TypeInfo object. Calling any of these constructor methods via Invoke always returns a new instance of the actual object, and calls the constructor automatically. In this way, it is impossible to get a half-constructed object since the only way to create an object is by calling its constructor. This of course makes the assumption that all objects have a constructor method, but that's easily abstracted away by returning a "fake" constructor method that takes no params -- it just creates the object instance, and doesn't need to call any actual constructor method since none exists.
Ok, so now we know how to make object instances. However, this is kind of silly too -- the only way to get a TypeInfo object is via an object instance itself! That means you'd have to do something like this:
dim ti as Introspection.TypeInfo = Introspection.GetType( new Class1 )
dim c as Class1 = ti.ConstructorMethods( 0 ).Invoke
If you've already had to call new Class1 just to get the TypeInfo, it becomes less interesting to get the constructor methods and invoke them, right?
So let's make this a bit more interesting -- let's define an operator which returns a TypeInfo object without requiring an object instance. Some may be tempted to say "aha, now we can make a TypeInfo from a string and create objects that way!" Please, don't give in to that temptation (yet). That still runs into the dead-stripping problem, since the linker may or may not know about the class type. Remember, the following code would be perfectly legal:
dim v as Variant = MagicalTypeInfoOperator( "Class1" ).ConstructorMethods( 0 ).Invoke
In that case, the compiler has no knowledge of the Class1 class since it'd be dead-stripped out. Instead, our magical operator needs to have a way to alert the compiler of the class reference itself. And there's already an operator which works this way: IsA. The form of IsA is: someBoolean = objectReference IsA ClassType, so obviously the compiler knows how to handle class types in that fashion (consequently, IsA is the easiest way to ensure a class isn't dead-stripped out of your project when dealing with purely generic types like variants or objects). So what if we had an operator that looked like this: someTypeInfo = MagicalTypeInfoOperator( ClassType )? That would neatly solve all of our problems, right? This would allow you to write code like this:
dim v as Variant = MagicalTypeInfoOperator( Class1 ).ConstructorMethods( 0 ).Invoke
and it wouldn't have any of the dead-stripping problems, but be equally as readable!
Well, hold on now! This doesn't exactly solve the original problem -- how do we unserialize an object from disk? In that case, we don't have an actual class reference handy, so all this work has been for nothing! Hah, not really... I assure you. ;-) But we do have to make some allowances, to be sure.
When you're making a disk format, or a wire format, 99% of the time you do not want the external source to make arbitrary object instances. That would be a major security hole, after all. Imagine if I were able to write out a FolderItem instance and start calling .Delete on it, all via the wire. Oops, your hard drive is gone. ;-) Generally, you know what object instances are legal to unserialize for your particular needs. Given that as a premise, the solution to the problem becomes quite easy -- wrap a dictionary with some extra functionality.
We'll define one method called: Sub LegalToCreate( ti as Introspection.TypeInfo ), and another method called: Function CreateInstance( name as String ) as Variant. The first method is the way we set up the list of items that are legal to create, and the second method is the way we create instances by name. The functions are quite simple:
Sub LegalToCreate( ti as Introspection.TypeInfo )
mMap.Value( ti.Name ) = ti
End Sub
Function CreateInstance( name as String ) as Variant
dim ti as Introspection.TypeInfo = mMap.Lookup( name, nil )
if ti <> nil then return ti.ConstructorMethods( 0 ).Invoke
End Function
So you'd use these sets of methods like this:
LegalToCreate( MagicalTypeInfoOperator( Class1 ) )
dim v as Variant = CreateInstance( "Class1" )
Ta da! Now we've solved our original problem -- we can come up with a list of items which are legal to unserialize, and safely unserialize them from disk to create an actual instance.
Of course, all of this is purely academic right now since the constructor methods and magical type info operator do not exist. However, it's neat to think about the things you could do if they did exist, right?
I think when introspection first was shown in an alpha this is precisely the problem you and I discussed about being able to serialize anything, once the bugs I ran into were fixed, but that unserializing it was not as simple.
To date I've been just having every class return a serialized form of an instance and then writing that to disk and the reverse is a constructor that takes that same bunch of data and creates one from it.
Isn't returning a Variant from CreateInstance a bit wasteful? An Object type should be more efficient here, because that does not waste another object instance, whilst a Variant would be one object holding the other object reference. Right? (Just trying to understand how Variants work)
@Thomas -- actually, they'd be roughly equivalent in terms of performance. Variants are a really neat "trick" of the compiler when it comes to object references. Variants that hold objects are nothing more than thin wrappers around the object reference itself. Essentially, the compiler uses variant as a more specific type than object -- so if you've got an object instance stored in a variant, the compiler sees "variant" as the main type, and "object" as the subtype (in a fashion similar to the way the compiler sees "array" as the main type, and "object" as the subtype when working with an array of some objects). So there's no wasted object instances when a variant holds an object -- the variant and the object are one and the same.
Thanks, Aaron! That solves a long-standing mystery for me.
So I assume that RB2008r2 maybe resolves this problem?
@Cosmo -- yes, 2008r2 has the ability to create object instances via introspection. Check out Introspection.ConstructorInfo and the GetTypeInfo operator.