Overview of Quercus Robusta Library

This page requires HTML style sheets (today only IExplorer 5+, Netscape 4.7+ and Mozilla seems to support style sheets in predictable manner).

Back to the top page

Overview of Quercus Robusta Library

Content:

Inheritance and sub-typing

Containers

Iterators

Algorithms

Garbage Collection

Multi-threading

GUI

Overview

There are many various versions of notations and definitions of the terms described below. Before reading please understand that while it is perfectly valid to motivate one's definitions it is useless and counter-productive to claim that some definition is incorrect (because it is definition not statement). After all realists lost the battle to nominalists centuries ago.

While in many OO programming languages (e.g., C++, Java, Eiffel) inheritance and sub-typing are lumped together, they are quite different things. To understand this difference it is necessary to introduce some definitions:

There are some fundamental notions: object, state, behavior, message, method and value.

Object is identified sequence of values. That is object is pair (id, seq) where id is some identity (not further specified) unique among all objects and seq is mapping from ``instants of time'' to the pairs of form (m,s) where m is the methods suit and s is object's hidden internal state (which is in turn object?). That is seq:T -> M x O

Methods suite is mapping from language-dependent ``signatures'' to the methods (what is the method?).

In traditional languages there are two parts of the signature: one that is used during method lookup (name, types and names of parameters and results etc.) and another that is not: exceptions thrown, pre-post-conditions, concurrent access specifiers etc. This ``unused'' part is actually used but through separate language mechanisms: e.g., Java requires try/catch method calls in accordance with their specification etc. It would be nice to unify.

Object has some hidden state and behavior associated with this state. The only way to exhibit object's behavior is through sending messages to it. Object maps messages to methods i.e., pieces of code acting over their arguments and object's state. State of the object can change during object existence.

Value is object with it's state fixed. That is value is a pair (o,s) where o is an object and s is some possible state of the object o. Then one can look at the object as a collection of values sharing the same identity. (Does identity belong to the state?) There should be some structure imposed on this collection but in general (in the multi-threaded setting) there is no absolute time so I am not sure that this structure should be one of the linear ordering. For each object there is space of all possible values for this object (i.e., of all its possible states) determined for example by its class and object itself is just some kind of trajectory in this space.

Type system is some covering of the class of all values. That is, the type system partitions all values into possibly overlapping subsets called types. Power of the typing (i.e., existence of some type system built into the language) is evident in languages with the static type-checking. In any conventional language expressions produce values and are recursively built from simpler expressions. In the statically typed languages most basic expressions (variables, constants etc.) have types known at the compile time attached to them. Also, the statically typed language enforces some restrictions on possible ways of forming expressions such that once types of basic expressions are set, you can assign type to every valid expression (It is not that easy in practice because there are recursive types etc., so type-checker will probably need to employ some smart algorithm to check validness of some expressions but the general idea is clear.) such that every evaluation of this expression will produce value of that type. Now you can reason about the run-time behavior of the program using the information about types of values produced by expressions.

If your type-system becomes more fine-grained, that is the more types you introduce, then you get more information about run-time properties of programs; on the other hand typing will be more complex, hard to understand and programs will be harder to change. So it is necessary to make decision between complexity of type system and amount of run-time information it catches.

As it was noted earlier objects perform computations in response to the message sends. But it would be inconvenient if not impossible to build objects that can response to the each and every possible message. As a matter of fact in untyped language it is quite possible for object to receive message it doesn't understand. This doesn't necessarily mean that program will crash, to the contrary some systems (e.g., P. Curtis's LambdaMOO) make smart use of such situations to implement otherwise impossible behavior. But in general, to sensibly response to message you should understand it. Languages like Smalltalk rely on programmer discipline, documentation and external checking tools (soft-typing) to guarantee that every message send will be successfully resolved at run-time. But when building complex software with code created by multiple developers scattered around space and time one feels desire to get more robust mechanism to assert that execution will not end up with infamous

Concord.OilPressureSensor: message isSuitable:with: not understood

Especially when one happens to fly in this very Concord

The question naturally arises: can we use type systems described above to guarantee that every message send will be resolved to the method? The answer is yes. And the type-system required is remarkably clean and simple. To build it we need the notion of the interface.

Interface is a collection of message signatures. Message signature is language-dependent data that can be used to distinguish various messages. That is given some object o information in the signature S make it possible to predict the result of the lookup of any message of the form and message o.m where message m has (or conforming to) signature S. In the simplest case signature consists only of message name (selector). When language already enjoys some type system signature may include types of arguments and return values, exceptions possibly thrown during execution of method bound to this message, concurrency restrictions etc.

There is type TI naturally associated with any interface I: value (o,s) belongs to TI iff for every signature S in I and for every message m conforming to S any message send o.m( ... ) will result in successful lookup, that is iff object o can respond to all messages with signatures in I. This doesn't imply that messages with signatures in I are the only messages o can respond to.

Now if expression e has type TI than any message send in the form e.m( ... ) where m conforms to signature from I will be successfully resolved at run-time. And if moreover such ``type-safe'' sends are the only sends allowed by language than any message send in valid program will be successfully resolved and all messages will be understood.

Of course this doesn't mean that message send have to be the only operation and types of the form TI have to be the only types defined by the language.

If we limit ourselves to the language where the set of messages understood by the object doesn't depend on its state and is constant during object life-time (if for example method lookup is really implicit and is not controlled by the user) than it is obvious that we can lift interface-based typing from values to objects. That is we can assign types to the objects themselves.

In languages with types like [ 0 .. 9 ] an object may belong or not belong to such types depending on its state.

That is we have two separate but interconnected structures in the language: typing and operations. One of the purposes of typing is to assure correctness of operations. But operations themselves may be defined in terms of types. Of course it is possible to construct type-system such that any expression will execute correctly iff it is well-typed (it is enough to take all possible execution paths as types or some-such), but such type system will be patently undecidable (it will be impossible to build static type-checker).

Type-system based on interfaces guarantees that operations of some kind (implicit operation of the method lookup during the message send) will always succeed. But it also get us for free another valuable property: polymorphism by inclusion.

Method lookup during message send is implicit operation and array indexing is explicit. Interfaces give us type system expressible enough to assert the correctness of the method lookup. One can imagine type system to assert the correctness of the array indexing. It will be probably equivalent to some kind of the integer linear programming. On the other hand there are issues like concurrent access to the shared state which are much harder (?) to express as operations and to envisage a type system to deal with them.

Polymorphism is the property of parametrized code to work with parameters of different types. Polymorphism is divided (by Cardelly and Wagner(sp?)) into two main classes: universal and ad hoc. Examples of universal polymorphism are genericity (templates), virtual types and polymorphism by inclusion. Examples of ad hoc polymorphism are coercion and overloading. Polymorphism is the main mechanism for the code re-use (after cut-n-paste :-). And successful re-use is what makes possible creation of large libraries and frameworks which are the foundation of the large-scale software development.

As have been mentioned above types can overlap. Type A is subtype of type B if A is subset of B that is every value of type A is also value of type B. Note that sub-typing is defined as relation on types (not operation).

Suppose now, that message send is the only operation in our language with interface types assigned to the objects and we have code p(e:TI) parametrized over parameter e where e is required to be an object of type TI. If we have an expression of the form p(a) (This of course contradicts supposition that message send is the only operation because p(a) is manifestly not message send; what we mean is message send of the form x.f(a) where message f will be resolved to the method p.) where the argument a is of type TI; then well-typedness of p means that for every message send of the form e.m( ... ) in p message send a.m( ... ) will be successfully resolved. That is if we call p with argument a conforming to the interface I no message lookup will fail. But suppose we have object b conforming to the interface I' where I' is I plus some additional signatures. Then b evidently conforms to I and therefore belongs to TI and during call p(b) no message send will fail. That is code designed to work with parameter of type TI will work with argument of type TI' where I' is surinterface of interface I and TI' is subtype of type TI. Great! We get mechanism to re-use code.

Concrete type system provides for re-use of code using only operations that type system guarantees correctness of. For example interface-based type system will not guarantee you that index will not fall out the end of array or integer operations will not overflow. But different type system can. And you can use all of them in the same time.

It is very important to understand that inclusion polymorphism allows for re-use of client-side code: code that is re-used is using code (methods) of object that is changed. This is to be opposed to inheritance where code that is re-used is code of object that is changed. Inheritance is for the server-side re-use.

Fast

All

This page requires HTML style sheets (today only IExplorer 5+, Netscape 4.7+ and Mozilla seems to support style sheets in predictable manner).

Last modified: Thu Mar 9 20:43:09 2000