Features and
Implementation of the OMG Meta-Object Facility
Aditya
P. Bansod
Strevda[1]
This paper
takes a look at the Object Management Group’s Meta-Object Facility (MOF) a
technology developed to assist and standardized the handling of meta/abstract
data. Simply put metadata is data about data. This can be applied generally for
any type of data and the MOF works to address how to manage and handle this
data. This paper provides a thorough background on the technology, highlighting
features and drawbacks and goes further to suggest improvements to the current
architecture of the MOF. The paper also deals with the methods and ways to
implement the MOF technology in an object repository and the developmental
issues arising from that. The paper begins with a discussion of the MOF Core
which is the foundation for the MOF and then moves to discuss the MOF
Reflective, a special set of specifications that allows the MOF to perform
self-realization operation. From there, the MOF’s interoperability with CORBA
is discussed, and the limitations of the MOF specification are further dealt
with. Lastly, the implementation issues of the MOF are discussed giving in
detail the problems and oversights of the specification and further proposing
workarounds to address those issues.
The world that
we live in is surrounded by ever more complex data. Cellular phone records,
Internet sales, and data of all types becomes larger and harder to manage as we
globalize our economy and expand our horizons across the world. But how do we
manage the data? For that matter, how do we manage the data about our data?
This is what is known as metadata[2]. The world is not controlled by money,
energy, or oil anymore, rather it is controlled by the information; thus, the management
of this information is paramount.
A business
executive using a Sprint Mobile Phone makes a call to one of his clients. His
call is recorded, catalogued, and billed by Sprint. Another man uses Sprint to
make a connection to a segment of the Sprint Internet backbone by accessing a
web page, also recorded, catalogued, and billed by Sprint. These transactions
happen every minute of every day, nonstop and without fail. The masses of data
that are collected by corporations these days are reaching unbelievable heights
and certainly unmanageable levels. That is why companies such as Sprint, MCI,
and other major industry players are looking to metadata to help them manage
the masses of data.
The concept of
metadata is new in the field of computer science, only recently becoming even a
recognized point of discussion. To explain what metadata is, it is necessary to
refer back to the previous example. The actual data about a cellular phone
record would be the number the person called, the phone number that the call
originated from, perhaps the credit-card number to bill the phone call to.
However, the metadata about this phone call would not be the actual data, but
an abstraction of this data. It would only be the fact that we’re recording the
phone number, the credit-card number, but not that actual information itself.
Metadata is useful for managing the actual data itself. Because the user has an
understanding of the structure of the data (i.e. the metadata), it is possible
to better handle, store, and predict changes in the actual data.
With such a
large scope, it is a surprise that metadata (and meta-objects, used
interchangeably) standards were never addressed until late 1997. In September
of 1997, the Object Management Group (OMG) – an industry standards body –
finalized a standard called the Meta-Object Facility[3]
[1] (commonly called the MOF) that is designed to standardize the interface
used to manage metadata, and more generally, meta-objects. The MOF has picked
up incredibly quickly in the data management and object oriented database
world, and companies such as Unisys and IBM have either modified or are working
on modifying their products to be MOF compliant. One of the key features of the
MOF is that is allows for interoperability between other OMG standards, such as
CORBA (Common Object Request and Broker Architecture) [2] and is aligned with
UML, the Unified Modeling Language [3] (another OMG standard). The
interoperability with CORBA is specified in the MOF Specification with the
description of the MOF to IDL [1, 2] (Interface Definition Language, another
OMG technology) mappings. This facility allows for the MOF to work across
different computer platforms, and with a variety of programming languages
without regard for the underlying technology.
This paper will
examine features (its architecture and interoperability) of the MOF
specifications, and potential implementation (design and logic considerations)
of the MOF in a MOF based Repository and most importantly will suggest
improvements and potential flaws in the current architecture. It will be
possible to show the manageability of metadata and meta-objects in general and
allow the reader to understand conceptually the core of the Meta-Object
Facility, and the features and limitations that are involved with many aspects
of it. However, to understand fully some of the concepts that will be developed
in this paper, it is necessary to have a solid understanding of object oriented
technologies. This can be found in Appendix I.
The Meta-Object
Facility is based in two packages, the Model Core and the Reflective Package
[1]. Most of the functionality and useful modeling elements of the MOF are
contained in the Core; the Reflective serves for a self-discovery purpose.
The core of the
MOF is based on its root element the ModelElement.
All other elements in the MOF Core inherit from the ModelElement.
The ModelElement
provides the most elementary services, such as naming of elements, annotations
to describe the element, and so forth. The actual ModelElement
cannot be instantiated[4],
only inherited from because it is defined as an abstract[5]
element. It also provides for some key references[6]
between the class Tag and
itself (to provide for extensions to the element), constraints to identify a
set of limitations the ModelElement
may have, and others.
From the ModelElement
other objects derive (thus inheriting its abilities), most notably, Namespace. The Namespace class provides
the assurance that within a name space all names are unique. The other key
feature of the Namespace
is that it allows for collections of elements (see footnote for abstract class)
with its association of Containment
to ModelElement.
This capability allowed for a child of Namespace,
such as Package, to contain other ModelElements
to facilitate whatever the user may need.
The Namespace provides the
key ability to define containment, but in application, perhaps the type GeneralizableElement
supersedes its purpose; the GeneralizableElement
allows for discovery of many implementation-level[7]
properties for objects. One of the attributes of the GeneralizableElement
is visibility, which is of the type VisibilityKind,
an enumeration[8] of the types
public, private, and protected that defines whether or not other classes in the
MOF can access that GeneralizableElement.
This feature is primarily used to maintain internal data types and operations
within classes. Another feature of the GeneralizableElement
is knowledge of the supertypes and subtypes of a GeneralizableElement. This self-discovery
type of logic is provided with the attributes isRoot
and isLeaf, which identify whether it has no
parents or if it has no children, respectively. To further reinforce this
feature, the MOF specification provides a reference from GeneralizableElement
to GeneralizableElement
itself called “supertypes,” this allowing for a GeneralizableElement to collect the
supertypes of itself.
The last few
important classes in the MOF can be described briefly. These are Namespace, Association,
DataType, Class, MofAttribute, and Operation.
In the MOF, a Namespace
is used to contain models which may be stored in a MOF Repository. Since Namespace inherits (2
levels detached) from Namespace,
it can contain all other classes need to make a complete model. Packages are
most commonly used to contain Associations,
Classes, and DataTypes.
As noted earlier, Associations are used to provide referential links between
objects of the MOF, and are named such as “supertypes” as mentioned in the
description of GeneralizableElement.
Finally, before Classes, DataTypes
exist to provide extensibility to the MOF by adding new fundamental types that
can contain data. Ranging from enumerations to complex structures to simpler
types such as integers.
Classes are the
fundamental unit used to describe models. All the elements of the MOF can be
stored in a MOF Repository by using the Class class. This ability to use the
MOF to describe the MOF is one of key features that allows the MOF to be
self-descriptive. Inside a class, there can be Operations
and MofAttributes. Operations
are used to provide functionality to a class (e.g. Namespace has an
operation called lookupElement which will return a ModelElement
given a name), and MofAttributes allow for properties to help describe the
Class (e.g. name, visibility, etc.).
Even with all
this information about the key elements to the MOF Core, there is still need
for discussion on the actual uses of the MOF in modeling. That is where the
aforementioned Class, DataType,
and Association become most useful. The MOF is capable
of defining an infinite number of models within itself by way of creating
instances of the Class class. As a proof of
concept, we shall take a brief view of actually storing the MOF within a
repository based on the MOF Specification. For example, let us take ModelElement.
If we were to recreate ModelElement
within the MOF we would first need to create an instance of Class.
The actual class that we have created has a set of empty properties, such as name,
annotation, qualifiedName,
etc. Then we must set these attributes so that we can begin our representation
of the ModelElement.
Since we have now satisfied the basic requirements of the MOF Repository, we
can now return to loading our model. If we look back at the MOF Specification,
we notice that ModelElement
has both attributes and operations. To let us model this in the Repository, we
must create instances of MofAttribute and Operation
for each specified attribute and operation, and then set them contained within
the class we originally created.
To understand
this example, it is necessary to understand the key distinction between the MOF
architecture and its actual purpose. The MOF was designed to allow for models
and metamodels to be created and loaded into a Repository that was based on the
MOF itself. In the example, we took an element of the MOF (ModelElement)
and stored it in our fictitious Repository. We could just as easily have
illustrated storing a corporate hierarchy in the MOF Repository. The reason
that the storing of the MOF model itself was illustrated was because it
describes one of the central features of the MOF: its reflective capabilities.
Since the MOF can be stored in the MOF, it is possible to use other features of
the MOF specification such as the IDL Generation Rules to create IDL for the
MOF with minimal work (more discussion later). Further, it allows for the
modeler to use visual tools such as Rational’s Rose or Select’s Enterprise to
view the MOF Model from a Repository. More importantly though, the MOF Specification
allows for the ModelElement
itself to have a supertype (parent), which can be implemented as the object in
the Reflective package, RefObject.
The Reflective
Package allows for elements in the MOF to discover information about objects
that they describe. Basically, the Reflective interfaces provide services that
allow the user to “see” what is contained in the Model, find information about
the associations, invoke operations, and modify attributes without any prior
knowledge of the Model the Reflective is operating on. At the top of the
Reflective hierarchy is the RefBaseObject element, which
provides operations like itself, which allow for an
object to test if it is a specific type, thus letting it gain information about
itself. More importantly is the RefObject
class, which provides the class-level self-discovery.
The RefObject
lets an object that inherits from it (i.e. ModelElement)
create instances of itself, access operations, set properties, and so on.
Because all of the MOF in an implementation inherits from the RefObject,
it is possible for any object to use the functionality of the RefObject,
because it is a parent on the inheritance tree. Thus, operations like allObjects
and createInstance can be used to manipulate instances of
objects without knowing their capabilities or any of their contained elements.
For example, allObjects returns a collection[9]
of all the objects of the most derived type (the object that is actually
calling allObjects). With this
collection at hand, the user can iterate though the instances and perform
operations on them.
The entire
Reflective package provides for a generic interface that allows for the user to
manipulate unknown objects. This generic interface allows for access not only
to objects, but also to Associations (RefAssociation)
and Packages (RefPackage). They also provide a
simple manner to access information about their respective types like RefObject.
The types of objects that Reflective interfaces are provided for are the same
as were mentioned to be useful for modeling. This is because those classes are
most useful and would most likely be used in the “real world.” For the all
other objects there is always RefObject. It is reassuring to
know that RefObject provides an operation
called invokeOperation that would allow full
accessibility to a non-modeling class.
To be
successful in a product driven world, it was necessary to develop a set of
rules that would allow the MOF to work with existing object oriented
technologies, primarily CORBA. It allows for objects that were developed on
different platforms using different programming languages to communicate with
each other and to use each others facilities with a device called the ORB, or
the Object Request Broker [2]. The ORB is vendor supplied for each platform
(e.g. Windows, Unix, VMS, etc.) and is readily available from vendors, such as
Iona’s Orbix and Inprise’s Visibroker. To complete the distributive abilities
of CORBA, one more piece is needed to allow the objects to interoperate. This
unit is called the IDL (Interface Definition Language).
An IDL is a
text file, based loosely on C [2] that provides metadata(it does not provide
any true function code) about the objects that it is abstracting. From
an IDL, an IDL compiler will produce bindings[10]
for a target language (most often C++), one for the ORB server and one for the
client programs. The ORB server uses the IDL information to know what
interfaces objects expose and how to let them be available in the distributed
environment. The client file is more important, in that it is directly used by
client CORBA-compliant applications for them to also know what is available.
The generated IDL bindings are also directly used by client applications to
access the objects via the ORB, because the bindings actually provide the
interface into the ORB.
Since the IDL
is such a key part to the CORBA, it is no surprise that the when the MOF was
made, OMG addressed this by including a section called “The MOF to IDL
Mapping.” [1] This allows for a MOF model in a Repository to have IDL
interfaces created and defined for use with an ORB and CORBA clients. This
section details how to handle each type of element that could be in the MOF and
the rules to apply to create the appropriate IDL. Since IDL does not support
such MOF concepts as references or associations, it was necessary to create
appropriate mappings to allow for this. Also, the specification assumes that
the Repository that is being used is purely MOF based, which is not always the
case, thus additional rules need to be added to accommodate that.
The problem of
associations and references is directly addressed in the MOF Specification, and
an ingenious solution is proposed for each. In the case of references, each end
of the reference is required by constraint to belong to one class (interface in
IDL). Thus, in each of the interfaces for the ends of a reference, IDL is
generated that allows navigating the reference, modifying the reference, adding
a new instance of that reference, and so on. Note that different IDL is
generated depending on the multiplicity to allow for accessing an end that has
a greater than one cardinality, returning a collection. This allows for full
functionality for dealing with a reference. The generating of IDL for
associations is more complicated, but equally easy to implement. Because an
instance of an association can be created, it is necessary to generate an
interface for it. Thus, with this interface, it is possible to query all the
associations of a certain type (i.e. Tag-AttachesTo-ModelElement)
and perform other operations on them. This is different than a reference,
because a reference is used (by definition) to query a Classifier (and subtype
Class) about its associations with other classes, but not the association
itself.
On an
implementation level (a MOF Repository), it is necessary to define a new set of
rules that are not included in the MOF specification. Because many repositories
that support the MOF (IBM TeamConnection, Unisys UREP) have their own built in
datatypes that are used in the MOF elements, the generation of IDL does not
follow the specification specifically [4]. For example, in a theoretical
Repository, there is no fundamental data type called “string”
rather a “RepositoryString.” Therefore, the IDL generator needs to be aware of
certain mappings (i.e. RepositoryString to string, RepositoryInteger to long,
etc.) that would occur between an implementation MOF and the generated IDL for
that model. Similar problems arise with structures that are return types or
parameters in operations. For example, MultiplicityType
is a structure with four datatypes (lower, upper, isUnique, isOrdered). This,
however, cannot be constructed for persistence[11]
in a repository because datatypes are abstract classes. So, an implementation
would make MultiplicityType a class instead of a
datatype, allowing for persistence. Unfortunately, this does not generate
correct IDL, and additional rules would need to be implemented to make a proper
mapping.
With the
standard tools and the explained extensions to the MOF Specification, it is
possible to generate fully CORBA compliant IDL that allows access to ORBs and
CORBA clients. With the IDL it is feasible to create a CORBA server that would
provide a back-end to the ORB, and CORBA clients that would access the ORB and
interface definitions for the ORB. This allows for full access to any model
that was created using the MOF, assuming that its functionality is implemented
in the CORBA server. Remember that IDL can be used by the ORB to realize the
available interfaces, used by a developer to make a CORBA server for the model,
and used by other developers to access that server. With all this linking
ability, it is no surprise that the OMG included the generation rules to
provide a standardized method to generate IDL from the MOF.
The Meta-Object
Facility was designed to be used to store, manipulate, and understand metadata.
However, this would not be possible if a MOF Server[12]
did not exist. Since a MOF repository is a database type system that would be
made up of the MOF elements (i.e. ModelElement,
Feature, Namespace,
etc.), it would be possible to create new instances of MOF objects and store
them in the repository. This, with some trivial additional work, solves one of
the core problems in the MOF, which is the lack of persistence. To facilitate
the creation of a MOF Server some design time considerations must be made and
some changes must be made as well.
In general
object-oriented terminology to create an instance of an object one must
“construct” the object. For instance, if we wished to create an object of type MofAttribute
in the MOF, it would be necessary to construct an instance of it. This
functionality is not defined in the MOF specification, thus in an implementation
a developer must provide such an operation to allow for instances to be
created. The parameters of this operation are derived with a bit of logic. For
each attribute in the element, and all of its parents (all the way to ModelElement),
a parameter of that type is created. This allows for all the fundamental
information of an element to be filled in directly with a construct operation.
With all this information, a construct operation can be created so that an
object can exist within a repository.
Another
implementation issue of note is the previously mentioned problems of structure,
such as MultiplicityType. Many operations and
attributes in the MOF require the datatype MultiplicityType
to define the cardinality of an element. Examples of this include AssociationEnd
and MofAttribute that need to provide knowledge of their
size. Unfortunately, due to the object nature of Repositories in general, it is
impossible to utilize a static datatype unless it is a fundamental type.
Otherwise it is necessary to utilize an Object as the datatype. Thus, elements
such as MultiplicityType are converted to an
object, with each of their included datatypes converted into an attribute of
that object. It is also important to note that a construct operation needs to
be provided for this as well because it is now an object and not a datatype. It
is not necessary to supply this type of mapping for an enumeration. This is
because an enumeration does not contain any elements; it simply enumerates
possible values that it can be.
Other than the
few cases where special mappings are required, a straightforward implementation
of the MOF model can be created in a Repository to make a MOF server. To make a
useful server, creating a server and programming it is not enough. There is a
need for bindings as well to allow it to be used with different languages [1].
This is where IDL, CORBA, and the ORB become very useful, but it is also
possible to generate bindings to allow a user to use the new repository on a
less distributed and more local level. If the MOF server was implemented in
C++, then it would be useful to allow not only C++ programs to access the
repository, but also allow C programs, Java programs, and OLE[13]
clients. Because the MOF specification does not describe how to generate
bindings many commercially available repositories will generate them
automatically, thus easing the burden of developing in many environments.
The OMG has
defined a clear-cut method for modeling data and creating representations of
metadata with the MOF Specification and vendors have started to develop
prototypes and applications to use and exploit the capabilities of MOF. Thus,
it is of no surprise that the MOF is appreciating such wide spread support,
largely due to the massive amount of work and vendor input that was integrated
into the MOF. More and more the world is seeing the benefits of
interoperability and standards based computing, thus the MOF will be a force to
reckon with in the object world as it moves to standardize all the Repository
vendors. To further enhance this sense of uniformity, the OMG has issued a new
standard, called the XML Metadata Interchange Format (XMI) that allows for
models to move between vendors and still maintain their integrity.
This paper has
showed some of the details to the implementation of a MOF based Repository that
are essential to know since they were not included the in MOF specification.
Also, some of the key elements to the MOF itself were described to show how the
MOF is built and how all elements interact with each other to form and allow
the modeling elements of the MOF. Combine this with a MOF implementation, and a
completely functional MOF Server is born, allowing a user to create, delete,
and manipulate MOF elements in real time, using the MOF interfaces themselves.
Tie that in with the IDL generation, and it is possible to access a MOF server
across the world on an IBM mainframe even though the MOF server is running on a
Unix server. These features and implementation aspects of the MOF truly show
the future and promise of such a well-built specification.
As more and
more corporations realize the significance of their metadata, the field of
meta-object management and the MOF itself will enlarge in scope. With the
overall view provided of the major elements of the MOF, and the features and
limitations of the MOF described, it is easy to see that the MOF can be
extremely useful in more than just the modeling field. Its uses can be extended
to source code reverse engineering (proof of concepts exist for Java and C++),
data warehousing and other large scale applications. The foundations of the MOF
allow for it to scale well and not face any size-limitations, thus it can be
used for real time database support. The MOF’s future is bright, with no end in
sight and the continuing work of the OMG to make it better. It is certain that
the MOF is a specification that will not fade away, yet stay as the modeling
and metadata standard.
[1] Cooperative
Research Centre for Distributed Systems Technology, et. al. Meta Object
Facility Specification. OMG Document ad/97-08-04. September 1, 1997
[2] Object Management Group. Common Object
Request Broker Architecture, version 2.1. OMG Adopted Technology. June,
1998.
[3] Rational Software, et. al. Unified
Modeling Language Specification. OMG Document ad/97-08-02. September 1,
1997
[4] Unisys Corp. Universal Repository Type
Library Reference. August, 1998.
Object oriented
technology was first generally available with the introduction of the C++
programming language. Some of the general terminology that is needed for an
understanding of objects is found in our colloquial English. For example, an
object can "inherit" from another object, and in the process, it
inherits its properties and attributes. The objects have parents and children,
much like in the real world. For example, a medical student will get his degree
in medicine, and will specialize geriatrics. This will allow the student to
have knowledge of his general medical degree and to add information gained from
his specific degree. Also, for example, a child inherits from his parents, and
from each of them he may inherit specific properties such as eye color, hair
color, strength, and so forth; this is an example of multiple inheritance. Objects
may also have relations with other objects. A person would have a relationship
of client to his doctor; a child would have a relationship of sibling to his
sister, perhaps. The paradigm of objects can be extended to suit virtually any
application and more specifically, programming. Much of the terminology that is
used will be explained via footnotes at the first occurrence of a foreign word.
This revolution has lead the way to streamlined and more efficient application
development.
[1] Work supported in part by the Strevda organization. Contact 26845 Bridlewood Drive, Laguna Hills, CA 92653., aditya@manyads.com.
[2]
Metadata is the information describing the real information. For example, the
text of a book is the information, and table of contents is the metadata for
the book.
[3]
OMG Document ad/970814
[4]
To instantiate an object is to create an actual instance of it. Example, you
cannot create intelligence, but you can create a child who has intelligence.
[5]
An abstract class (i.e. object) cannot be instantiated, and thus cannot contain
other elements since it cannot exist on its own. This should not be confused
with a child class that inherits from the parent which may provide containment
capabilities (e.g. Package to Namespace)
[6]
Classes can connect to each other via associations and references (a navigable
association). They provide for fundamental linking capabilities between
objects. Example, a child has an association with his teacher.
[7]
The actual level where the MOF is used to make models.
[8]
An enumeration is a data type that contains certain named units that can be
referenced later by either name or ordinal.
[9]
Collections are sets of data with cardinality greater than one. They can be of
any datatype or of another class.
[10]
Bindings are interfaces from one programming language to another. Via binding a
developer can program only in one language and then generate bindings to be
used in others while only having to program once.
[11]
Persistence is the ability for an object to be constructed in a repository, and
keep itself “alive.” Thus, the next time a repository is accessed, the object
is still available.
[12]
Identical to MOF Repository, but different than a CORBA Server. A CORBA server
would be used to access a MOF Server/Repository, but not be one itself.
[13]
Microsoft’s Object Linking and Embedding. It allows heterogeneous applications
to access objects. Analogous to CORBA, on a local level.