Alexis Smirnov
Thinking about software



 

Unifying object and relational data structures

In this note I’ll show how to use XSD Schema as a basis of object-relational mapping. In particular, I’ll show when you might want to create database schema from XSD schema. This note is accompanied by a tool called xsd2db to allow creation of database from an XSD file.

 

Introduction

Object/Relational mapping is a set of design patters and methodologies that allow an object-oriented data structure to be persisted in a relational database. O/R mapping is an important part of just about every enterprise system that makes use of a relational database.

 

It is important to get O/R mapping right because both database structure and object structures design is affected by it. Transferring data from in-memory object representation into a set of database records and back can make or break the performance, scalability and robustness of your many systems. Complex inconsistent mapping techniques can create maintenance nightmares for both database and application programmers. A single database often acts as a host for many different applications, many of which may be developed long after the database is released in operation. Database schemas that are difficult to represent as object model will be difficult to use as a basis for any future applications.

 

Tools and methodologies

There are many tools, a number of design methodologies and patterns. The site www.agiledata.org run by Scott Ambler (particularly this essay) as a great place to start exploring the subject. Fowler’s book on patters of enterprise architectures includes many useful patters of O/R mapping.

 

Once a mapping is achieved, the results are usually captured in two very different forms. The first is OO data structure as the form of UML diagrams code written in OO language like C# or Java depending on the platform. The second is a database schema, often represented as somewhat database-specific SQL DDL script. This script will be used to create an instance of a database. Regardless of how you go about defining and implementing O/R mapping, chances are you’ll end up with two disjoint artifacts.

 

Amongst commercial tools on .NET these two are good examples:

Deklarit helps solving O/R mapping challenge by generating both DB schema and OO data structures based on declarative definitions of business objects.

Pragmatier is an integrated modeling tool, code and database generator. It generates a complete data tier including an O/R data access layer with object-relational mapping and a database schema. It can also wrap and extend existing databases.

Scott maintains a good list of products and frameworks. Most of them are for Java platform.

 

Challenge: Maintaining the mapping

In experience of Martin Fowler and Pramod Sadalage as shown in their article, agile development methods can effectively be used using the development of system with a database as its major component. One of the primary features of agile methods is their attitude towards change. They allow and expect change at any point in the development process. In this context O/R mapping have to be agile just like any other component within the system. Since OO and relational structures are disjoint, in many cases both artifacts have to be refractored together to assure that the software don’t break.

O/R mapping can become more challenging if the system requires flexibility of the database schema.  Take an example of a bug tracking system where administrator has the ability to add and remove custom properties assigned to a bug. If O/R mapping is static, the object model of such bug tracking cannot really represent those custom properties as first class citizen as a column.

An ontology-based system is another example where disjoint nature of the OO and relational structures presents a challenge. (This example got me started thinking about this issue and is near and dear to me because of the architecture of Enterprise Privacy Manager). An object model of an ontology-based system (such as an expert system or inference system) mirrors the structure of ontology. When the ontology changes, so does the object model. If such object model is mapped on a relational model – it has to change too. Being able to evolve the ontology (a valuable feature of many ontology-based systems) is difficult when OO and relational structures are disjoint.

 

Unifying data schema in XSD

The core of the solution of this class of issues is to centralize data schema definition. Such centralized schema must be defined in a form that can be converted in OO and relational representations. XSD in one such representation. XSD schema allows one to define rich data structures and can be converted into both OO and relational representations. Despite its complexities, XSD is the practical choice until better alternatives are introduces in the mainstream.

 

Ultimately, there’s ample room for innovation in integrating relational and object structures using simplier more immediate constructs like extensions to programming languages. MSR’s Erik Meijer and Wolfram Schulte are doing some very interesting work on the subject.

 

Here’s how XSD helps in maintaining the O/R mapping in earlier examples:

Once bug tracking data schema is captured in CSD, one can add a property to both object and table by adding it directly to XSD file. Updates to the OO data structure (class) and to relational schema (table) can be automated.

If an ontology changes, one can produce an XSD based on new ontology and then produce or update a database schema from a new XSD.

 

The key benefit from using XSD as common format for data schema is the fact that one needs to define the schema (or changes to it) only once as oppose to doing it twice – once in OO world, and then once again in relational world.

 

 

Tools to help

Of course defining your data in XSD wouldn’t be much help if there wasn’t any tools to allow you take your XSD schema and create OO and relational structures based on it.

In .NET, XSD to OO conversion is done by xsd.exe utility included in .NET Framework SDK. This utility generates typed DataSet from XSD files. Shawn Wildermuth and Chris Sells have extended this utility to allow direct derivation from resulting classes.

 

But what about the other half of the process - generating a database schema out of XSD? At the time of this writing I’m aware of only one RDBMS that support this feature – Oracle 9i. Also, XML Spy development tool has a feature that will generate database creation script based on XSD.

 

Roger McFarlane and I have written a standalone tool that automates the process. Taking an XSD schema and creating a database schema with it. Xsd2db supports SQL Server and MSDE (Jet) databases, but other databases may be supported in the future.  At the time of this writing I’m not aware of any other standalone tool that does the same job.

 

If you’re interested in getting xsd2db or knowing more about it, please let me know. We’re looking to taking xsd2db public and will appreciate your feedback.

 

 

Conclusion

O/R mapping is important part of architecture of many enterprise applications. While many methodologies and tools exist to make the job easier, all of them use separate schemas for OO and relational data, thus making it difficult to manage changes of data structures. Using common XSD schema helps to centralize the definition of the data structure in one source. Tools are available to generate both object and relational representations from the single XSD file.


Click here to visit the Radio UserLand website. Click to see the XML version of this web page. Click here to send an email to the editor of this weblog.

© Copyright 2003 Alexis Smirnov.
Last update: 5/6/2003; 5:47:28 PM.