Summer 1999   

Universal Data Access - Part 1

As recently as a few years ago, the options for bringing MultiValue data to the PC desktop were quite limited. Today’s world on the other hand offers a veritable potpourri of MultiValue data access tools. The difficulty these days is not in finding a solution, but instead, in deciding what solution best fits your needs. Robert Houben explains.

This article attempts to lay out the standards based data access solutions available to the MultiValue community today. The intention is to provide you with some guidelines on how to identify the right solution for your needs. Next issue we will give examples of how this new technology can be used in today’s applications.

Standards Based Database Access

In 1991, Microsoft began work with Lotus, Digital, IBM, Apple and other members of the X/Open standards committee called the SQL Access Group (SAG). This group developed a specification for a symmetrical Call Level Interface (CLI) and a subset of ANSI SQL that could be implemented by database vendors and ISV’s, providing some level of interoperability. Microsoft delivered this strategy to the world with the name Open DataBase Connectivity (ODBC) and launched it at the ODBC 1.0 Beta conference on March 9-10, 1992.

This was the first time that a standards-based data access strategy was delivered to the software community. Liberty was involved indirectly in the creation of Microsoft’s ISAM ODBC drivers and was at this beta conference. Within less than a year of this conference Liberty had released an ODBC driver that understood multivalued data and was the first non-Microsoft vendor to produce an ODBC driver of any kind. ODBC went on to gain market acceptance as a strategy for accessing relational data.

At the time of the ODBC 1.0 beta conference, Microsoft, in talking about their future directions, acknowledged that almost 80% of all corporate data lived in non-relational data stores, and they knew that they needed to have a strategy for accessing this data. During the next 1 or 2 years, while Microsoft struggled with how best to solve this issue, since ODBC was such a widespread standard, an amazing number of non-relational databases were coerced into looking relational so they could participate in the standards that had been set. It seems that even a poor standard that was a bad fit was better than no standard at all!

Today, developers have a number of standards-based solutions to consider when looking at strategies for accessing data from industry standard tools and technologies. These choices can be confusing, and choosing between them can prove a challenging task.

These are the data access strategies that we will be considering:

Relationship to Standards

Figure 1 (right) shows how the various data access strategies being considered relate to the SAG CLI standard.

UDA Figure 1Note that OLEdb is not based on a standard. One could argue, however, that anything done by Microsoft, with a distribution of 65 million systems, is a de-facto standard. Also note how the JDBC standard leverages the ODBC standard. This is one of the benefits of standards. There is a huge incentive to developers to leverage existing standards, even when developing new ones. In this game, the consumer is the winner!

Data Access Strategies
Key Decision Point – Availability

This is probably the most critical factor in determining what strategy to use. If the client application that you plan to use only supports one data access strategy, then you are limited to that method i.e. Excel for Office 95 will only deal with ODBC drivers.

Fortunately, most of the data access methods are either available now, or will be shortly for multivalued databases.

ODBC

ODBC, presently at Release 3.5, is the most widely supported standard for data access. In addition to being a de-facto standard, it is also based on de-jure standards, the SAG CLI and ANSI/92 SQL. Client-side drivers are available for Windows, UNIX, Macintosh and other environments. A large number of databases, both relational and non-relational, are available. Unfortunately for multivalued users, only Win32 drivers and a limited number of UNIX drivers for UniVerse are available at this time.
On non-Windows platforms, the latest standard supported is ODBC 2.10. Very few drivers have been written to the ODBC 3.x standard. 3.x was a significant departure from 2.x and prior, with no significant benefits to make it desirable. Microsoft has added two features to ODBC, both of which are outside of the standard, and both of which could have been retro-fitted to ODBC 2.x. These are:
 
 
These features are especially important because they are a significant enabling tool for the Microsoft Transaction Server and Microsoft’s multi-tier architecture, called Digital interNet Architecture (DNA). For more information on this, see http://www.microsoft.com/dna/.
 
The ODBC driver from Liberty is compatible with all MultiValue databases, hardware platforms, operating systems and communication protocols. This means that a single application such as Microsoft Excel or Crystal Reports can seamlessly access data from your MultiValue system, regardless of the hardware, DBMS, operating system or communication protocol used.
 
One important area of consideration when choosing an ODBC driver is to look at the complexity of your data. Some ODBC drivers, especially some of the free ones bundled with Operating Systems, require your data to be very well structured, with all multivalues defined with association values, before they will access the data correctly. If your legacy multivalued data does not follow the “normal” conventions of multivalued data, you may need to consider a more expensive, commercial third party ODBC vendor’s tool.
 
When to Use ODBC
When your client applications are Win32 only, then ODBC becomes a likely candidate.
 
Also, if your client applications all support ODBC and you have an available ODBC driver for your database platform, this again would suggest ODBC as the best solution.
 
ODBC drivers are available on a few UNIX platforms, for a few multivalued databases only. If you need access from UNIX and happen to have such a driver available, then ODBC is again an excellent option.
 
OLEdb
 
This methodology, unlike ODBC, is not standards based, although some would argue that anything done by the Redmond Giant is a standard just by virtue of the size of its install base. The other issue with OLEdb is its proprietary nature. It is a standard that was intended for Win32, and ActiveX in particular.
 
OLEdb and ActiveX are not available under UNIX. In a recent move, however, Microsoft is making available a strategy and set of tools to enable programmers who develop non-GUI Distributed Component Object Model (DCOM) components to port these to UNIX. DCOM is the Object Request Broker (ORB) component of Microsoft’s ActiveX technology. With this in place, it should be possible to create an OLEdb driver for Windows, and port it to UNIX. You can also access an OLEdb driver on a Windows system from a UNIX-based DCOM component.
 
While this is possible, Microsoft has not made the conversion an easy process. It typically involves developing and debugging your component under Win32, then going through a number of convoluted steps to do the conversion, using a third party toolkit. Then, if it turns out that your component needs debugging, you’re going to have a difficult job ahead of you…
 
OLEdb supports both Connection Pooling and Distributed Transactions.
 
One of the wonderful strengths of OLEdb is the ability, using Visual Studio 6.0 wizards, to create an OLEdb consumer object that consumes an OLEdb driver. You can build ActiveX components and include them in your “package”. This package, using MTS, can be configured to remote the ODBC calls, so that instead of installing the OLEdb drivers on all the clients, they are installed on a server and the calls happen from the server, which also pools the connections. The client application accesses the OLEdb driver as though it were local, handles the database as though it were the only client that had to access it (ignore locking and concurrency, MTS does this for you) and pretends to connect/disconnect before and after each access. MTS magically gives your components distributed, cached and pooled connections, and distributed transactions, with concurrency control and distributed, two-phase, commit/rollback.
 
Implementation of an MTS based application is not for the faint of heart. It requires OLEdb drivers that do not crash, or they run the risk of “hanging” that pooled connection. There are other areas of complexity. To my knowledge the only companies seriously developing with this technology were making constant use of Microsoft Professional Services.

Liberty Integration Software is planning the release of both an OLE DB interface that supports full relational query capability and an OLEdb driver that supports access to the pure multivalued structures.
 
In another key move, Microsoft, in implementing OLEdb drivers for things like POP3 servers, have run into the concept of reply-to fields with multiple values (multivalues). Suddenly, Microsoft is releasing tools, like their Hierarchical Flex-grid, that support multivalues, even with Liberty’s current ODBC driver.
 
When to Use OLEdb
OLEdb is best used when you are planning to access data on Win32 platforms only. At present, it is not a viable option for non-Win32 environments.

Also, very few applications support OLEdb drivers, unless they have full ODBC-style query support. OLEdb drivers with that level of support are presently few and far between i.e. Microsoft Office 97 suite supports OLEdb drivers, provided they are full-featured enough.

Microsoft provides a query component, which adds query support to level 1 OLEdb drivers. While this technically enables SQL syntax against drivers that don’t natively support it, for large databases this is likely to be extremely inefficient, since it literally means that the driver must manipulate the entire table in many instances.
If the nature of your project is such that you need high-end scalability, distributed transactions, or other high-end features, and you feel that you can afford the development costs, then the DNA architecture, with MTS, is a good fit.

If you are a Visual Basic or Visual C++ developer with a need to create custom applications to access multivalued data, then OLEdb is a good choice.
 
ActiveX Data Objects (ADO)
 
ADO is a tool that abstracts the OLEdb layer. It is generally recommended that access to OLEdb drivers be done through ADO. Use of MTS pretty well mandates the use of ADO. Drag and drop creation and/or usage of OLEdb based components uses ADO by default. This is really an extension of OLEdb, and not a separate data access strategy. It can access ODBC drivers through the OLEdb to ODBC bridge.
 
Universal Data Access (UDA)
UDA Figure 2
This is the name that Microsoft gives to it’s overall Data Access Strategy. Figure 2 (right) gives a picture of what this architecture looks like.
 
For more information on Universal Data Access see http://www.microsoft.com/data/
 
Java Database Connectivity (JDBC)

The JDBC standard, like ODBC, is standards based, being based on the SAG CLI, and modeled after ODBC to benefit from the existing industry investment in ODBC. It also has the benefit of being machine neutral. At this time, Liberty’s JDBC driver has been certified 100% Pure Java and has been run on a large number of client systems, all without requiring a recompile. This is clearly the option for people who want the widest coverage, and there is even a JDBC driver to access ODBC drivers, allowing you to leverage existing Win32 or UNIX based ODBC drivers.

JDBC is a one-language tool. It is intended to be used from Java. If you are planning to write your applications in C or C++, this is probably a problem. If you are happy developing in Java, then you have the issue of determining if a Java Runtime Environment (JRE) is available on all the platforms for which you are developing. Most commercial platforms have a JRE at this time.

Java supports a number of RPC connectivity schemes, including RMI, CORBA and Java Beans, which provide basically the same type of functionality as DCOM. Unlike C, C++ or Visual Basic working with DCOM, Java working with RMI, CORBA or Java Beans allow you to very easily integrate RPC connectivity. Even accessing DCOM under Java with Visual J++ is much simpler than the same connectivity under Visual C++ or Visual Basic. This is because Java is Object Oriented by nature, while both C++ and Visual Basic are procedural languages that were coerced into providing object oriented constructs.

As a side note, IBM has tools that provide similar functionality to MTS, although the development methodologies are much different.

Unfortunately, most Windows-based applications support ODBC only. So you don’t have the option to use JDBC in those cases.
 
When to Use JDBC
If you ask yourself the following questions and get a YES to all of them, you probably want to use JDBC:
  1. Do you need to run your application on multiple diverse systems?
  2. Do all of these systems have a JRE available?
  3. If you are using a 3rd party application, does it support JDBC access as an option?
  4. If you are creating a home-grown application, are you willing to code it, or a portion of it, in Java?
 
 
The benefit here is obvious. With just a little care, a single application can work for all platforms.

While the promise of Write Once Run Anywhere (WORA) is not an automatic reality, it can be achieved with just a little effort and testing. The place where problems usually occur are with user interface and timing issues between threads. Obviously the performance characteristics of specific machines vary and the number of pixels and behavior of GUI controls differ between operating systems, so these areas will always require testing and attention. I liken this to the observation that while portability between multivalued platforms is not a given, it often has been something that has been achieved with a weekend’s work. There have been many cases where a Java application has been coded to run on multiple platforms.

If any of the answers above was a NO, however, JDBC is probably not an option for you.

Summary

Today, the MultiValue community has a great deal of choice in the standards based tools and technologies available to them to integrate MultiValue databases with the world of Windows and the Web. Whatever your new project may be, you will most likely have a number of different avenues available. As opposed to a few years back, where the trick was to find a solution that worked, the trick today will be isolating what tools and technologies will best suit your needs.

In the next issue, Universal Data Access will be discussed from the application point of view. Sample applications and how they make use of Universal Data Access will be presented.

Robert Houben
Liberty Integration Software

Universal Data Access - Part 2


Last Updated: 30 June 1999

Home | | What's New | | News | | Articles | | Workspace | | Pick of the Web | | InfoCentre | | Parting Shots | | About Us | | Subscribe |