Universal
Data Access - Part 1
As recently as a few years ago, the options for bringing MultiValue data
to the PC desktop were quite limited. Todays world on the other hand offers
a veritable potpourri of MultiValue data access tools. The difficulty these
days is not in finding a solution, but instead, in deciding what solution best
fits your needs. Robert Houben explains.
This article attempts to lay out the standards based data access solutions
available to the MultiValue community today. The intention is to provide you
with some guidelines on how to identify the right solution for your needs. Next
issue we will give examples of how this new technology can be used in todays
applications.
Standards Based Database Access
In 1991, Microsoft began work with Lotus, Digital, IBM, Apple and other members
of the X/Open standards committee called the SQL Access Group (SAG). This group
developed a specification for a symmetrical Call Level Interface (CLI) and a
subset of ANSI SQL that could be implemented by database vendors and ISVs,
providing some level of interoperability. Microsoft delivered this strategy
to the world with the name Open DataBase Connectivity (ODBC) and launched it
at the ODBC 1.0 Beta conference on March 9-10, 1992.
This was the first time that a standards-based data access strategy was delivered
to the software community. Liberty was involved indirectly in the creation of
Microsofts ISAM ODBC drivers and was at this beta conference. Within less
than a year of this conference Liberty had released an ODBC driver that understood
multivalued data and was the first non-Microsoft vendor to produce an ODBC driver
of any kind. ODBC went on to gain market acceptance as a strategy for accessing
relational data.
At the time of the ODBC 1.0 beta conference, Microsoft, in talking about their
future directions, acknowledged that almost 80% of all corporate data lived
in non-relational data stores, and they knew that they needed to have a strategy
for accessing this data. During the next 1 or 2 years, while Microsoft struggled
with how best to solve this issue, since ODBC was such a widespread standard,
an amazing number of non-relational databases were coerced into looking relational
so they could participate in the standards that had been set. It seems that
even a poor standard that was a bad fit was better than no standard at all!
Today, developers have a number of standards-based solutions to consider when
looking at strategies for accessing data from industry standard tools and technologies.
These choices can be confusing, and choosing between them can prove a challenging
task.
These are the data access strategies that we will be considering:
- ODBC
- OLEdb
- ActiveX Data Objects (ADO)
- Universal Data Access (UDA)
- Java Database Connectivity (JDBC)
Relationship to Standards
Figure 1 (right) shows how the various data access strategies being considered
relate to the SAG CLI standard.
Note
that OLEdb is not based on a standard. One could argue, however, that anything
done by Microsoft, with a distribution of 65 million systems, is a de-facto
standard. Also note how the JDBC standard leverages the ODBC standard. This
is one of the benefits of standards. There is a huge incentive to developers
to leverage existing standards, even when developing new ones. In this game,
the consumer is the winner!
Data Access Strategies
Key Decision Point Availability
This is probably the most critical factor in determining what strategy to use.
If the client application that you plan to use only supports one data access
strategy, then you are limited to that method i.e. Excel for Office 95 will
only deal with ODBC drivers.
Fortunately, most of the data access methods are either available now, or will
be shortly for multivalued databases.
- ODBC
ODBC, presently at Release 3.5, is the most widely supported standard for
data access. In addition to being a de-facto standard, it is also based on
de-jure standards, the SAG CLI and ANSI/92 SQL. Client-side drivers are available
for Windows, UNIX, Macintosh and other environments. A large number of databases,
both relational and non-relational, are available. Unfortunately for multivalued
users, only Win32 drivers and a limited number of UNIX drivers for UniVerse
are available at this time.
- On non-Windows platforms, the latest standard supported is ODBC 2.10. Very
few drivers have been written to the ODBC 3.x standard. 3.x was a significant
departure from 2.x and prior, with no significant benefits to make it desirable.
Microsoft has added two features to ODBC, both of which are outside of the
standard, and both of which could have been retro-fitted to ODBC 2.x. These
are:
-
- Connection Pooling (this requires Microsoft Transaction Server (MTS)),
and
- Distributed Transactions (this requires both MTS and server-side 2-phase
commit support).
-
- These features are especially important because they are a significant enabling
tool for the Microsoft Transaction Server and Microsofts multi-tier
architecture, called Digital interNet Architecture (DNA). For more information
on this, see http://www.microsoft.com/dna/.
-
- The ODBC driver from Liberty is compatible with all MultiValue databases,
hardware platforms, operating systems and communication protocols. This means
that a single application such as Microsoft Excel or Crystal Reports can seamlessly
access data from your MultiValue system, regardless of the hardware, DBMS,
operating system or communication protocol used.
-
- One important area of consideration when choosing an ODBC driver is to look
at the complexity of your data. Some ODBC drivers, especially some of the
free ones bundled with Operating Systems, require your data to be very well
structured, with all multivalues defined with association values, before they
will access the data correctly. If your legacy multivalued data does not follow
the normal conventions of multivalued data, you may need to consider
a more expensive, commercial third party ODBC vendors tool.
-
- When to Use ODBC
When your client applications are Win32 only, then ODBC becomes a
likely candidate.
-
- Also, if your client applications all support ODBC and you have an available
ODBC driver for your database platform, this again would suggest ODBC as the
best solution.
-
- ODBC drivers are available on a few UNIX platforms, for a few multivalued
databases only. If you need access from UNIX and happen to have such a driver
available, then ODBC is again an excellent option.
-
- OLEdb
-
- This methodology, unlike ODBC, is not standards based, although some would
argue that anything done by the Redmond Giant is a standard just by virtue
of the size of its install base. The other issue with OLEdb is its proprietary
nature. It is a standard that was intended for Win32, and ActiveX in particular.
-
- OLEdb and ActiveX are not available under UNIX. In a recent move, however,
Microsoft is making available a strategy and set of tools to enable programmers
who develop non-GUI Distributed Component Object Model (DCOM) components to
port these to UNIX. DCOM is the Object Request Broker (ORB) component of Microsofts
ActiveX technology. With this in place, it should be possible to create an
OLEdb driver for Windows, and port it to UNIX. You can also access an OLEdb
driver on a Windows system from a UNIX-based DCOM component.
-
- While this is possible, Microsoft has not made the conversion an easy process.
It typically involves developing and debugging your component under Win32,
then going through a number of convoluted steps to do the conversion, using
a third party toolkit. Then, if it turns out that your component needs debugging,
youre going to have a difficult job ahead of you
-
- OLEdb supports both Connection Pooling and Distributed Transactions.
-
- One of the wonderful strengths of OLEdb is the ability, using Visual Studio
6.0 wizards, to create an OLEdb consumer object that consumes an OLEdb driver.
You can build ActiveX components and include them in your package.
This package, using MTS, can be configured to remote the ODBC calls, so that
instead of installing the OLEdb drivers on all the clients, they are installed
on a server and the calls happen from the server, which also pools the connections.
The client application accesses the OLEdb driver as though it were local,
handles the database as though it were the only client that had to access
it (ignore locking and concurrency, MTS does this for you) and pretends to
connect/disconnect before and after each access. MTS magically gives your
components distributed, cached and pooled connections, and distributed transactions,
with concurrency control and distributed, two-phase, commit/rollback.
-
- Implementation of an MTS based application is not for the faint of heart.
It requires OLEdb drivers that do not crash, or they run the risk of hanging
that pooled connection. There are other areas of complexity. To my knowledge
the only companies seriously developing with this technology were making constant
use of Microsoft Professional Services.
Liberty Integration Software is planning the release of both an OLE DB interface
that supports full relational query capability and an OLEdb driver that supports
access to the pure multivalued structures.
-
- In another key move, Microsoft, in implementing OLEdb drivers for things
like POP3 servers, have run into the concept of reply-to fields with multiple
values (multivalues). Suddenly, Microsoft is releasing tools, like their Hierarchical
Flex-grid, that support multivalues, even with Libertys current ODBC
driver.
-
- When to Use OLEdb
OLEdb is best used when you are planning to access data on Win32 platforms
only. At present, it is not a viable option for non-Win32 environments.
Also, very few applications support OLEdb drivers, unless they have full ODBC-style
query support. OLEdb drivers with that level of support are presently few
and far between i.e. Microsoft Office 97 suite supports OLEdb drivers, provided
they are full-featured enough.
Microsoft provides a query component, which adds query support to level 1
OLEdb drivers. While this technically enables SQL syntax against drivers that
dont natively support it, for large databases this is likely to be extremely
inefficient, since it literally means that the driver must manipulate the
entire table in many instances.
If the nature of your project is such that you need high-end scalability,
distributed transactions, or other high-end features, and you feel that you
can afford the development costs, then the DNA architecture, with MTS, is
a good fit.
If you are a Visual Basic or Visual C++ developer with a need to create custom
applications to access multivalued data, then OLEdb is a good choice.
-
- ActiveX Data Objects (ADO)
-
- ADO is a tool that abstracts the OLEdb layer. It is generally recommended
that access to OLEdb drivers be done through ADO. Use of MTS pretty well mandates
the use of ADO. Drag and drop creation and/or usage of OLEdb based components
uses ADO by default. This is really an extension of OLEdb, and not a separate
data access strategy. It can access ODBC drivers through the OLEdb to ODBC
bridge.
-
- Universal Data Access (UDA)
-
- This is the name that Microsoft gives to its overall Data Access Strategy.
Figure 2 (right) gives a picture of what this architecture looks like.
-
- For more information on Universal Data Access see http://www.microsoft.com/data/
-
- Java Database Connectivity (JDBC)
The JDBC standard, like ODBC, is standards based, being based on the SAG CLI,
and modeled after ODBC to benefit from the existing industry investment in
ODBC. It also has the benefit of being machine neutral. At this time, Libertys
JDBC driver has been certified 100% Pure Java and has been run on a large
number of client systems, all without requiring a recompile. This is clearly
the option for people who want the widest coverage, and there is even a JDBC
driver to access ODBC drivers, allowing you to leverage existing Win32 or
UNIX based ODBC drivers.
JDBC is a one-language tool. It is intended to be used from Java. If you are
planning to write your applications in C or C++, this is probably a problem.
If you are happy developing in Java, then you have the issue of determining
if a Java Runtime Environment (JRE) is available on all the platforms for
which you are developing. Most commercial platforms have a JRE at this time.
Java supports a number of RPC connectivity schemes, including RMI, CORBA and
Java Beans, which provide basically the same type of functionality as DCOM.
Unlike C, C++ or Visual Basic working with DCOM, Java working with RMI, CORBA
or Java Beans allow you to very easily integrate RPC connectivity. Even accessing
DCOM under Java with Visual J++ is much simpler than the same connectivity
under Visual C++ or Visual Basic. This is because Java is Object Oriented
by nature, while both C++ and Visual Basic are procedural languages that were
coerced into providing object oriented constructs.
As a side note, IBM has tools that provide similar functionality to MTS, although
the development methodologies are much different.
Unfortunately, most Windows-based applications support ODBC only. So you dont
have the option to use JDBC in those cases.
-
- When to Use JDBC
If you ask yourself the following questions and get a YES to all of
them, you probably want to use JDBC:
-
- Do you need to run your application on multiple diverse systems?
- Do all of these systems have a JRE available?
- If you are using a 3rd party application, does it support JDBC access
as an option?
- If you are creating a home-grown application, are you willing to code
it, or a portion of it, in Java?
-
- The benefit here is obvious. With just a little care, a single application
can work for all platforms.
While the promise of Write Once Run Anywhere (WORA) is not an automatic reality,
it can be achieved with just a little effort and testing. The place where
problems usually occur are with user interface and timing issues between threads.
Obviously the performance characteristics of specific machines vary and the
number of pixels and behavior of GUI controls differ between operating systems,
so these areas will always require testing and attention. I liken this to
the observation that while portability between multivalued platforms is not
a given, it often has been something that has been achieved with a weekends
work. There have been many cases where a Java application has been coded to
run on multiple platforms.
If any of the answers above was a NO, however, JDBC is probably not an option
for you.
Summary
Today, the MultiValue community has a great deal of choice in the standards
based tools and technologies available to them to integrate MultiValue databases
with the world of Windows and the Web. Whatever your new project may be, you
will most likely have a number of different avenues available. As opposed to
a few years back, where the trick was to find a solution that worked, the trick
today will be isolating what tools and technologies will best suit your needs.
In the next issue, Universal Data Access will be discussed from the application
point of view. Sample applications and how they make use of Universal Data Access
will be presented.
Robert Houben
Liberty Integration Software
Universal Data Access - Part 2
Last Updated: 30
June 1999