AMCY5.COM
Your Business Partner
Projects Reports Proposals Presentations Letters
Every Thing for Your Business
 
HomeProjectsReportsPresentationsLettersProposalsContact Us
 
Marketing Finance HRM IT OB Research ENT Economics

Project on RDBMS and ORDBMS

Share |

We are providing Projects for your business growth and to meet new challenges. Here are some projects prepared by our team of "Developing New Projects" for the Guarantee of your business growth


History of Database

 

Database

 

It is often said that we live in an information society and that information is a very valuable resource (or, as some people say, information is power). In this information society, the term database has become a rather common term although its meaning seems to have become somewhat vague as the importance of database systems has grown. Some people use the term database of an organization to mean all the data in the organization (whether computerized or not). Other people use the term to mean the software that manages the data. We would use it to mean a collection of computerized information such that it is available to many people for various uses.

 

A database is a well organized collection of data that are related in a meaningful way which can be accessed in different logical orders but are stored only once. The data in the database is therefore integrated, structured, and shared.

Or

A Database can be considered an organized collection of information.

 For example, it is easy to find a book in a library because books are arranged in an organized manner. The tools that help us

Ü  Store Data

Ü  Adding new Information

Ü  Changing the existing information

Ü  Deleting the existing information

Ü  Viewing the information

The data is managed by a Database Management System.

The phone book is a good example of a simple database. For example, to find the phone number of Mr. Jonathan Smith, living at No.5 4 th Avenue Apt 15, New York, NY, one may use the following process: Get the phone book for the city of New York. Flip to the section where the last name is ‘Smith’. More than one Mr. Smith may be listed. In that case, look for Mr. Smith whose first name is Jonathan. Look for Jonathan Smith living at No. 5 4 th Avenue Apt 15, New York, NY to get the corresponding phone number. In the example above, the phone number can be found because the information in the phone book was organized in a certain manner.

In the early days of computing, computers were mainly used for solving numerical problems. Even in those days, it was observed that some of the tasks were common in many problems that were being solved. It therefore was considered desirable to build special subroutines which perform frequently occurring computing tasks such as computing Sin(x). This lead to the development of mathematical subroutine libraries that are now considered integral part of any computer system.

 

By the late fifties storage, maintenance and retrieval of non-numeric data had become very important. Again, it was observed that some of the data processing tasks occurred quite frequently e.g. sorting. This lead to the development of generalized routines for some of the most common and frequently used tasks. A general routine by its nature tends to be somewhat less efficient than a routine designed for a specific problem. In the late fifties and early sixties, when the hardware costs were high, use of general routines was not very popular since it became a matter of trade-off between hardware and software costs. In the last two decades, hardware costs have gone down dramatically while the cost of building software has gone up. It is no more a trade-off between hardware and software costs since hardware is relatively cheap and building reliable software is expensive.

Database management systems evolved from generalized routines for file processing as the users demanded more extensive and more flexible facilities for managing data. Database technology has undergone major changes during the 1970's. An interesting history of database systems is presented by Fry and Sibley (1976).

 

Types of Databases

 

There are many techniques and technologies for managing

Ü  DBMS  (Database Management Systems)

§  dBase, Fox Pro

Ü  HDBMS  (Hierarchical DBMS)

§  IMS

Ü  NDBMS  (Network DBMS)

§  IDMS

Ü  RDBMS  (Relational DBMS)

§  Oracle, SQL Server, Sybase, Informix, etc.

Ü  ORDBMS  (Object Relational DBMS)

§  Oracle

 

DBMS: - Database Management Systems have been used for quite a while. Software packages like dBase; Clipper, etc had facilities for managing data. The user had to take care of two aspects – specify what data was to be retrieved and how to go and get the data.

HDBMS: - Managed the data in a hierarchical fashion. IBM’s product IMS is a Hierarchical Database. It is used in mainframe computers. It is still in use in some installations. Over a period of time the demands of the users prompted a search for alternate approaches.

 

NDBMS: - Managed the data according to network models. IBM’s product IDMS is a Network Database. It is also used in mainframe computers. Like the HDBMS (IMS), it is still used in a few installations. Over a period of time disadvantages of the network database technology prompted the search for an alternate solution. With more powerful personal computer systems and the inherent limitations of the Hierarchical and Network database models, the popularity of the Relational Database Model increased.

RDBMS: - Was invented by the team lead by Dr. Edmund F. Codd and funded by IBM in the early 1970’s. The Relational Model is based on the principles of relational algebra. This model also known as the Relational Database Management System is very popular and is in use by a majority of the Fortune 500 companies. This model will be used in great detail in this course. Many vendors sell products that conform to this model. Some of these vendors are: Oracle, SQL Server, Sybase, Informix, Ingress, Gupta SQL, DB2, and Microsoft Access.

ORDBMS: - This model has been proposed and is presently being implemented by the Oracle Corporation. This model addresses the shortcomings of the RDBMS model. The Object Relational Database Management System is not purely object oriented. However, it is implemented via the same relational engine that drives the Relational Database Management System.

 

Data is stored in tables. A table is the combination of columns and rows. Each column contains a certain type of data about a person. For example, the first column contains the last name, the second column contains the first name and the third column contains the address. Each row contains all types of data about a person.

 

Client / Server Database Concepts

The Client computer sends requests to the server. The Server processes the requests and delivers results to the client.

Server: - This is usually a high powered computer. Typically it could consist of several processors (2, 3 or 4), a very large amount of RAM (2GB or 4GB), a very large amount of hard disk storage space (possibly a few hundred GB), high speed network data transfer rates, etc. The server need not be from a specific vendor. It could be purchased from any vendor. The server platform may be UNIX, Windows NT, AIX, Sun Solaris, Novell, etc.

Client: - These are usually personal computers. They have processing power of their own. The client operating system may be Windows 95, Windows 98, Windows 2000, Windows NT Workstation, Sun Sparc Station, Mac OS, etc.

Operation: - Operation of the client server network is as follows:

• The server is powered on and is ready to receive connections / requests. The network administrator configures the server.

• Each client is powered on and is required to provide user identification and authentication information (such as a user ID and a password) to connect to the server. In a multiple server environment, the client computer may also be required to specify the location of the server to which it must connect. This setup creates the infrastructure for other programs to be used on this configuration. In most environments, databases, e-mail servers, web servers are run on a server computer. Clients use the resources offered by these databases, websites, etc.

 

 

 

DBMS

(Database Management System)

 

As discussed earlier, a database is a well organized collection of data. To be able to carry out operations like insertion, deletion and retrieval, the database needs to be managed by a substantial package of software. This software is usually called a Database Management System (DBMS). The primary purpose of a DBMS is to allow a user to store, update and retrieve data in abstract terms and thus make it easy to maintain and retrieve information from a database. A DBMS relieves the user from having to know about exact physical representations of data and having to specify detailed algorithms for storing, updating and retrieving data.

 

A DBMS is usually a very large software package that carries out many different tasks including the provision of facilities to enable the user to access and modify information in the database. The database is an intermediate link between the physical database, the computer and the operating system, and on the other hand, the users. To provide the various facilities to different types of users, a DBMS normally provides one or more specialized programming languages often called Database Languages. Different DBMS provide different database languages although a language called SQL has recently taken on the role of a de facto standard. Database languages come in different forms. A language is needed to describe the database to the DBMS as well as provide facilities for changing the database and for defining and changing physical data structures. Another language is needed for manipulating and retrieving data stored in the DBMS. These languages are called Data Description Languages (DDL) and Data Manipulation Languages (DML) respectively.

To summarize, a database system consists of

  1. The database (data)
  2. A DBMS (software)
  3. A DDL and a DML (Part of the DBMS)Application programs

 

Some DBMS packages are marketed by computer manufacturers that will run only on that manufacturer's machines (e.g. IBM's IMS) but increasingly independent software houses are designing and selling DBMS software that would run on several different types of machines (e.g. ORACLE).

 

Evolution of Database Management Systems

 

The history of Database Management Systems (DBMS) is an evolutionary history. Each successive wave of innovation can be seen as a response either to some limiting aspect of the technology preceding it or to demands from new application domains. In economic terms, successful information technology lowers the costs incurred when implementing a solution to a particular problem. Lowering development costs increases the range of information management problems that can be dealt with in a cost-effective way, leading to new waves of systems development.

 


As an example of changing development costs, consider that early mainframe information systems became popular because they could make numerical calculations with unprecedented speed and accuracy (by the standards of the time). This had a major impact on how businesses managed financial assets and how engineering projects were undertaken. It took another evolutionary step—the invention of higher level languages such as COBOL—to get to the point that the economic return from a production management system justified the development cost. It took still another innovation—the adoption of relational DBMS before customer management and human resource systems could be automated. Today it is difficult to imagine how you could use COBOL and ISAM to build the number and variety of e-commerce Web sites that have sprung up over the last three years. And looking ahead, as we move into an era of ubiquitous, mobile computers and the growing importance of digital media, it is difficult to see how RDBMS technology can cope.

 

Early Information Systems

 

Pioneering information systems of the late 1960s and early 1970s were constrained by the capacity of the computers on which they ran. Hardware was relatively expensive, and by today’s standards, quite limited. Consequently, considerable human effort was devoted to optimizing programming code. Diverting effort from functionality to efficiency constrained the ambition of early developers, but the diversion of effort was necessary because it was the only way to develop an information system that ran with adequate performance. Information systems of this era were implemented as monolithic programs combining user-interface, data processing logic, and data management operations into a single executable on a single machine. Such programs intermingled low-level descriptions of data structure details, how records were laid out on disk, how records were interrelated, with user-interface management code. User-interface management code would dictate, for example, where an element from a disk record went on the screen. Maintaining and evolving these systems, even to perform tasks as simple as adding new data fields and indexes, required programmers to make changes involving most aspects of the system. Experience has shown that development is the least expensive stage of an information system’s life cycle. Programming and debugging take time and money, but by far the largest percentage of the total cost of an information system is incurred after it goes into production. As the business environment changes, there is considerable pressure to change systems with it. Changing a pre-relational information system, either to fix bugs or to make enhancements, is extremely difficult. Yet, there is always considerable pressure to do so. In addition, hardware constantly gets faster and cheaper. A second problem with these early systems is that as their complexity increased, it became increasingly important to detect errors in the design stage. The most pervasive design problem was redundancy, which occurs when storage structure designs record an item of information twice.

Different DBMS Architectures

There are three types of Architectures in the Database. They are as under

Hierarchical Model

The hierarchical model is the basis of the oldest database management systems, which grew out of early attempts to organize data for the U.S space program. Since these databases were ad-hoc solution to immediate problems, they were created without the strong theoretical foundations that later systems had. Their designers were familiar with file organizations and data structures, and used these concepts to solve the immediate data representation problems of users. The hierarchical model uses the tree as its basic data structure. Nodes of the trees in the hierarchical model represent data records or record segments, which are the portions of the data records. Relationships are represented as links or pointers between nodes.

 

The Network Model   

The network uses a network or plex structure, which is data structure consisting of nodes and branches. Unlike a tree, a plex structure allows a node to have more than one parent. The nodes of the network represent records of various types. Relationships between records are represented as links, which become pointers in the implementation.

Relational Model

The relational model was proposed by Codd in 1970 and continues to be the subject of much research. It is now widely used by both mainframe and microcomputers-based DBMSs because of its simplicity from user’s point of view and its power. The relation model uses the theory of relations from mathematics and adapts it for use in database theory. In relation model both entities and relationships are represented by relations which are physically represented as tables or two-dimensional arrays, and attributes as columns of those tables.

 

Before going into the detail of the Relational and Object Relational Database Management Systems we should have a knowledge of the Relational Design.

 

RDBMS

(Relational Database Management System)

 

In 1970 an IBM researcher named Ted Codd wrote a paper that described a new approach to the management of “large shared data banks.” In his paper Codd identifies two objectives for managing shared data. The first of these is data independence, which dictates that applications using a database be maintained independent of the physical details by which the database organizes itself. Second, he describes a series of rules to ensure that the shared data is consistent by eliminating any redundancy in the database’s design. Codd’s paper deliberately ignores any consideration of how his model might be implemented. He was attempting to define an abstraction of the problem of information management: a general framework for thinking about and working with information. The Relational Model Codd described had three parts: a data model,

a means of expressing operations in a high-level language, and a set of design principles that ensured the elimination of certain kinds of redundant data problems. Codd’s relational model views data as being stored in tables containing a variable number of rows (or records), each of which has a fixed number of columns. Something like a telephone directory or a registry of births, deaths, and marriages, is a good analogy for a table. Each entry contains different information but is identical in its form to all other entries of the same kind. The relational model also describes a number of logical operations that could be performed over the data. These operations included a means of getting a subset of rows (all names in the telephone directory with the surname “Brown”), a subset of columns (just the name and number), or data from a combination of tables (a person who is married to a person with a particular phone number). By borrowing mathematical techniques from predicate logic, Codd was able to derive a series of design principles that could be used to guarantee that the database’s structure was free of the kinds of redundancy so problematic in other systems. Greatly expanded by later writers, these ideas formed the basis of the theory of normal forms. Properly applied, the system of normal form rules can ensure that the database’s logical design is free of redundancy and, therefore, any possibility of anomalies in the data.

Elements of a Relational Database

There are four basic elements which are necessary for the Relational Database. Those are given below.

Ü  Relations

Ü  Attributes

Ü  Domains

Ü  The Relational Operators.

Codd describe a data storage system that possessed three characteristics that were sorely needed at that time:

1.      Logical data independence: - This desirable characteristic means that changes made to an attribute (column), for example, an increase or decrease in size have no perceivable effect on other attributes for the same relation (table). Logical data independence was attractive to data processing organizations because it could substantially reduce the cost of software maintenance.

2.      Referential and data integrity: - Unlike other database systems, a relational database would relieve the application software of the burden of enforcing integrity constraints. Codd described two characteristics that would be maintained by a relational database, referential integrity and data integrity.

3.      Ad hoc query: - This characteristic would enable the user to tell the database which data to retrieve without indicating how to accomplish the task.

Some time passed before a commercial product actually implemented some of the relational database features that Codd described. Today many vendors sell Relational Database Management Systems; some of the more well-known vendors are Oracle, Sybase, IBM, Informix, Microsoft, and Computer Associates. Of these vendors, Oracle has emerged as the leader. The Oracle RDBMS engine has been ported to more platforms than any other database product. Because of Oracle's multiplatform support, many application software vendors have chosen Oracle as their database platform. And now, Oracle Corporation has ported the same RDBMS engine to the desktop environment with its release of Personal Oracle.

Relational Database Implementation

Today's relational databases implement a number of extremely useful features that Codd did not mention in his original article. However, as of this writing, no commercially available database is a full implementation of Codd's rules for relational databases.

 

During the early 1970s, several parallel research projects set out to implement a working RDBMS. This turned out to be very hard. It wasn’t until the late 1980s that RDBMS products worked acceptably in the kinds of high-end, online transactions processing applications served so well by earlier technologies. Despite the technical shortcomings RDBMS technology exploded in popularity because even the earliest products made it cheaper, quicker, and easier to build information systems. For an increasing number of applications, economics favored spending more on hardware and less on people. RDBMS technology made it possible to develop information systems that, while desirable from a business management point of view, had been deemed too expensive. To emphasize the difference between the relational and pre-relational approaches, a four hundred line C program can be replaced by the SQL-92 expression in Listing below.

CREATE TABLE Employees (

Name VARCHAR(128),

DOB DATE,

Salary DECIMAL(10,2),

Address VARCHAR(128)

);

 

The code in list implements considerably more functionality than a C program because RDBMS provide transactional guarantees for data changes. They automatically employ locking, logging, and backup and recovery facilities to guarantee the integrity of the data they store. Also, RDBMS provide elaborate security features. Different tables in the same database can be made accessible only by different groups of users. All of this built-in functionality means that developers focus more of their effort on their system’s functionality and less on complex technical details.

Relational Database Theory

With relational database, systems can be considered complete unless it discusses the basic concepts of relational database theory. In a nutshell:

Ü  Every entity has a set of attributes that uniquely define an instance of that entity.

Ü  No part of the primary key can be null.

Ü  Every entity has a set of attributes that uniquely identify each row in that entity.

Ü  A table cannot have duplicate rows.

Ü  Tables are related to one another.

Ü  The order of the rows in a table is arbitrary.

Ü  The order of the columns in a table is arbitrary.

Architecture of RDBMS

There are three standard levels for Relational Database management system

Ü  Conceptual level

Ü  External level

Ü  Internal level

 

 

External Level

The external level consist of many different external views or external models of the database. Each user has a model of the real world represented in a form that is suitable for that user. A particular user interacts with only certain entities in the real world and is interested in only some of their attributes and relationships. Therefore, that user’s view contain only information about those aspects of the real world.

Conceptual Level

The conceptual level consists of base tables, which are physical stored tables. These tables are created by Database Administrator using a CREATE Table command. Once the table is created the DBA can create “VIEW” for Users. A view may be a subset of single base table or may be created by combining base tables or performing other Operations on them.

Internal Level

The level which covers the physical implementation of the database. It includes the data structure and file organization used to store data on physical storage devices. The internal schema, written in DDL, is a complete description of the internal model. It includes such items as how data is represented, how records are sequenced, what index exist, what pointers exist. An internal record is a single stored record. It is the unit that is passed up to the internal level.

Primary Key

Every entity has a set of attributes that uniquely define an instance of that entity. This set of attributes is called the primary key. The primary key may be composed of a single attribute

No Part of the Primary Key Is Null

A basic tenet of relational theory is that no part of the primary key can be null. If you think about that idea for a moment, it seems intuitive: The primary key must uniquely identify each row in an entity; therefore, if the primary key (or a part of it) is null, it wouldn't be able to identify anything.

Data Integrity

According to relational theory, every entity has a set of attributes that uniquely identify each row in that entity. Relational theory also states that no duplicate rows can exist in a table.

Referential Integrity

Tables are related to one another through foreign keys. A foreign key is one table column for which the set of possible values is found in the primary key of a second table. Referential integrity is achieved when the set of values in a foreign key column is restricted to the primary key that it references or to the null value. Once the database designer declares primary and foreign keys, enforcing data and referential integrity is the responsibility of the DBMS.

Five Normal Forms in Relational Database Theory

INTRODUCTION

The normal forms defined in relational database theory represent guidelines for record design. The guidelines corresponding to first through fifth normal forms are presented here, in terms that do not require an understanding of relational theory. The design guidelines are meaningful even if one is not using a relational database system. We present the guidelines without referring to the concepts of the relational model in order to emphasize their generality, and also to make them easier to understand. Our presentation conveys an intuitive sense of the intended constraints on record design, although in its informality it may be imprecise in some technical details. A comprehensive treatment of the subject is provided by Date.

The normalization rules are designed to prevent update anomalies and data inconsistencies. With respect to performance tradeoffs, these guidelines are biased toward the assumption that all non-key fields will be updated frequently. They tend to penalize retrieval, since data which may have been retrievable from one record in an un-normalized design may have to be retrieved from several records in the normalized form. There is no obligation to fully normalize all records when actual performance requirements are taken into account.

FIRST NORMAL FORM

First normal form deals with the "shape" of a record type. Under first normal form, all occurrences of a record type must contain the same number of fields. First normal form excludes variable repeating fields and groups. This is not so much a design guideline as a matter of definition. Relational database theory doesn't deal with records having a variable number of fields.

SECOND AND THIRD NORMAL FORMS

Second and third normal forms deal with the relationship between non-key and key fields. Under second and third normal forms, a non-key field must provide a fact about the key, us the whole key, and nothing but the key. In addition, the record must satisfy first normal form. We deal now only with "single-valued" facts. The fact could be a one-to-many relationship, such as the department of an employee, or a one-to-one relationship, such as the spouse of an employee. Thus the phrase "Y is a fact about X" signifies a one-to-one or one-to-many relationship between Y and X. In the general case, Y might consist of one or more fields, and so might X. In the following example, QUANTITY is a fact about the combination of PART and WAREHOUSE.

Second Normal Form

Second normal form is violated when a non-key field is a fact about a subset of a key. It is only relevant when the key is composite, i.e., consists of several fields. Consider the following inventory record:

---------------------------------------------------

| PART | WAREHOUSE | QUANTITY | WAREHOUSE-ADDRESS |

====================-------------------------------

The key here consists of the PART and WAREHOUSE fields together, but WAREHOUSE-ADDRESS is a fact about the WAREHOUSE alone. The basic problems with this design are:

  • The warehouse address is repeated in every record that refers to a part stored in that warehouse.
  • If the address of the warehouse changes, every record referring to a part stored in that warehouse must be updated.
  • Because of the redundancy, the data might become inconsistent, with different records showing different addresses for the same warehouse.
  • If at some point in time there are no parts stored in the warehouse, there may be no record in which to keep the warehouse's address.

To satisfy second normal form, the record shown above should be decomposed into (replaced by) the two records:

------------------------------------------------------------

| PART | WAREHOUSE | QUANTITY |WAREHOUSE | WAREHOUSE-ADDRESS |

====================----------- =============-----------------

When a data design is changed in this way, replacing unnormalized records with normalized records, the process is referred to as normalization. The term "normalization" is sometimes used relative to a particular normal form. Thus a set of records may be normalized with respect to second normal form but not with respect to third. The normalized design enhances the integrity of the data, by minimizing redundancy and inconsistency, but at some possible performance cost for certain retrieval applications. Consider an application that wants the addresses of all warehouses stocking a certain part. In the unnormalized form, the application searches one record type. With the normalized design, the application has to search two record types, and connect the appropriate pairs.

Third Normal Form

Third normal form is violated when a non-key field is a fact about another non-key field, as in

------------------------------------

| EMPLOYEE | DEPARTMENT | LOCATION |

============------------------------

The EMPLOYEE field is the key. If each department is located in one place, then the LOCATION field is a fact about the DEPARTMENT -- in addition to being a fact about the EMPLOYEE. The problems with this design are the same as those caused by violations of second normal form:

  • The department's location is repeated in the record of every employee assigned to that department.
  • If the location of the department changes, every such record must be updated.
  • Because of the redundancy, the data might become inconsistent, with different records showing different locations for the same department.
  • If a department has no employees, there may be no record in which to keep the department's location.

To satisfy third normal form, the record shown above should be decomposed into the two records:

------------------------- -------------------------

| EMPLOYEE | DEPARTMENT | | DEPARTMENT | LOCATION |

============------------- ==============-----------

To summarize, a record is in second and third normal forms if every field is either part of the key or provides a (single-valued) fact about exactly the whole key and nothing else.

Functional Dependencies

In relational database theory, second and third normal forms are defined in terms of functional dependencies, which correspond approximately to our single-valued facts. A field Y is "functionally dependent" on a field (or fields) X if it is invalid to have two records with the same X-value but different Y-values. That is, a given X-value must always occur with the same Y-value. When X is a key, then all fields are by definition functionally dependent on X in a trivial way, since there can't be two records having the same X value.

There is a slight technical difference between functional dependencies and single-valued facts as we have presented them. Functional dependencies only exist when the things involved have unique and singular identifiers (representations). For example, suppose a person's address is a single-valued fact, i.e., a person has only one address. If we don't provide unique identifiers for people, then there will not be a functional dependency in the data:

----------------------------------------------

| PERSON | ADDRESS |

-------------+--------------------------------

| John Smith | 123 Main St., New York |

| John Smith | 321 Center St., San Francisco |

----------------------------------------------

Although each person has a unique address, a given name can appear with several different addresses. Hence we do not have a functional dependency corresponding to our single-valued fact.

Similarly, the address has to be spelled identically in each occurrence in order to have a functional dependency. In the following case the same person appears to be living at two different addresses, again precluding a functional dependency.

---------------------------------------

| PERSON | ADDRESS |

-------------+-------------------------

| John Smith | 123 Main St., New York |

| John Smith | 123 Main Street, NYC |

---------------------------------------

We are not defending the use of non-unique or non-singular representations. Such practices often lead to data maintenance problems of their own. We do wish to point out, however, that functional dependencies and the various normal forms are really only defined for situations in which there are unique and singular identifiers. Thus the design guidelines as we present them are a bit stronger than those implied by the formal definitions of the normal forms.

For instance, we as designers know that in the following example there is a single-valued fact about a non-key field, and hence the design is susceptible to all the update anomalies mentioned earlier.

------------------------------------------------------

| EMPLOYEE | FATHER | FATHER'S-ADDRESS |

|============------------+----------------------------

| Art Smith | John Smith | 123 Main St., New York |

| Bob Smith | John Smith | 123 Main Street, NYC |

| Cal Smith | John Smith | 321 Center St., San Francisco |

------------------------------------------------------

However, in formal terms, there is no functional dependency here between FATHER'S-ADDRESS and FATHER, and hence no violation of third normal form.

FOURTH AND FIFTH NORMAL FORMS

Fourth and fifth normal forms deal with multi-valued facts. The multi-valued fact may correspond to a many-to-many relationship, as with employees and skills, or to a many-to-one relationship, as with the children of an employee (assuming only one parent is an employee). By "many-to-many" we mean that an employee may have several skills, and a skill may belong to several employees.

Note that we look at the many-to-one relationship between children and fathers as a single-valued fact about a child but a multi-valued fact about a father.

In a sense, fourth and fifth normal forms are also about composite keys. These normal forms attempt to minimize the number of fields involved in a composite key, as suggested by the examples to follow.

Fourth Normal Form

Under fourth normal form, a record type should not contain two or more independent multi-valued facts about an entity. In addition, the record must satisfy third normal form.

The term "independent" will be discussed after considering an example.

Consider employees, skills, and languages, where an employee may have several skills and several languages. We have here two many-to-many relationships, one between employees and skills, and one between employees and languages. Under fourth normal form, these two relationships should not be represented in a single record such as

-------------------------------

| EMPLOYEE | SKILL | LANGUAGE |

===============================

Instead, they should be represented in the two records

-------------------- -----------------------

| EMPLOYEE | SKILL | | EMPLOYEE | LANGUAGE |

==================== =======================

Note that other fields, not involving multi-valued facts, are permitted to occur in the record, as in the case of the QUANTITY field in the earlier PART/WAREHOUSE example.

The main problem with violating fourth normal form is that it leads to uncertainties in the maintenance policies. Several policies are possible for maintaining two independent multi-valued facts in one record:

(1) A disjoint format, in which a record contains either a skill or a language, but not both:

-------------------------------

| EMPLOYEE | SKILL | LANGUAGE |

|----------+-------+----------|

| Smith | cook | |

| Smith | type | |

| Smith | | French |

| Smith | | German |

| Smith | | Greek |

-------------------------------

This is not much different from maintaining two separate record types. (We note in passing that such a format also leads to ambiguities regarding the meanings of blank fields. A blank SKILL could mean the person has no skill, or the field is not applicable to this employee, or the data is unknown, or, as in this case, the data may be found in another record.)

(2) A random mix, with three variations:

 

 

 

(a) Minimal number of records, with repetitions:

-------------------------------

| EMPLOYEE | SKILL | LANGUAGE |

|----------+-------+----------|

| Smith | cook | French |

| Smith | type | German |

| Smith | type | Greek |

-------------------------------

(b) Minimal number of records, with null values:

-------------------------------

| EMPLOYEE | SKILL | LANGUAGE |

|----------+-------+----------|

| Smith | cook | French |

| Smith | type | German |

| Smith | | Greek |

-------------------------------

(c) Unrestricted:

-------------------------------

| EMPLOYEE | SKILL | LANGUAGE |

|----------+-------+----------|

| Smith | cook | French |

| Smith | type | |

| Smith | | German |

| Smith | type | Greek |

-------------------------------

(3) A "cross-product" form, where for each employee, there must be a record for every possible pairing of one of his skills with one of his languages:

-------------------------------

| EMPLOYEE | SKILL | LANGUAGE |

|----------+-------+----------|

| Smith | cook | French |

| Smith | cook | German |

| Smith | cook | Greek |

| Smith | type | French |

| Smith | type | German |

| Smith | type | Greek |

-------------------------------

Other problems caused by violating fourth normal form are similar in spirit to those mentioned earlier for violations of second or third normal form. They take different variations depending on the chosen maintenance policy:

  • If there are repetitions, then updates have to be done in multiple records, and they could become inconsistent.
  • Insertion of a new skill may involve looking for a record with a blank skill, or inserting a new record with a possibly blank language, or inserting multiple records pairing the new skill with some or all of the languages.
  • Deletion of a skill may involve blanking out the skill field in one or more records (perhaps with a check that this doesn't leave two records with the same language and a blank skill), or deleting one or more records, coupled with a check that the last mention of some language hasn't also been deleted.

Fourth normal form minimizes such update problems.

1 Independence

We mentioned independent multi-valued facts earlier, and we now illustrate what we mean in terms of the example. The two many-to-many relationships, employee: skill and employee: language, are "independent" in that there is no direct connection between skills and languages. There is only an indirect connection because they belong to some common employee. That is, it does not matter which skill is paired with which language in a record; the pairing does not convey any information. That's precisely why all the maintenance policies mentioned earlier can be allowed.

In contrast, suppose that an employee could only exercise certain skills in certain languages. Perhaps Smith can cook French cuisine only, but can type in French, German, and Greek. Then the pairings of skills and languages becomes meaningful, and there is no longer an ambiguity of maintenance policies. In the present case, only the following form is correct:

-------------------------------

| EMPLOYEE | SKILL | LANGUAGE |

|----------+-------+----------|

| Smith | cook | French |

| Smith | type | French |

| Smith | type | German |

| Smith | type | Greek |

-------------------------------

Thus the employee: skill and employee: language relationships are no longer independent. These records do not violate fourth normal form. When there is interdependence among the relationships, then it is acceptable to represent them in a single record.

2 Multivalued Dependencies

For readers interested in pursuing the technical background of fourth normal form a bit further, we mention that fourth normal form is defined in terms of multivalued dependencies, which correspond to our independent multi-valued facts. Multivalued dependencies, in turn, are defined essentially as relationships which accept the "cross-product" maintenance policy mentioned above. That is, for our example, every one of an employee's skills must appear paired with every one of his languages. It may or may not be obvious to the reader that this is equivalent to our notion of independence: since every possible pairing must be present, there is no "information" in the pairings. Such pairings convey information only if some of them can be absent, that is, only if it is possible that some employee cannot perform some skill in some language. If all pairings are always present, then the relationships are really independent.

We should also point out that multivalued dependencies and fourth normal form apply as well to relationships involving more than two fields. For example, suppose we extend the earlier example to include projects, in the following sense:

  • An employee uses certain skills on certain projects.
  • An employee uses certain languages on certain projects.

If there is no direct connection between the skills and languages that an employee uses on a project, then we could treat this as two independent many-to-many relationships of the form EP:S and EP:L, where "EP" represents a combination of an employee with a project. A record including employee, project, skill, and language would violate fourth normal form. Two records, containing fields E,P,S and E,P,L, respectively, would satisfy fourth normal form.

Fifth Normal Form

Fifth normal form deals with cases where information can be reconstructed from smaller pieces of information that can be maintained with less redundancy. Second, third, and fourth normal forms also serve this purpose, but fifth normal form generalizes to cases not covered by the others.

We will not attempt a comprehensive exposition of fifth normal form, but illustrate the central concept with a commonly used example, namely one involving agents, companies, and products. If agents represent companies, companies make products, and agents sell products, then we might want to keep a record of which agent sells which product for which company. This information could be kept in one record type with three fields:

-----------------------------

| AGENT | COMPANY | PRODUCT |

|-------+---------+---------|

| Smith | Ford | car |

| Smith | GM | truck |

-----------------------------

This form is necessary in the general case. For example, although agent Smith sells cars made by Ford and trucks made by GM, he does not sell Ford trucks or GM cars. Thus we need the combination of three fields to know which combinations are valid and which are not.

But suppose that a certain rule was in effect: if an agent sells a certain product, and he represents a company making that product, then he sells that product for that company.

-----------------------------

| AGENT | COMPANY | PRODUCT |

|-------+---------+---------|

| Smith | Ford | car |

| Smith | Ford | truck |

| Smith | GM | car |

| Smith | GM | truck |

| Jones | Ford | car |

-----------------------------

In this case, it turns out that we can reconstruct all the true facts from a normalized form consisting of three separate record types, each containing two fields:

------------------- --------------------- -------------------

| AGENT | COMPANY | | COMPANY | PRODUCT | | AGENT | PRODUCT |

|-------+---------| |---------+---------| |------+---------|

| Smith | Ford | | Ford | car | | Smith | car |

| Smith | GM | | Ford | truck | | Smith | truck |

| Jones | Ford | | GM | car | | Jones | car |

------------------- | GM | truck | -------------------

---------------------

These three record types are in fifth normal form, whereas the corresponding three-field record shown previously is not.

Roughly speaking, we may say that a record type is in fifth normal form when its information content cannot be reconstructed from several smaller record types, i.e., from record types each having fewer fields than the original record. The case where all the smaller records have the same key is excluded. If a record type can only be decomposed into smaller records which all have the same key, then the record type is considered to be in fifth normal form without decomposition. A record type in fifth normal form is also in fourth, third, second, and first normal forms.

Fifth normal form does not differ from fourth normal form unless there exists a symmetric constraint such as the rule about agents, companies, and products. In the absence of such a constraint, a record type in fourth normal form is always in fifth normal form.

One advantage of fifth normal form is that certain redundancies can be eliminated. In the normalized form, the fact that Smith sells cars is recorded only once; in the un-normalized form it may be repeated many times.

It should be observed that although the normalized form involves more record types, there may be fewer total record occurrences. This is not apparent when there are only a few facts to record, as in the example shown above. The advantage is realized as more facts are recorded, since the size of the normalized files increases in an additive fashion, while the size of the un-normalized file increases in a multiplicative fashion. For example, if we add a new agent who sells x products for y companies, where each of these companies makes each of these products, we have to add x+y new records to the normalized form, but xy new records to the un-normalized form.

It should be noted that all three record types are required in the normalized form in order to reconstruct the same information. From the first two record types shown above we learn that Jones represents Ford and that Ford makes trucks. But we can't determine whether Jones sells Ford trucks until we look at the third record type to determine whether Jones sells trucks at all.

The following example illustrates a case in which the rule about agents, companies, and products is satisfied, and which clearly requires all three record types in the normalized form. Any two of the record types taken alone will imply something untrue.

-----------------------------

| AGENT | COMPANY | PRODUCT |

|-------+---------+---------|

| Smith | Ford | car |

| Smith | Ford | truck |

| Smith | GM | car |

| Smith | GM | truck |

| Jones | Ford | car |

| Jones | Ford | truck |

| Brown | Ford | car |

| Brown | GM | car |

| Brown | Toyota | car |

| Brown | Toyota | bus |

-----------------------------

------------------- --------------------- -------------------

| AGENT | COMPANY | | COMPANY | PRODUCT | | AGENT | PRODUCT |

|-------+---------| |---------+---------| |-------+---------|

| Brown | Ford | | GM | truck | | Jones | truck |

| Brown | GM | | Toyota | car | | Brown | car |

| Brown | Toyota | | Toyota | bus | | Brown | bus |

------------------- --------------------- -------------------

Fifth Normal Form

Observe that:

  • Jones sells cars and GM makes cars, but Jones does not represent GM.
  • Brown represents Ford and Ford makes trucks, but Brown does not sell trucks.
  • Brown represents Ford and Brown sells buses, but Ford does not make buses.

Fourth and fifth normal forms both deal with combinations of multivalued facts. One difference is that the facts dealt with under fifth normal form are not independent, in the sense discussed earlier. Another difference is that, although fourth normal form can deal with more than two multivalued facts, it only recognizes them in pair wise groups. We can best explain this in terms of the normalization process implied by fourth normal form. If a record violates fourth normal form, the associated normalization process decomposes it into two records, each containing fewer fields than the original record. Any of this violating fourth normal form is again decomposed into two records, and so on until the resulting records are all in fourth normal form. At each stage, the set of records after decomposition contains exactly the same information as the set of records before decomposition.

In the present example, no pairwise decomposition is possible. There is no combination of two smaller records which contains the same total information as the original record. All three of the smaller records are needed. Hence an information-preserving pairwise decomposition is not possible, and the original record is not in violation of fourth normal form. Fifth normal form is needed in order to deal with the redundancies in this case.

UNAVOIDABLE REDUNDANCIES

Normalization certainly doesn't remove all redundancies. Certain redundancies seem to be unavoidable, particularly when several multivalued facts are dependent rather than independent. In the example shown, it seems unavoidable that we record the fact that "Smith can type" several times. Also, when the rule about agents, companies, and products is not in effect, it seems unavoidable that we record the fact that "Smith sells cars" several times.

INTER-RECORD REDUNDANCY

The normal forms discussed here deal only with redundancies occurring within a single record type. Fifth normal form is considered to be the "ultimate" normal form with respect to such redundancies.

Other redundancies can occur across multiple record types. For the example concerning employees, departments, and locations, the following records are in third normal form in spite of the obvious redundancy:

In fact, two copies of the same record type would constitute the ultimate in this kind of undetected redundancy.

Inter-record redundancy has been recognized for some time , and has recently been addressed in terms of normal forms and normalization .

CONCLUSION

While we have tried to present the normal forms in a simple and understandable way, we are by no means suggesting that the data design process is correspondingly simple. The design process involves many complexities which are quite beyond the scope of this paper. In the first place, an initial set of data elements and records has to be developed, as candidates for normalization. Then the factors affecting normalization have to be assessed:

  • Single-valued vs. multi-valued facts.
  • Dependency on the entire key.
  • Independent vs. dependent facts.
  • The presence of mutual constraints.
  • The presence of non-unique or non-singular representations.

And, finally, the desirability of normalization has to be assessed, in terms of its performance impact on retrieval applications.

 

 

 

RDBMS & The Oracle Product Line

As the world's leading vendor of relational database software, Oracle Corporation supports its flagship product, the Oracle RDBMS, on more than 90 platforms. The release of Personal Oracle for Windows extends Oracle's reach to the most popular desktop operating system; Microsoft Windows.

The Oracle RDBMS is available in the following three configurations:

Ü  The Oracle Universal Server can support many users on highly scaleable platforms such as Sun, HP, Pyramid, and Sequent. The various options that are available with this configuration include the distributed option, by which several Oracle databases on separate computers can function as a single logical database. The Oracle Enterprise Server is available for a wide variety of operating systems and hardware configurations. Oracle Web Server ;an integrated system for dynamically generating HTML output from the content of an Oracle database; is also included with the Universal Server.

Ü  The Oracle Workgroup Server is designed for workgroups and is available on NetWare, Windows NT, SCO UNIX, and UnixWare. The Oracle Workgroup Server is a cost-effective and low-maintenance solution for supporting small groups of users. Oracle Web Server is also available as an option with the Oracle Workgroup Server.

Ü  Personal Oracle is a Windows-based version of the Oracle database engine that offers the same functionality that exists in the Oracle Universal Server and the Oracle Workgroup Server. Even though Personal Oracle cannot function as a database server by supporting multiple users, it still provides an excellent environment for experimentation and prototyping.

We are discussed here generally about personal oracle As in case of relational database management system

Oracle with Structured Query Language (SQL)

You communicate with Personal Oracle through Oracle's version of the Structured Query Language (SQL, usually pronounced sequel). SQL is a nonprocedural language; unlike C or COBOL in which you must describe exactly how to access and manipulate data, SQL specifies what to do. Internally, Oracle determines how to perform the request. SQL exists as an ANSI standard as well as an industry standard. Oracle's implementation of SQL adheres to Level 2 of the ANSI X3.135-1989/ISO 9075-1989 standard with full implementation of the Integrity Enhancement Feature. Oracle (as well as other database vendors) provides many extensions to ANSI SQL.

In addition, Oracle's implementation of SQL adheres to the U.S. government standard as described in the Federal Information Processing Standard Publication (FIPS PUB) 127, entitled Database Language SQL.

 

Oracle Data Types

 

The Oracle DBMS provides a number of different data types for the storage of the different forms of data in a manner that is most suitable to the manipulations that are likely to be performed. As part of the Data Modeling process it is important that the most appropriate data types are identified. Where there is any doubt (e.g. storing a year as a number or as a date), the data type chosen should reflect the way the data will be used most often. It is possible to inter-convert between data types, but this will reduce the efficiency of queries. The use of these data types in table design is "Manipulating Oracle Tables".

The available Oracle data types the most widely used are these:

Ü  VARCHAR2

Ü  CHAR

Ü  NUMBER

Ü  DATE

Ü  Long

VARCHAR2

This data type should be used for any columns which may contain characters. This includes alphabetic letters, together with _, -, !, ?, + etc. and numbers as a character representation (i.e. they look like numbers when displayed, but they cannot be subjected to arithmetic manipulation. Similarly an attempt to do a greater than (>), or less than (<) operation will not have the correct arithmetic result- VARCHAR2 values are compared character by character up to the first character that differs. Whichever value has the "greater" character in that position is considered "greater". Characters are compared via the ASCII character set with the largest ASCII value (i.e. 255) is considered greatest.) VARCHAR2 columns may be up to 2000 characters wide.

CHAR

This data type is now effectively replaced by VARCHAR2 - CHAR should no longer be used, although is still recognized and valid. The CHAR data type is fixed length up to a maximum of 255 characters, whereas VARCHAR2 is of variable length up to 2000 characters. The variable length of VARCHAR2 gives it significant storage and performance advantages over CHAR.

NUMBER

Any kind of number; positive, negative, integer or real. Up to 22 digits may be entered. For comparison, a larger value is considered greater than a smaller value, with all negative values smaller than all positive values.

DATE

 

A special data type with some of the characteristics of both Character and Number Data-types. It is used for the storage of date and time information. The operators =, > or < etc. can be used for dates but in where clauses such as

WHERE HIREDATE = '01-DEC-95'

 

the date value, on the right, must appear in quotes. For comparison, a later date is considered greater than an earlier date. Oracle dates must lie in the range 1st January, 4712 BC to 31st December, 4712 AD.

The standard Oracle date format is:

Ü  Digit Day followed by

Ü  Letter Month and

Ü  Digit Year

All should be separated by hyphens e.g. '01-JAN-95'

LONG

 

Can hold strings of characters up to 2 gigabytes long. Only one column of LONG data type can be defined per table. Binary data can only be inserted into these using the programming language interface (an example of binary data storage in a LONG data type would be storing a picture in an oracle table).

Importantly, there are significant restrictions on the use of character information stored in LONG data types. While the INSERT and UPDATE statements can be used to insert and modify LONG data types, and the SELECT statement to retrieve this data, no functions can be used to manipulate a retrieved a LONG data type and a LONG column can never be used in a WHERE clause.

When retrieving LONG data types the display length of the column is set to 80 characters, and anything stored in the column beyond this will be truncated. The following command is required to extend the display length of a LONG field, within SQL*Plus:

SQL> SET LONG xxxx

 

The value xxxx should be replaced with a number representing thelongest value likely to be stored in the LONG column.

Above are the general use data types in oracle, after similar with the data Types of oracle the next step in sense of RDBMS is that to clear knowledge of Creating the tables In Oracle

DBMS Vs RDBMS

The characteristic that differentiates a DBMS from an RDBMS is that the RDBMS provides a set-oriented database language. For most RDBMSs, this set-oriented database language is SQL. Set oriented means that SQL processes sets of data in groups. Two standards organizations, the American National Standards Institute (ANSI) and the International Standards Organization (ISO), currently promote SQL standards to industry. The ANSI-92 standard is the standard for the SQL used throughout this book. Although these standard-making bodies prepare standards for database system designers to follow, all database products differ from the ANSI standard to some degree. In addition, most systems provide some proprietary extensions to SQL that extend the language into a true procedural language.

Problems with RDBMS

Starting in the late 1980s, several deficiencies in relational DBMS products began receiving a lot of attention. The first deficiency is that the dominant relational language, SQL-92, is limiting in several important respects. For instance, SQL-92 supports a restricted set of built-in types that accommodate only numbers and strings, but many database applications began to include deal with complex objects such as geographic points, text, and digital signal data. A related problem concerns how this data is used. Conceptually simple questions involving complex data structures turn into lengthy SQL-92 queries.

The second deficiency is that the relational model suffers from certain structural shortcomings. Relational tables are flat and do not provide good support for nested structures, such as sets and arrays. Also, certain kinds of relationships, such as sub typing, between database objects are hard to represent in the model. (Subtyping occurs when we say that one kind of thing, such as a SalesPerson, is a subtype of another kid of thing, such as an Employee.) SQL-92 supports only independent tables of rows and columns. The third deficiency is that RDBMS technology did not take advantage of object-oriented (OO) approaches to software engineering which have gained widespread acceptance in industry. OO techniques reduce development costs and improve information system quality by adopting an object-centric view of software development. This involves integrating the data and behavior of a real-world entity into a single software module or component. A complex data structure or algorithmically sophisticated operation can be hidden behind a set of interfaces. This allows another programmer to make use of the complex functionality without having to understand how it is implemented. The relational model did a pretty good job handling most information management problems. But for an emerging class of problems RDBMS technology could be improved upon.

 

 

Object-Oriented DBMS

 

Object-Oriented Database Management Systems (OODBMS) are an extension of OO programming language techniques into the field of persistent data management. For many applications, the performance, flexibility, and development cost of OODBMS are significantly better than RDBMS or ORDBMS. The chief advantage of OODBMS lies in the way they can achieve a tighter integration between OO languages and the DBMS. Indeed, the main standards body for OODBMS, the Object Database Management Group (ODMG) defines an OODBMS as a system that integrates database capabilities with object-oriented programming language capabilities. The idea behind this is that so far as an application developer is concerned, it would be useful to ignore not only questions of how an object is implemented behind its interface, but also how it is stored and retrieved.

 

All developers have to do is implement their application using their favorite OO programming language, such as C++, Smalltalk, or Java, and the OODBMS takes care of data caching, concurrency control, and disk storage. In addition to this seamless integration, OODBMS possess a number of interesting and useful features derived mainly from the object model. In order to solve the finite type system problem that constrains SQL-92, most OODBMS feature an extensible type system. Using this technique, an OODBMS can take the complex objects that are part of the application and store them directly. An OODBMS can be used to invoke methods on these objects, either through a direct call to the object or through a query interface. And finally, many of the structural deficiencies in SQL-92 are overcome by the use of OO ideas such as inheritance and allowing sets and arrays. OODBMS products saw a surge of academic and commercial interest in the early 1990s, and today the annual global market for OODBMS products runs at about $50 million per year. In many application domains, most notably computer-aided design or manufacturing (CAD/CAM), the expense of building a monolithic system to manage both database and application is balanced by the kinds of performance such systems deliver.

 

Problems with OODBMS

 

Regrettably, much of the considerable energy of the OODBMS community has been expended relearning the lessons of twenty years ago. First, OODBMS vendors have rediscovered the difficulties of tying database design too closely to application design. Maintaining and evolving an OODBMS-based information system is an arduous undertaking. Second, they relearned that declarative languages such as SQL-92 bring such tremendous productivity gains that organizations will pay for the additional computational resources they require. You can always buy hardware, but not time. Third, they re-discovered the fact that a lack of a standard data model leads to design errors and inconsistencies. In spite of these shortcomings OODBMS technology provides effective solutions to a range of data management problems. Many ideas pioneered by OODBMS have proven themselves to be very useful and are also found in ORDBMS. Object-relational systems include features such as complex object extensibility, encapsulation, inheritance, and better interfaces to OO languages.

 

  

ORDBMS

(Object Relational Database Management System)

 

ORDBMS Evolution from RDBMS

One of the most popular sort algorithms is a called insertion sort. Insertion sort is an O (N2) algorithm. Roughly speaking, the time it takes to sort an array of records increases with the square of the number of records involved, so it should only be used for record sets containing less than about 25 rows. But insertion sort is extremely efficient when the input is almost in sorted order, such as when you are sorting the result of an index scan. List presents an implementation of the basic insertion sort algorithm for an array of integers.

 

InsertSort( integer arTypeInput[] )

{ integer nTypeTemp;

integer InSizeArray, nOuter, nInner;

nSizeArray = arTypeInput[].getSize();

for ( nOuter = 2; nOuter <= nSizeArray; nOuter ++)

{

vTypeTemp = arTypeInput[nOuter];

nInner = nOuter - 1;

while (( nInner > 0 ) &&

(arTypeInput[nInner] > vTypeTemp)) {

arTypeInput[nInner+1] =

arTypeInput[nInner];

nInner--;

}

arTypeInput[nInner+1] = vTypeTemp;

}}

 

Sorting algorithms such as this can be generalized to make them work with any data type. A generalized version of this insert sort algorithm appears in list, as shown above. All this algorithm requires is logic to compare two type instances. If one value is greater, the algorithm swaps the two values. In the generalized version of the algorithm, a function pointer is passed into the sort as an argument. A function pointer is simply a reference to a memory address where the Compare() function’s actual implementation can be found. At the critical point, when the algorithm decides whether to swap two values, it passes the data to this function and makes its decision based on the function’s return result.

 

InsertSort( Type arTypeInput[],

(int) Compare( Type, Type) )

{

Type vTypeTemp;

integer InSizeArray, nOuter, nInner;

nSizeArray = arTypeInput[].getSize();

for ( nOuter = 2; nOuter <= nSizeArray; nOuter ++)

{

vTypeTemp = arTypeInput[nOuter];

nInner = nOuter - 1;

while (( nInner > 0 ) &&

Compare(arTypeInput[nInner],vTypeTemp) > 0 )

{

Swap(arTypeInput[nInner+1],arTypeInput[nInner]);

nInner--;

}

Swap(arTypeInput[nInner+1], vTypeTemp);

}}

 

Note how the functionality of the Swap() operation is something that can be handled by IDS without requiring a type specific routine. The ORDBMS knows how big the object being swapped is, and whether the object is in memory or is written out to disk. To use the generalized algorithm in the List, all that IDS needs is the Compare() logic. The ORDBMS handles the looping, branching, and exception handling. Almost all sorting algorithms involve looping and branching around a Compare(), as does B-Tree indexing and aggregate algorithms such as MIN() and MAX(). Part of the process of extending the ORDBMS framework with new data types involves creating functions such as Compare() that IDS can use to manage instances of the type. All data management operations implemented in the ORDBMS is generalized in this way. In an RDBMS, the number of data types was small so that the Compare() routines for each could be hard-coded within the engine.

 

Sort( void ** parRecords, IFX_TYPE * pRecKey ... )

{

switch(pRecKey->typenum)

{

case IFX_INT_TYPE:

case IFX_CHAR_TYPE:

InsertSort(parRecords, IntCompare);

break;

case IFX_FLOAT_TYPE:

InsertSort(parRecords, DoubleCompare);

break;

// etc for each SQL-92 type

default:

ifx_internal_raiseerror('No Such Type');

break;

}}

 

To turn an RDBMS into an ORDBMS, you need to modify the code shown in List to allow the engine to access Compare() routines other than the ones it has built-in. If the data type passed as the second argument is one of the SQL-92 types, the sorting function proceeds as it did before. But if the data type is not one of the SQL-92 types, the ORDBMS assumes it is being asked to sort an extended type Every user-defined function embedded within the ORDBMS is recorded in its system catalogs, which are tables that the DBMS uses to store information about databases. When asked to sort a pile of records using a user-defined type, the ORDBMS looks in these system catalogs to find a user-defined function called compare that takes two instances of the type and returns an INTEGER. If such a function exists, the ORDBMS uses it in place of a built-in logic. If the function is not found, IDS generates an exception. The listing below shows the modified sorting facility.

 

Sort( void ** parRecords, IFX_TYPE * pRecKey . . . )

{

switch(pRecKey->typenum)

{

case IFX_INT_TYPE:

case IFX_CHAR_TYPE:

InsertSort(parRecords, IntCompare);

break;

case IFX_FLOAT_TYPE:

InsertSort (parRecords, DoubleCompare);

break;

// etc for each SQL-92 type

. .

default: // ah! Not SQL-92. Must be user-defined.

if( (pUDRCompare=udr_lookup(pRecKey->typenum,

"Compare")) ==NULL)

ifx_internal_error("No Such Function.');

else

InsertSort (parRecords, pUDRCompare );

break;

}}

 

Using the ORDBMS generalization of the sort utility has several implications:

Ü  When you take a database application developed to use the previous generation of RDBMS technology and upgrade to an ORDBMS, you should see no changes at all. The size of the ORDBMS executable is slightly increased, and there are some new directories and install options, but if all you do is to run the RDBMS application on the ORDBMS, the new code and facilities are never invoked.

Ü  This scheme implies extensive modifications to the RDBMS code. You not only need to find every place in the engine that such modifications are necessary, but also need to provide the infrastructure in the engine to allow the extensions to execute within it.

Ü  Finally, you should know how general such an extensibility mechanism is. As long as you can specify the structure of your data type, and define an algorithm to compare one instance of the type with another, you can use the engine’s facilities to sort or index instances of the type and you can use OR-SQL to specify how you want this done. In practice, renovating an RDBMS in this manner is an incredibly complex and demanding undertaking.

 

ORDBS or Object Relational Database Management System is already defined. Now we are looking forward to its features which are giving different facilities to their users.

 

Features of ORDBMS

 

ORDBMS synthesize the features of RDBMS with the best ideas of OODBMS.

Ü  Although ORDBMS reuse the relational model as SQL interprets it, the OR data model is opened up in novel ways. New data types and functions can be implemented using general-purpose languages such as C and Java. In other words, ORDBMS allow developers to embed new classes of data objects into the relational data model abstraction.

Ü  ORDBMS schema has additional features not present in RDBMS schema. Several OO structural features such as inheritance and polymorphism are part of the OR data model.

Ü  ORDBMS adopt the RDBMS query-centric approach to data management. All data access in an ORDBMS is handled with declarative SQL statements. There is no procedural, or object-at-a-time, navigational interface. ORDBMS persist with the idea of a data language that is fundamentally declarative, and therefore mismatched with procedural OO host languages. This significantly affects the internal design of the ORDBMS engine, and it has profound implications for the interfaces developers use to access the database.

Ü  From a systems architecture point of view, ORDBMS are implemented as a central server program rather than as distributed data architectures typical in OODBMS products. However, ORDBMS extend the functionality of DBMS products significantly, and an information system using an ORDBMS can be deployed over multiple machines. To see what this looks such as, consider again the Employees example. Using an ORDBMS, you would represent this data as shown below.

 

CREATE TABLE Employees (

Name PersonName NOT NULL,

DOB DATE NOT NULL,

Salary Currency NOT NULL,

Address StreetAddress NOT NULL

PRIMARY KEY ( Name, DOB )

);

 

An object-relational table is structurally very similar to its relational counterpart and the same data integrity and physical organization rules can be enforced over it. The difference between object-relational and relational tables can be seen in the section stipulating column types. In the object-relational table, readers familiar with RDBMS should recognize the DATE type, but the other column types are completely new. From an object-oriented point of view, these types correspond to class names, which are software modules that encapsulate state and (as we shall see) behavior.

 

INSERT INTO SendPage

( Pager_Number, Pass_Code, Message )

SELECT E.Pager_Number,

E.Pass_Code,

Print(E.Name) ||

': Call 1-800-TEMPS-R-US for immediate INFORMIX DBA

job'

FROM Temporary_Employees E

WHERE Contains (GeoCircle('(-122.514, 37.221)',

'60 miles')),

E.LivesAt )

AND DocContains ( E.Resume,

'INFORMIX and Database

Administrator')

AND NOT IsBooked ( Period(TODAY, TODAY + 7),

E.Booked );

 

In a SQL-92 DBMS, SendPage could be only a table. The effect of this query would then be to insert some rows into the SendPage table. However, in an ORDBMS, SendPage might actually be an active table, which is an interface to the communications system used to send electronic messages. The effect of this query would then be to communicate with the matching temporary workers!

 

Motivation

Relational model (70’s): Clean and simple.

Ü  Great for administrative data.

Ü  Not as good for other kinds of data (e. g., multimedia, networks, CAD).

Object- Oriented models (80’s): Complicated, but some influential ideas.

Ü  Complex data types.

Ü  Object identity/ references.

Ü  ADTs (encapsulation, behavior goes with data).

Ü  Inheritance.

Idea: Build DBMS based on OO model.

 

 

 

Object- Relational

Idea: Add OO features to the type system of

SQL. I. e. “plain old SQL”, but...

Ü  Columns can be of new types (ADTs)

Ü  User- defined methods on ADTs

Ü  Columns can be of complex types

Ü  Reference types and “deref”

Ü  Inheritance and collection inheritance

Ü  Old SQL schemas still work! (Backwards compatibility)

 

 

Complex Types

User can use type constructors to generate new types:

Ü  Setof ( foo)

Ü  arrayof( foo)

Ü  listof( foo)

Ü  Row (n1 t1... nk tk)

Can be nested:

Ü  Setof (arrayof (int))

 

 

ADTs: User- Defined Atomic Types

Built- in SQL types (int, float, text, etc.) are limited.

Ü  Even these types have simple methods associated with them (math, LIKE, etc.)

ORDBMS: User can define new atomic types (& methods) if a type cannot be naturally defined in terms of the built- in types:

create type jpeg (internallength = variable,

input = jpeg_ in, output = jpeg_ out);

Need input & output methods for types. e. g., Convert from text to internal type and back.

 

Reference Types & Deref.

In most ORDBMS, every object has an OID. So, can “point” to objects --reference types!

Ü  ref ( theater_ t)

Don’t confuse reference and complex types!

Ü  mytheater row( tno integer, name text, address text, phone integer)

Ü  Theaterref( theater_ t)

Both look same at output, but are different!!

Ü  Deletion, update, “sharing”

Ü  Similar to “by value” vs. “by reference” in PL

 

 

Modifications to support ORDBMS

Parsing

Ü  Type- checking for methods pretty complex.

Query Rewriting

Ü  Often useful to turn path expression into joins!

Ü  Collection hierarchy’ Unions

Optimization

Ü  New algebra operators needed for complex types.

o   Must know how to integrate them into optimization.

Ü  WHERE clause expression can be expensive!

 

ORDBMS Advantages

 

ORDBMS technology improves upon what came before it in three ways. The first improvement is that can enhance a system’s overall performance. The IDS product can achieve higher levels of data throughput or better response times than is possible using RDBMS technology and external logic because ORDBMS extensibility makes it possible to move logic to where the data resides, instead of always moving data to where the logic is running. This effect is particularly pronounced for data intensive applications such as decision support systems and in situations in which the objects in your application are large, such as digital signal or time series data. But in general, any application can benefit. The second improvement relates to the way that integrating functional aspects of the information system within the framework provided by the ORDBMS improves the flexibility of the overall system. Multiple unrelated object definitions can be combined within a single ORDBMS database. At runtime, they can be mingled within a query expression created to answer some high-level question. Such flexibility is very important because it reduces the costs associated with information system development and ongoing maintenance. The third benefit of an ORDBMS relates to the way information systems are built and managed. An ORDBMS system catalogs become a metadata repository that records information about the modules of programming logic integrated into the ORDBMS. Over time, as new functionality is added to the application and as the programming staff changes, the system’s catalogs can be used to determine the extent of the current system’s functionality and how it all fits together. The fourth benefit is that the IDS product’s features make it possible to incorporate into the database data sets that are stored outside it. This allows you to build federated databases. From within single servers, you can access data distributed over multiple places.

 

Flexibility and Functionality

 

Explaining flexibility is more difficult because the advantages of flexibility are harder to quantify. But it is probably more valuable than performance over the long run. To understand why, let’s continue with the financial company example.

As it turned out, the more profound result of the integration effort undertaken by our financial firm was that the CalcValue() operation was liberated from its procedural setting. Before, developers had to write and compile (and debug) a procedural C program every time they wanted to use CalcValue() to answer a business question. With the ORDBMS, they found that they could simply write a new OR-SQL query instead. For example, the query in which is Listed below is showing a join involving a table that, while in the database, was beyond the scope of the original (portfolio) problem. “What is the SUM() of the values of instruments in our portfolio grouped by region and sector?”

 

 

 

SELECT IS.Country,

IS.Region,

SUM (CalcValue ( I.Issuer_Class, I.Principal, I.Issue_Date, I.Rate ))

FROM Instruments I, Issuers IS

WHERE I.Portfolio = :Portfolio_Argument

AND I.Issuer = IS.Name

GROUP BY IS.Country, IS.Region;

 

With the addition of a small Web front end, end users could use the database to find out which the most valuable instrument in their portfolio was or which issuer’s instruments performed most poorly, and so on. It was known that the system had the data necessary to answer all these questions before. The problem was that the cost of answering them using C and SQL-92 was simply too high.

 

Maintainability

 

After some investigation, the developers discovered that, over time, there had been several versions of the CalcValue() function. Also, once the CalcValue() algorithm was explained to end users, they had suggestions that might improve it. With the previous system, such speculative changes were extremely difficult to implement. They required a recode-recompile relink redistribute cycle. But with the ORDBMS, the alternative algorithms could be integrated with ease. In fact, none of the components of the overall information system had to be brought down. The developers simply wrote the alternative function in C, compiled it into a shared library, and dynamically linked it into the engine with another name. What all of this demonstrates is that the ORDBMS permitted the financial services company to react to changing business requirements far more rapidly than it could before. By embedding the most valuable modules of logic into a declarative language, they reduced the amount of code they had to write and the amount of time required to implement new system functionality. None of this was interesting when viewed through the narrow perspective of system performance. But is made a tremendous difference to the business’s operational efficiency.

 

 

Object-Relational DBMS Applications

 

Extensible databases provide a significant boost for developers building traditional business data processing applications. By implementing a database that constitutes a better model of the application’s problem domain the information system can be made to provide more flexibility and functionality at lower development cost. Business questions such as the one in the List might be answered in systems using an SQL-92 and C. Doing so, however, involves a more complex implementation and requires considerably more effort. The more important effect of the technology is that it makes it possible to build information systems to address data management problems usually considered to be too difficult.  One way to characterize applications in which an object-relational DBMS is the best platform is to focus on the kind of data involved. For thirty years, software engineers have used the term “data entry” to describe the process by which information enters the system. Human users enter data using a keyboard. Today many information systems employ electronics to capture information. Video cameras, environmental sensors, and specialized monitoring equipment record data in rich media systems, industrial routing applications, and medical imaging systems. Object-relational DBMS technology excels at this kind of application. It would be a mistake to say that ORDBMS are only good for digital content applications. As we shall see in this book OR techniques provide considerable advantages over more low-level RDBMS approaches even in traditional business data processing applications. But as other technology changes move us towards applications in which data is recorded rather than entered, ORDBMS will become increasingly necessary.

 

ORDBMS Concepts and Terminology

 

Application Domain & Description

Complex data analysis: - You can integrate sophisticated statistical and special purpose analytic algorithms into the ORDBMS and use them in knowledge discovery or data mining applications. For example, it is possible to answer questions such as “Which attribute of my potential customers indicates most strongly that they will spend money with me?”

Text and documents: - Simple cases permit you to find all documents that include some word or phrase. More complex uses would include creating a network that reflected similarity between documents.

Digital asset management: - The ORDBMS can manage digital media such as video, audio, and still images. In this context, manage means more than store and retrieve. It also means “convert format,” “detect scene changes in video and extract first frame from new scene,” and even “What MP3 audio tracks do I have that include this sound?”

Geographic data: - For traditional applications, this might involve “Show me the lat/long coordinates corresponding to this street address.” This might be extended to answer requests such as “Show me all house and contents policy holders within a quarter mile of a tidal water body.” For next-generation applications, with a GPS device integrated with a cellular phone, it might even be able to answer the perpetual puzzler “Car 54, where are you?”

Bio-medical: - Modern medicine gathers lots of digital signal data such as CAT scans and ultrasound imagery. In the simplest case, you can use these images to filter out “all cell cultures with probable abnormality.” In the more advanced uses, you can also answer questions such as “show me all the cardiograms which are ‘like’ this cardiogram.”

 

 

 

Data Model

 

A data model is a way of thinking about data, and the object-relational data model amounts to objects in a relational framework. An ORDBMS chief task is to provide a flexible framework for organizing and manipulating software objects corresponding to real-world phenomenon.

The object-relational data model can be broken into three areas:

Ü  Structural Features. This aspect of the data model deals with how a database’s data can be structured or organized.

Ü  Manipulation. Because a single data set often needs to support multiple user views, and because data values need to be continually updated to reflect the state of the world, the data model provides a means to manipulate data.

Ü  Integrity and Security. A DBMS data model allows the developers to declare rules that ensure the correctness of the data values in the database. In the first two chapters of this book, we introduce and describe the features of an ORDBMS that developers use to build information systems.

 

Enhanced Table Structures

 

An OR database consists of a group of tables made up of rows. All rows in a table are structurally identical in that they all consist of a fixed number of values of specific data types stored in columns that are named as part of the table’s definition. The most important distinction between SQL-92 tables and object relational database tables is the way that ORDBMS columns are not limited to a

  

 


standardized set of data types. Figure illustrates what an object-relational table looks like. The first thing to note about this table is the way that its column headings consist of both a name and a data type. Second, note how several columns have internal structure. In a SQL-92 DBMS, such structure would be broken up into several separate columns, and operations over a data value such as Employee’s Name would need to list every component column. Third, this table contains several instances of unconventional data types. Lives At is a geographic point, which is a latitude/longitude pair that describes a position on the globe. Resume contains documents, which is a kind of Binary Large Object (BLOB). In addition to defining the structure of a table, you can include integrity constraints in its definition. Tables should all have a key, which is a subset of attributes whose data values can never be repeated in the table. Keys are not absolutely required as part of the table’s definition, but they are a very good idea. A table can have several keys, but only one of these is granted the title of primary key. In our example table, the combination of the Name and DOB columns contains data values that are unique within the table. On balance, it is far more likely that an end user made a data entry mistake than two employees share names and dates of birth.

 

 

 

Object-Oriented Schema Features

 

Another difference between relational DBMS and ORDBMS is the way in which object-relational database schema supports features co-opted from object-oriented approaches to software engineering. We have already seen that an object-relational table can contain exotic data types. In addition, object-relational tables can be organized into new kinds of relationships, and a table’s columns can contain sets of data objects. In an ORDBMS, tables can be typed; that is, developers can create a table with a record structure that corresponds to the definition of a data type. The type system includes a notion of inheritance in which data types can be organized into hierarchies. This naturally supplies a mechanism whereby tables can be arranged into hierarchies too. In the figure illustrates how the Employees table might look as part of such a

 

 


hierarchy. In most object-oriented development environments, the concept of inheritance is limited to the structure and behavior of object classes. However, in an object-relational database, queries can address data values through the hierarchy. When you write an OR-SQL statement that addresses a table, all the records in its subtables become involved in the query too.

Extensibility: - User-Defined Types and User-Defined Functions

 

The concept of extensibility is a principal innovation of ORDBMS technology One of the problems you encounter developing information systems using SQL-92 is that modeling complex data structures and implementing complex functions can be difficult. One way to deal with this problem is for the DBMS vendor to build more data types and functions into their products. Because the number of interesting new data types is very large, however, this is not a reasonable solution. A better approach is to build the DBMS engine so that it can accept the addition of new, application specific functionality. Developers can specialize or extend many of the features of an ORDBMS: the data types, OR-SQL expressions, the aggregates, and so on. In fact, it is useful to think of the core ORDBMS as being a kind of software backplane, which is a framework into which specialized software modules are embedded.

 

Database Stored Procedures

 

Almost all RDBMS allow you to create database procedures that implement business processes. This allows developers to move considerable portions of an information system’s total functionality into the DBMS. Although centralizing CPU and memory requirements on a single machine can limit scalability, in many situations it can improve the system’s overall throughput and simplify its management. By implementing application objects within the server, using Java, for examples, it becomes possible, though not always desirable, to push code implementing one of an application-level object’s behaviors into the ORDBMS. The interface in the external program simply passes the work back into the IDS engine. Figure represents the contrasting approaches. An important point to remember is that with Java, the same logic can be deployed either within the ORDBMS or within

 


an external program without changing the code in any way, or even recompiling it. The novel idea that the ORDBMS can be used to implement many of the operational features of certain kinds of middleware. Routine extensibility, and particularly the way it can provide the kind of functionality illustrated in Figure, is a practical application of these ideas. But making such system scalable requires using other features of the ORDBMS: the distributed database functionality, commercially available gateways, and the open storage manager (introduced below). Combining these facilities provides the kind of location transparency necessary for the development of distributed information systems.

 

Storage Management

 

Traditionally the main purpose of a DBMS was to centralize and organize data storage. A DBMS program ran on a single, large machine. It would take blocks of that machine’s disk space under its control and store data in them. Over time, RDBMS products came to include ever more sophisticated data structures and ever more efficient techniques for memory caching, data scanning, and storage of large data objects. In spite of these improvements, only a fraction of an organization’s data can ever be brought together into one physical location. Data is often distributed among many systems, which is the consequence of figure. Routine Extensibility and the ORDBMS as the Object-Server Architecture autonomous information systems development using a variety of technologies, or through organizational mergers, or because the data is simply not suitable for storage in any DBMS. To address this, the IDS product adds a new extensible storage management facility.


In the figure, we illustrate this Virtual Table Interface concept. ORDBMS possess storage manager facilities similar to RDBMS. Disk space is taken under the control of the ORDBMS, and data is written into it according to whatever administrative rules are specified. All the indexing, query processing, and cache management techniques that are part of an RDBMS are also used in an ORDBMS. Further, distributed database techniques can be adapted to incorporate user-defined types and functions. However, all of these mechanisms must be re-implemented to generalize them so that they can work for user-defined types. For example, page management is generalized to cope with variable length OPAQUE type objects. You can also integrate code into the engine to implement an entirely new storage manager. Developers still use OR-SQL as the primary interface to this data, but instead of relying on internal storage, the ORDBMS can use the external file system to store data. Any data set that can be represented as a table can be accessed using this technique. Developers can also use this technique to get a snapshot of the current state of live information. It is possible to represent the current state of the operating system’s processes or the current state of the file system as a table. You can imagine, for example, an application intended to help manage a fleet of trucks and employees servicing air conditioners or elevators. Each truck has a device combining a Global Positioning System (GPS) with a cellular phone

that allows a central service to poll all trucks and to have them “phone in” their current location. With the ORDBMS, you can embed Java code that activates the paging and response service to implement a virtual table, and then write queries over this new table, as shown in Listing below. “Find repair trucks and drivers within ’50 miles’ of ‘Alan Turing’ who are qualified to repair the ‘T-20’ air conditioning unit.”

 

SELECT T.Location, T.DriversName

FROM Trucks T, Customers C

WHERE Distance (T.Location, C.Location) < '50 miles'

AND C.Name = 'Alan Turing'

AND DocContains ( T.DriverQualifications,

'Repair for T-20');

 

This functionality should fundamentally change the way you think about a DBMS. In an object-relational DBMS, SQL becomes the medium used to combine components to perform operations on data. Most data will be stored on a local disk under the control of the DBMS, but that is not necessarily the case. The Virtual Table Interface tutorial describes these features in more detail.

 

The Real Benefits of Object-Relational DB-Technology for Object-Oriented Software Development

 

 

Abstract: - Object-oriented programming languages (OOPLs like C++, Java, etc.) have established themselves in the development of complex software systems for more than a decade. With the integration of object-oriented concepts, object-relational database management systems (ORDBMS) aim at supporting new generation software systems better and more efficiently. Facing the situation that nowadays more and more software development teams use OOPLs ‘on top of’ (O) RDBMS, i.e. access (object-) relational databases from applications developed in OOPL, this benefit is on our investigations on assessing the contribution of object-relational database technology to object-oriented software development. First, a conceptual examination shows that there is still a considerable gap between the object-relational paradigm (as represented by the SQL: 1999 standard) and the object-oriented paradigm. Second, empirical studies (performed by using our new benchmark approach) point at mechanisms, which are not part of SQL: 1999 but would allow reducing the mentioned gap. Thus, we encourage the integration of such mechanisms, e. g., support for navigation and complex objects (structured query results), into ORDBMS in order to be really beneficial for new generation software systems.

 

Motivation

 

Object-oriented programming languages (OOPL), such as C++, Java, SmallTalk, etc., have established themselves in the development of complex software system for more than a decade. Both, the structure of these systems as well as the structure of objects managed by these systems have become very complex. Object-oriented concepts offered by OOPL are well suited for managing complex structured objects. However, there are additional requirements, such as persistence and transaction-protected manipulation, which can only be fulfilled efficiently by integrating a database management system (DBMS). Consequently, database technology becomes one of the core technologies of modern software systems. In times of the ‘breakthrough’ of object-oriented system development, two kinds of DBMS were of practical relevance: relational DBMS (RDBMS) and object-oriented DBMS (OODBMS). Using OODBMS has proved inefficient/inflexible for reasons.

 

Using an RDBMS, on one hand, requires to overcome the well known impedance mismatch , i.e., performing the non-trivial task of mapping complex object structures and navigational data processing (at the OOPL layer) to the set-oriented, descriptive query language (SQL92), which supports just a simple, flat data model. Despite this considerable mapping overhead, mature RDBMS technology (index structures, optimization, integrity control, etc.), on the other hand, contributes to keep the overall system performance acceptable. Several commercial systems mapping object-oriented structures onto the relational data model are currently available. Such systems are often referred to as Persistent Object System built on Relation (shortly: POS).

The object-relational wave in database technology has decisively reduced the gap between RDBMS and OOPL. Although object-relational DBMS (ORDBMS) are able to (internally) manage object-oriented structures the required seamless coupling of OOPL and ORDBMS is not yet possible, be-cause (as in SQL92) results of SQL:1999 queries are rather (sets of data) topples than (desired sets of) objects. In summary, the gap between OOPL and ORDBMS can be traced back to a whole bunch of modeling and operational aspects, as we will detail in the following sections. Furthermore, the SQL: 1999 standard and the commercially available ORDBMS differ very much in their object-oriented features. Thus, it is by no means clear, how a given object-oriented design can be mapped to a given ORDBMS (most) efficiently, or which features should be offered by ORDBMS in general in order to enable an efficient mapping of object-oriented structures, respectively. Our long-term objective is to influence the further development of ORDBMS to-wards a better support of object-oriented software development (minimal mapping overhead). Thus, we have proposed a new benchmark approach in allowing to assess a given ORDBMS by taking into account both, its own performance as well as the required mapping overhead.

 

Conceptual Considerations

 

There is a multiplicity of object data models, for example ODMG, UML, COM, C++ and Java. All these models support the basic concepts of the OO paradigm; however, there are certain differences. Independently from the modeling language used in the OO software development (e. g., UML), SQL: 1999 must be coupled with a concrete OOPL. In accordance to their overall relevance and conceptual vicinity, we concentrate on the object model of C++ and ODMG and compare it with the SQL: 1999 standard.

 

Modeling Aspects

Object Orientation in OOPL. The concept object represents the foundation of the OO paradigm. An object is the encapsulation of data representing a semantic unit w. r. t. its structures/values and its behavior. It conforms to a particular class. In fact, a class implements an object type (classification) which is characterized by a name as well as a set of attributes and methods. Each attribute conforms to a certain data type and is either single-valued or set-valued (collection types). Furthermore, a data type can be scalar (e. g., integer, Boolean, string, etc.) or complex. In the latter case corresponding values can be references (association) or objects of other classes (aggregation) so that complex structures can be modeled. A class may implement methods (behavior) which can be invoked in order to change the object’s state. Classes may be arranged within class hierarchies. A class inherits structures and behavior from its super classes (inheritance), but may refine these definitions (specialization). Due to space restrictions we do not give a more detailed description of the OO paradigm, but discuss how the OR data model conforms to OO concepts.

 

Object Orientation in SQL: 1999. While the relational data model (SQL2) did not support semantic modeling concepts sufficiently, in SQL: 1999 the fundamental extension supporting object-orientation is the structured user-defined data type (UDT). UDT, which can be considered as object types, can be treated in the same way as pre-defined data types (built-in data types). Consequently, similarly to the type system of OOPL the type system of SQL: 1999 is extensible. UDT may be complex structured and, therefore, may not only contain predefined data types but also set-valued attributes (collection types) and even other UDT (aggregation) or references (associations). Obviously, UDT are comparable to the classes of the OO paradigm. However, according to the SQL: 1999 standard a UDT must be associated with a table. The notion of typed table, also referred to as object table, allows to persistently managing instances of a certain UDT within a table. Each topple of such a table represents an instance (object) of a particular UDT and is identified by a unique object identifier (OID) which can be sys-tem- generated or user-defined. Besides instantiable UDT, SQL: 1999 also supports non-instantiable UDT, which conforms to the notion of abstract classes in OOPL. In addition, UDT may have methods (behavior) which are either system-generated or implemented by users. They may participate in type hierarchies, in which more specialized types (subtypes) inherit structure and behavior from more general types (super-types), but may specialize corresponding definitions. Thus, SQL: 1999 supports polymorphism and substitutability, however, multiple-inheritance is not supported. Due to the association of UDT with tables SQL: 1999 does not support encapsulation and, consequently, there is nothing like the degree of encapsulation known from OOPL (public, protected, private).

 

 

Operational Aspects

 

Besides the fundamental modeling aspects discussed so far, we also have to examine operational aspects in order to figure out the conceptual distance adequately. The following aspects are most relevant to our consideration:

 

Descriptive Queries vs. Navigational Processing: - While OOPL processing is inherently navigational, SQL supports a set-oriented, descriptive query language. Both navigational and set-oriented query processing are important to modern software systems. Therefore, ORDBMS should also directly facilitate navigational processing to fulfill this requirement of OO applications. Direct support of navigational access by the DBMS would mean that a database object referred to by its OID can be provided as in-stance of an OOPL class. However, to the best of our knowledge none of the currently available ORDBMS directly supports this notion of navigation. A naive coupling of OOPL with descriptive SQL requires to issue one or several corresponding SQL queries to the database for processing a dereferencing operation, e. g. GetObject(Ref), and retrieving the requested object from the database server. Such a processing scheme will surely lead to bad runtime behaviour of the entire system, since the costs of transforming a navigational operation to SQL queries, of evaluating these queries in the DBS, and of the client/ server communication can be very high. Obviously, the lack of DB APIs in directly supporting navigational access impairs the system efficiency badly. Thus, either direct support for navigation 1 must be provided or efficient perfecting mechanisms exploit-1. In section 5 we will see that one of the commercially available ORDBMS provides some basic means for a direct access to objects by OID. Measurement results show that this is at least a step into the right direction set-oriented database access and, thereby, reducing the number of database round-trips must be applied in order to effectively couple OOPLs with ORDBMS.

 

Structured Query Results: - As already mentioned several times before, OOPL support complex structured object types, especially by the possibility of nesting complex data types as well as using collection types and references. We have also mentioned previously that these facilities of modelling complex structured objects have been integrated into SQL by the SQL: 1999 standard. Unfortunately, because of the traditional basic concepts of SQL, complex structures (actually supported both in OOPL and ORDBMS) get lost at the DBMS interface, since only (sets of) simply structured, flat data tuples can be retrieved. Therefore, if we want to couple an OOPL with an (O)RDBMS, it is necessary to separately retrieve simple fragments of complex objects by issuing several SQL queries, and then rebuild complex object structures at the programming language level (see Fig. 1). The mentioned problem even gets worse, if not individual complex objects, but complex structures (object graphs) containing numerous related objects interconnected by object references are to be selected as units. Obviously, the lack of direct support for complex structured objects at the DB API reveals a bottleneck between the two paradigms, and prevents new generation software systems from exploiting the potential power of ORDBMS most effectively.

 

Object Behaviour: -. Of course, the operational aspects also encompass the object behaviour implemented in the database. Because of special implementation aspects these methods (UDF) can almost exclusively be executed at the server side, or, if these UDF or special client invokable pendants are executed at the client side, it cannot be guaranteed that these pendants perform the original semantics. For example, there may be complex dependencies between UDF and integrity constraints, e. g., referential integrity constraints and triggers, which are implemented by using SQL and are automatically ensured by the DBMS. Thus, it is almost impossible to support calling UDF at the OOPL level in the same (‘natural’) way as usually object methods can be called. There-fore, we do not consider a mapping of object methods in this paper and restrict our considerations to navigational and set-oriented access.

 

Mapping Rules

 

In the previous section, we outlined the conceptual distance between the OO paradigm and SQL: 1999. Considering an individual ORDBMS, its OO features determine the overhead which has at least to be spent in order to bridge this distance. Nevertheless, in theory there is an entire spectrum of possibilities to design the required mapping layer. At this point, we want to mention that there are some more aspects of ORDBMS, which OO applications may benefit from, but which cannot be captured in this paper, e. g., and facilities for integrating external data sources into database processing. The other hand, on how far the OO features are to be exploited. Regarding the first point (‘natural’ coding), we demand that the programmer must not be burdened by having to take data management aspects into account. Thus, programming must be independent of the database as well as the mapping layer design. Regarding the second point (degree of exploiting OO features), we want to outline the two extremes of the mentioned spectrum, i.e., pure relational mapping and full exploitation of the OO features offered by the considered ORDBMS

 

Pure Relational Mapping: - As mentioned before, there are several commercial POS mapping OO structures to relational tables. Objects are represented by table rows. Since RDBMS do not support set-valued attributes, user-defined data types, and object references, additional tables are required to store corresponding data and to connect them with the corresponding class tables via foreign keys. Thus, several tables may be required to map a given class. Principally, there are several ways of representing a class hierarchy in the relational model. After studying pros and cons, we decided to use the horizontal partitioning approach, since it provides good performance in most cases, and is also used in most commercial POS.

 

Object-Relational Mapping: - Exploiting the OO features of ORDBMS is commonly argued to be more promising, but the real benefits in comparison to the pure relational mapping are not very well studied yet. This paper will give some performance evaluations later on. Before outlining general mapping rules exploiting OO features of ORDBMS, we have to re-emphasize the following point. Our benchmark approach, which will be out-lined in the subsequent section, assesses a certain ORDBMS by taking the required mapping overhead into account. In order to be fair, the mapping layer used throughout the measurements must be designed in an optimal way w. r. t. the capabilities of the ORDBMS considered. Therefore, the design of the mapping layer may differ with the ORDBMS to be assessed. In the following, we just outline general mapping rules, which are based on the SQL: 1999 concepts, in order to provide some basic understanding on how a mapping layer can be designed. A C++ class maps to a UDT in SQL: 1999. Non-instantiable UDT corresponds to abstract C++ classes. A UDT is associated with exactly one table (typed table) to initialize its instances. Each tuple in this table represents a persistent instance (object) of a particular class and is associated with a system-generated OID. Embedded objects (aggregation) entirely belong to their top-level object and, therefore, do not own an OID. Extents are mapped to the list constructor of C++ STL. Keys are managed at the mapping layer by applying the map constructor of C++ STL. A C++ class hierarchy maps to a hierarchy of structured UDT. However, SQL: 1999 only supports single-inherit-. So that multi-inheritance has to be simulated at the mapping layer. SQL references are mapped to C++ pointers. Since SQL: 1999 does not support diametric references, a relationship type is broken down into two separate primary-key/foreign-key connections and the mapping layer maintains the referential integrity. Except for mutator and observer methods, which are generated by the mapping layer w. r. t. the constraints de-fined in the user database schema, object behaviour is not yet considered in our performance investigation. Navigation is supported by offering the function GetObject(Ref) which, in the case that the DB API does not directly support OID-based object fetching, is implicitly transformed into an SQL query. After having discussed (modelling and operational) discrepancies between ORDBMS and OOPL as well as the mapping rules needed to bridge the gap, we proceed with our performance evaluations.

 

Performance Evaluation

 

Our discussion shows that there is only a small difference between the OO and OR paradigms w. r. t. modelling aspects, but a considerable distance w. r. t. the operational aspects and the application semantics. In order to further evaluate this distance as well as to quantify the overhead required for bridging this gap, we propose a configurable benchmark approach [18, 26]. Remind, we do not consider OODBMS, but ORDBMS, because we more and more have to face the situation that people are using OOPL for software development and (O) RDBMS for data management purposes so that there is a need for a more detailed examination of the efficiency of possible coupling mechanisms. Consequently, the OO-Benchmark representing an important standard for benchmarking OO systems is not appropriate for our purposes. The performance of RDBMS or ORDBMS has traditionally been evaluated in isolation by applying a standard benchmark directly at the DBMS interface. Sample benchmarks are the Wisconsin benchmark, the TPC benchmark as well as the Bucky benchmark [6]. These benchmarks are very suitable for comparing different DBMS with each other. However, none of these benchmarks helps to assess the contributions of a DBMS to OO software development. Consequently, these other approaches do not take the typical application server architecture and the fact that the DBMS capabilities determine the overhead of the required application/mapping layer into account. Furthermore, data types as well as operations of the applications we consider may differ significantly (double-edged sword), so that a standard benchmark can not cover the entire spectrum. Therefore, we propose an open, configurable benchmark system allowing examining the entire system (incl. mapping layer) w. r. t. to its typical applications. Such a system will also help us to get results transcending those reported on in this paper (see succeeding sections), e. g., and more detailed examinations of navigational support. In the following, we outline our first prototype.

 

Benchmark System

 

An open, configurable benchmark system is not necessarily difficult to be applied, as our approach proves. Indeed, our current prototype offers predefined configurations, which w. r. t. data-base size are small, medium and large in order to be sufficiently scalable. Both, structures (data type and type hierarchies) as well as complexity of data in the standard configurations are determined in cooperation with one of the eading software vendors for business standard software and, thus, represent a wide spectrum of typical application domains. In addition, our benchmark system can be simply configured according to the particular properties of a concrete application. Among other possibilities, users can directly control the generation process of the benchmark database, e. g., specify the database size, the complexity of the class hierarchy and the complexity of the individual objects, in order to take care that the special requirements of the application in mind are taken into account. After having generated the benchmark database the user may select from a given set of query templates and indicate how many times each template is to be instantiated.

 

New query templates can be easily added, if the existing templates do not reflect application characteristics sufficiently. Based on these user selections/ specifications the load generator creates a set of queries which is passed to the query executor, which, in turn, serves as a kind of driver for measurements. Users can also specify which kinds of measurement data are to be collected by the system, i. e., amount of time spent at the DB or the mapping layer for query transformation, or the time spent for SQL query evaluation, data loading, and/or result set construction. Corresponding values are collected by the data collector during execution of the query set and after-wards stored in the DBS for further evaluations. As explained in more detail in , the special challenges of this benchmark approach are, on one hand, to properly take into account the requirements of OO system development, and, on the other hand, to guarantee an optimal mapping w. r. t. the particular capabilities of an individual ORDBMS.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Benchmark and Measurements

 

In order to get a complete performance evaluation, we concentrate on answering the following questions: 1. which performance gains offer ORDBMS in comparison to RDBMS regarding their usage in OO software development?

2. Which additional overhead has to be spent at the mapping layer in order to bridge the gap between the OO and OR paradigms and how does it behave facing different query types?

3. To which extent is the system performance influenced by the capabilities of the (O) RDBMS API?

In order to be able to answer the first two questions, we have selected a set of typical benchmarking queries according to a long-term study of a leading software company. These queries represent a wide spectrum of typical operations in the target applications of ORDBMS. We have compared a purely relational mapping with an object-relational one (by means of exploiting its OO modelling power) by using a currently available commercial ORDBMS. This way we ‘measured’ how OO software development can leverage from the OO extensions offered by ORDBMS (e. g., structured UDT, references, etc.). The operations considered for that purpose are implemented as query tem-plates and grouped in following categories:

 

Navigation operations: Navigation operations, such as GetObject(OID), are not directly supported by almost all currently available ORDBMS. Considering such operations helps us to assess the performance of ORDBMS in supporting navigational processing. We hope that corresponding results ‘help’ ORDBMS vendors to make ORDBMS as efficient as OODBMS are in this concern.

Queries with simple predicates on scalar attributes: Queries of this category have simple predicates just containing a single comparison operation on a scalar attribute. This group mainly serves to provide a performance baseline that can be helpful when interpreting results of more complex queries.

Queries with predicates on UDT: This group contains queries with simple predicates (a single comparison operation) on attributes of structured, non-atomic data types. Thus, it mainly serves for assessing the efficiency of mapping UDT to (O) RDBMS.

Queries with predicates on set-valued attributes: This group contains queries with simple predicates (a single IN operation) on nested sets. ORDBMS directly support set-valued data types. In the relational mapping, several tables (according to the degree of nesting), which are connected by primary/foreign-keys, are necessary.

Queries with path predicates: This group contains queries evaluating simple predicates after path traversals. These queries allow to evaluate the efficiency of processing dereferencing operations (path traversals) in ORDBMS.

Queries with complex predicates: Queries of this group contain complex predicates challenging both query transformation as well as query optimization.

Queries on the class hierarchy: While all other queries exclusively deliver direct in-stances of a single queried class, queries of this group deliver transitive instances as well. Predicates conform to those of the second category. This group of queries allows to evaluate the efficiency of the ORDBMS in handling class hierarchies (inheritance). The comparison with the relational mapping has been expected to demonstrate the advantages of ORDBMS. The third question posed at the beginning of this section deals with the capabilities of the DB interface especially w. r. t. support for complex structured objects and navigational access. In order to examine these aspects, we performed measurements on two different (commercially successful) ORDBMS. One of these systems offers the more traditional interface, whereas the second one provides some basic means of supporting complex structures objects. We performed our measurements 4 on a benchmarking database with 100 classes and 250000 instances (configuration medium). In order to use a representatively structured class hierarchy, we studied typical application scenarios of a renowned vendor of business standard software and parameterized our population algorithm accordingly. We measured the database time (DB time) and the total system time (TS time). The DB time of SQL queries is the time elapsed between delegating the queries to the DBMS and receiving back the results (open cursors, traverse iterators). It includes the time for client/ server communication, the time for evaluating the queries within the DBS and the time for loading the complete result sets. This has to be taken into account, when analyzing the measurement results. The TS time is defined as the total elapsed time from issuing a query operation at the OOPL level until having received the complete result set. It contains the time spent within the mapping layer as well as the DB time. We think that these 3 questions have to be answered before we can think about, how OR technology can be improved in order to support OOPL better and more efficiently. In the following section, we report on our measurement results.

 

Measurement Results and Observations

 

ORDBMS vs. RDBMS

 

In the first test series, we have compared a purely relational mapping with an OR mapping by using one of the leading currently available commercial ORDBMS. The hardware and the soft-ware configurations are left unspecified, to avoid the usual legal and competitive problems with publishing performance numbers for commercial products.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


All performance measurements are averages of multiple trials, with more trials for higher variance measurements. For each DBMS tested, we put much effort in optimization (e.g., indexes) and mapping layer design in order to achieve the

 

best performance possible. Vestigation aims at quantifying the benefits of OO extensions offered by ORDBMS in more detail. Due to space restrictions, it is not possible to analyze all results in detail. It can be ob-served that the OR mapping outperforms the purely relational mapping in all query categories. Although the OR mapping shows only tiny ad-vantages in retrieving small result sets, it provides performance gains of up to 40% in retrieving large result sets, or processing queries on class hierarchies. The reasons are twofold. First, the OO features provided by the ORDBMS contribute to keep the complexity of the mapping layer low (better query evaluation strategy, less overhead for synthesizing the result set) and to reduce client/ server communication (less queries). Second, the implementation of the ORDBMS (we used) enables performance improvements, (even) if using OO extensions. In the purely relational case, it is not possible to map a (complex) class to exactly one table. Mapping set-valued attributes, aggregations and (m:n)-relationships properly demands several tables interconnected by primary/foreign keys.

 

DB Time

 

Performance Characteristics of the Mapping Layer

 

In order to characterize the performance of the mapping layer adequately, we have investigated simple queries with different selectivity. The results are presented in Figure. As already mentioned in section 2, the DB APIs of almost all currently available ORDBMS do not support navigational access directly. Hence, navigational operation, such as GetObject (Ref), must be transformed to a database query (SQL: 1999), such as "Select * from... Where OID = Ref", by the mapping layer. Such a query strategy, especially when intensively dealing with navigational operations as usually required by most OO applications, obviously leads to high processing overhead spent in the database system as well as very high communication costs (over 70% of the entire system time). The (probably not very astonishing) observation is that the traditional query strategy is not adequate for supporting navigational access.

   

 


The DB time of set-oriented queries shows only a slight ascent with increasing result sets, while the additional mapping overhead increases rapidly. When retrieving 1250 objects, the time spent at the mapping layer even exceeds 86% of the total system time. This observation can be explained as follows. In the early days of ORDBMS, these systems comparable to RDBMS were not very successful in supporting navigational access, but excellent in processing set-oriented access (as they are still today). Unfortunately, OO applications can hardly benefit from this advantage, because the ORDBMS API is ‘inherited’ from traditional RDBMS and, therefore, still only supports simple, flat data. In lack of an extensible DB API which may generically support complex data types defined by the user, complex objects in an ORDBMS have to be first ‘disassembled’ into scalar values, and afterwards reconstructed (at the mapping layer) to objects of a certain class in the particular OOPL. This kind of overhead gets dramatic with increasing result set cardinality and impairs the en-tire system efficiency significantly. Regarding these measurement results, we can draw the following conclusions. In order to be able to support navigational access better, the ORDB API should directly sup-port navigational operations like GetObject(Ref), so that the costs of transforming navigational operations to SQL queries and for evaluating these queries can be avoided. Furthermore, it should also support the notion of complex objects directly and offer the possibility of retrieving complex objects as units. According to our examinations, such improvements can increase the entire system efficiency by up to 400%.

 

 

Supports for Complex Objects

 

As already mentioned before, the lack of direct support for complex objects and navigational access, the API level extremely impairs the overall system efficiency. Fortunately, a leading ORDBMS vendor already offers an extended call level interface, which, as we can see later in this section, directly supports navigational access as well as retrieval of complex objects as units, and, in addition, even retrieval of complex object graphs as units. Navigation is enabled by the possibility of autonomously retrieving complex structured objects (by OID) as instances of C structures. This simplifies the mapping to OOPL, such as C++, considerably and, therefore, is undoubtedly the first step into the right direction, although this mechanism does not yet supports the actually wanted seamless coupling (transparent transformation from a database object to an instance of an OOPL class). The mentioned support for complex objects at the level of the DB API allows to directly retrieving a complex object’s data from the database into the main memory by specifying its OID or a predicate. Therefore, the expensive query processing strategy described earlier, that can be avoided. Remind that the measurements described in section earlier, that have been performed on an ORDBMS that does not possess a DB API as the one described in this section. To show the importance of and the corresponding demand on a suitable support for complex objects at the DB API level, we repeated the measurements described earlier, that on the ORDBMS referred to in this section and providing the mentioned complex object support at its API. Illustrates the measurement results. Obviously, the additional overhead spent at the mapping layer is now independent from the cardinality of the query result sets. Thus, the direct supports of complex objects at the DB API results in a clear performance gain (up to 400%). The direct support for navigational access at the DB API level mentioned in this section, allowing to directly access objects by calling a function like Get-Object( Ref), avoids expensive processes (query transformation, data types conversion and object re-construction). This obviously contributes to improve performance significantly. Fig. 6a shows a comparison between a query strategy (transforming a navigational operation to an SQL query) and a navigational strategy (directly calling a GetObject(OID) function at the DB API). The advantages of the navigational strategy are obvious. With a direct support of navigational access the entire system efficiency increases by approximately 200%. OO applications often want a set of objects interconnected by object references (object graph) to be retrieved completely within just a single database interaction. Fig. 6b shows a comparison of two strategies for retrieving complex object graphs. In this measurement, we used the ORDBMS directly supporting navigation as well as retrieval of complex object graphs. It can be seen clearly that strategy I exploiting the ability of retrieving object graphs exhibits a performance gain of about 100% already at a result set cardinality of 13 objects. The design of a new DB API, which directly supports complex structured objects, is by no means an easy job and requires generic design methods, because user-defined data types can be arbitrarily structured, e. g., contain other complex data types, such as UDT, references and set-valued attributes. Furthermore, a DB API has always to be multi-lingual requiring to support all common programming languages simultaneously and, therefore, making it very difficult to offer the best of both worlds (DBMS and OOPL) without any compromises.

 

Conclusions

 

We have emphasized the importance of assessing ORDBMS w. r. t. their capabilities of supporting OOPL. We have first qualitatively considered ORDBMS and OOPL regarding modelling and operational aspects relevant for object-oriented software development. As the object-relational (SQL: 1999) and the object-oriented data model are essentially coming together, the operational distance between these two paradigms is still considerable so that an additional mapping layer is necessary to over-come this gap. Regrettably, such an additional layer impairs the performance of the overall system considerably. Additionally, we performed quantitative examinations (measurements) in order to assess ORDBMS in their capabilities of supporting OOPL. Indirectly, these measurements are supposed to contribute to promoting the optimal utilization of currently available ORDBMS in object-oriented system development and to guide the future development of ORDBMS in a way that the support for OOPL is improved. Regarding our performance examinations, we have motivated the necessity of an open, configurable benchmark approach, because not only the performance of ORDBMS themselves but also the additional overhead, which is necessary for bridging the conceptual and operational distance between ORDBMS and OOPL, have to be taken into account and, therefore, properly characterized. It has been clearly illustrated by our performance measurements that object-relational database technology gets more and more mature, not only conceptually (data model, query processing), but also w. r. t. performance. The model facilities contribute to keep the mapping layer ‘thin’ in contrast to RDBMS. This, on one hand, reduces the implementation efforts, and, on the other hand, increases the entire system efficiency. Despite the available object-oriented ex-tensions, which entail an unambiguous gain in comparison to RDBMS, the potential benefit of object-relational database technology in our opinion is not yet exhausted, since the traditional DB API is so far not capable of successfully supporting object-oriented principles. The DB API of almost all ORDBMS still can not support navigational operations and complex object structures directly so that new generation software systems can not take advantage of object-relational database technology in an optimal way. It can be called the ‘bottleneck’ between the object-relational and object-oriented paradigms. Our examinations have shown clearly that a new DB API directly supporting navigational operations and complex objects is necessary. Obviously, it is not easy to equip ORDBMS with such a new interface. For example, such an interface must be multi-lingual. For answering the question, how user-defined data types can be effectively represented at the OOPL level, further research efforts are required. Altogether, the problem of a seamless and effective mapping of SQL: 1999 to OOPL, such as C++ and Java, has to be worked on further. Our future work will mainly be looking for possible solutions. Furthermore, we plan intensive studies of ORDBMS capabilities for supporting navigational access, because ORDBMS are still behind OODBMS in this concern. Generally, after having characterized the performance aspects in more details, our long-term objective is to develop applicable concepts contributing to increase the performance of ORDBMS.

 

 

 

 

 

Bibliography

 

www.yahoo.com

www.askjeeves.com

www.hotmail.com

www.oracle.com

References

1. Bernhard, R., Flehmig, M., Mahdoui, A., Ritter, N., Steiert, H.-P., Zhang, W.P.:

“Building a Persistent Class System on top of (O)RDBMS - Concepts and Evalua-tions”,

Internal Report, University of Kaiserslautern, 1999

2. Bernstein, P.A., Harry, B., Sanders, P.J., Shutt, D., Zander, J.: “The Microsoft Re-pository”,

Proc. VLDB Conf., 1997, pp. 3-12

3. Bernstein, P.A., Pal, S., Shutt, D.: “Context-Based Prefetch for Implementing Ob-jects

on Relations”, Proc. VLDB Conf., 1999, pp. 327-338

4. Bitton, D., DeWitt, D.J., Turbyfill, C.: “Benchmarking Database Systems: A Sys-tematic

Approach”, Proc. VLDB Conf., 1983, pp. 8-19

5. Carey, M.J., DeWitt, D.J.: “Of Objects and Databases: A Decade of Turmoil”,

Proc. VLDB Conf., 1996, pp. 3-14

6. Carey, M.J., DeWitt, D.J., Naughton, J.F., Asgarian, M., Brown, P., Gehrke, J.E.,

Shah, D.N.: “The Bucky Object-Relational Benchmark”, Proc. VLDB Conf., 1996,

pp. 135-146

7. Carey, M.J., DeWitt, D.J., Kant, C., Naughton, J.F.: “A Status Report on the OO7

OODBMS Benchmarking Effort”, Proc. ACM OOPSLA, 1994, pp. 414-426

8. Carey, M.J., Doole, D., Mattos, N.M.: “O-O, What Have They Done to DB2?”,

Proc. 1999 25th. VLDB Conf., pp. 542-553

9. Cattell, R.G.G., Barry, D., Bartels, D., et al: “The Object Database Stand-ard:

ODMG 2.0”, Morgan-Kaufman Publishers, San Mateo, 1997

10. Gray, J.: “The Benchmark Handbook for Database and Transaction Processing

Systems”, Morgen Kaufmann Publishers, San Mateo, CA, USA, 2nd Ed., 1993

11. Gulutzan, P., Pelzer, T.: “SQL-99 Complete, Really”, R&D Publications, 1999

12. Keller, A., Jensen, R., Agrawal, S.: “Persistence Software: Bridging Object-Ori-ented

Programming and Relational Database”, Proc. ACM SIGMOD Conf., 1993,

pp. 523-528

13. Mahnke, W., Steiert, H.-P.: “The Application Protential of ORDBMS in the Design

Environments“, Proc. CAD 2000, Berlin, pp. 219-239 (in German)

14. Ontos Business Data Server, http://www.ontos.com

15. Poet Object Server, POET Software, POET SQL Object Factory, http://poet.com/

16. Rao, B.R.: “Object-oriented Databases: Technology, Applications, and Products”,

McGraw-Hill, New York, 1994

17. RogueWave Software, DBTools.h++,

http://www.roguewav3e.com/products/dbtools/

 





   
Contact Us Privacy Policy Terms of Service Send Feed Back About Us Help  Copy Rights © 2009 AMCY5.COM All Rights Reserved