100% found this document useful (1 vote)
13 views

Ibm Informix Integration Through Data Federation Ibm Redbooks pdf download

The document is an IBM Redbook titled 'IBM Informix: Integration Through Data Federation', which discusses managing information and integrating heterogeneous data sources. It covers data federation concepts, project data sources, and the configuration of DB2 Information Integrator and Informix. The document serves as a guide for users looking to implement data federation solutions using IBM technologies.

Uploaded by

zlimagabca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
13 views

Ibm Informix Integration Through Data Federation Ibm Redbooks pdf download

The document is an IBM Redbook titled 'IBM Informix: Integration Through Data Federation', which discusses managing information and integrating heterogeneous data sources. It covers data federation concepts, project data sources, and the configuration of DB2 Information Integrator and Informix. The document serves as a guide for users looking to implement data federation solutions using IBM technologies.

Uploaded by

zlimagabca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

Ibm Informix Integration Through Data Federation

Ibm Redbooks download

https://ebookbell.com/product/ibm-informix-integration-through-
data-federation-ibm-redbooks-51388476

Explore and download more ebooks at ebookbell.com


Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Informix Dynamic Server 11 Advanced Functionality For Modern Business


Chuck Ballard Carlton Doe Ajay Gupta Randy House Alexander Koerner
Vijay Lolabattu Keshava Murthy Jacques Roy Karl Schimmel Richard Snoke

https://ebookbell.com/product/informix-dynamic-server-11-advanced-
functionality-for-modern-business-chuck-ballard-carlton-doe-ajay-
gupta-randy-house-alexander-koerner-vijay-lolabattu-keshava-murthy-
jacques-roy-karl-schimmel-richard-snoke-23686160

Informix Dynamic Server V10 Extended Functionality For Modern Business


Chuck Ballard Carlton Doe Alexander Koerner Anup Nair Jacques Roy Dick
Snoke Ravi Vijay

https://ebookbell.com/product/informix-dynamic-server-v10-extended-
functionality-for-modern-business-chuck-ballard-carlton-doe-alexander-
koerner-anup-nair-jacques-roy-dick-snoke-ravi-vijay-23686164

Rational Application Developer V6 Programming Guide 2 Volume Set Ibm


Redbooks

https://ebookbell.com/product/rational-application-
developer-v6-programming-guide-2-volume-set-ibm-redbooks-2104608

Ibm Framework For Ebusiness Technology Solution And Design Overview


Ibm Redbooks Ibm Redbooks

https://ebookbell.com/product/ibm-framework-for-ebusiness-technology-
solution-and-design-overview-ibm-redbooks-ibm-redbooks-2202548
Websphere Business Integration Message Broker Basics Ibm Redbooks

https://ebookbell.com/product/websphere-business-integration-message-
broker-basics-ibm-redbooks-2345584

Websphere Business Integration Adapters An Adapter Development And


Websphere Business Integration Solution Ibm Redbooks

https://ebookbell.com/product/websphere-business-integration-adapters-
an-adapter-development-and-websphere-business-integration-solution-
ibm-redbooks-2445116

Ibm Tivoli Storage Management Concepts Ibm Redbooks

https://ebookbell.com/product/ibm-tivoli-storage-management-concepts-
ibm-redbooks-2487464

Enhance Your Business Applications Simple Integration Of Advanced Data


Mining Functions Ibm Redbooks

https://ebookbell.com/product/enhance-your-business-applications-
simple-integration-of-advanced-data-mining-functions-ibm-
redbooks-2509754

Understanding Ldap Design And Implementation Ibm Redbooks

https://ebookbell.com/product/understanding-ldap-design-and-
implementation-ibm-redbooks-2620322
Front cover

IBM Informix:
Integration Through
Data Federation
Managing information and protecting
development assets

Integrating heterogeneous
sources of data

Data federation and join


optimization

Chuck Ballard
Nigel Davies
Marcelo Gavazzi
Martin Lurie
Jochen Stephani

ibm.com/redbooks
International Technical Support Organization

IBM Informix: Integration Through Data Federation

August 2003

SG24-7032-00
Note: Before using this information and the product it supports, read the information in
“Notices” on page vii.

First Edition (August 2003)

This edition applies to DB2 Information Integrator Version 8.1, Informix Enterprise Gateway
Manager Version 7.3.1, Informix Dynamic Server Version 9.4, Informix Extended Parallel Server
Version 8.4, IBM Red Brick Version 6.2, DB2 Version 8.1, and Oracle9i.

© Copyright International Business Machines Corporation 2003. All rights reserved.


Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP
Schedule Contract with IBM Corp.
Contents

Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
The team that wrote this redbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Comments welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Executive summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Chapter 1. Data federation overview . . . . . . . . . ...... ....... ...... .. 13


1.1 The challenge of information integration . . . . . ...... ....... ...... .. 14
1.2 To federate or not to federate . . . . . . . . . . . . . ...... ....... ...... .. 19
1.3 Data federation examples . . . . . . . . . . . . . . . . ...... ....... ...... .. 23
1.3.1 Data federation with DB2 . . . . . . . . . . . . ...... ....... ...... .. 25
1.3.2 Data federation with Informix . . . . . . . . . ...... ....... ...... .. 26
1.4 Considerations . . . . . . . . . . . . . . . . . . . . . . . . ...... ....... ...... .. 28
1.4.1 Naming conventions . . . . . . . . . . . . . . . . ...... ....... ...... .. 28
1.4.2 Data types . . . . . . . . . . . . . . . . . . . . . . . . ...... ....... ...... .. 29

Chapter 2. The data federation project . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31


2.1 Environment and server configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1.1 Client configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.1.2 Database connectivity using DB2 Information Integrator . . . . . . . . . 35
2.1.3 Database connectivity using Informix EGM . . . . . . . . . . . . . . . . . . . 36
2.1.4 Other configuration options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2 Schema definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3 Naming conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3.1 Database servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.3.2 Database identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.3.3 References to remote tables within the federated database . . . . . . . 45
2.3.4 User IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Chapter 3. Overview of project data sources . . . . . . . . . . . . . . . ...... .. 47


3.1 DB2 UDB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...... .. 49
3.1.1 DB2 UDB architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...... .. 49
3.1.2 DB2 UDB functions and features . . . . . . . . . . . . . . . . . . . ...... .. 52
3.2 Informix Dynamic Server (IDS) . . . . . . . . . . . . . . . . . . . . . . . . . ...... .. 53

© Copyright IBM Corp. 2003. All rights reserved. iii


3.2.1 IDS architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2.2 Functions and features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3 Informix Extended Parallel Server (XPS) . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3.1 XPS architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3.2 Functions and features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4 IBM Red Brick Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4.1 Red Brick architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4.2 Functions and features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.5 Oracle9i and Microsoft Excel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Chapter 4. DB2 Information Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73


4.1 Product overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.1.1 Systems environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.1.2 Components and packaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2 Installation and configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2.1 Pre-installation requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2.2 Installation instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2.3 Post-installation tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.3 Creating and using wrappers and nicknames . . . . . . . . . . . . . . . . . . . . . . 97
4.3.1 Create a wrapper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.3.2 Define a server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3.3 Define user mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.3.4 Create and authorize nicknames . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.3.5 Testing the connection to a data source . . . . . . . . . . . . . . . . . . . . . 107
4.4 Considerations for use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.4.1 Statistics and index specifications. . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.4.2 Informix synonyms and I-STAR . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.4.3 Transaction support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.4.4 User mapping authentication for Informix data sources . . . . . . . . . 113
4.4.5 Pass-through . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.4.6 Data type mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.4.7 Server options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.4.8 Capacity planning for DB2 Information Integrator . . . . . . . . . . . . . . 119
4.5 Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.5.1 Connecting to data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.5.2 Error messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.5.3 Informix environment settings and DB2DJ.INI file. . . . . . . . . . . . . . 120
4.5.4 Capturing server options and nickname definitions . . . . . . . . . . . . 121
4.5.5 Access plans and the query optimizer . . . . . . . . . . . . . . . . . . . . . . 123
4.5.6 Diagnostic information to collect . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.5.7 Tracing DB2 Information Integrator wrapper calls. . . . . . . . . . . . . . 126

Chapter 5. Informix Enterprise Gateway Manager . . . . . . . . . . . . . . . . . . 131

iv IBM Informix: Integration Through Data Federation


5.1 Product overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.1.1 Systems environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.2 Installation and configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.2.1 Pre-installation requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.2.2 Installation instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.2.3 Configuration tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.3 Considerations for use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.3.1 Using non-Informix SQL statements . . . . . . . . . . . . . . . . . . . . . . . . 149
5.3.2 Object names in queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.3.3 Using views versus synonyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.3.4 Using Informix 4GL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.3.5 Distributed transactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.3.6 Multiple data sources within a single application . . . . . . . . . . . . . . 151
5.3.7 Effect of accuracy of statistical information . . . . . . . . . . . . . . . . . . . 151
5.3.8 Effect of Informix style system catalog . . . . . . . . . . . . . . . . . . . . . . 152
5.3.9 Oracle date and datetime conversion . . . . . . . . . . . . . . . . . . . . . . . 152
5.3.10 Using cursors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.3.11 Temporary files for cursor handling. . . . . . . . . . . . . . . . . . . . . . . . 153
5.3.12 Limiting the number of rows returned in a query . . . . . . . . . . . . . . 153
5.4 Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

Chapter 6. Data federation in action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159


6.1 Test application examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.1.1 Query 1 - Summarizing regional data . . . . . . . . . . . . . . . . . . . . . . . 160
6.1.2 Query 2 - Federated views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.1.3 Query 3 - Consolidating the data . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.2 Testing with DB2 Information Integrator . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.2.1 Join and insert testing with DB2 II . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.3 Testing with Informix Enterprise Gateway Manager . . . . . . . . . . . . . . . . 181
6.3.1 Join and insert testing with EGM. . . . . . . . . . . . . . . . . . . . . . . . . . . 182
6.3.2 Updating multiple data sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

Chapter 7. Optimization in a federated environment . . . . . . . . . . . . . . . . 193


7.1 Performance options and considerations . . . . . . . . . . . . . . . . . . . . . . . . 194
7.2 Remote data source tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.3 Server options in DB2 II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.3.1 Pushdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.3.2 Maximal pushdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
7.3.3 Collating sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.4 Use of Materialized Query Tables (MQTs) . . . . . . . . . . . . . . . . . . . . . . . 204
7.5 Remote data source catalog statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . 209
7.6 Optimizer hints in query text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

Chapter 8. Hints and tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Contents v
8.1 Federated inserts with Informix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
8.2 Using DATE columns for a Union operation . . . . . . . . . . . . . . . . . . . . . . 219
8.3 Use of current schema with DB2 Interactive Explain . . . . . . . . . . . . . . . 219
8.4 DB2 problem determination series tutorial . . . . . . . . . . . . . . . . . . . . . . . 220
8.5 Losing connection to a data source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
8.6 Using views versus synonyms with data federation . . . . . . . . . . . . . . . . 220

Appendix A. DB2 II and Informix EGM function summary . . . . . . . . . . . 225


Data federation summary table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

Appendix B. Nonrelational wrappers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229


Microsoft Excel wrapper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
XML wrapper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241


IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

vi IBM Informix: Integration Through Data Federation


Notices

This information was developed for products and services offered in the U.S.A.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area.
Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product, program, or service that
does not infringe any IBM intellectual property right may be used instead. However, it is the user's
responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such provisions
are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES
THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.

Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm
the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on
the capabilities of non-IBM products should be addressed to the suppliers of those products.

This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.

COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the
sample programs are written. These examples have not been thoroughly tested under all conditions. IBM,
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy,
modify, and distribute these sample programs in any form without payment to IBM for the purposes of
developing, using, marketing, or distributing application programs conforming to IBM's application
programming interfaces.

© Copyright IBM Corp. 2003. All rights reserved. vii


Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:

AIX® ™ NetVista™
CT™ ^™ Notes®
DataBlade™ Informix® OS/390®
DataJoiner® IBM® Red Brick™
Distributed Relational Database ibm.com® Red Brick Vista®
Architecture™ IMS™ Redbooks™
DB2 Universal Database™ iSeries™ Redbooks (logo) ™
DB2® Lotus Notes® Tivoli®
DRDA® Lotus® z/OS™

The following terms are trademarks of other companies:

ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel Corporation in the United
States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the
United States, other countries, or both.

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun
Microsystems, Inc. in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Other company, product, and service names may be trademarks or service marks of others.

viii IBM Informix: Integration Through Data Federation


Preface

This IBM® Redbook is primarily intended for use by IBM Informix® Customers
and Business Partners, but much of the information in this publication can be
used in any data federation implementation. The purpose is to demonstrate how
your heterogeneous sources of information can be federated and enable the
implementation, use, and maintenance of a robust, managed information
environment.

The information in this publication will help you understand how the IBM Informix
database family of products can better be used together, and with other
information sources. The primary focus is on moving towards an integrated
information environment. There are several ways to provide this type of
environment, and we will discuss those alternatives.

We demonstrate how you can start down the path towards an integrated
information environment through the use of data federation technology. There are
two primary tools available to help you along, and we will discuss them and their
capabilities in later chapters. Those tools are DB2® Information Integrator and
Informix Enterprise Gateway Manager.

We have implemented Informix Dynamic Server (IDS), Informix Extended


Parallel Server (XPS), IBM Red Brick™ Warehouse, DB2 UDB, and Oracle9i, in
a federated data implementation, by using DB2 Information Integrator and
Informix Enterprise Gateway Manager. Then we provide examples of the
capabilities of those two data federation products, and compare them at a high
level to give you an understanding of the capabilities of both.

In addition to data federation and integration, there are other related topics of
interest in this publication. The following is a brief description of the topics and
how this publication is organized:
򐂰 “Introduction” on page 1 provides a high level overview of the contents of this
publication. It enables the reader to get a basic understanding of the contents
and conclusions presented in this publication, without the requirement of
reading all the detailed and technical supporting information.
򐂰 Chapter 1, “Data federation overview” on page 13, gives an overview of the
topic of data federation. It explains why it is strategic and describes the
benefits. Some best practices are provided to help guide you through
planning for and implementing your federated data environment. It can deliver
significant advantages and enable you to get faster and easier access to your
information assets.

© Copyright IBM Corp. 2003. All rights reserved. ix


򐂰 Chapter 2, “The data federation project” on page 31, describes the project
environment and the systems architecture used during the development of
this publication. It defines the hardware and software products used, and the
operating systems. It also describes the database and data schema used for
our testing. We used the Informix Stores Demo database as the basis, which
is typically very familiar to Informix customers. However, the schema was
modified somewhat to provide the tables and data distribution required in our
federated environment. The sample application that is part of the Stores
Demo database was used to simulate a real production environment for our
testing.
򐂰 Chapter 3, “Overview of project data sources” on page 47, presents an
overview of products used for our federated data environment. This chapter is
primarily for those not familiar with the database management systems
(DBMS) used. We take a brief look at the architecture, and some of the
features and functions that will be valuable in configuring the databases for
use in a federated environment. The DBMS’s used were DB2 UDB, Informix
Dynamic Server, Informix Extended Parallel Server, IBM Red Brick
Warehouse, and Oracle9i. We also demonstrated data federation with
Microsoft Excel and XML files. This topic is discussed in Appendix B,
“Nonrelational wrappers” on page 229
򐂰 Chapter 4, “DB2 Information Integrator” on page 73, is an overview of the
DB2 Information Integrator product. It became generally available this year
and is well positioned to help access, manipulate, and integrate the many
data sources present in most typical companies. We cover the architecture,
some functions and features, and instructions for installing and configuring
the product. In addition, we provide some considerations for its use to help in
your implementation project.
򐂰 Chapter 5, “Informix Enterprise Gateway Manager” on page 131, is an
overview of the Informix Enterprise Gateway Manager (EGM) product. This is
a product with similar functionality to the DB2 Information Integrator, that has
been available from Informix. We describe the architecture, some of the
functions and features, and instructions for installing and configuring the
product. In addition, we provide some considerations for its use.
򐂰 Chapter 6, “Data federation in action” on page 159, takes you through some
of the testing we did with the data federation products. We describe the
sample queries used and present the results of our tests. The information
primarily centers around join execution in a heterogeneous data source
environment. Most of the tests were performed with both the DB2 Information
Integrator and the Informix Enterprise Gateway Manager products, as there is
interest in both products.
򐂰 Chapter 7, “Optimization in a federated environment” on page 193, takes the
testing in Chapter 6, “Data federation in action” on page 159, a step further. In
Chapter 6 we focused on the capability of join processing. This chapter

x IBM Informix: Integration Through Data Federation


focuses on how well, and how fast, those joins can be performed. The
objective here was to test the optimization capabilities of the products. This is
a very important consideration that can impact resource utilization,
performance, and response times.
򐂰 Chapter 8, “Hints and tips” on page 217, presents some hints and tips to help
you as you implement your federated environment. These are typically
situations and issues we encountered in our installation and testing. We
provide you with the solution or workaround to help make your
implementation easier and faster.
򐂰 Appendix A, “DB2 II and Informix EGM function summary” on page 225,
presents a summary table of the high-level functions and features desirable in
a data federation environment. Then we list if, and how well, the DB2
Information Integrator and Informix Enterprise Gateway Manager products
provide that capability. This helps clarify some of the basic capabilities of each
product.
򐂰 Appendix B, “Nonrelational wrappers” on page 229, presents information on
two of the DB2 Information Integrator wrappers that can be used for
nonrelational data sources. In particular, those are Microsoft Excel and XML
files. In each case we demonstrate how data from relational tables and these
nonrelational files can be joined together as if the data were from the same
data source.

The team that wrote this redbook


This redbook was produced by a team of specialists from around the world. Four
members worked at the International Technical Support Organization, in San
José, California and one worked remotely in Boston, Massachusetts.

Preface xi
Figure 1 Left to right: Nigel Davies, Marcelo Gavazzi, Chuck Ballard, Jochen Stephani

Chuck Ballard is a Project Leader at the International Technical Support


Organization, in San Jose, California. He has over 35 years of experience,
holding positions in the areas of product engineering, sales, marketing, technical
support, and management. His expertise is primarily in the areas of database,
data management, data warehousing, business intelligence, and process
re-engineering. He has written extensively on these subjects, taught classes, and
presented at conferences and seminars worldwide. Chuck has both a Bachelors
degree and a Masters degree in Industrial Engineering from Purdue University.

Nigel Davies is an IT Architect working at the Cumberland Forest Application


Centre in Sydney for IBM Global Services Australia. Although now living and
working in Australia, he is originally from the United Kingdom, where he earned
his Masters degree in Chemistry at Oxford University. He has over 22 years of
experience in the IT industry. During this time he has worked as a programmer,
designer, business analyst, DBA, and even project manager, before finally
settling down as an IT architect. Nigel has been working with DB2 since 1987
and has previously published articles on DB2 performance for the UK trade
journal Info Plus. He has worked for IBM GSA for seven years.

xii IBM Informix: Integration Through Data Federation


Marcelo Gavazzi is an Advisory Software Engineer within the IBM Software
Group organization working at the Latin America Call Center in Miami, Florida.
He is originally from Brazil and holds a degree in Computer Sciences from
Universidade Mackenzie. He has more than ten years of experience supporting
RDBMS systems, including Informix database servers and DB2 UDB. Marcelo is
a certified “Informix Dial-up engineer” for interventions on critical systems
throughout the Latin America region, providing diagnostics and fixes in the
emergency support line. He is also involved in training projects disseminating
new technologies to the technical support community. He holds professional
certifications on both Informix and DB2 UDB products.

Jochen Stephani is a DM Consultant/IT Specialist working on the Technical


Sales Team in IBM’s Information Management Division in Munich, Germany. He
joined Informix in April 1996 and has more than seven years of experience
working with Informix databases on high-end OLTP and business intelligence
projects, including benchmarks, prototyping, and performance tuning. He has
contributed to many important projects across the region and is a ’trusted
advisor’ for strategic customers. Based on his
experience and product skills, his major focus as a
DM Consultant is to provide architectural guidance.
He holds a degree in Computer Sciences from the
Fachhochschule of Rosenheim.

Martin Lurie (pictured at left) was the remote team


member. Marty is a Systems Engineer in IBM’s Data
Management Division, and is located in Boston,
Massachusetts. He is an IBM-certified DB2 DBA, IBM
certified Business Intelligence Solutions Professional, and an Informix-certified
Professional.

Other Contributors
Thanks to the following people for their contributions to this project:
Omkar Nimbalkar - Informix Product Management and Marketing
Dwaine Snow - Product Manager, Informix Dynamic Server
Patricia Quinn - Brand Manager, Informix Dynamic Server
Mohan Natraj - WW Brand Manager, XPS and Red Brick Warehouse
IBM Informix Marketing, Support, and Development

Benjamin Wilde - DB2 II Development Manager


Walter Alvey - DB2 DataJoiner® Development
Kenneth Gee - Red Brick Warehouse Support

Preface xiii
Yang Jason Sun - UDB Query Compiler
Jacques Labrie - DB2 II Scenario Testing
Yannick Barel - Information Integration Technology Solutions
Josette Huang - Manager, Information Integration
Glenn Smith - Information Integration Install Developer
Joe Baginski - Manager, Information Integration Build and Install
IBM Silicon Valley Development Lab

Kenro Yamagata - Information Management Technology Development


Yamato Software Development Lab - IBM Japan

Christina Hanna - Brio IT Specialist


Cumberland Forest Application Centre, Sydney Australia

Mary Comianos - Operations and Communications


Deanna Polm - Residency Administration
Emma Jacobs - Graphics Designer
International Technical Support Organization, San Jose Center

Julie Czubik - Technical Editor


International Technical Support Organization, Poughkeepsie Center

Become a published author


Join us for a two- to six-week residency program! Help write an IBM Redbook
dealing with specific products or solutions, while getting hands-on experience
with leading-edge technologies. You'll team with IBM technical professionals,
Business Partners and/or customers.

Your efforts will help increase product acceptance and customer satisfaction. As
a bonus, you'll develop a network of contacts in IBM development labs, and
increase your productivity and marketability.

Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html

Comments welcome
Your comments are important to us!

xiv IBM Informix: Integration Through Data Federation


We want our Redbooks™ to be as helpful as possible. Send us your comments
about this or other Redbooks in one of the following ways:
򐂰 Use the online Contact us review redbook form found at:
ibm.com/redbooks
򐂰 Send your comments in an Internet note to:
[email protected]
򐂰 Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. QXXE Building 80-E2
650 Harry Road
San Jose, California 95120-6099

Preface xv
xvi IBM Informix: Integration Through Data Federation
Introduction

Information technology is constantly changing, and improvements in hardware


and software are coming at an ever increasing pace. At the same time, the
business environment is also changing rapidly. For example, the results of the
many acquisitions and mergers are having a dramatic impact on those
businesses. This is because each business typically brings with it different
business processes, hardware, software, and applications, and they are all
incompatible with each other. The burden of keeping it all working, so the
business can continue to function, falls primarily on the Information Technology
(IT) organizations.

This fast moving, heterogeneous, and ever growing type of business environment
becomes a stumbling block in making the acquisition, merger, or move to a new
technology successful. Because business cannot afford to stop doing business
until all the processes are integrated and working smoothly, it can be a huge
challenge. It can impact day-to-day business processes, the business transaction
processing systems, and the population of data warehouses that are so critical in
helping management make timely and informed decisions.

Data federation to the rescue


Many businesses are now looking to data federation technology to remove this
stumbling block. Data federation is not really new—the concepts have been
around for some time. However, the technology has not been there to support it.
That technology is here now, and IBM has it. There is a brief overview that

© Copyright IBM Corp. 2003. All rights reserved. 1


follows, but for more detailed information please read Chapter 1, “Data federation
overview” on page 13.

Defining data federation


Data federation is a process that enables data from multiple heterogeneous data
sources to appear as if it is contained in a single relational database. We use the
terms data federation and information integration in this publication. However, we
do not consider these two terms to mean the same thing. In this publication we
primarily focus on the topic of data federation.

While we are discussing terms, be aware also that the terms information
integration or data integration are typically used rather loosely around the world.
You must always insist on definitions of these terms rather than accepting them
at face value. There can be varying levels of integration, but any form of
integration is typically quite difficult, resource intensive to achieve, and quite
expensive. But, depending on your definition, it could be said that data federation
is actually some level of information integration.

Data federation uses a single language to access all data sources, and that is
SQL. So, all data can be accessed and used from an application program, tool,
or program product using the same relational SQL. This means that companies
can begin moving to a standardized data access language for all their data
sources, which can mean a huge productivity boost for them. It also means huge
savings in finding, training, and maintaining skilled programming resources. This
standardization should result in a reduction in the number of different skilled
resources that a company will require to manage, maintain, and use their legacy
data.

How data federation works


Each heterogeneous data source is defined and described to the data federator,
as virtual tables. The data sources can reside anywhere, so that definition
contains the access path to the real location of the data. There is also a
mechanism that transforms the federator SQL into the language used by the real
data source. In the case of the DB2 Information Integrator product, that
mechanism is called a wrapper.

However, there is more to data federation than simple data access. Data from
these multiple heterogeneous data sources typically will need, as examples, to
be joined together, aggregated, and summarized. This is not a simple task,
especially when the data sources reside in distributed systems in different
operating environments. Also, in addition to accessing the data, performance and
resource requirements must be considered. For example, where will these
operations take place? Will all the data have to be transferred back to the
federator? What will be the impact on the network?

2 IBM Informix: Data Federation


All of these concerns must be addressed by the federator if it is to provide
adequate performance and result in the savings promised from reduced resource
requirements. These types of concerns and considerations are discussed in
various chapters of this publication. In general, they all fall into a category we
refer to as optimization. That is, we need to optimize the access and use of the
heterogeneous data sources so it will satisfy the performance and response time
criteria of our users.

What is included in this redbook


This publication contains information on the topic of data federation and
information integration. It is primarily oriented to users of Informix database
management systems, which includes Informix Dynamic Server, Extended
Parallel Server, and Red Brick Warehouse. We wanted to provide information
that would describe how they can implement data federation and gain the
benefits from it. So, we define and position data federation and information
integration with respect to this particular set of users.

To do so, we have included other data sources that are also typically
implemented by this set of users. For example, we also include Oracle9i, DB2
UDB, and Microsoft Excel. We discuss the concepts of, and considerations for
using, data federation. The benefits are many and are discussed in numerous
books and articles on the subject. We present some examples of those benefits
in this publication just to give you a feel for why you should consider
implementing a federated data environment.

We implemented a federated data environment during the development of this


publication and ran numerous tests on it to understand its behavior. We discuss
the results of that testing in the chapters that follow. We also provide
considerations for you as you prepare to implement your federated environment.
In addition to the obvious benefits of being able to work with data from
heterogeneous sources, there are many other tangible benefits to data
federation. For example, it can enable you to begin standardizing your application
development efforts. This can be a huge benefit. It impacts development times,
the type of development resources needed, and the availability of skilled
resources.

Data federation is described and discussed in later chapters of this publication,


so we will keep the discussion here very short. Data federation is simply a means
of enabling you to access and use data from heterogeneous sources as if they
were all the same source. Think about the savings that means to you and your
application development organization. Think about what it means for easing the
burdens during a merger or acquisition. Think about the effort it can save as you
move to new hardware and software technologies.

Introduction 3
All data federation implementations are not alike. You must understand the
concepts, but you also must understand the criteria that can enable you to
evaluate the candidates. You must come to know the functions, features,
performance, ease of use, resource requirements, and future product direction,
to select wisely. Once you know all this, we think you will select DB2 Information
Integrator.

The project systems environment


In this publication we give you information to help in the installation,
configuration, use, and management of a federated data environment. With a
product such as DB2 Information Integrator (II); for example, you can access
data from many data sources, such as DB2, Informix, Microsoft, ODBC, Oracle,
Sybase, Teradata, OLE DB, Excel, XML, message queues, Web services, flat
files, life sciences data sources, and IBM Lotus Extended Search sources such
as content repositories, LDAP directories, WWW, e-mail databases, and
syndicated content. The good news is that you can access all these sources as if
they were the same source. That can represent a huge savings in development
time, requirement for skilled resources, and conversion or migration costs.

There are two products available from IBM that can participate in a federated
data environment. We chose to test both of these products and document the
results in this publication.They are:
򐂰 DB2 Information Integrator (II)
򐂰 Informix Enterprise Gateway Manager (EGM)

However, in addition to the above products there is an additional capability for


federating data in an Informix database environment, called I-STAR. This
capability is built into the Informix Dynamic Server and Extended Parallel Server.
Therefore, when these two data sources are used together, I-STAR will typically
be used to perform the data joins. Also, we have demonstrated that even when
these two data sources are used in conjunction with other heterogeneous
sources, I-STAR may be invoked during the federation of the data. This verifies
yet another level of federation that is possible in an Informix database
environment. For a definition of I-STAR refer to “Informix I-STAR Technology” on
page 27.

We configured a test environment that would appropriate for most IBM Informix
customers. That environment consisted of the following five database
management systems:
򐂰 Informix Dynamic Server V9.4
򐂰 Informix Extended Parallel Server V8.4
򐂰 IBM Red Brick Warehouse V6.2
򐂰 DB2 UDB V8.1
򐂰 Oracle9i

4 IBM Informix: Data Federation


The tests were to demonstrate that the heterogeneous data sources could be
accessed and joined together with the federation products. We also explored the
architectures of those products and their capabilities. There is a function/feature
summary table in Appendix A, “DB2 II and Informix EGM function summary” on
page 225. This information will help you evaluate their capabilities and determine
which you can best use to work with the many other sources of data that are out
there—especially those already in your own organization.

To help you get you started towards implementing a federated environment, we


implemented the products ourselves so we could provide you with some
guidelines and instructions to make your implementation easier and faster. For
our implementation, we used three Intel-based IBM servers, each with 760 MB of
memory. They had the following software configuration:
򐂰 Server 1 was running SuSE Linux V8.1 with the following software:
– Informix IDS V9.4
– Informix Extended Parallel Server V8.4
– IBM Red Brick Warehouse V6.2
– DB2 UDB V8.1
򐂰 Server 2 was running Windows/2000 with the following software:
– Informix IDS V9.4
– DB2 UDB V8.1
– Oracle9i
– DB2 Information Integrator V8.1
򐂰 Server 3 was running Windows/2000 with the following software:
– Informix Enterprise Gateway Manager V7.31

Servers 1 and 2 were primarily database servers. We used Server 3, a


Windows/2000 server, to run Informix Enterprise Gateway Manager. Although it
could have run on Server 1 or Server 2, we configured a third server as a test
variable. We wanted to demonstrate a feature of this product that enables it to
run on a separate processor that has no local RDBMS. This feature worked
properly and there were no problems encountered that were a function of running
on this separate server.

Rather than writing sample applications to test the federated environment we


decided to use existing query products. The schema for the Informix Stores
Demo database that we used is configured to support a Stores Application, so
that was all that was necessary for us to meet our objectives. The primary
objective of the tests was to demonstrate the capabilities of DB2 Information
Integrator and Informix Enterprise Gateway Manager for accessing and joining
heterogeneous data sources. To do this we simply needed to issue appropriate
queries against the federated environment and view the results. In addition we
needed to explore the query optimization capabilities of DB2 Information

Introduction 5
Integrator and Informix Enterprise Gateway Manager, and their ability to rewrite
the query plans. These products also enabled us to view those new query plans
to validate any optimization that took place.

To query the federated environment and validate optimization, we used the


following three products:
򐂰 Server Studio Java Edition for Informix
򐂰 DB2 Command Center
򐂰 Brio Explorer V6.5.2

For all the specific details on the project environment, refer to Chapter 2, “The
data federation project” on page 31. Here you will also discover the reasons
behind our selections for the environment.

Using these products simplified the data access and enabled fast development
and execution of the test queries. A sample database, the Stores Demo
database, comes with IDS and was used as the test vehicle for the
implementation. It is a database, and application scenario, familiar to most
Informix customers. Using it should promote easier understanding of the results
of our tests. Using this environment provided us a means of demonstrating how
quickly solutions can be developed and implemented, even when using multiple
heterogeneous data sources. It truly demonstrates the value of a federated data
environment.

We believe this publication will provide, in one document, the information you
need to begin implementing and testing a powerful federated data environment. It
will support the movement to a more standard development environment, saving
you time, money, and minimizing the need for skilled development resources on
all of those data sources.

The DB2 Information Integrator has demonstrated the capability to support a


heterogeneous data environment and provide improved performance with the
use of its powerful optimization capabilities. Once you have tried it, we believe
you will realize its power, ease of implementation and use, and significant
savings. You will then be on a path towards a more standardized and highly
productive development environment.

6 IBM Informix: Data Federation


Executive summary
“Introduction” on page 1, discusses the need for information integration.
Companies, and perhaps yours included, have grown through differing data
management technologies over the years, which has left them with a legacy. That
legacy includes:
򐂰 Lots of data of many different types and in many different formats
򐂰 Data created by and residing in many different applications
򐂰 Data developed in several data management technologies
򐂰 Data stored in a number of different data management products
򐂰 Data management products purchased from a number of different vendors
򐂰 Data management products installed that are no longer supported or
available
򐂰 Numerous different data schemas
򐂰 Data in many locations around the world
򐂰 Data from mergers or acquisitions
򐂰 Data with the same, or similar, names but having different meanings
򐂰 Data entities, such as customer, vendor, or company, that are the same but
spelled or named differently

All this adds significant costs to your organization to maintain, manage, and use.
It is because of all this that information integration is needed. Also, there is
significant savings associated with it.

In this publication we explore and discuss information integration. It can come in


many flavors with differing capabilities. Enterprise information integration is not
an easy or a short-term proposition. However, it is a direction in which you should
be moving. The volume of data an organization collects is growing at a significant
rate, particularly with the advent of the Internet. To accurately analyze it and
extract the value of informed decision making from it, requires you to manage it.

Implementing information integration, of course, requires supporting technology


and products. One approach that is demonstrating significant capabilities and
benefits is data federation, and that is the main focus of this publication. We
discuss and demonstrate data federation as implemented by two IBM data
federation products:
򐂰 DB2 Information Integrator (DB2 II)
򐂰 Informix Enterprise Gateway Manager (EGM), in conjunction with the Informix
Database Servers

Introduction 7
We have purposely restricted our scope of the project to federation of relational
data sources. Our project environment and architecture is defined in detail in
Chapter 2, “The data federation project” on page 31.

To make this exercise more realistic, we have constructed a case study centered
around a fictitious company based in the United States. This fictitious company
has the states organized into seven regions. Each region uses the same
database schema, but has implemented it using a different database server or
operating system. Most businesses hopefully would be less complex than this.
However, we wanted to demonstrate the robust data federation capabilities that
are available to you. We have used the Informix Stores Demo database schema
for our case study. That is a sample database supplied with Informix Dynamic
Server (IDS) and Informix Extended Parallel Server (XPS). Most Informix users
are already familiar with this schema, which should aid the understanding of our
case study.

We use this case study to explore the data federation technology available today
from IBM. We discuss the strengths and weaknesses, and provide insight into to
how it works and how to leverage the technology to your benefit.

Data federation technology is certainly impressive, giving you a great start down
the path to fully realize the vision of information integration, as expressed in 1.1,
“The challenge of information integration” on page 14.

The technology today requires some further development in the area of


distributed update capability across heterogeneous platforms. However, in both
the DB2 and Informix family of products, data integrity is assured for updates
across distributed databases using a two-phase commit strategy. In the current
releases of both DB2 II and EGM, there are limitations in the way updates can be
performed across heterogeneous distributed databases. These limitations are
discussed in 4.4, “Considerations for use” on page 108, and 5.3, “Considerations
for use” on page 149.

The purpose of this publication is to explore the topic of data federation and
demonstrate how it can be used by companies with Informix, and who also have
other heterogeneous sources of data.

We created a federated data environment in a systems configuration built on


Linux and Microsoft Windows/2000 operating systems. To enable the federated
environment, we used the DB2 Information Integrator and Informix Enterprise
Gateway Manager. These two products provide a number of similar capabilities,
but have dissimilar architectures. We have documented the results from our
testing and have used them to see how well they provide they functions and
features typically desired in data federation. The overall results demonstrated
that the DB2 Information Integrator provided more functionality, particularly in he
area of higher degrees of optimization. The end result of optimization is that less

8 IBM Informix: Data Federation


data may be transferred over the network and you will experience better
performance and improved resource utilization. And, as mentioned in “The
project systems environment” on page 4, we also demonstrated the data
federation capabilities of an Informix database feature called I-STAR. I-STAR is a
data federation capability that was developed to federate data from the Informix
Dynamic Server and Extended Parallel Server data sources. It complements and
extends the federation capabilities of DB2 Information Integrator and the Informix
Enterprise Gateway Manager.

To enable client access to the test environment, we used the following query
products:
򐂰 Brio Explorer
򐂰 DB2 Command Center
򐂰 Server Studio JE (Java Edition)

These products enabled easy development of queries required to test the data
federation and to display the results. The graphical representation of the data
schema made visualizing the testing being done very simple. In addition, it
enabled us to view any modification to the SQL as a result of query optimization
by DB2 Information Integrator or Informix Enterprise Gateway Manager.

Working with a federated information environment can significantly reduce your


application development efforts and protect your development investment. This is
because the applications can be written as if they are accessing one common
data source rather than many. This saves the time typically required to
understand the various data structures and their related programming
requirements. This also has the impact of minimizing the on-going maintenance
effort for those applications. There can also be significant savings in avoiding the
acquiring and training of skilled resources that is typically required when there
are many different data sources that a company, as well as application
developers and data administrators, must use and manage.

If you are in the position of having many data sources that are used to support
your organization, this publication is required reading. It will enable you to start
down the path to a more standardized approach for application development.
One that can provide you with reduced development and maintenance costs,
lower requirement for skilled resources, and less maintenance burden.

After you have read the highlights and benefits you will have a better
understanding of the value of data federation. We then encourage you to read
the remainder of this publication to get more in-depth knowledge that will solidify
your understanding of, and appreciation for, data federation.

Introduction 9
Highlights and benefits
This section contains some of the highlights and benefits of data federation that
we have demonstrated during our testing for this project. They are discussed
throughout the publication, but we have summarized some of them here for easy
and fast reading. The benefits of data federation are many and can be realized
from more than one of the highlight areas.
򐂰 Reduces your need for skilled resources
Using a federated data approach enables you to access data from multiple
disparate data sources as if they all were the same. Therefore, you should not
need as many skilled resources to work with the different data sources you
might have in your organization. This is particularly true when it comes to your
application development staff. Your developers can now focus on developing
applications rather than trying to understand all the nuances of the different
data sources. It will also make it easier for you to find and employ skilled
resources.
򐂰 Use standard SQL
You can use standard SQL, and SQL Expressions, in your applications. And,
since it is DB2 SQL, you will always have a ready pool of available resources
from which to fill your development resource requirements.
򐂰 Query optimization
You will get better performance and resource utilization because your queries
will be optimized. This is unique to DB2 II. It means that more data operations
will be pushed down to their lowest point of execution, which means less data
will be sent across your network. The result is better performance and less
resource utilization.
򐂰 Use of summary tables
Summary tables are often used to give faster query performance. Now there
is also another type of summary table created as the result of a query. It is
called a Materialized Query Table (MQT). But, if programmers are not aware
of them they will use the original detailed tables. With DB2 II, the DB2
optimizer will be aware of these MQTs and automatically rewrite the query to
use them—for a significant performance boost.
򐂰 Access nonrelational data
You can also access nonrelational data sources, such as Excel files, with the
same relational SQL used to access relational data. You can even join the
nonrelational data with relational data, fast and easy. This means writing no
more extract programs to take data from nonrelational sources and create
new, or temporary, relational sources so you can join the data together. This
represents a tremendous savings in application development and data
maintenance.

10 IBM Informix: Data Federation


򐂰 Enables integration and consolidation
There are several ways to enable integration of your data resources. You can
physically consolidate or logically integrate. Of course you can do some of
both and ease the consolidation effort. Data federation gives you the option to
integrate or consolidate on a planned and flexible schedule. This would apply
to integration required as the result of a merger or acquisition as well.

As you read the remainder of this publication you will discover many more
highlights and benefits that will help as you make those critical decisions on data
integration and consolidation.

Introduction 11
12 IBM Informix: Data Federation
1

Chapter 1. Data federation overview


In today’s e-business on-demand environment, integrating information across
and beyond the enterprise is a competitive mandate. Initiatives such as customer
relationship management, supply chain management, and business intelligence
are based on successfully integrating information from multiple data sources.

In this chapter we discuss the challenge of information integration. As with most


technology directions, there are multiple levels of information integration and
multiple approaches to consider when trying to accomplish it. One such
approach is data federation. It is this approach on which we will primarily focus
during the discussions.

© Copyright IBM Corp. 2003. All rights reserved. 13


1.1 The challenge of information integration
In most corporations, it is a given that different functional areas and departments
will use different database systems, operating environments, techniques,
applications, and software products to product, store, retrieve, and analyze
critical data. It happens for many reasons, such as:
򐂰 Mergers and acquisitions
򐂰 Personal preferences
򐂰 Changing technology
򐂰 New product availability

As a result, corporations have their data stored in many different formats, on


many different database systems, in many different applications, and in many
different operating environments. This poses a huge challenge for the information
technology (IT) department when data from these different areas must be
consolidated, either temporarily or permanently. Also, this must be done if
management is to get an accurate picture of what is going on in their businesses.
It is a difficult position in which most corporations find themselves.

There are disparate sources of data everywhere, little of which is, or can easily
be, integrated. For examples refer to Figure 1-1 on page 15. Most of the
decisions made that resulted in this situation were well justified and probably
good business decisions—at the time. But, now they are finding it extremely
costly to consolidate, much less integrate, these sources of data to give
management the information they need for decision making. So why do they not
integrate all their sources of data?

14 IBM Informix: Integration Through Data Federation


Figure 1-1 Data environment today

Information integration is a much discussed goal, but can be a very difficult goal
to accomplish. When integration is attempted, it is most typically on a small scale
and on an application-at-a-time basis. Therefore, it is not an architected
enterprise data integration effort but rather integration to meet the requirements
of a specific application or systems implementation. But, it is a small step
forward.

Due to organizational, operational, or legacy data source constraints, any


integration is typically a very expensive proposition—let alone enterprise
information integration. In most organizations data is dispersed over a number of
operating environments and data sources, both inside and outside the
organization. Therefore it takes many different skilled resources to understand
how to access, combine, and use the data from the many different types of data
sources. Businesses continue to pursue that goal, but also look to other
approaches to meet their requirements.

The vision of information integration is that users should be able to read and
update all of the data they use as if it resided in a single source.

Information integration technology should shield the requester from all the
complexities associated with accessing the data in its diverse locations and

Chapter 1. Data federation overview 15


formats and without regard to their accompanying semantics and access
methods.

The challenge of information integration can be met using either of the two
fundamental approaches below, or a combination of both:
򐂰 Data federation
򐂰 Data consolidation

We discuss these two approaches in more detail in the following sections.

Data federation
Data federation is not a new concept and has been around for some time. It is the
ability to combine data from multiple heterogeneous data sources. It is a logical
integration that typically takes place in real time. Although it is not full integration,
many are finding that they can satisfy much of their data integration need with
data federation.

Data federation refers to an architecture where a relational database


management system enables access to heterogeneous data sources. This is
depicted in Figure 1-2 on page 17. It enables access using the common
language of the RDBMS, typically Structured Query Language (SQL). To do this
some technology and mechanism must be used to translate the SQL into the
language of the heterogeneous data source. Also, it must be able to perform
such non-trivial tasks as handling the differences in data types that exist across
these heterogeneous data sources.

Although data federation sounds relatively simple, there are many considerations
and much care to be taken. To date, most data federation has been
accomplished by custom programming. Now, however, there are products to
help, such as the DB2 Information Integrator (DB2 II).

16 IBM Informix: Integration Through Data Federation


Data Federation Engine DB2 Family

SQL, SQL/XML Sybase

Variety of Clients Transformation and Mapping


Informix

SQL Server

Oracle

Biological
Data and Teradata
Algorithms Text XML Excel WebSphere MQ WWW, email, … ODBC

Figure 1-2 Data federation architecture

Some of the earlier federation attempts involved gateways, which allow a DBMS
to route a query to another database system. Most accomplished this by using a
standard access mechanism called Open Database Connectivity (ODBC). We
will also demonstrate the use of a gateway in this publication. Informix has such a
gateway, called the Informix Enterprise Gateway Manager (EGM), which we will
use along with DB2 Information Integrator. Although EGM is a very robust
gateway, you will find that DB2 II has a number of important capabilities that EGM
does not—for satisfying the requirements of data federation.

The following are a number of capabilities that should be inherent in a product


supporting data federation. A few are included in the following list:
򐂰 Conceal differences in the implementation and use of a heterogeneous data
source. It must behave as if it were part of the single database environment.
򐂰 Support for many heterogeneous data sources.
򐂰 Ability to exploit unique capabilities of the heterogeneous data source.
򐂰 Ability to easily add support for new data sources. In addition, the federating
DBMS should be able to extend its capabilities to operate on the

Chapter 1. Data federation overview 17


heterogeneous data source. This could include using the DBMS optimizer, for
example.

For more detailed information on the topic of data federation, there is an excellent
article in the IBM Systems Journal Volume 41, November 4, 2002. It is called
“Data Integration Through Database Federation,“ by L.M. Haas, E.T. Lin, and
M.A. Roth. Some of the above information was distilled from that article.

Gartner Group estimates that 70 percent of corporate application development


budgets are allocated to accessing and federating disparate data.

Data federation (or distributed access) achieves the vision of information


integration by giving the appearance that the various federated data sources exist
in the same place. The creation of this illusion is the role of the federated
database server. This is the only server that the end user or application will
access directly, and it is in this way that the benefits of the single data source
illusion are experienced.

Data consolidation
Data consolidation (or data placement) physically brings the source data
together from a variety of locations into one place in advance, so that a user
query does not need to be distributed. This approach typically uses either
extract, transformation and load (ETL), or replication functionality. As with the
federated approach above, the end user or application interacts only with this one
physical consolidated database server to enjoy the single data source
experience.

With data consolidation, both the remote data request and transfer must occur
before the end user or application request is issued. It is logical therefore that the
request to the remote data source is basically formulated one time only during
data requirements definition, while the transfer of data typically occurs many
times according to some defined cycle or trigger. Neither the data request nor the
data transfer to and from the remote data source are directly related to the end
user’s request. Although, hopefully, there is some relationship with the end user
request and the data architect’s prediction of the types of queries to be serviced.

Data consolidation or placement is the traditional approach to integrating


information and, in contrast to data federation, moves the data to the query. It has
always been considered less complex than data federation, as data consolidation
creates a second, local copy of the data, pre-processed as required, thus
reducing the need for extensive data manipulation and remote access within the
user query. Data consolidation, because it operates off the critical time path of
the user’s query, also allows for substantial and complex transformation of the
data to address issues of cleanliness, semantic and temporal consistency, and
so on. It therefore exhibits varying levels of complexity. At its simplest, it is a

18 IBM Informix: Integration Through Data Federation


manually initiated database unload followed by a load of the target system. At its
most complex, it may involve the automated, real-time, multi-way synchronization
of databases on a number of remote systems. In most cases today, it is
somewhere in between.

A key consideration for data consolidation is the maximum latency that can be
tolerated when transferring the data from source to target. Typically, business
needs specify how up-to-date a copy of the data must be. In data warehouses,
for example, the frequency required might be daily or weekly and the latency of
data consolidation can easily extend to many hours. At the other extreme, the
need for almost real-time data, such as in stock market systems, requires
minimum latency in data consolidation.

Two of the most important factors determining the minimum latency possible in
data consolidation are the complexity of the transformations required and the
volumes of data to be transferred. These factors lead to two complementary
approaches to consolidating data. ETL is optimized for larger data volumes and
is often associated with more complex transformations, while data replication
emphasizes the transfer of individual data records and is often restricted to
simpler transformations.

1.2 To federate or not to federate


Data federation and data consolidation are actually similar concepts. Both involve
requesting and receiving data that originally resides outside the physical confines
of the database server with which the end users or applications interact. The key
difference is in the timing of the data requests to, and transfers from, the remote
data source and the central sever. With data federation, both the remote data
request and transfer occur after the end user or application has issued the
request to the federated database server.

But from the end users’ points-of-view, or that of the applications acting on their
behalf, data federation and data consolidation act in opposite ways. Data
federation integrates the required information synchronously, directly from its
original sources, acting only after the end user decides what information is
required. It must therefore return the result in a time frame that is acceptable to
the user or requesting application. Data consolidation operates in advance of the
user query, allowing itself more time to perform the required processing.
However, the data architect needs to make decisions in advance regarding what
data will be required. Secondly, because data consolidation is creating a second
copy of the data, it requires a larger quantity of permanent data storage than the
data federation approach.

Chapter 1. Data federation overview 19


It is sometimes suggested that data federation is a technique that can be used to
provide users direct, unplanned access to any data, anywhere, because a
federated query allows data to be combined in real time. Although this could be a
theoretically true statement, it is a potentially dangerous suggestion. Such
unrestrained access could lead to significant performance problems that could
impact the end user, the federated database server, and, perhaps more
importantly, the source systems. Furthermore, data federation can actually
demand even more rigorous analysis, modelling, control, and planning than data
consolidation. This would happen, for example, to avoid significant performance
and semantic problems for users if they try to combine data that is inconsistent in
meaning and structure.

Consolidated data stores, such as data warehouses and operational data stores
constructed through data placement, have the capability to avoid such issues
using a range of techniques. For example, they could do so by using a
consistent, point-in-time, snapshot approach. The data consolidation approach
can still have advantages in some areas when compared to a federated
database. However, data federation may eliminate the need for building a data
mart. If the volume of queries is not large, and often can be satisfied with
summary tables, a huge productivity boost can be realized by eliminating such
things as the need for a detainment and the corresponding creation of a new
server, and movement of significant quantities of data to populate it. Of course,
for frequent complex queries that need access to the lowest level of detailed
data, a detainment or data warehouse is the preferred solution.

Most queries require tables to be joined together, which can be resource


intensive and time consuming. So, how can you speed up queries that require a
join operation? Well, the fastest way to join two tables is not to join them at all.
The Transaction Processing Council (TPC) TPC-D benchmark, now withdrawn,
proved conclusively that if the answer to a query is pre-computed in a summary
table, the query runs faster.

How can this summary table speed-up technique be applied to a federated


environment? Because to really take advantage of summary tables they have to
be used whenever possible. How can we make that happen? Let us give an
example. A summary of data at a monthly interval can satisfy a query at a
monthly, quarterly, or annual level of aggregation. If the query optimizer is smart
enough to rewrite the query to operate on the summary table instead of seeking
the detail, then we will get an advantage.

We tested this capability by adding a summary table, and it works. In DB2, the
summary table is called a Materialized Query Table (MQT). A powerful feature of
DB2 II is that it can recognize and understand the MQT. In our test, it recognized
that the MQT could satisfy the query we submitted. So DB2 II intercepted the
SQL for the query and rewrote it to access the MQT rather than the base tables.

20 IBM Informix: Integration Through Data Federation


This meant that there were no joins to perform, just a simple access to the MQT.
That resulted in a significant reduction in processing resources and response
time.

Arguments in favor of data federation


Data federation is likely to be the appropriate method of data integration when:
򐂰 Real-time or near real-time access to rapidly changing data is required.
Making copies of rapidly changing data can be costly, and there will always be
some latency in the process. Through data federation, the original data is
accessed directly and joined in the query. However, the performance, security,
availability, and privacy aspects of accessing the original data must be
considered.
򐂰 Direct immediate write access to the original data is required.
Working on a data copy is generally not advisable when there is a need to
insert or update data, because data integrity issues between the original data
and the copy may occur. Even if a two-way data consolidation tool is available,
complex two-phase commit locking schemes are required. However, writing
directly to the database of another application is generally prohibited.
Therefore, it is generally recommended to call the owning application, via an
application program interface (API) invoked by the federated database server,
to request any updates.
򐂰 It is technically difficult to use copies of the source data.
When users require access to widely heterogeneous data and content, it may
be difficult to bring all the structured and unstructured data together in a
single local copy. Or, when the source data has a very specialized structure,
or has dependencies on other data sources, it may not be possible to sensibly
query a local copy of the data. In such cases, accessing the original source is
recommended.
򐂰 The cost of copying the data exceeds that of accessing it remotely.
The performance impacts and network costs associated with querying remote
data sets must be compared with the network, storage, and maintenance
costs of storing multiple copies of data. In some cases, there will be a clear
preference for a federation-based approach, such as when:
– Data volumes of the original sources are too large to justify copying it.
– Data is too seldom used to justify copying it.
– A very small or unpredictable percentage of the data is ever used.
– Data is accessed from many remote and distributed locations, which
would imply multiple copies.

Chapter 1. Data federation overview 21


򐂰 It is illegal or forbidden to make copies of the source data.
Creating a local copy of source data that is controlled by another organization
or that resides on the Internet may be impractical, due to security, privacy, or
licensing restrictions. However, federated access with discrete queries may
be permitted. If access to the sensitive data is only of an on-demand nature,
this access can be audited (if applicable) at the remote source using the
normal auditing techniques applied at that data source. This auditing
capability at the remote data source to know who has accessed what would be
lost under the data consolidation approach.
򐂰 The users’ needs are not known in advance.
Allowing users immediate, ad hoc access to any enterprise data is an obvious
argument in favor of data federation. However, care is needed with approach.
The potential exists for users to create queries, accidentally or otherwise, that
negatively impact both source systems and network performance, and that
disappoint the user by yielding poor response times. In addition, because of
the typical level of semantic inconsistencies across data stores within
organizations (as a result of differing data latencies, formats, and data
structures), there is a significant risk that such queries could return answers
that are of little or no value.

Arguments against data federation


Not surprisingly, the arguments against the data federation approach are the very
same as those in favor of data consolidation:
򐂰 Read-only access to reasonably stable data is required.
The data federation approach will present the remote data in real time. This
may not be advantageous to the end user or application, which would prefer
to suffer some latency in the data in order to be insulated from the continuous
flux of information in the remote operational data sources.
򐂰 Users need historical or trending data.
Historical and trending data is seldom available in operational data sources,
but requires a data consolidation approach to build up historical data over
time. This is a very common data warehousing requirement.
򐂰 Data access performance or availability is overriding considerations.
Users routinely want quick data access. Despite the best efforts of a
well-designed federated server working in unison with the remote data
sources, the volumes of data required may necessitate that a local,
pre-processed copy of the data be made available. As seen in data
warehousing environments, the queries to be serviced can be very complex
or may require a multi-dimensional view of historical or trending data. As a
result, data consolidation is a fundamental technique in data warehousing.
However, data federation can still be used in conjunction with data

22 IBM Informix: Integration Through Data Federation


consolidation to provide support for the lowest drill-down level. In this case the
data warehouse acts as both the consolidated database server and the
federated database server.
򐂰 User needs are repeatable and can be predicted in advance.
When user queries are well-defined, repeated, and require access to only a
known subset of the source data, it may be cheaper to create a copy of the
data for local access and use. This approach also insulates the remote
operational data sources from large, complex, or poorly structured queries.
򐂰 Data transformations or joins needed are complex or long-running.
In cases where significant data transformations are required or where joins
are complex or long-running, it is inadvisable to have them run synchronously
as part of a user query due to potentially poor performance and high costs. In
such cases, creating a copy of the data through data consolidation would
seem to be more advantageous.

Using both data federation and data consolidation


It is likely that there will be cases where a combination of data federation and
data consolidation techniques is the best option.

One case is where a federated query can leverage data consolidation


functionality transparently. This is because sometimes a federated query will not
work. It could be because of network outages, for example. Here, for example,
data federation can use data consolidation to create or manage cached data. On
the other hand, data consolidation tools may be optimized of only a subset of
available data sources. Using data federation along with it can expand that
number, and allow pre-joining of data for a performance impact.

Later on in Chapter 7, “Optimization in a federated environment” on page 193,


we consider exactly such a combination of techniques through the use of
Materialized Query Tables (MQTs). MQTs are data stores of preprocessed data
from the remote data sources stored at the local federated server. We consider a
user query against the federated database where the underlying tables reside in
one or more remote databases. We demonstrate how we can exploit the
federated server to transform the query to instead reference the MQT residing
locally at the federated server.

1.3 Data federation examples


The previous sections have discussed the pros and cons of data federation, and
described it in general terms. In this section we present examples of data
federation in environments where DB2 is the base, and where Informix is the
base. These are simply to provide you with an idea of the capabilities for data

Chapter 1. Data federation overview 23


federation. In fact, either DB2 or Informix can be the base, or you can use a
combination of both. Specific details on how to implement data federation,
capabilities, and considerations are discussed in later chapters.

In the following sections, there are figures that depict the general environments
of DB2 and Informix, when used as the base for a data federation. There is a
wide range of support with both products that can provide significant benefits as
you integrate the information in your enterprise. And, as we have discussed,
there are differing ways to implement an integrated environment. The primary
message here is that whether you have DB2, or Informix, or both, you have
powerful tools and capabilities at your disposal that can save you time, effort, and
resources as you integrate the information in your enterprise.

Data federation has become a necessity with the evolution and availability of
numerous data management products. Between new technology advances and
the normal evolution of business requirements, companies find themselves with
quite a number of different, and incompatible, data sources. Most companies are
using some form of data federation, even if calling it by another name. Also, it is
typically a customized approach that is very expensive and resource intensive to
support and maintain.

Even with the advent of relational technology and databases, companies find
themselves with more than one relational DBMS. This could be for many
reasons, but it became a reality and the norm rather than the exception. To
minimize the skilled resources and extra work required in application
development, as well as to satisfy the continued push for data integration,
companies demanded additional capabilities from their vendors. In the
beginning, that support was primarily for relational technology.

Vendors responded with architectures and products to give companies some


relief. As examples, IBM introduced data replication technology for data
movement, and Distributed Relational Database Architecture™ (DRDA®) to
support standard interfaces for data exchange. A first step was to provide easy
data access and interchange among IBM relational databases.

Data replication became a fast and reliable way to have multiple images of data
in multiple locations, all kept in synchronization. The technology quickly moved
beyond relational, for example, with support for the IBM Information Management
System (IMS™) database technology.

Third-party products and tools became available to support inter-vendor data


access and exchange, and the move to integration and data federation was on.
The technology is still growing and becoming more sophisticated. The goal is
simple: Access to any data source, anywhere, with the same programming
language or tool.

24 IBM Informix: Integration Through Data Federation


1.3.1 Data federation with DB2
We show in Figure 1-3 an example of the data federation capabilities when using
DB2 as the base. You can see that there are a number of tools to enable you to
implement a robust federated data environment. There is access to relational and
nonrelational databases, file systems, tools, and Web data sources. In most
cases, the DB2 approach uses the native API for each data source for
compatibility, breadth of support, and performance reasons.

From a programmer perspective, all data sources will appear to be the same.
That is, they will all appear to be DB2 sources. Also, there is access to both
structured and unstructured data (universal data) through the use of DB2
Extenders. Extenders are extensions to DB2 that enable additional capabilities.
The primary product to be used for data federation is the DB2 Information
Integrator, and it is discussed in great detail in subsequent chapters.

DB2 Family

MVS AIX
VM OS2
VSE HP-UX
OS/400 SUN
SINIX NT
SCO Others
DB2 Client
DB2 Connect
Extenders
Other Servers

To Universal Data Informix


Oracle
Sybase
Native Microsoft
Interface Teradata
P er so nal E d itio n

DB2
AST External Non-Relational
Data Cache
Third-party IMS
Data TABLE Gateway VSAM
Links UDFs

Cross Access
OLE/OLE DB
EDA/SQL
External Supported Data
WEB Data Lotus Notes
Files Sources
MS Exchange
Sources
MS Excel
MS Access

Figure 1-3 DB2 data federation

One of the primary advantages of DB2 Information Integrator is that it takes


advantage of the DB2 optimizer. The optimizer examines the environment to
determine the best way to access a data source. For example, it looks at the
processors involved, network speeds, and statistics regarding the type, size, and

Chapter 1. Data federation overview 25


characteristics of the data source. It then develops the best access plan to be
used for the data sources involved. This may include performing some of the
work on the local processor and some on remote processors, whereas many
older technologies simply bring all the data back to the local processor and work
on it there. This is rarely an optimal approach. DB2 Information Integrator can
then rewrite the SQL access commands to retrieve the data with the approach
that results in the lowest resource cost. That is why the DB2 optimizer is
categorized as a cost-based optimizer.

We present a number of examples, in later chapters, that result in the


development of new query plans and the rewrite of SQL by the DB2 optimizer.
These examples will clearly demonstrate the advantages gained and benefits
received by this capability.

1.3.2 Data federation with Informix


Informix products can also be used as the base for data federation. They can
provide a number of capabilities that are similar to those described in the DB2
data federation example. We depict a data federation environment, using
Informix as the base, in Figure 1-4 on page 27. The primary product used for
data federation in that environment is Informix Enterprise Gateway Manager. We
briefly describe that approach here, but it will be discussed in greater detail in
subsequent chapters.

To address data federation, a number of vendors developed what was referred to


as gateway products. That is, they provided a gateway to other heterogeneous
sources of data. For example, Informix introduced their Enterprise Gateway. The
first version of this gateway used IBM DRDA to enable access to data on the IBM
mainframe processors. At that time, the mainframe was where vast amounts of
data was stored, so to enable mid-range solutions, vendors needed to give their
customers access to that mainframe source of data.

Support was then expanded to enable access to other vendor data sources,
regardless of the platform, but access was primarily through the use of ODBC
rather than the native DBMS API. This approach does enable data access, but
can bring with it performance issues. This is why DB2 has focused on using
native APIs, and it is one of the differentiators of the DB2 approach.

Over time, gateways improved and began to address some of these issues. For
example, as we discuss in later chapters, the Informix Enterprise Gateway
Manager has the capability to change the data access plan as well. However,
there are also some limitations.

In the Informix example, you will note that it can also enable access to both
structured and unstructured data. Informix DataBlades are used for the interface

26 IBM Informix: Integration Through Data Federation


to unstructured data. These are extensions to the database that enable this
capability, and a number of others.

Early on, Informix introduced I-STAR as a way to enable easy access and
exchange of data between the different Informix relational databases. This is one
of the ways Informix provided the capability to change access plans and
distribute some of the query execution tasks. It was another step towards data
federation. I-STAR began as a separate product, but over time was integrated
into the Informix base technology.

DB2 390
Data Blades
Enterprise
To Universal Data Gateway
With DRDA

Other Servers
Per so nal Ed itio n

Enterprise
Informix Gateway
DB2 UDB
Oracle
Manager Sybase
Microsoft
Teradata

Third-party
Virtual Table Gateway
Interface

Any
Cross Access
External EDA/SQL
Data Source Supported Data
Sources

Figure 1-4 Informix data federation

Informix I-STAR Technology


In the early days of integration and data federation, Informix developed a
technology called I-STAR. It was a separate product at that time, but has since
been integrated into the Informix code base for Informix Dynamic Server and
Informix Extended Parallel Server.

I-STAR allows users to query and update more than one database across
multiple database servers, within a single transaction. The database servers can
reside on a single host computer or on the same network. A two-phase commit
protocol ensures that transactions are uniformly committed, or rolled back,

Chapter 1. Data federation overview 27


across multiple database servers. You can also use Informix database servers
with Informix Enterprise Gateway products to manipulate data in non-Informix
databases. The heterogeneous commit protocol ensures that updates to one or
more Informix databases and one non-Informix database in a single transaction
are uniformly committed or rolled back. You can also use I-STAR in a
heterogeneous environment that conforms to X/Open.

There is more detailed information on the capabilities of both DB2 Information


Integrator and Informix Enterprise Gateway Manager in the chapters that follow.

1.4 Considerations
In this section we discuss some individual techniques to be considered with data
federation. In later chapters, we make specific recommendations based on the
test environment we used.

1.4.1 Naming conventions


If you think naming conventions are not too important, we would like you to
review that position now. In this project we found ourselves managing more
discrete, but similar, data sources than ever before. If you are clear about which
data source you are looking at, you will be able to federate your different data
sources much more easily.

Good naming conventions are invaluable when interpreting query plans to


analyze any potential performance problems you may encounter. As you will see
when you read 2.1, “Environment and server configuration” on page 33, our
database schema was not overly complex. However, we did purposely create
quite a lot of complexity in the number and types of database servers. This was
to more closely simulate an environment that might be more closely comparable
with that in which you work.

In our environment, it is the multiplication factor provided by having the same


schema distributed over a large number of database servers, that really
demonstrates that naming conventions are critical.

In this context we are referring to the local names you allocate in the federated
database server to the remote data sources. Of course the existing remote data
sources will already have their own names and we are not suggesting that you
embark on a campaign to rename all of those.

In our environment, we used a naming convention, the main purpose of which is


to aid clarity for you. Consequently, once you understand the convention, you can
easily determine the remote data source for a given federated database object.

28 IBM Informix: Integration Through Data Federation


Bear in mind that the names you allocate in the federated database server for the
remote data sources are the names by which all your end users and applications
will address these objects in the single data source illusion that the federated
database server creates. Of course, you may choose to obscure the data source
for the federated database objects from your end users and applications by
applying aliases, synonyms, or views over the top of your other names.

You may, or may not, choose to follow something similar to our naming approach.
After reading this publication, and hopefully getting a good understanding of what
data federation is all about, we do at least request that you give some thought to
naming conventions. It is much more cost effective to do this at the beginning of
this data federation journey.

1.4.2 Data types


In general, the data types in your remote data sources will already be determined
by the time you decide to federate your data, so any difficulties you face in this
area will be the legacy of other system design decisions taken long before you
arrive at this point.

In addition, certain higher value data types are either not supported or are
supported with some limitations with the currently available data federation
technology. Examples are certain binary large objects (BLOBs), character large
objects (CLOBs), and some date/time data types on some platforms.

Fortunately, there are some weapons in the data federation arsenal to help you
here. We discuss some of the problems we encountered and how we dealt with
them for DB2 II in 4.4, “Considerations for use” on page 108, and for EGM in 5.3,
“Considerations for use” on page 149.

Chapter 1. Data federation overview 29


30 IBM Informix: Integration Through Data Federation
2

Chapter 2. The data federation project


In this chapter we introduce you to the environment and architecture used for our
product implementations and testing. We used this environment to exercise the
capabilities of the products, to validate specific functional scenarios, and to
evaluate and document the results for inclusion in this publication.

In setting up the environment, our objectives were to:


򐂰 Create an environment that was complex enough to be a reasonable model
for a real-life commercial data federation scenario.
򐂰 Encompass the primary database management systems (DBMSs) typically in
use in Informix installations. We recognize that most Informix installations will
perhaps not have all the DBMSs we used, but most will probably have enough
of a subset to gain value from the information presented.
򐂰 Have an environment that would constitute a reasonable compromise
between meeting the above objectives and still be manageable within our
constraints of server availability and time.

We also introduce you to our case study, which we hope will give you something
near a touch and feel experience of how you could use data federation within
your own organization.

We have constructed the case study scenario centered around a fictitious


company that we have based in the United States. This fictitious company has
the states organized into seven regions. Each region in our case study uses the

© Copyright IBM Corp. 2003. All rights reserved. 31


same database schema, but has implemented it using a different database
management system or operating system. Most businesses would probably be
less complex than this. However, we wanted to demonstrate the robustness of
the data federation capabilities that are available to you. We have used the
Stores Demo database schema, which is a sample database supplied with
Informix Dynamic Server (IDS) and Informix Extended Parallel Server (XPS).
Most companies that have Informix installed are already familiar with this
schema, which should aid the understanding of our case study.

As the starting point for the case study, we populated the databases in each
region with significant quantities of meaningful test data. We then integrated the
disparate databases from the regions into a single federated corporate database.
To simulate daily operations, we constructed sample queries to run against our
federated database. Sometimes we ran into problems with incompatible data
types in the different DBMSs and had to overcome these issues. We analyzed
access plans produced by the query optimizer for execution of our sample
queries, and then investigated the effect on performance of altering various
configuration settings.

In constructing this case study, we discovered many considerations for building a


federated system. You can read about all of this in detail in Chapter 4, Chapter 5,
and Chapter 6.

32 IBM Informix: Integration Through Data Federation


Random documents with unrelated
content Scribd suggests to you:
The Project Gutenberg eBook of The Roman
Festivals of the Period of the Republic
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

Title: The Roman Festivals of the Period of the Republic

Author: W. Warde Fowler

Release date: March 4, 2019 [eBook #59007]

Language: English

Credits: Produced by Ted Garvin, David King, and the Online


Distributed Proofreading Team at http://www.pgdp.net.

*** START OF THE PROJECT GUTENBERG EBOOK THE ROMAN


FESTIVALS OF THE PERIOD OF THE REPUBLIC ***
Transcriber’s Note:

Footnotes have been collected at the end of the text,


and are linked for ease of reference.
The Roman Festivals of the Period of the Republic

THE ROMAN FESTIVALS OF THE


PERIOD OF THE REPUBLIC
AN INTRODUCTION TO THE STUDY OF THE
RELIGION OF THE ROMANS
BY

W. WARDE FOWLER, M.A.


FELLOW AND SUB-RECTOR OF LINCOLN
COLLEGE, OXFORD

London
MACMILLAN AND CO., Limited
NEW YORK: THE MACMILLAN COMPANY
1899

All rights reserved

OXFORD: HORACE HART

PRINTER TO THE UNIVERSITY

FRATRIS FILIIS

I. C. H. F
H. G. C. F

BONAE SPEI ADOLESCENTIBUS


PREFACE

A word of explanation seems needed about the form this book has
taken. Many years ago I became specially interested in the old
Roman religion, chiefly, I think, through studying Plutarch’s
Quaestiones Romanae, at a time when bad eyesight was compelling
me to abandon a project for an elaborate study of all Plutarch’s
works. The ‘scrappy’ character not only of the Quaestiones, but of all
the material for the study of Roman ritual, suited weak eyes better
than the continual reading of Greek text; but I soon found it
necessary to discover a thread on which to hang these fragments in
some regular order. This I naturally found in the Fasti as edited by
Mommsen in the first volume of the Corpus Inscriptionum
Latinarum; and it gradually dawned on me that the only scientific
way of treating the subject was to follow the calendar throughout
the year, and to deal with each festival separately. I had advanced
some way in this work, when Roscher’s Lexicon of Greek and Roman
Mythology began to appear in parts, and at once convinced me that
I should have to do my work all over again in the increased light
afforded by the indefatigable industry of the writers of the Roman
articles. I therefore dropped my work for several years while the
Lexicon was in progress, and should have waited still longer for its
completion, had not Messrs. Macmillan invited me to contribute a
volume on the Roman religion to their series of Handbooks of
Archaeology and Antiquities.
Having once set out on the plan of following the Fasti, I could not
well abandon it, and I still hold it to be the only sound one:
especially if, as in this volume, the object is to exhibit the religious
side of the native Roman character, without getting entangled to any
serious extent in the colluvies religionum of the last age of the
Republic and the earlier Empire. The book has thus taken the form
of a commentary on the Fasti, covering in a compressed form almost
all the public worship of the Roman state, and including incidentally
here and there certain ceremonies which strictly speaking lay outside
that public worship. Compression has been unavoidable; yet it has
been impossible to avoid stating and often discussing the conflicting
views of eminent scholars; and the result probably is that the book
as a whole will not be found very interesting reading. But I hope
that British and American students of Roman history and literature,
and possibly also anthropologists and historians of religion, may find
it useful as a book of reference, or may learn from it where to go for
more elaborate investigations.
The task has often been an ungrateful one—one indeed of

Dipping buckets into empty wells


And growing old with drawing nothing up.

The more carefully I study any particular festival, the more (at least
in many cases) I have been driven into doubt and difficulty both as
to reported facts and their interpretation. Had the nature of the
series permitted it, I should have wished to print the chief passages
quoted from ancient authors in full, as was done by Mr. Farnell in his
Cults of the Greek States, and so to present to the reader the actual
material on which conclusions are rightly or wrongly based. I have
only been able to do this where it was indispensable: but I have
done my best to verify the correctness of the other references, and
have printed in full the entries of the ancient calendars at the head
of each section. Professor Gardner, the editor of the series, has
helped me by contributing two valuable notes on coins, which will be
found at the end of the volume: and I hope he may some day find
time to turn his attention more closely to the bearing of numismatic
evidence on Roman religious history.
It happens, by a curious coincidence, that I am writing this on the
last day of the old Roman year; and the lines which Ovid has
attached to that day may fitly express my relief on arriving at the
end of a very laborious task:

Venimus in portum, libro cum mense peracto,


Naviget hinc alia iam mihi linter aqua.

W. W. F.
Oxford: Feb. 28, 1899.
CONTENTS

Introduction 1

Calendar 21

Festivals of March 33

Festivals of April 66

Festivals of May 98

Festivals of June 129

Festivals of July 173

Festivals of August 189

Festivals of September 215

Festivals of October 236

Festivals of November 252

Festivals of December 255


Festivals of January 277

Festivals of February 298

Conclusion 332

Notes on Two Coins 350

Indices 353
ABBREVIATIONS.

The following are the most important abbreviations which occur in


the notes:
C. I. L. stands for Corpus Inscriptionum Latinarum. Where the
volume is not indicated the reference is invariably to the second
edition of that part of vol. i which contains the Fasti (Berlin, 1893).
Marquardt or Marq. stands for the third volume of Marquardt’s
Römische Staatsverwaltung, second edition, edited by Wissowa
(Berlin, 1885). It is the sixth volume of the complete Handbuch der
Römischen Alterthümer of Mommsen and Marquardt.
Preller, or Preller-Jordan, stands for the third edition of Preller’s
Römische Mythologie by H. Jordan (Berlin, 1881).
Myth. Lex. or Lex. stands for the Ausführliches Lexicon der
Griechischen und Römischen Mythologie, edited by W. H. Roscher,
which as yet has only been completed to the letter N.
Festus, or Paulus, stands for K. O. Müller’s edition of the fragments
of Festus, De Significatione Verborum, and the Excerpta ex Festo of
Paulus Diaconus; quoted by the page.
INTRODUCTION
I. The Roman Method of Reckoning the Year.[1]
There are three ways in which the course of the year may be
calculated. It can be reckoned—
1. By the revolution of the moon round the earth, twelve of which =
354 days, or a ring (annus), sufficiently near to the solar year to be
a practicable system with modifications.
2. By the revolution of the earth round the sun i. e. 365-1/4 days; a
system which needs periodical adjustments, as the odd quarter (or,
more strictly, 5 hours 48 minutes 48 seconds) cannot of course be
counted in each year. In this purely solar year the months are only
artificial divisions of time, and not reckoned according to the
revolutions of the moon. This is our modern system.
3. By combining in a single system the solar and lunar years as
described above. This has been done in various ways by different
peoples, by adopting a cycle of years of varying length, in which the
resultants of the two bases of calculation should be brought into
harmony as nearly as possible. In other words, though the difference
between a single solar year and a single lunar year is more than 11
days, it is possible, by taking a number of years together and
reckoning them as lunar years, one or more of them being
lengthened by an additional month, to make the whole period very
nearly coincide with the same number of solar years. Thus the
Athenians adopted for this purpose at different times groups or
cycles of 8 and 19 years. In the Octaeteris or 8-year cycle there
were 99 lunar months, 3 months of 30 days being added in 3 of the
8 years—a plan which falls short of accuracy by about 36 hours.
Later on a cycle of 19 years was substituted for this, in which the
discrepancy was greatly reduced. The Roman year in historical times
was calculated on a system of this kind, though with such inaccuracy
and carelessness as to lose all real relation to the revolutions both of
earth and moon.
But there was a tradition that before this historical calendar came
into use there had been another system, which the Romans
connected with the name of Romulus. This year was supposed to
have consisted of 10 months, of which 4—March, May, July, October
—had 31 days, and the rest 30; in all 304. But this was neither a
solar nor a lunar year; for a lunar year of 10 months = 295 days 7
hours 20 minutes, while a solar year = 365-1/4. Nor can it possibly
be explained as an attempt to combine the two systems. Mommsen
has therefore conjectured that it was an artificial year of 10 months,
used in business transactions, and in periods of mourning, truces[2],
&c., to remedy the uncertainty of the primitive calculation of time;
and that it never really was the basis of a state calendar. This view
has of course been the subject of much criticism[3]. But no better
solution has been found; the hypothesis that the year of 10 months
was a real lunar year, to which an undivided period of time was
added at each year’s end, to make it correspond with the solar year
and the seasons, has not much to recommend it or any analogy
among other peoples. It was not, then, the so-called year of
Romulus which was the basis of the earliest state-calendar, but
another system which the Romans themselves usually ascribed to
Numa. This was originally perhaps a lunar year; at any rate the
number of days in it is very nearly that of a true lunar year (354
days 8 hours 48 minutes)[4]. It consisted of 12 months, of which
March, May, July, October had 31 days, and the rest 29, except
February, which had 28. All the months therefore had an odd
number of days, except the one which was specially devoted to
purification and the cult of the dead; according to an old
superstition, probably adopted from the Greeks of Southern Italy[5],
that odd numbers were of good omen, even numbers of ill omen.
This principle, as we shall see, holds good throughout the Roman
calendar.
But this reckoning of the year, if it ever existed at all, could not have
lasted long as it stood. As we know it in historical times, it has
become modified by applying to it the principle of the solar year. The
reason for this should be noted carefully. A lunar year, being about
11 days short of the solar year, would in a very short time become
out of harmony with the seasons. Now if there is one thing certain
about the Roman religious calendar, it is that many at least of its
oldest festivals mark those operations of husbandry on which the
population depended for its subsistence, and for the prosperous
result of which divine agencies must be propitiated. These festivals,
when fixed in the calendar, must of course occur at the right
seasons, which could not be the case if the calendar were that of a
purely lunar year. It was therefore necessary to work in the solar
principle; and this was done[6] by a somewhat rude expedient, not
unlike that of the Athenian Octaeteris, and probably derived from
it[7]. A cycle of 4 years was devised, of which the first had the 355
days of the lunar year, the second 355 + 22, the third 355 again,
and the fourth 355 + 23. The extra periods of 22 and 23 days were
inserted in February, not at the end, but after the 23rd (Terminalia)
[8]
. The total number of days in the cycle was 1465, or about 1 day
too much in each year; and in course of time even this system got
out of harmony with the seasons and had to be rectified from time
to time by the Pontifices, who had charge of the calendar. Owing to
ignorance on their part, misuse or neglect of intercalation had put
the whole system out of gear before the last century of the Republic.
All relation to sun and moon was lost; the calendar, as Mommsen
says, ‘went on its own way tolerably unconcerned about moon and
sun.’ When Caesar took the reform of the calendar in hand the
discrepancy between it and the seasons was very serious; the
former being in advance of the latter probably by some weeks.
Caesar, aided by the mathematician Sosigenes, put an end to this
confusion by extending the year 46 B.C. to 445 days, and starting
afresh on Jan. 1, 45 B.C.[9]—a day henceforward to be that of the
new year—with a cycle of 4 years of 365 days[10]; in the last of which
a single day was added, after the Terminalia. This cycle produced a
true solar year with a slight adjustment at short intervals; and after
a few preliminary blunders on the part of the Pontifices, lasted
without change until A.D. 1582, when Pope Gregory XIII set right a
slight discrepancy by a fresh regulation. This regulation was only
adopted in England in 1752, and is still rejected in Russia and by the
Greek Church generally.
II. Order of Months in the Year.

That the Roman year originally began with March is certain[11], not
only from the evidence of the names of the months, which after
June are reckoned as 5th (Quinctilis), 6th (Sextilis), and so on, but
from the nature of the March festivals, as will be shown in treating
of that month. In the character of the religious festivals there is a
distinct break between February and March, and the operations both
of nature and of man take a fresh turn at that point. Between the
festivals of December and those of January there is no such break.
No doubt January 1, just after the winter solstice, was even at an
early time considered in some sense as a beginning; but it is going
too far to assume, as some have done, that an ancient religious or
priestly year began at that point[12]. It was not on January 1, but on
March 1, that the sacred fire in the Aedes Vestae was renewed and
fresh laurels fixed up on the Regia, the two buildings which were the
central points of the oldest Roman religion[13]. March 1, which in
later times at least was considered the birthday of the special
protecting deity of the Romans, continued to be the Roman New
Year’s Day long after the official beginning of the year had been
changed to January 1[14]. It was probably not till 153 B.C., when the
consuls began to enter on office on January 1, that this official
change took place; and the date was then adopted, not so much for
religious reasons as because it was convenient, when the business of
administration was increasing, to have the consuls in Rome for some
time before they left for their provinces at the opening of the war
season in March.
No rational account can in my opinion be given of the Roman
religious calendar of the Republic unless it be taken as beginning
with March; and in this work I have therefore restored the old order
of months. With the Julian calendar I am not concerned; though it is
unfortunate that all the Roman calendars we possess, including the
Fasti of Ovid, date from after the Julian era, and therefore present
us with a distorted view of the true course of the old Roman
worship.
Next after March came Aprilis, the month of opening or unfolding
vegetation; then Maius, the month of growing, and Junius, that of
ripening and perfecting. After this the names cease to be descriptive
of the operations of nature; the six months that follow were called,
as four of them still are, only by their positions relative to March, on
which the whole system of the year thus turned as on a pivot.
The last two months of the twelve were January and February. They
stand alone among the later months in bearing names instead of
mere numbers, and this is sufficient to suggest their religious
importance. That they were not mere appendages to a year of ten
months is almost certain from the antique character of the rites and
festivals which occur in them—Agonia, Carmentalia, Lupercalia, &c.;
and it is safer to consider them as marking an ancient period of
religious importance preparatory to the beginning of the year, and
itself coinciding with the opening of the natural year after the winter
solstice. This latter point seems to be indicated in the name
Januarius, which, whether derived from janua, ‘a gate,’ or Janus, ‘the
god of entrances,’ is appropriate to the first lengthening of the days,
or the entrance of the sun on a new course; while February, the
month of purifying or regenerative agencies (februa), was, like the
Lent of the Christian calendar, the period in which the living were
made ready for the civil and religious work of the coming year, and
in which also the yearly duties to the dead were paid.
It is as well here to refer to a passage of Ovid (Fasti, ii. 47 foll.),
itself probably based on a statement of Varro, which has led to a
controversy about the relative position of these two months:
Sed tamen antiqui ne nescius ordinis erres,
Primus, ut est, Iani mensis et ante fuit.
Qui sequitur Ianum, veteris fuit ultimus anni,
Tu quoque sacrorum, Termine, finis eras.
Primus enim Iani mensis, quia ianua prima est;
Qui sacer est imis manibus, imus erat.
Postmodo creduntur spatio distantia longo
Tempora bis quini continuasse viri.

This plainly means that from the time when March ceased to be the
first month, the year always began with January and ended with
February; in other words the order was January, March, April, and so
on, ending with February; until the time of the Decemvirate, when
February became the second month, and December the last, as at
present, January still retaining its place. A little consideration of
Ovid’s lines will, however, suggest the conclusion that he, and his
authority, whoever that may have been, were arguing aetiologically
rather than on definite knowledge. January, they thought, must
always have been the first month, because janua, ‘a door,’ is the first
thing, the entrance, through which you pass into a new year as into
a house or a temple. How, they would argue, could a month thus
named have ever been the eleventh month? This once supposed
impossible, it was necessary to infer that the place of January was
the first, from the time of its introduction, and that it was followed
by March, April, &c., February coming last of all, immediately after
December; and finally that at the time of the Decemvirs, who are
known to have made some alterations in the calendar, the positions
of January and February were reversed, January remaining the first
month, but February becoming the second.
III. The Divisions of the Month.

The Romans, with their usual conservatism, preserved the shell of


the lunar system of reckoning long after the reality had disappeared.
The month was at all times divided by the real or imaginary phases
of the moon, though a week of eight days was introduced at an
early period, and though the month was no longer a lunar one.
The two certain points in a lunar month are the first appearance of
the crescent[15] and the full moon; between these is the point when
the moon reaches the first quarter, which is a less certain one.
Owing to this uncertainty of the reckoning of the first days of the
month there were no festivals in the calendars on the days before
the first quarter (Nones), with a single exception of the obscure
Poplifugia on July 5. The day of the new moon was called Kalendae,
as Varro tells us, ‘quod his diebus calantur eius mensis nonae a
pontificibus, quintanae an septimanae sint futurae, in Capitolio in
curia Calabra sic: Dies te quinque calo, Iuno Covella. Septem dies te
calo Iuno Covella’[16]. All the Kalends were sacred to Juno, whose
connexion with the moon is certain though not easy to explain.
With the Nones, which were sacred to no deity, all uncertainty
ceased. The Ides, or day of the full moon, was always the eighth
after the first quarter. This day was sacred to Jupiter; a fact which is
now generally explained as a recognition of the continuous light of
the two great heavenly bodies during the whole twenty-four
hours[17]. On the Nones the Rex sacrorum (and therefore before him
the king himself) announced the dates of the festivals for the month.
There was another internal division of the month, with which we are
not here specially concerned, that of the Roman week or nundinal
period of eight days, which is indicated in all the calendars by the
letters A to H. The nundinae were market days, on which the rustic
population came into Rome; whether they were also feast days
(feriae) was a disputed question even in antiquity.
IV. The Days.

Every day in the Roman calendar has a certain mark attached to it,
viz. the letters F, C, N, NP, EN, Q.R.C.F., Q.St.D.F., or FP. All of these
have a religious significance, positive or negative.
F, i. e. fas or fastus, means that on the day so marked civil and
especially judicial business might be transacted without fear of divine
displeasure[18]. Correctness in the time as well as place of all human
actions was in the mind of the early Roman of the most vital
importance; and the floating traditional ideas which governed his life
before the formation of the State were systematized and kept secret
by kings and priests, as a part, so to speak, of the science of
government. Not till B.C. 304 was the calendar published, with its
permissive and prohibitive regulations[19].
C (comitialis) means that the day so marked was one on which the
comitia might meet[20], and on which also legal business might be
transacted, as on the days marked F, if there were no other
hindrance. The total number of days thus available for secular
business, i.e. days marked F and C, was in the Julian calendar 239
out of 365.
N, i. e. nefastus, meant that the day so marked was religiosus,
vitiosus, or ater; as Gellius has it[21], ‘tristi omine et infames
impeditique, in quibus et res divinas facere et rem quampiam novam
exordiri temperandum est.’ Some of these days received the mark in
historical times for a special reason, e. g. a disaster to the State;
among these were the postriduani or days following the Kalends,
Nones and Ides, because two terrible defeats had occurred on such
days[22]. But most of them (in all they are 57) were probably so
marked as being devoted to lustrations, or worship of the dead or of
the powers of the earth, and therefore unsuitable for worldly
business. One long series of such dies nefasti occurs Feb. 1-14, the
time of purification; another, April 5-22, in the month occupied by
the rites of deities of growing vegetation; a third, June 5-14, when
the rites of the Vestals preparatory to harvest were taking place; and
a fourth, July 1-9, for reasons which are unfortunately by no means
clear to us.
NP was not a mark in the pre-Julian calendars, for it was apparently
unknown to Varro and Ovid. Verrius Flaccus seems to have
distinguished it from N, but his explanation is mutilated, even as it
survives in Festus[23]. No one has yet determined for certain the
origin of the sign, and discussion of the various conjectures would
be here superfluous[24]. It appears to distinguish, in the Julian
calendars, those days on which fell the festivals of deities who were
not of an earthly and therefore doubtful character from those
marked N. Thus in the series of dies nefasti in February and April the
Ides in each case have the mark NP as being sacred to Jupiter.
EN. We have a mutilated note in the calendar of Praeneste which
indicates what this abbreviation meant, viz. endotercisus =
intercisus, i. e. ‘cut into parts’[25]. In morning and evening, as Varro
tells us, the day was nefastus, but in the middle, between the
slaying of the victim and the placing of the entrails upon the altar, it
was fastus. But why eight days in the calendar were thus marked we
do not know, and have no data for conjecturing. All the eight were
days coming before some festival, or before the Ides. Of the eight
two occur in January and two in February, the others in March,
August, October and December. But on such facts no conjectures
can be built.
Q.R.C.F. (Quando Rex Comitiavit Fas) will be explained under March
24; the only other day on which it occurs is May 24. Q.St.D.F.
(Quando stercus delatum fas) only occurs on June 15, and will there
be fully dealt with.
FP occurs thrice, but only in three calendars. Feb. 21 (Feralia) is thus
marked in Caer.[26], but is F in Maff. April 23 (Vinalia) is FP in Caer.
but NP in Maff. and F in Praen. Aug. 19 (Vinalia rustica) is FP in Maff.
and Amit, F in Antiat. and Allif., NP in Vall. Mommsen explains FP as
fastus principio, i. e. the early part of the day was fastus, and
suggests that in the case of the Feralia, as the rites of the dead were
performed at night, there was no reason why the earlier part of the
day should be nefastus. But in the case of the two Vinalia we can
hardly even guess at the meaning of the mark, and it does not seem
to have been known to the Romans themselves.
V. The Calendars still surviving.

The basis of our knowledge of the old Roman religious year is to be


found in the fragments of calendars which still survive. None of
these indeed is older than the Julian era; and all but one are mere
fragments. But from the fragments and the one almost perfect
calendar we can infer the character of the earlier calendar with
tolerable certainty.
The calendar, as the Romans generally believed, was first published
by Cnaeus Flavius, curule aedile, in 304 B.C., who placed the fasti
conspicuously in the Forum, in order that every one might know on
what days legal business might be transacted[27]; in other words, a
calendar was published with the marks of the days and the
indications of the festivals. After this we hear nothing until 189 B.C.,
when a consul, M. Fulvius Nobilior, adorned his temple of Hercules
and the Muses with a calendar which contained explanations or
notes as well as dates[28]. These are the only indications we have of
the way in which the pre-Julian calendar was made known to the
people.
But the rectification of the calendar by Julius, and the changes then
introduced, brought about a multiplication of copies of the original
one issued under the dictator’s edict[29]. Not only in Rome, but in the
municipalities round about her, where the ancient religious usage of
each city had since the enfranchisement of Italy been superseded,
officially at least, by that of Rome, both public and private copies
were made and set up either on stone, or painted on the walls or
ceiling of a building.
Of such calendars we have in all fragments of some thirty, and one
which is all but complete. Fourteen of these fragments were found in
or near Rome, eleven in municipalities such as Praeneste, Caere,
Amiternum, and others as far away as Allifae and Venusia; four are
of uncertain origin[30]; and one is a curious fragment from Cisalpine
Gaul[31]. Most of them are still extant on stone, but for a few we
have to depend on written copies of an original now lost[32]. No day
in the Roman year is without its annotation in one or more of these;
the year is almost complete, as I have said, in the Fasti Maffeiani;
and several others contain three or four months nearly perfect[33].
Two, though in a fragmentary condition, are of special interest. One
of these, that of the ancient brotherhood of the Fratres Arvales,
discovered in 1867 and following years in the grove of the brethren
near Rome, contains some valuable additional notes in the
fragments which survive of the months from August to November.
The other, that of Praeneste, containing January, March, April and
parts of February and December, is still more valuable from the
comments it contains, most of which we can believe with confidence
to have come from the hand of the great Augustan scholar Verrius
Flaccus. We are told by Suetonius that Verrius put up a calendar in
the forum at Praeneste[34], drawn up by his own hand; and the
date[35] and matter of these fragments found at Praeneste agree with
what we know of the life and writings of Verrius. It is unlucky that
recent attempts to find additional fragments should have been
entirely without result; for the whole annotated calendar, if we
possessed it, would probably throw light on many dark corners of
our subject.
To these fragments of Julian calendars, all drawn up between B.C. 31
and A.D. 46, there remain to be added two in MSS.: (i) that of
Philocalus, A.D. 354, (ii) that of Polemius Silvius, A.D. 448; neither of
which are of much value for our present purpose, though they will
be occasionally referred to. Lastly, we have two farmer’s almanacs
on cubes of bronze, which omit the individual days, but are of use as
showing the course of agricultural operations under the later
Empire[36].
All these calendars, some of which had been printed wholly or in
part long ago, while a few have only been discovered of late, have
been brought together for the first time in the first volume of the
Corpus Inscriptionum Latinarum, edited by Mommsen with all his
incomparable skill and learning, and furnished with ample
elucidations and commentaries. And we now have the benefit of a
second edition of this by the same editor, to whose labours in this as
in every other department of Roman history it is almost impossible
to express our debt in adequate words. All references to the
calendars in the following pages will be made to this second edition.
A word remains to be said about the Fasti of Ovid[37], which is a
poetical and often fanciful commentary on the calendar of the first
half of the Julian year, i.e. January to June inclusive; each month
being contained in one book. Ovid tells us himself[38] that he
completed the year in twelve books; but the last six were probably
never published, for they are never quoted by later writers. The first
six were written but not published before the poet’s exile, and taken
in hand again after the death of Augustus, but only the first book
had been revised when the work was cut short by Ovid’s death.
Ovid’s work merits all praise as a literary performance, for the
neatness and felicity of its versification and diction; but as a source
of knowledge it is too much of a medley to be used without careful
criticism. There is, however, a great deal in it that helps us to
understand the views about the gods and their worship, not only of
the scholars who pleased themselves and Augustus by investigating
these subjects, but also of the common people both in Rome and in
the country. But the value varies greatly throughout the work. Where
the poet describes some bit of ritual which he has himself seen, or
tells some Italian story he has himself heard, he is invaluable; but as
a substitute for the work of Varro on which he drew, he only
increases our thirst for the original. No great scholar himself, he
aimed at producing a popular account of the results of the work of
scholars, picking and choosing here and there as suited his purpose,
and not troubling himself to write with scientific accuracy. Moreover,
he probably made free use of Alexandrine poets, and especially of
Callimachus, whose Aetia is in some degree his model for the whole
poem; and thus it is that the work contains a large proportion of
Greek myth, which is often hard to distinguish from the fragments of
genuine Italian legend which are here and there imbedded in it. Still,
when all is said, a student of the Roman religion should be grateful
to Ovid; and when after the month of June we lose him as a
companion, we may well feel that the subject not only loses with
him what little literary interest it can boast of, but becomes for the
most part a mere investigation of fossil rites, from which all life and
meaning have departed for ever.
VI. The Calendar of the Republic and its Religious
Festivals.

All the calendars still surviving belong, as we saw, to the early


Empire, and represent the Fasti as revised by Julius. But what we
have to do with is the calendar of the Republic. Can it be recovered
from those we still possess? Fortunately this is quite an easy task, as
Mommsen himself has pointed out[39]; we can reconstruct for certain
the so-called calendar of Numa as it existed throughout the
Republican era. The following considerations must be borne in mind:
1. It is certain that Caesar and his advisers would alter the familiar
calendar as little as possible, acting in the spirit of persistent
conservatism from which no true Roman was ever free. They added
10 days to the old normal year of 355 days, i. e. two at the end of
January, August, and December, and one at the end of April, June,
September, and November; but they retained the names of the
months, and their division by Kalends, Nones, and Ides, and also the
signs of the days, and the names of all festivals throughout the year.
Later on further additions were made, chiefly in the way of
glorification of the Emperors and their families; but the skeleton
remained as it had been under the Republic.
2. It is almost certain that the Republican calendar itself had never
been changed from its first publication down to the time of Caesar.
There is no historical record of any alteration, either by the
introduction of new festivals or in any other way. The origin of no
festival is recorded in the history of the Republic, except the second
Carmentalia, the Saturnalia, and the Cerealia[40]; and in these three
cases we can be morally certain that the record, if such it can be
called, is erroneous.
3. If Julius and his successors altered only by slight additions, and if
the calendar which they had to work on was of great antiquity and
unchanged during the Republic, how, in the next place, are we to
distinguish the skeleton of that ancient calendar from the Julian and
post-Julian additions? Nothing is easier; in Mommsen’s words, it is
not a matter of calculation; a glance at the Fasti is sufficient. In all
these it will be seen that the numbers, names, and signs of the days
were cut or painted in large capital letters; while ludi, sacrifices, and
all additional notes and comments appear in small capital letters. It
cannot be demonstrated that the large capital letters represent the
Republican calendar; but the circumstantial evidence, so to speak, is
convincing. For inscribed in these large capitals is all the information
which the Roman of the Republic would need; the dies fasti,
comitiales, nefasti, &c.; the number of the days in the month; the
position of the Nones and the Ides and the names of those days on
which fixed festivals took place; all this in an abbreviated but no
doubt familiar form. The minor sacrificial rites, which concerned the
priests and magistrates rather than the people, he did not find there;
they would only have confused him. The moveable festivals, too, he
did not find there, as they changed their date from year to year and
were fixed by the priesthood as the time for each came round. The
ludi, or public games, were also absent from the old calendar, for
they were, originally at least, only adjuncts to certain festivals out of
which they had grown in course of time. Lastly, all rites which did
not technically concern the State as a whole, but only its parts and
divisions[41], i. e. of gentes and curiae, of pagi (paganalia), montes
(Septimontium) and sacella (Sacra Argeorum), could not be included
in the public calendar of the Roman people.
But the Roman of the Republic, even if his calendar were confined to
the indications given by the large capital letters in the Julian
calendar, could find in these the essential outline of the yearly round
of his religious life. This outline we too can reconstruct, though the
detail is often wholly beyond our reach. For this detail we have to fall
back upon other sources of information, which are often most
unsatisfactory and difficult to interpret. What are these other
sources, of what value are they, and how can that value be tested?
Apart from the surviving Fasti, we have to depend, both for the
completion of the religious calendar, and for the study and
interpretation of all its details, chiefly on the fragmentary remains of
the works of the two great scholars of the age of Julius and
Augustus, viz. Varro and Verrius Flaccus, and on the later
grammarians, commentators, and other writers who drew upon their
voluminous writings. Varro’s book de Lingua Latina, though not
complete, is in great part preserved, and contains much information
taken from the books of the pontifices, which, did we but possess
them, would doubtless constitute our one other most valuable record
besides the Fasti themselves[42]. Such, too, is the value of the
dictionary of Verrius Flaccus, which, though itself lost, survives in the
form of two series of condensed excerpts, made by Festus probably
in the second century, A.D., and by Paulus Diaconus as late as the
beginning of the ninth[43]. Much of the work of Varro and Verrius is
also imbedded in the grammatical writings of Servius the
commentator on Virgil, in Macrobius, Nonius, Gellius, and many
others, and also in Pliny’s Natural History, and in some of the
Christian Fathers, especially St. Augustine and Tertullian; but all
these need to be used with care and caution, except where they
quote directly from one or other of their two great predecessors.
The same may be said of Laurentius Lydus[44], who wrote in Greek a
work de Mensibus in the sixth century, which still survives. To these
materials must be added the great historical writers of the Augustine
age; Livy, who, uncritical as he was, and incapable of distinguishing
the genuine Italian elements in religious tradition from the accretions
of Greek and Graeco-Etruscan myth, yet supplies us with much
material for criticism; and Dionysius of Halicarnassus, who as a
foreigner resident for some time in Rome, occasionally describes
ritual of which he was himself a witness. The Roman lives of
Plutarch, and his curious collection entitled Roman Questions, also
contain much interesting matter, taken from several sources, e.g.
Juba, the learned king of Mauritania, but as a rule ultimately
referable to Varro. Beyond these there is no one author of real
importance; but the ‘plant’ of the investigator will include of course
the whole of Roman literature, and Greek literature so far as it
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookbell.com

You might also like