Sg 246928
Sg 246928
Jouko Jäntti
Bill Stillwell
Gary Wicks
ibm.com/redbooks
International Technical Support Organization
July 2003
SG24-6928-00
Note: Before using this information and the product it supports, read the information in “Notices” on
page vii.
This edition applies to IMS Version 8 (program number 5655-C56) or later for use with the OS/390 or z/OS
operating system.
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Contents v
vi IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not give you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring
any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and
distribute these sample programs in any form without payment to IBM for the purposes of developing, using,
marketing, or distributing application programs conforming to IBM's application programming interfaces.
ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel Corporation in the United
States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the
United States, other countries, or both.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems,
Inc. in the United States, other countries, or both.
C-bus is a trademark of Corollary, Inc. in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
SET, SET Secure Electronic Transaction, and the SET Logo are trademarks owned by SET Secure Electronic
Transaction LLC.
Other company, product, and service names may be trademarks or service marks of others.
viii IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Preface
This IBM Redbook is the second volume of a series of redbooks called IMS™ in the Parallel
Sysplex®. These redbooks describe how IMS exploits the Parallel Sysplex functions and how
to plan for, implement, and operate IMS systems working together in a Parallel Sysplex. We
use the term IMSplex to refer to multiple IMSs that are cooperating with each other in a
Parallel Sysplex environment to process a common shared workload. Although we generally
think of an IMSplex in terms of online environments, an IMSplex can include batch IMS jobs
as well as IMS utilities.
IMS in the Parallel Sysplex, Volume I: Reviewing the IMSplex Technology, SG24-6908
describes the Parallel Sysplex and how IMS exploits the Parallel Sysplex to provide user
services including data sharing, shared queues, VTAM® Generic Resources, Automatic
Restart Manager (ARM), and systems management functions. When migrating an IMS
system from a single, non-sharing environment to one which invokes some or all of these
services, or even when incorporating additional function into an existing IMSplex (for
example, upgrading a data sharing system to also use shared queues), the migration process
must be carefully planned. Many decisions and compromises must be made, perhaps even
some application or database changes. There will also be changes to system definition and to
operational procedures.
This redbook addresses the development of the migration plan and identifies some of the
steps and considerations you might encounter when developing the plan. The result of this
exercise is not to perform any of the implementation tasks but to identify those tasks which
must be done, and to create a plan for accomplishing them. For example, the plan can identify
as a task the establishment of a naming convention for system data sets. The naming
convention itself is not a part of the plan, but is a result of implementing the plan.
In this book we present planning considerations for the IMSplex. Separate chapters are
devoted to:
Block level data sharing
Shared queues
Connectivity
Systems management
The overall IMSplex environment
Jouko Jäntti is a Project Leader specializing in IMS with the IBM International Technical
Support Organization, San Jose Center. He holds a bachelor’s degree in Business
Information Technology from Helsinki Business Polytechnic, Finland. Before joining the ITSO
in September 2001, Jouko worked as an Advisory IT Specialist at IBM Global Services,
Finland. Jouko has been working on several e-business projects with customers as a
specialist in IMS, WebSphere®, and UNIX on the OS/390® platform. He has also been
responsible for the local IMS support for Finnish IMS customers. Prior to joining IBM in 1997,
Bill Stillwell is a Senior Consulting I/T Specialist and has been providing technical support
and consulting services to IMS customers as a member of the Dallas Systems Center for 20
years. During that time, he developed expertise in application and database design, IMS
performance, Fast Path, data sharing, shared queues, planning for IMS Parallel Sysplex
exploitation and migration, DBRC, and database control (DBCTL). He also develops and
teaches IBM education and training courses, and is a regular speaker at the annual IMS
Technical Conferences in the United States and Europe.
Gary Wicks is a Certified I/T Specialist in Canada. He has 28 years of experience with IBM in
software service and is currently on assignment as an IMS Development Group Advocate. He
holds a degree in mathematics and physics from the University of Toronto. His areas of
expertise include IMS enablement in Parallel Sysplex environments and he has written
several IBM Redbooks™ on the subject over the last six years.
Rich Conway
Bob Haimowitz
International Technical Support Organization, Poughkeepsie Center
Rose Levin
Jim Bahls
Mark Harbinski
Claudia Ho
Judy Tse
Sandy Stoob
Pedro Vera
IBM Silicon Valley Laboratory,USA
Knut Kubein
IBM Germany
Suzie Wendler
IBM USA
Your efforts will help increase product acceptance and customer satisfaction. As a bonus,
you'll develop a network of contacts in IBM development labs, and increase your productivity
and marketability.
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
We want our Redbooks to be as helpful as possible. Send us your comments about this or
other Redbooks in one of the following ways:
Use the online Contact us review redbook form found at:
ibm.com/redbooks
Send your comments in an Internet note to:
[email protected]
Mail your comments to:
IBM® Corporation, International Technical Support Organization
Dept. QXXE Building 80-E2
650 Harry Road
San Jose, California 95120-6099
xi
xii IMS in the Parallel Sysplex Volume II: Planning the IMSplex
1
This volume deals primarily with the first phase, planning. It tries to identify for you in the form
of questions that you must answer or issues you must address, what you have to consider
when migrating from a non-sysplex environment to an IMSplex environment. This volume is
also useful when you are upgrading your current IMSplex to take advantage of other IMS
Parallel Sysplex capabilities, such as migrating from data sharing only to a shared queues
environment, or to one that includes the Common Service Layer available with IMS Version 8.
The first volume in this series of three volumes, IMS in the Parallel Sysplex, Volume I:
Reviewing the IMSplex Technology, SG24-6908, describes the IMSplex, its functionality, and
its benefits.
The third volume contains the implementation and operational details. For example, you may
decide during the planning phase that you want to migrate to shared queues to meet some
business requirement. IMS in the Parallel Sysplex, Volume III: IMSplex Implementation and
Operations, SG24-6929 describes how you implement shared queues, including parameter
changes, JCL changes, new address spaces, Coupling Facility structure definitions, and so
forth.
Most users will not go directly from a plan to a production environment. They will first create a
test environment (perhaps several) and a development environment (perhaps several).
Although this document is written as though all of the planning were for the production
environment, similar planning, preparation, implementation, and operational phases exist for
the test and development environments, with the same kinds of questions and tasks. The
answers and tasks may be different, but the questions are the same.
Throughout the planning process, there are several concepts that must be in the forefront of
all your planning activities. You might recognize fewer or more, but each has a purpose, and
that purpose must be satisfied in any planning process.
The assumption here is that the existing environment is being replaced (or upgraded) for one
or more reasons, such as better performance, availability, operability, or perhaps reduced
cost. The target environment must continue to provide the equivalent function with
performance and availability at least as good as the existing environment. So, in order to
define a target environment which will do this, it is first necessary to understand the existing
environment. The last thing you want to do is to get into production in the new environment
and then find out that some process, application, or database has some unique requirements
that are not met by the new IMSplex environment.
The following describes some of the characteristics of the existing environment that should be
known before defining the target.
What is the current workload?
This should be documented in terms that will facilitate the distribution of that workload
over two or more processors and should include transaction volumes as well as batch and
BMP and support jobs such as image copies, reorganizations, and so forth.
Who are the heavy resource users?
Which of the transactions or batch processes identified in the previous question require
the greatest number of CPU or I/O resources, and which transactions are the highest
volumes? Are there any existing problems (for example, locking conflicts) that must be
accounted for in the new environment? It might be necessary to make special provisions
for them in the target environment.
What are the service level commitments?
What agreements exist for transaction response times, batch elapsed times, and
availability? How about recovery times following database or system failure? Are users
billed according to the resources they use?
What are the network connections to the existing IMS?
Who are the end-users and how are they connected? VTAM? TCP/IP using OTMA? Any
changes expected within the planning horizon?
To what systems is the existing IMS connected?
This should include other IMSs, DB2s, CICSs, IMS Connect, WebSphere MQ, DB2®
stored procedures, WebSphere Application Server programs, and any other intelligent
systems or devices that might be sensitive to the identity of the IMS system to which they
are connected.
What are the online and batch schedules?
Always keep in mind the target environment. You may begin with some idea of what that
configuration is, but you should continually review your first ideas to see if that configuration
must change as you identify the function that meets your objectives. There will probably be
multiple target environments, which include test, development, and production environments.
Defining the environment should include the configuration of the Parallel Sysplex itself, the
number of IMS subsystems in the IMSplex, composition of the VTAM generic resource group
and/or OTMA group, the OS/390 or z/OS system on which each IMS will run, other
subsystems outside of the target IMSplex environment to which the target IMSs will connect
(for example, other IMSs and IMSplexes connected by MSC or ISC), the Coupling Facilities to
The elements of the target configuration, along with some of the questions you must answer,
are as follows:
Processors (CPCs) and LPARs
How many and which LPARs will IMS be active in?
OS/390 or z/OS systems
The OS/390 or z/OS systems in the target configuration should be identified by the
processors and LPARs on which they run and the types of work they will support. The
types of work include the IMS subsystems and support processes they will handle.
Coupling Facilities
The Coupling Facilities and Coupling Facility structures to support IMS should be
identified. How many Coupling Facilities are available for IMS structures? What type of
processor are they (for example, 967x or zSeries®)? What is the CF Level of the Coupling
Facility Control Code (CFCC)? Structures required in an IMSplex may include:
– Data sharing structures:
• IRLM (lock structure)
• VSAM cache structure (directory-only cache structure)
• OSAM cache structure (directory-only or store-through cache structure)
• Shared DEDB VSO structure(s) (store-in cache structures)
– Shared queues:
• Shared message queue and shared EMH queue primary and overflow structures
(list structures)
• System Logger structures (list structures)
– Common Service Layer
• Resource structure (list structure)
IMS subsystems
These are IMS online systems (either IMS Transaction Manager, DC Control, or Database
Control), IMS batch jobs, and IMS support processes, such as utilities. Identify where IMS
batch jobs will execute. Support processes may include database image copies, database
recovery, database reorganization, log archiving, and message queue management. The
OS/390 or z/OS systems on which they will run should be identified.
IMS Transaction Manager connectivity
For work to execute in an IMS environment, the user must be able to connect to an IMS
control region. When there was only one control region, it was fairly obvious which one to
connect to, and what to do when that one control region was not available. BMPs always
connected to the one IMS control region. When there are multiple control regions, it is
necessary to define how work will flow to each IMS in the IMSplex. This may include:
– SNA network
These are the “traditional” means of gaining access to an IMS transaction manager.
These devices may be real (for example, a 3270 terminal) or simulated (for example,
Telnet (TN3270)). End-users may gain access through VTAM Generic Resources, or
During this planning phase, you should determine both the processing and business impact of
the failure of any component of the target environment. Identify those applications which must
be given priority in a degraded processing environment. You must also consider what users
who are connected to a failed component should do (for example, log on to another IMS, or
wait for the IMS restart).
Some of the tasks in this phase will be decision-type tasks (for instance, how many copies of
SDFSRESL do I want?). Others will be implementing some of these decisions (such as,
making two copies of SDFSRESL). At the conclusion of this phase, you are ready to migrate
to your target environment.
Throughout the description of the Parallel Sysplex services, we have tried to use IMS’s
implementation of Parallel Sysplex services as examples.
Since that time, each release of IMS has more fully exploited Parallel Sysplex services. There
is every indication that future releases of IMS will add even more Parallel Sysplex exploitation
to enable a more dynamic, available, manageable, scalable, and well performing environment
for database, transaction, and systems management. To be able to take advantage of these
functions, and any future enhancements, it is important that you have at least a cursory
understanding of what a Parallel Sysplex is, how it works, and why it is important to IMS and
to your installation.
While this is simply a very high level overview of the Parallel Sysplex, more detailed
information is provided in a series of IBM publications on the Parallel Sysplex and Parallel
Sysplex services. These publications can be found by searching for the term sysplex on the
IBM publications center Web site:
http://ehone.ibm.com/public/applications/publications/cgibin/pbi.cgi
The Parallel Sysplex was introduced later in the 1990s and added hardware and software
components to provide for sysplex data sharing . In this context, data sharing means the ability
for sysplex member systems and subsystems to store data into, and retrieve data from, a
common area. IMS data sharing is a function of IMS subsystems which exploits sysplex data
sharing services to share IMS databases. Since IMS did not exploit the base sysplex, from
this point on we will use the term sysplex as shorthand notation for a Parallel Sysplex.
In this series of redbooks, the term sysplex, when used alone, refers to a Parallel Sysplex.
When the sysplex is not a Parallel Sysplex, and the distinction is important, we use the term
base sysplex. We use the term IMSplex to refer to one or more IMSs cooperating in a single
data sharing environment.
There are surely many more and each new product provided by IBM, as well as other
software vendors, may take advantage of these sysplex services. How to use them is
documented in a series of IBM publications. Examples include:
z/OS MVS Programming: Sysplex Services Guide, SA22-7617
z/OS MVS Programming: Sysplex Services Reference, SA22-7618
z/OS MVS Setting Up a Sysplex, SA22-7625
Figure 2-1 shows some of the major components of a possible Parallel Sysplex configuration.
CF-CF
Structures Link
Coupling
CFCC
Facilities
CPC-CF CPC-CF
Link Link
...
CDSs
Figure 2-1 Parallel Sysplex configuration
There are many XCF groups in a Parallel Sysplex, and IMS and IMSplex-related programs
are members of many XCF groups. Groups are dynamically created when the first member
joins the group. For example, when the IRLMs are initiated, they join an XCF group using the
name specified in the IRLMGRP= parameter. The group is dynamically created by XCF when
the first member joins. Information about the group is maintained in the sysplex couple data
set. We refer to this as a data sharing group, but it is also an XCF group. Being a member of a
group entitles that member to use XCF signaling and monitoring services.
An example of this is the IRLMs in a data sharing group. Each IRLM updates a status field in
CSA indicating that it is still alive and kicking (active). If one IRLM fails, XCF will detect that
the IRLM has stopped updating its status field and (try to) drive that IRLMs status user
routine. If the IRLM has truly failed, then the status user routine will not respond and XCF will
notify the other IRLMs of the failure, at which time they will begin action to read the retained
locks from the IRLM structure.
OS/390 also has a status field, but it is on the sysplex CDS instead of in CSA. If OS/390 fails,
or the processor fails, then this status field will not be updated. XCFs running on other
members of the sysplex monitor each other’s status field and will detect the status missing
condition and take action to isolate the failed system. XCF will then drive the group user
To take advantage of ARM, the program must register with ARM as a restartable element.
Most of the system-type address spaces in an IMSplex register with ARM as a default, but
can be overridden by an execution parameter (for example, ARMRST=Y|N). IMS dependent
regions are not supported by ARM - they must be restarted by operator command or
automation. Specifically, the following address spaces which may be a part of, or connected
to, your IMSplex may register with ARM:
IMS control regions (DB/DC, DBCTL, DCCTL)
Fast Database Recovery region (FDBR)
Internal Resource Lock Manager (IRLM)
Common Queue Server (CQS)
CSL address spaces (SCI, RM, and OM)
IMS Connect
DB2
CICS
VTAM
It is the responsibility of the application (for example, IMS) to enable itself to be restarted by
ARM. This is done by registering as a restart element to the automatic restart management
service of the Parallel Sysplex. ARM is activated in the sysplex by defining an ARM policy in
the ARM CDS and then activating that policy.
SETXCF START,POLICY,TYPE=ARM,POLNAME=yourpolicyname
The following are highlights of the automatic restart management services. Most can be
selected as options or overridden in the ARM policy.
Abends
The registered element will be restarted on the same OS/390 image. When registering,
IMS identifies some abend types as not restartable. In general, these are abend types for
which you would not want an automatic restart, for example:
– U0020 - IMS is modified down
– U0028 - /CHE ABDUMP
– U0624 - /SWITCH SYSTEM (XRF planned takeover)
– U0758 - message queues are full
– U0759 - message queue error
Following an abend, IMS will be restarted with AUTO=Y causing /ERE.
System failures
The above is a sampling of the capabilities and options you have with ARM. There are several
other parameters you can specify in the ARM policy which have not been identified here.
Refer to OS/390 documentation for a complete description of ARM and the ARM policy and
parameters (see z/OS MVS Programming: Sysplex Services Guide, SA22-7617).
Shared data resides in one of three structure types in the Coupling Facility — cache
structures, lock structures, and list structures. XES provides a set of services to connect to
and define the attributes of these structures (connection services), and another set of
services unique to each structure type (cache services, lock services, and list services).
Connectors use these services to invoke the functions supported by that structure type.
There are three types of structures that can be allocated in a Coupling Facility - cache
structures, lock structures, and list structures. Each type is designed to support certain data
sharing functions and has its own set of XES services. Programs such as IMS, which want to
access a structure and invoke these XES services for data sharing, connect to it using XES
connection services. After connecting to the structure, the program then accesses data in the
structure using XES cache, lock, or list services.
*
* *
Local Cache Vector
The directory also identifies which connectors (IMSs) have that block in their local buffer pool,
and a vector index relating the directory entry to a buffer in the local buffer pool. When one
connector updates a block of data on DASD, the CFCC uses the vector index in the directory
entry to invalidate that buffer in other connectors having the same block in their buffer pools.
Note that all cache structures have directory entries, and all directory entries perform the
same functions.
Figure 2-3 is an illustration of a directory only IMS cache structure (for example, the VSAM
cache structure). In this example, BLK-A is in IMSX’s buffer pool in a buffer represented by
vector-index-1 in IMSX’s local cache vector. BLK-A is also in IMSY’s buffer pool in a buffer
represented by vector-index-4 in IMSY’s local cache vector. The bits in the local cache vector
were assigned to the buffers by each IMS when it connected to the structure.
IMS uses this type of structure for all shared VSAM databases, and has the option to use it for
OSAM.
Block-A
ROOTA1 ROOTA2
Figure 2-4 illustrates a store-through cache structure. Note that not all directory entries have
data entries. This may happen when IMS reads a block of data into a buffer but does not write
it to the structure. OSAM has an option, for example, to not write data to the OSAM cache
structure unless it has been changed. If it is changed, OSAM will write it first to DASD and
then to the structure.
Although there is only one OSAM cache structure, whether or not to cache data in that
structure is an option set at the subpool level. When this type of structure is used, but not all
OSAM data is cached, for blocks that are not cached there will be directory entries for the
blocks in the buffers but no data entries.
IMS uses these for OSAM Not all OSAM buffers pools need to
Caching is optional by have same "caching" option. Option
subpool specified on IOBF statement
IMS uses store-in cache structures only for shared VSO AREAs in a data entry database
(DEDB). Store-in cache structures can offer a very significant performance benefit over
store-through structures because there is no I/O. If a shared VSO CI is updated many times
between two IMS system checkpoints, then with a store-through structure (such as, OSAM),
that block or CI is written to DASD each time it is committed — perhaps hundreds or
thousands of times between system checkpoints. If it is in a store-in cache structure (such as,
shared VSO), then it gets written to the structure each time it is committed, but only gets
written to DASD once per system checkpoint, saving perhaps hundreds or thousands of
DASD I/Os.
WRITE_DATA request
IMS issues a WRITE_DATA request to write a block of data to a store-through or store-in
cache structure. If IMS has updated that block of data, then the request will contain an
indicator that the data has been changed. If any other IMS has registered interest in the data
item, its buffer will be invalidated by flipping a bit in the local cache vector. Which bit to flip is
determined by looking at the directory entry. A signal is then sent to the processor to flip that
bit in the local cache vector, indicating the buffer is invalid. The IMS which had its buffer
invalidated is removed from the directory entry. Note that this IMS does not know its buffer
has been invalidated until it tests the bit setting in the local cache vector on a refetch
(explained below).
CROSS_INVALidate request
When IMS updates a block of data on DASD and does not write it to the structure (for
example, if it is a directory-only structure), then of course there is no WRITE_DATA request to
invalidate the buffers. In this case, IMS will issue a CROSS_INVALidate request to tell the
CFCC to invalidate any other IMS’s buffer with registered interest in the data item.
TESTLOCALCACHE request
Before any IMS uses data in any buffer, it must first test to determine if the contents of that
buffer are still valid. To do this, it must test the bit in the local cache vector to see if some other
IMS has cross-invalidated it. This is a TESTLOCALCACHE request and must be issued
before each access to the buffer. If a buffer is found to be invalid, then that IMS must
re-register interest in the data item and, if not returned on the READ_DATA request, then
reread it from DASD.
Figure 2-6 and Figure 2-7 are examples of IMS using cache services to read and update
data.
1. A program in IMSX issues a get call to ROOTA1 which is in BLK-A. IMSX issues a
READ_DATA request to register interest in BLK-A and provides the vector index into the
local cache vector. A directory entry for BLK-A is created in the cache structure.
2. IMSX reads BLK-A from DASD.
3. IMSX writes BLK-A to DASD using a WRITE_DATA request and indicating that the data is
not changed.
4. A program in IMSY issues a call to ROOTA2, which is also in BLK-A. IMSY must register
interest in BLK-A using a READ_DATA request. This will update the directory entry for
BLK-A to indicate IMSY’s interest and its vector index.
5. If BLK-A is in the structure when IMSY issues the READ_DATA request, it will be moved to
IMSY’s buffer. If it is not in the structure, then IMSY would read it from DASD.
Directory Data
After data is written to DASD, Elements
data is written to structure
3
Write operation invokes
buffer invalidation 2 A1' A2
1 4
A' E A
Q B
1. If the program on IMSX updates ROOTA1, IMSX will first write it to DASD.
2. IMSX will then issue a WRITE_DATA request with the changed data indicator on.
3. The CFCC will send a signal to IMSY’s processor to flip the bit pointed to by IMSY’s vector
index in the directory entry and remove IMSY’s interest from the directory entry.
4. If IMSY needs to use the buffer again, it must test the validity of the buffer by issuing the
TESTLOCALCACHE request. When it finds that it is invalid, it will re-register interest in
BLK-A. At this time, since BLK-A (with the updated data) is in the structure, it will be read
into IMSY’s buffer.
If this were a store-in cache structure (shared VSO), the only difference would be that step-1
would be skipped. Updated blocks in the structure would be written to DASD later (at IMS
system checkpoint time) using cast-out processing (another function of XES cache services).
When the cache structure also contains data, then enough space must be allocated to hold
as much data as the user wants. Space for data entries is used just like space in a buffer pool
- the least recently used data entry will be purged to make room for a new one. This is
unavoidable, unless the database being cached is very small and will fit entirely within the
cache structure, so like a buffer pool, the sizing of the structure will affect its performance.
IMS uses the IRLM to manage locks in a block level data sharing environment. The IRLM
uses XES lock services to manage the locks globally. It is the IRLM, not IMS, that connects to
the lock structure. In the context of the definition of the components of a lock structure, the
connected user is an IRLM and the resource is an IMS database resource (for example, a
record or a block).
The lock table consists of an array of lock table entries. The number and size of the lock table
entries is determined by the first connector at connection time. Each lock table entry identifies
connectors (IRLMs) with one or more clients (IMSs) that have an interest in a resource which
has “hashed” to that entry in the lock table. The entry also identifies a contention manager, or
global lock manager, for resources hashing to that entry. The purpose of the lock table is to
provide an efficient way to tell whether or not there is a potential contention for the same
resource among the data sharing clients. If there is no potential contention, a lock request can
be granted immediately without the need to query other lock managers for each individual
lock request. If the lock table indicates that there is potential contention, then additional work
is needed to determine whether there is a conflict between the two requestors. This is the
responsibility of the global lock manager.
In the example shown in Figure 2-8, a lock request issued by an IMS system (let’s say IMS3)
to IRLM3 for RECORDA hashes to the third entry in the lock table (it has a hash value of 3).
Interest is registered in that entry and IRLM3 is set as the global lock manager or contention
manager for that entry. There has also been a lock request to IRLM1 for RECORDB. This is
potential lock contention which must be resolved by the global lock manager. It is impossible
to tell from the information in the lock table whether the two requests are compatible or not. It
is the responsibility of the global lock manager to resolve this. Also in our example, an IRLM1
client has also requested a lock for RECORDC which hashed to entry 5 in the lock table.
Since there is no other interest from any IRLM client for this entry, the lock can be granted
immediately.
We will discuss how potential lock conflicts are resolved when we discuss XES lock services.
Record List
Lock Table
A
in record list
If record retrieved using PCB C Record List
with update PROCOPT IRLM1 IRLM2 IRLM3
B
A
If any IRLM fails, surviving IRLMs
Are informed by XCF C
record list
Figure 2-8 shows the record list component, or the list of record data entries showing the level
of interest for specific resources. For the IMS data sharing environment, these entries
represent record and block locks which IMS has acquired with update intent. It does not mean
only those resources which IMS has updated, but those resources which IMS might have
updated. If an IMS database record was retrieved with an update processing option
(PROCOPT), then the lock for that resource goes into this record list. These record data
entries in the record list are used only for recovery when a connector (IRLM) fails. They are
not used for granting or rejecting locks at any other time. They are not used, for example, to
resolve the potential conflict indicated by the lock table.
We will discuss how these are used when we discuss XES lock services.
When the request is made, the hash value is used as an index into the lock table. The lock
table entry will be updated to show interest by the requestor in this entry. If there is no other
In the example started in Figure 2-9, IMS3 has issued a lock request to IRLM3 for an
exclusive lock on ROOTA. IRLM3 uses the resource name to create a hash value (HV=4) and
then submits an XES lock services OBTAIN request. The hash table entry indexed by HV=4
shows no existing interest in any resource that would hash to this entry. The entry is updated
to show interest by IRLM3 and sets IRLM3 as the global lock manager for this entry. Record
data is added to the record list with information passed on the lock request, including the
resource name and lock holder. The lock is granted.
Figure 2-10 continues the example. In this illustration, IMS1 has requested a lock on ROOTB.
The resource name hashes to the same value as ROOTA (HV=4). When the lock request is
issued to lock services for an exclusive lock on ROOTB, the lock table entry shows that there
is already some interest in this entry, but we cannot tell from this whether it is for the same
resource. IRLM3’s contention exit is driven to resolve the potential conflict. IRLM3 has kept
the information about the other lock in a data space and can determine immediately that the
requests are for two different resources and the lock can be granted. IRLM1’s interest is
updated in the lock table entry, a record data entry is created, and IRLM3 keeps track of
IRLM1’s locked resource in a data space.
Figure 2-11 illustrates what happens when an IRLM fails and surviving IRLMs are notified by
XCF.
X
I
XCF status monitoring services R Lock Locks
informs other members of group L
M
Services
Help with sizing the IRLM lock structure can be found at:
http://www.ibm.com/servers/eserver/zseries/cfsizer
Other users of list structures include XCF for signalling, JES2 checkpoint, VTAM Generic
Resources and multinode persistent sessions, and just about anything that needs to be
shared across the sysplex and which does not meet the specific requirements of cache or
lock services.
LE 2
LE 1 LE 2
ADJ 1 ADJ 2
LE 2
List
LE 3
LE 3
ADJ 3
LE 4 LE 3
ADJ 4
The following describes these components and how they are used.
List headers
List headers are the anchor points for lists. The first connector to a list structure
determines how many list headers the structure should have. Each list entry is chained off
one list header. There is no preset meaning to a list header. The connector determines
how the list headers are to be used. The Common Queue Server (CQS) for IMS shared
queues allocates 192 list headers in a shared queues structure. It assigns some of them
for transactions, some for messages to LTERMs, some for MSC links, and so on. Some list
headers are used by CQS itself to help manage the shared queues.
List entries
This is where the information is kept. A list entry can consist of one, two, or three
components — list entry controls, an adjunct area, and a data entry. All list entries on a list
header form a list. All list entries on a list header that have the same entry key form a
sublist.
– List entry controls
Each list entry must have a list entry control (LEC) component. The LEC contains the
identification information for the list entry, such as the list header number, the list entry
When an EMC is queued to an EVENTQ, the EVENTQ transitions from empty to non-empty
and the list notification vector for that connector is set causing the list transition exit specified
by the connector to be driven. It is the responsibility of the list notification exit to determine
what the event was (that is, which EMC was queued to the EVENTQ). An XES list service
request called DEQUEUE_EVENT will read the EMC on the event queue and dequeue it. The
EVENTQ is now empty again and an EMC will not be queued again to the EVENTQ until the
sublist becomes empty again and then non-empty.
Figure 2-13 and Figure 2-14 show an example of CQS monitoring of the shared queues
structures.
List Notification
Vectors for ... SUBLIST
CQSA W W X Y Y Y Y
2
MESSAGES
4
CQSB
Note that, in the above example, the EMC for TRANY is not queued to the EVENTQ. This is
because it only occurs when the TRANY sublist transitions from empty to non-empty. When
the first TRANY arrived, this would have happened, the same as it is doing for TRANX. It will
not occur again for TRANY unless all the messages for TRANY are read and dequeued by
IMS and the sublist becomes empty and then non-empty again.
Note also that, although there are TRANW messages on the queue, no CQS has registered
interest so there is no EMC for TRANW. When (if) any CQS does register interest, an EMC
will be created and immediately queued to the EVENTQ.
IMSA SUBLIST
W W X Y Y Y Y
MESSAGES
7
8
CQSA
6 TRANY CQSA
6 CQSA TRANX CQSA
CQSB TRANZ CQSB
5 EVENT CQSC TRANX CQSC
QUEUE EMCs
The enqueuing of the EMC to CQSA’s EVENTQ causes CQSA’s list notification vector to be
set, which in turn causes the list transition exit to be driven. All CQS knows at this time is that
there is something on the queue, but does not know what. The exit will invoke list services to
read the EMC and dequeue it from the EVENTQ. Now that CQS knows what was on the
queue, it can notify IMSA. IMSA will, eventually, read the transaction from the shared queues,
process it, and delete it. A similar sequence of events will occur for CQSC and IMSC, except
that this time, when IMSC tries to read the message, it will already be gone (unless another
arrived in the meantime). This is called a false schedule in IMS and incurs some wasted
overhead. However, this technique is self-tuning as workload increases. The shared queues
planning chapters in this book will discuss this and suggest ways to minimize the occurrence
of false schedules.
CQS uses all of these services in support of IMS shared queues and the Resource Manager.
When there are one or more failed-persistent connections, which will occur if the structure (for
example, an IMS shared queues structure) is defined with persistent connections and a
connector (for example, CQS) fails. To delete the structure, you must first delete the
failed-persistent connection. This can be done by entering the following commands:
SETXCF FORCE,CONNECTION,STRNAME=structure-name,CONNAME=connection-name or ALL
SETXCF FORCE,STRUCTURE,STRNAME=structure=name
In an IMS shared queues environment, the shared queues structures and their connections
are persistent. Even though all of the IMSs and CQSs are down, the message queue
structures will persist. If you want to get rid of them, the equivalent, for example, of an IMS
cold start without shared queues, you must delete the structures.
User-managed rebuild
There must be at least one active connector to the structure and all connectors must agree to
coordinate the rebuild process. Briefly, this means that the connectors coordinate to quiesce
activity, to connect to the new instance, to propagate data to the new structure instance, and
then to signal completion, at which time the original instance is deallocated (deleted).
User-managed rebuild can be used to move a structure from one Coupling Facility to another,
or to recover from a loss of connectivity. It may be initiated by one of the connectors, or by
operator command:
SETXCF START,REBUILD,STRNAME=structure-name<,LOC=OTHER>
Specifying LOC=OTHER causes the structure to be rebuilt on another Coupling Facility and
requires at least two Coupling Facilities to be specified in the PREFLIST in the CFRM policy.
System-managed rebuild
System-managed rebuild may be invoked when there are no active connectors to the
structure. If there are active connectors, they only have to agree to quiesce activity until the
system has rebuilt the structure.
System-managed rebuild will not rebuild a structure due to loss of connectivity, structure
failure, or Coupling Facility failure. Such a rebuild requires an active connector and
user-managed rebuild.
Note: User-managed duplexing should not be confused with Fast Path support for dual
shared VSO structures.
Structure duplexing requires not only that the connector indicate support during connection,
but it also requires the CFRM couple data set to be formatted with SMDUPLEX, and for the
structure definition in the CFRM policy to specify DUPLEX(ALLOWED) or
DUPLEX(ENABLED). When duplexing is enabled, the structure will be duplexed
automatically by the system when the first connector allocates the structure. When duplexing
is allowed, it must be initiated, or terminated, by operator command.
Structure size can be altered only within the maximum and minimum sizes specified on the
structure definition statements in the CFRM policy (SIZE and MINSIZE). It may be initiated by
a connector or by the alter command:
SETXCF START,ALTER,STRNAME=structure-name,SIZE=new-size
When the command is used to alter the size of a lock structure, the size of the lock table itself
is not changed, only the amount of storage allocated for the record data entries.
Other structure characteristics, such as changing the entry-to-element ratio of a list structure,
can only be initiated by a connector (or by the system if autoalter is enabled). For IMS
structures, only CQS will initiate an alter, and then only to alter the size if it detects that the
structure has reached some user-specified threshold of data elements in use.
Autoalter
When autoalter is enabled, the system may initiate a structure alter based on one of two
factors:
If the system detects that the storage in use in a structure has reached a structure full
threshold specified in the policy, it may increase the size to relieve the constraint. At this
time it may also alter the entry-to-element ratio if it determines that it does not represent
the actual utilization of list entries and data elements.
If the system detects that storage in the Coupling Facility is constrained, it may decrease
the allocated storage for a structure that has a large amount of unused storage.
Also FULLTHRESHOLD(value) can be specified in the CFRM policy to override the default of
80% full.
As a user, you get to decide which mode using the z/OS modify command:
F WLM,MODE=GOAL - or -
F WLM,MODE=COMPAT
When running in goal mode, you must have defined your goals in one or more WLM policies
in the WLM couple data set (WLM CDS) and then activate one of these policies according to
your own objectives at the time they are activated:
VARY WLM,POLICY=xxxx <where xxxx = policy name>
You may have a different policy for an off-shift batch workload environment than for a
prime-shift transaction oriented workload; or different policies for weekdays and weekends.
Whatever your objectives, you must define in the policy a workload management construct
consisting of:
Service policy At least one is required; may have multiples; only one active at a time
Workloads Arbitrary name used to group service classes together; for example,
IMSP (IMS production), IMSD (development), IMST (test), TSOSP
(systems programming), and so forth
Classification rules Associates unit of work (transactions, batch jobs, etc.) with service
class; for example, an IMS workload may be classified by its
subsystem type (IMS), by its subsystem ID (IMSA), by a transaction
code (TRANX), transaction class (3), userid (CIOBOB), or several
other criteria.
Service classes Associated with workload; contains service goals for that work; you
may assign importance to favor workload when resources are limited
Performance goals The desired level of service WLM uses to allocate a system resource
to a unit of work; for transactions, goal may be percentage within
response time (for example, 95% of TRANX should respond in less
than 0.30 seconds)
Running WLM in goal mode may have several side effects in the IMSplex, other than the
obvious effect of helping you allocate system resources to the important work in the system:
JES may route batch work (such as BMPs) to an OS/390 image, which is best meeting its
batch performance goals. For a BMP, this means that the IMSID= parameter in the BMP
must match the IMSGROUP parameter in DFSPBxxx, enabling the BMP to connect to
whichever IMS happens to be executing on the same image.
VTAM Generic Resources may route a logon request to that IMS, which is best meeting its
response time goals, overriding VTAMs attempt to distribute the logons evenly across the
When IMS restarts, it may optionally join a VGR group using the GRSNAME specified in the
DFSDCxxx PROCLIB member. When the first IMS joins this group, VTAM dynamically
creates the group in a Coupling Facility list structure and adds this first IMS as a member in
the member list. As other IMSs join the group, they are added to the member list.
When you log on using the generic resource name, VTAM makes a determination among the
active members of the group as to which IMS your logon request will be routed. Your LU is
then added to an affinity table in the list structure.
Figure 2-15 shows several IMSs running on several OS/390 images in the same Parallel
Sysplex. All the IMSs have joined a VGR group called IMSX. VTAM maintains a member list in
a CF list structure. As each user logs on, the LU name is added to an affinity table in the same
structure.
ISTGENERIC
VTAM1 VTAM2 VTAM3 VTAM4
Generic Resource Group: IMSX
EN EN EN EN
IMSA IMSB IMSC IMSD Member List Affinity Table
IMS VTAM Count LU IMS
VTAM5 VTAM6 IMSA VTAM1 12 LU1 IMSA
CF IMSB VTAM2 13 LU2 IMSB
IMSE
IMSC VTAM3 11 LU3 IMSC
ESCON IMSD VTAM4 12
NN NN LU4 IMSA
IMSE VTAM5 14
CF Structure
When you log on using the generic name, VTAM must decide to which IMS your logon
request should be sent. If an affinity already exists, then the logon request will be sent to that
application with which the LU already has the affinity. If no affinity exists, then there are three
possibilities included in VTAM’s decision:
1. Send the logon request to the member with the fewest logged on users.
ISTGENERIC
VTAM1 Generic Resource Group: IMSX
Without WLM in goal mode, and without the exit, VTAM will balance the logons across the
generic resource group.
Users such as CQS which do not have their own internal logging mechanism, but still have a
requirement to record data for later restart or recovery, or as an audit trail, may invoke the
services of the System Logger (program IXGLOGR) to connect to a log stream, write data to
that log stream, and later to retrieve data from that log stream . These log streams are
physically managed by the System Logger and implemented in a Coupling Facility list
3b 3a
LOG
MSGQ Structures MVS
LOG
Staging
LOG
Data Set
EMHQ Structures
From other
systems
The log stream can be shared by multiple connectors. For example, multiple CQS address
spaces in a common shared queues environment can write to the same log stream. The
System Logger will merge the log records from these CQSs into a single, sequential log
stream. Figure 2-18 shows two CQSs writing log data to a single log stream with their log
records merged by the System Logger.
System
Logger
System
Logger
A B B A A B A A Next
Merged Log Stream
Figure 2-18 CQS merged log stream
If the log records are needed later for recovery or restart, they can be retrieved from the log
stream as necessary. CQS uses the log stream to record changes to the shared queues
structures. If a shared queues structure requires recovery, CQS will re-initialize the structure
using an SRDS and then read the log stream and apply the changes. This is similar to the
way IMS rebuild local message queues from a SNAPQ checkpoint and the OLDS/SLDS
message queue log records. Figure 2-19 shows how a single CQS can read log records
written to the merged log stream by all CQSs in the shared queues group in order to recover
a lost or damaged shared queues structure.
2
4 MSGQ
CQSA
3
1
SRDS System
Logger
A B B A A B A A
Merged Log Stream
Figure 2-19 CQS reading the merged log stream
Because it is likely that a list structure cannot hold all of the data required in a log stream, as
the structure reaches a user-specified full threshold, the System Logger writes the data to
one or more offload data sets implemented as up to 168 VSAM linear data sets. When the log
data is no longer needed, the connector (for example, CQS) can delete it from the log stream.
CQS does not have to know the physical location of data in the log stream - in the structure or
on an offload data set. Just issuing requests for the log data will retrieve it from wherever it is
located.
Figure 2-20 shows graphically how the log stream exists partly in the logger structure (the
more recent records) and partly on the offload data sets (the older log records). When log
records are no longer needed for recovery, CQS will delete them from the log stream,
releasing the storage.
Logstream will reside partly in the structure and partly in the offload data sets
Most recent log records are stored in logger structure
Older log records are offloaded to the offload data sets
Structure Structure
Checkpoint Checkpoint
Current
Staging LOGR 5 4 3 2
Data Sets STRUCT
6 In Use 1 0
Data
Next Deleted
Spaces (available)
It is not our intention to present all of the benefits of the Parallel Sysplex. These have been
documented in great detail in IBM documentation, trade magazines, trade shows, technical
conferences, and so on. We did, however, want to set the stage for why the Parallel Sysplex
environment is good for IMS.
IMS exploits many Parallel Sysplex functions, and each new release of IMS adds to that
growing list. Each of these functions requires the user to execute a plan for implementing the
IMSplex and operating in the IMSplex environment. The list, as of IMS Version 8, includes:
Block level data sharing
Shared queues
VTAM Generic Resources
VTAM multi-node persistent sessions
Automatic restart management
The value of a new product or feature is often difficult to quantify, especially because IMS
environments vary so widely. However, we believe that Parallel Sysplex technology is the
vehicle to carry large and not so large computer applications into the future. A look at the
history of IMS since Version 5 will show that each release adds more IMS features that are
based on Parallel Sysplex technology. There is little reason to believe that future releases of
IMS will not continue this progression into the Parallel Sysplex world — not only for capacity
and availability, but for function as well.
This volume addresses those planning considerations, which must be a part of any
implementation and operational plan.
IMS in the Parallel Sysplex, Volume I: Reviewing the IMSplex Technology, SG24-6908,
contains a detailed review of block level data sharing. This volume and this chapter identify
some of the planning tasks necessary, either formally or informally, to implement block level
data sharing in your IMSplex environment.
This chapter addresses the first two activities in the planning process for data sharing:
Objectives and expectations
Functional planning
The next two activities are addressed in Chapter 7, “Putting it all together” on page 159:
Configuration planning
Security planning
Data sharing implementation and operations are addressed briefly in this chapter, but are
covered more fully in IMS in the Parallel Sysplex, Volume III: IMSplex Implementation and
Operations, SG24-6929.
Like all of the other planning chapters, the objective here is to ask questions which you must
answer when developing your plans. Each question is followed by some discussion of what
you should consider when answering the question.
Supported by the Parallel Sysplex, IMS Version 5 was the first IMS version to expand the
two-way block level data sharing limit of previous IMS releases to N-way data sharing. A
maximum of 255 IMS data sharing subsystems (and/or IMS batch jobs) can simultaneously
access and update a single set of IMS databases with full data integrity and good
performance. These IMS subsystems, with their associated IRLMs, comprise a data sharing
group. A maximum of 255 connectors to a cache structure are allowed, hence a maximum of
255 IMSs may be a part of the data sharing group. However, since each IRLM connects to a
lock structure, and lock structures allow only 32 connectors, you can have “only” 32 IRLMs in
a sysplex. Each IRLM can support multiple IMSs, batch and online, on the same LPAR.
IMS data sharing exploits the Parallel Sysplex architecture and services to improve greatly on
the performance and capabilities of the prior implementation of block level data sharing. The
components of the data sharing environment utilize the architecture and services of the
Parallel Sysplex, but must explicitly take advantage of that architecture and its services. For
example, IMS uses cache structures and XES cache services to register interest in data
items, to invoke buffer invalidation, and in some cases to cache data. The IRLM uses a lock
structure and XES lock services to manage lock requests from its IMS clients. XCF group,
signalling, and monitoring services are used to enable the IMSplex components to better
communicate and to know what is going on in the IMSplex.
Figure 3-1 shows the major components of a block level data sharing environment - an
integration of IMS with OS/390 software services and S/390 hardware functionality, including
Parallel Sysplex software services and hardware.
XES XES
IRLM IRLM
X XES CF XES X
C CTC C
A F XES XES F A
P P
P
IMS XES
CF XES IMS P
L
L
LCV LCV
DBRC DBRC
DBs
FDBR FDBR
RECONs
3.2.1 Objectives
Objectives for implementing any new technology generally fall into one or more of the
following categories:
Capacity
Availability
Performance
Recoverability
Operability
Functionality
You should define, in your migration plan, what your objectives are - why you are
implementing data sharing for your IMS databases.
Planning considerations: Be sure you understand, and document, your objectives for
implementing data sharing. Why you are doing it and how you will know when you are
successful?
3.2.2 Expectations
Expectations are quite different from objectives. You may be implementing data sharing for
availability, or capacity, but you should document your expectations for how (much) you
expect availability or capacity to improve. You should also document your expectations
relative to the other characteristics, which are not part of your objectives.
How do you expect your IMS system capacity to change? How much do you need?
This is one of the two major reasons for data sharing. When the workload becomes too
high for a single processor to handle efficiently, then a second (or third or fourth) LPAR is
added to the environment in order to distribute the workload. With more and more users
opening their applications (and databases) to the general public through the internet, the
potential is there for vastly increased transaction volumes.
Are you expecting your data to be more available than currently? Why?
While capacity may be a reason for data sharing, with the power and speed of today’s
processors, a more likely reason is the increased availability an application system gets by
virtue of executing on multiple OS/390 LPAR images. If one fails, near full availability can
be restored quickly by moving the workload from one system to another. Part of the
planning process will be to determine how to distribute the workload across the data
sharing IMSplex, and what to do in the event that something fails.
Do you expect performance to decline when data sharing is implemented? Online? Batch?
By how much? Is this acceptable? What if it is more?
You should expect that, according to some measurements, applications which share data
will not perform as well as those which do not share data. These measurements are
typically internal. For example, you may measure processor consumption after data
sharing and realize that more CPU is required in the data sharing environment. Or you
may measure I/O response time and find that with the busier activity, I/O response time is
slightly higher. You may even measure end-user response time and find that it went up
several percent. However, most well designed application systems (databases included)
continue to perform as well with data sharing as without when viewed from the perspective
of the end-user. For example, a transaction whose execution time was 100 milliseconds
without data sharing may require 110 milliseconds with data sharing. A 10% increase, but
not noticeable by the end-user. If your reasons for going to data sharing is to relieve a
capacity constraint, then that end-user may actually see improved response times due to
spending less time in the input queue.
Batch or BMP jobs, however, which run for long periods of time may display a measurable
increase in elapsed time. This is especially true of a DLIBATCH job which has had
exclusive use of its databases and is now forced to compete with the online for that data,
or has been converted from DLIBATCH to BMP (with data sharing).
Capacity
Implementing data sharing for capacity reasons is not as common as it once was. With the
emergence of larger processors (the z900 2064-216 turbo 16-way can deliver over 3000
MIPS of processing power and have up to 64 gigabytes of real memory) it is usually possible
for even the largest of applications to run on a single OS/390 or z/OS image. However, there
are frequently other reasons why the full power of a single image is not available to the IMS
application. The processor itself may not be configured as a single LPAR, or other work on
that LPAR prevents IMS from having as much processing power as it needs. Perhaps there is
sufficient processing power for most processing, but there are daily, weekly, monthly, or yearly
peaks during which you need additional capacity. In this case, data sharing offers a way to
utilize multiple OS/390 images on multiple LPARs to get the processing power needed. Once
the initial conversion to data sharing is accomplished, it is relatively easy to add additional
IMSs as needed for these periods of high demand, and then to remove them when the
demand slackens.
From a capacity planning perspective, you need to plan on how many IMS images you will
have and on how many LPARs they will execute. This requires knowing approximately what
the processing requirements of your single image IMS currently requires. This should be the
requirement at its peak, not its average, since the final configuration must be able to handle
the peak load. You then need to add to that for growth (new applications, new workload,
opening it to the internet, and so forth) and then add something for data sharing. How much to
add for data sharing is really dependent on the applications themselves, but it is typically in
the range of 10-15 percent.
Planning considerations: Determine the number of IMS images needed to provide the
capacity required for your final configuration. Plan for multiple workloads - low, average,
peak. You may want more than you need for availability reasons, but you should
understand how many you must have.
Availability
Improved availability is often the overriding objective for implementing data sharing.
Businesses today are more dependent on computers for their day-to-day business operations
than ever before. It is rare that people can take over with a pencil and pad when the system is
down. By having your IMS applications running on independent LPARs, a single failure would
only affect a portion of the applications’ availability.
Consider, however, that unless there is an outage, a data sharing environment is no more
available than non-data sharing. Therefore, in order to benefit from the improved availability
of data sharing, you must have solid procedures for what you will do when there is a failure.
1.2.3, “Define degraded mode environment” on page 8, identified part of the planning phase
as deciding what you will do when something fails, and documenting a plan and procedures
to handle each failure scenario. Without such a plan, you may still get some availability
What to share?
The above paragraphs discuss your objectives and what you should plan for - availability and
capacity. However, you also need to plan for how you will configure your data sharing
environment. While it is ideal to have each IMS a clone of the other(s), enabling any work to
execute on any IMS image, technical or business requirements might dictate that this cannot
be done. For example, if some applications require resources which cannot be shared, then
those applications will have to be relegated to a single IMS. There is really only one IMS
database resource that cannot be shared - the Fast Path main storage database (MSDB).
However, IMS Version 6 provided an alternative to the MSDB that provides equivalent
function with near equivalent performance - the data entry database (DEDB) with the virtual
storage option (VSO). There may be other databases which you do not want to share,
perhaps because of performance problems related to access patterns what would be
particularly bad in a data sharing environment.
If there are some resources required by applications that cannot be shared, or that you do not
want to share, you may find yourself with a partitioned application and data sharing
environment rather than a cloned environment, or more likely, a hybrid environment where
some applications can run on any IMS (cloned) and others must run on just one (partitioned).
Applications which can run on only one (or a subset of the total) IMS are said to have an
affinity for that IMS. For example, you may have a few applications which require a database
which cannot be shared. Those applications are said to have an affinity for the system where
the database is available.
Planning considerations: Identify any database resources that cannot, or which will not
be shared. Document how you intend to manage these unshared resources, including how
you plan to route the work to the data.
Planning considerations: Identify all PSBs with PROCOPT=E PCBs and be sure you
understand the reasons. If it really requires exclusive scheduling, then implement one of
the techniques described.
SERIAL transactions
This problem occurs for the same reason the PROCOPT=E problem occurs - one IMS
does not know what the other is doing. This is addressed again in the shared queues
chapter, but even without shared queues, if a transaction must truly be serially scheduled
within the IMSplex, then it must be limited to execution on a single IMS. There is no
cross-system serial scheduling.
One way to do this without shared queues would be to connect the data sharing systems
with MSC and define the transaction as local in only one IMS and as remote in all others.
This, of course, introduces an availability exposure, but no more than you had with just a
single IMS before data sharing. However, you should make sure that these transaction
really do need to be scheduled serially.
Performance
Generally speaking, applications which perform well in a non-data sharing environment will
perform well when moved to a data sharing environment. On the other hand, applications
which do not perform well in a non-sharing environment will not improve with data sharing
unless the reason for poor performance is capacity.
There are two primary causes of poor data sharing performance. They almost always go
together. That is, if you have a problem with one of them you will probably have a problem
with the other. Remember that we are talking about differences between the sharing and
non-sharing environments. Poor non-sharing performers will have the same bad
characteristics in data sharing.
Block lock contention
The main difference in locking between data sharing and non-data sharing environments
is when a database record is updated. In a non-sharing environment, a record lock is
acquired on the database record being updated, and this provides all the integrity needed.
No other program can access that same record, but other programs can access other
records in the same block. In the non-sharing case, there is only one copy of the block in
one buffer and only in that one IMS (database authorization prevents other IMSs from
accessing the same database). It is perfectly OK for multiple application programs on that
IMS to share the same buffer.
In a data sharing environment, IMS still gets a record lock on a segment which may be
updated. Other programs on other IMSs can still access other records even in the same
block. However, those other programs are not sharing the same buffer. When that other
program is running in another IMS, there is another copy of the same block in that other
IMS’s buffers. If each IMS were to update its record, and write its version of the block back
to DASD, then the update made by the last IMS to write the block would overlay the update
made by the first. To prevent this from happening, before any IMS can update a segment,
it must first acquire a block lock. This is a lock on the block or CI itself - not just on the
database record. This block lock may result in another IMS trying to update a different
record in the same block from getting the lock. That other IMS would have to wait for the
update to be committed, the updated block or CI to be written to DASD, and the block lock
to be released before it could continue.
Buffer invalidation
As mentioned above, these two side effects generally go hand-in-hand. In a non-sharing
environment, the single IMS manages its own buffer pools and every write of blocks or CIs
in those buffer pools to DASD is a result of an update made by that IMS.
However, in a data sharing environment, there are multiple buffer pools in multiple IMSs
and a single block or CI could exist simultaneously in each IMS. Block locking prevents
multiple IMSs from updating different records in the same block so long as the block lock is
held. However, when the block lock is released (see description of block lock contention
above), the waiting IMS can continue. If it were to update its segment and write it out to
DASD, then that updated block would overlay the first update. This problem is resolved by
an XES service called buffer invalidation. After the write operation, but before releasing
the block lock, the first IMS will issue an XES cross-invalidate call. If the block that it has
updated is currently registered (according to the directory entry for that block in the cache
structure) to any other IMS, that other IMS’s buffer will be invalidated by setting its
assigned bit in the local cache vector (LCV) to invalid.
When that second IMS finally gets its block lock, before accessing the buffer to update it, it
first checks this LCV to see if the buffer is still valid. When it finds an invalidated buffer, it
must reread the block or CI from DASD (or in some cases, from a Coupling Facility cache
Some examples of where block lock contention and buffer invalidation can cause problems in
data sharing are:
Small heavily updated databases
Small heavily updated databases tend to be a problem even in non-sharing environments.
It just gets worse when that database is shared. The probability that two (or more) IMSs
will try to update different records in the same block is higher than if the database were
large.
This, and the next few problem areas, are often difficult problems to address, especially for
existing applications and databases. The best approach, short of redesigning the
database and rewriting the application, is to make sure that each record is in its own block.
This may mean making the database a lot bigger than it needs to be just to hold the data,
but it will tend to spread the data out and reduce the chances of two records being in the
same block. You may also consider making the blocks or CIs smaller. This naturally tends
to put fewer records in the same block.
Database hot spots
Even though a database may be large, if update activity is concentrated in a small part of
that database, block locks could cause contention.
This is even more difficult to address. It may not be practical to just make the database a
lot bigger if it is already large. It may be practical to reduce the size of the block or CI.
HIDAM or PHIDAM databases with ascending keys and applications which insert at the
end of the database
Again, even if the database is large, inserts to the end of the database become a very hot
spot. An example of this is a HIDAM database where the key includes date and time in the
high order position. Every new record would go at the end of the data set - that is, in the
same block until it is full then the next (new) block will be allocated and used. This process
would be serialized across the IMSs.
This is an application and database design issue, and would be difficult to address with the
simple techniques described above. But this database probably already has performance
problems.
HIDAM or PHIDAM databases with no free space
This is similar to the above problem, except that ANY insert, when there is little or no free
space, will be put in the last block or CI of the data set. Since this would be the same block
for all sharing IMSs, there will be block lock contention for that overflow block.
The solution to this problem is to make sure that your database has plenty of free space
for inserts.
HDAM or PHDAM databases without enough free space
The common thread in each of the above is that the same block is being frequently updated by
multiple IMSs. This causes no problems in non-sharing environments (it might even be a
benefit) but it can be a severe problem when data sharing. For some of the above, if the
suggested techniques do not work, you may have to declare the database unshareable and
route all applications needing that database to the one IMS system where it is available. That
is, partition the applications which need these databases. This reduces some of the
availability benefits of data sharing, and it may not be practical if the database is used by a
large number of application programs.
For the free space issue, if your databases are already quite large, and adding more free
space will give you data sets larger than allowed, or larger than you want to support, you
might consider converting HDAM and HIDAM databases to the HALDB PHDAM and PHIDAM
databases (only available beginning with IMS Version 7). HALDB is a partitioned database
allowing you to define up to 1001 partitions for your data, allowing each data set to be smaller,
with plenty of free space.
Processing overhead
In addition to the potential problems caused by database design and application program
access patterns, there is also some additional overhead just in processing shared databases.
This overhead is the result of several requirements for data sharing that do not exist when
data is not shared.
IRLM versus program isolation
How much overhead each of the above causes depends on the application and database
design. Not all calls require locks, not all calls require I/O, and not all calls result in
cross-invalidation. An application which issues lots of GNP calls to databases which have lots
of dependent segments do not require as many locks as applications which issue lots GU
calls to root segments. Applications which do not do a lot of read I/O do not have to register
as often with the cache structure. Applications which do not update much do not do as many
cross-invalidate calls. All of this activity is measured in terms of additional CPU utilization.
Even the requests to the Coupling Facility are synchronous requests, meaning that the TCB
(one processor engine) is in a busy state.
But it is reasonable to assume that the sum of all this overhead would require an additional
10-15% CPU than an equivalent workload running on a single IMS without any data sharing
overhead. This type of overhead seldom leads to any perceived increase in response time or
elapsed time since it does not involve I/O - the major component of all transactions and batch
or BMP applications.
Because shared databases are allocated to and updated concurrently by multiple IMSs,
database backup and recovery procedures will be different. There are several considerations.
Image copies
There are two types of image copies you can take with IMS databases (three if you count
“user image copies” which are not recorded in the RECONs) which can be used, with
recovery utilities or IMS Online Recovery Service for z/OS (ORS) product and log data, to
recover databases.
Clean (non-concurrent) image copies
A clean image copy requires the database to not be allocated for update to any IMS while
the image copy is being taken. For batch IMS, you simply have to wait until the batch job
terminates. For online systems, this generally means issuing the /DBR command. In a
data sharing environment, the /DBR command must be entered from every online IMS
where the database is allocated.
Alternatively, a /DBR DB xxxx GLOBAL command can be issued from one IMS, which
notifies other IMSs in the data sharing group to DBR the database locally. The GLOBAL
option will turn on the “prohibit further authorization” flag in the RECONs for that database,
which would prevent even image copy jobs from getting authorization. When issuing this
command for purposes of offline image copy, you should include the NOPFA parameter on
the command as follows:
/DBR DB xxxxx GLOBAL NOPFA
This command does NOT turn on this flag and allows any requestor, including image copy
and other IMSs, to get authorization.
This command (/DBR) establishes a recovery point from which, or to which, a database
can be recovered using standard IMS utilities or tools. The recovery point is a point of
consistency at which DBRC knows there are no uncommitted updates on the database,
and therefore, none in the image copy which might need to be backed out. Most users take
clean image copies on a regular basis either to speed up any required recovery process,
or to establish a recovery point at some specific time to which the database can be
recovered.
Fuzzy (concurrent) image copies
Fuzzy image copies, also called concurrent image copies, do not require a recovery point.
The database can be allocated for update to multiple IMSs during the image copy. DBRC
also tracks these image copies and knows what logs are required to perform a recovery
from the time of the fuzzy image copy. The obvious benefit is that the database does not
have to be taken offline. The disadvantage is that it does not establish a recovery point to
which the database can be recovered using standard IMS utilities. It also does not allow
the use, by the recovery utility, of change accumulation data sets. Many users also take
fuzzy image copies between clean image copies to speed up recovery without having to
take the database offline to establish a recovery point.
Planning considerations: Review your current image copy procedures. You may discover
that fuzzy image copies can be substituted for clean image copies (at least partially) and
improve database availability. By using a tool such as IMS Online Recovery Service for
z/OS (ORS), you no longer have to DBR the database for a clean image copy to establish
a recovery point. When you do need a clean image copy, the IMS Image Copy 2 utility,
along with supporting hardware, can reduce the duration of an outage.
In a data sharing environment, because the recovery utilities cannot process multiple log
streams as input, a change accumulation of all the logs from all the sharing IMSs is required
for recovery. If a database problem occurs requiring full recovery, the database would have to
be DBRed from each IMS, and each batch job terminated, and change accumulation run one
final time to capture the latest updates up to the time the database was deallocated by each
IMS.
Database recovery
Each IMS (batch and online) which updates a shared database will generate its own set of
logs recording those updates. These logs will probably overlap. When this is the case, the
recovery process (for example, the database recovery utility) cannot use the logs directly.
They must first go through the change accumulation process which merges all the updates
and produces an output file that has only the latest updates from all the sharing IMSs. In this
case, uncommitted updates may also be “recovered” since they were on the change
accumulation data set. They would have to be backed out either by online dynamic backout,
and/or by running the Batch Backout Utility against each active PSB (online or batch).
It is usually faster to recover databases, even in non-sharing environments, if you first run
change accumulation. In data sharing, it is a requirement and may require a change in your
recovery procedures.
Planning considerations: If you do not already have ORS, visit the Web site to determine
whether it can help in your environment.
There are two issues dealing with the failure of a data sharing IMS. One is the recovery (or
restart) of the IMS itself. The other is the redirection of the workload to an IMSplex member
that is still available. The following identifies what the impact of that failure is on the end-user,
how one might recover from that failure, and if the work can be redirected, how to do it.
But in a data sharing environment, there are others reasons to restart IMS as quickly as
possible.
One reason is to release any locks that are held by the IRLM on behalf of the failed IMS.
Any user on another still active IMS who requests this lock will get a lock reject condition
and abend with a U3303.
A second is to release the database authorizations still held by the failed IMS. No batch job
(or utility) can get authorization to a database if it is authorized to a failed subsystem, even
with data sharing enabled.
Planning considerations: Identify requirements for FDBR regions to track each active
IMS. Determine which LPAR each FDBR will run on (it should be on a different LPAR than
the IMS which it is tracking). It is OK to have both FDBR and ARM since FDBR will
complete its work before ARM can restart IMS.
Planning considerations: ARM should be active for all IMSplex address spaces which
support it. If you do not currently use it, you should learn about it and seriously consider
activating it. Review, or create, an ARM policy for all of the IMSplex components which can
register with ARM. Identify, and define, ARM restart groups as necessary to keep IMS and
CICS or DB2 systems together.
Without DASD logging, or in the case of a system abend or non-pseudo abend, none of this
will happen. The databases will not be backed out, the locks will be retained, and the
databases will remain authorized to the batch job (in the SUBSYS record). Batch backout will
be required to do these functions which will be delayed much longer than any dynamic
backout would require.
Planning considerations: If you are sharing data with your batch applications, you should
consider using DASD logging to enable dynamic backout. This will require JCL changes
and procedures for archiving the batch SLDSs to tape.
IRLM
When an IRLM fails, its data sharing clients can no longer request (or release) locks until that
IRLM is restarted. Any locks that were protecting updates, or potential updates, will have
been recorded in a record list entry in the lock structure. Other IRLMs in the data sharing
group will be notified about the failed IRLM by XCF and will read the failed IRLM’s record list
from the lock structure into a data space. These are referred to as retained locks. Any lock
request for a retained lock will be rejected and the requesting application program will abend
with a U3303, including BMPs and batch data sharing jobs.
Currently executing IMS online transactions and BMPs, as well as batch jobs with DASD
logging, will abend (U3303) and be dynamically backed out (we do not need the IRLM to do
this). Online transactions will be requeued (suspend queue). BMPs and batch jobs will have
to be restarted from the most recent extended checkpoint. If a batch job is not backed out
dynamically, then batch backout will have to be run.
The online IMS will issue a message indicating that it is quiescing and stop all activity. Note
that this includes not only shared databases, but unshared databases as well since the IRLM
is managing those looks too.
DFS2011I IRLM FAILURE - IMS QUIESCING
For full function databases, after IMS reconnects to the restarted IRLM, processing can begin
again. Full function databases will be available since they were not stopped, but this is not
true for the stopped DEDB areas which must be restarted by issuing the /START AREA
command for each area.
Lock structure
When a lock structure fails, the IRLMs are notified by XCF and immediately begin the rebuild
process. Each IRLM knows all of the locks held by its client IMSs and can use this information
to completely rebuild the structure on another Coupling Facility. This, of course, requires that
the lock structure be defined in the CFRM policy with multiple CFs in the preference list
(PREFLIST). All updating IMS batch data sharing jobs will abend. They may be dynamically
backed out if DASD logging was in use, or by using the IMS Batch Backout Utility. Online
systems continue processing. Applications requesting locks will just wait for the lock structure
to be rebuilt and for lock services to resume.
When connectivity from an IRLM to a lock structure fails, assuming that the
REBUILDPERCENT in the CFRM policy allows it, and that you have identified more than one
CF in the PREFLIST, the structure will be rebuilt on another candidate Coupling Facility after
which “disconnected” IRLMs continue processing. When this happens. there are no abends,
even for batch jobs. Their lock requests simply wait for the rebuild to complete.
Planning considerations: The IRLMs can rebuild a lock structure only if there is an
available CF to rebuild it on. Be sure that when you define your lock structure in the CFRM
policy, you define at least two CFs in the PREFLIST.
The point here is that there are no user requirements for addressing the loss of an OSAM or
VSAM cache structure. You will get messages indicating their loss, and that data sharing has
Planning considerations: Like the lock structure, IMS can only rebuild these structures if
there is an available CF on which to rebuild them. Make sure your OSAM and VSAM cache
structure definitions have multiple CFs in their CFRM policy definitions.
Planning considerations: Although CF and structure failures are rare, the impact of a
failed shared VSO structure is high, requiring a full database recovery. Your plans should
include either Fast Path managed duplexing of the VSO structures, or, when available,
system managed duplexing of these structures.
Online transactions
In a data sharing environment without shared queues, each online user is logged on to one
specific IMS and that IMS must process all of that user’s transactions. When that IMS fails,
you are no longer able to continue working. You have two choices.
Wait until IMS is restarted, log back on, and continue working. Without shared queues and
the sysplex terminal management (STM) function of IMS Version 8 running with the
Common Service Layer (CSL), this may be your only choice in order to keep some
significant status, such as continuing an IMS conversation. If you were to log on to another
IMS, even though the application and data is available on that other IMS, information
necessary to continue the conversation would not be available. If you logged on to the
original IMS later, that IMS would reestablish your conversation as it was at the time of
failure (could be very confusing).
Log on immediately to another IMS. This restores service to the user most quickly, but
without shared queues and STM, any significant status that you had at the time of the
failure would be lost. Note that if you are using the Rapid Network Reconnect (RNR)
function made available with IMS Version 7, a user remains logged on to IMS even after it
fails. Logging on to another IMS may not be an option. For this reason, RNR is not
recommended for environments where CSL and STM are active.
From a planning perspective, your users must know how to react when IMS fails. One must
know how long the failed IMS is likely to be down, and be able to evaluate the possible loss of
significant status versus the need to get back into service quickly. Each user may make these
decisions based on different criteria, and you may want some users to log on immediately to
another IMS while others should wait. This goes back to the identification of workload
priorities.
BMPs
When IMS fails, so does its BMPs. There are several issues to consider in your planning here:
One is that any inflight work (uncommitted updates) of the BMPs must be backed out
before the BMP can be restarted on ANY IMS. This can be done either by emergency
restarting the failed IMS, or by having a Fast Database Recovery (FDBR) region tracking
that IMS. When IMS fails, FDBR will dynamically back out all inflight units of work (UOWs)
that were executing on the failed IMS and release the IRLM locks held on behalf of the
failed IMS (this is always good, even if you do not intend to restart any work on another
IMS).
The second issue is that, to restart the BMP, the extended checkpoint log records must be
available. If the BMP is restarted on another IMS, that IMS will not have the BMP
checkpoint ID table (used for CHKPTID=LAST restart) nor the extended checkpoint log
records (x’18’) anywhere on its OLDS or SLDS. In this case, if it is really important to
restart that BMP before the failed IMS is restarted, you must provide the restart checkpoint
ID on the BMP restart JCL, and provide the x’18’ log records through the //IMSLOGR data
set. You must also be sure the BMP connects to the correct IMS. This can a be done either
by specifying the IMSID on the restart JCL, or by using the IMSGROUP parameter in each
online IMS, and have each BMP specify this IMSGROUP name as its IMSID.
As with the transaction workload, it is important that for each BMP, you know how
important it is to redirect the work to a surviving IMS.
Planning considerations: Determine whether you will restart your BMPs on the same
IMS after a failure, or on any available IMS. If any available IMS, then you need
procedures for making the BMP’s extended checkpoint records available for restart.
IMS in the Parallel Sysplex, Volume I: Reviewing the IMSplex Technology, SG24-6908,
contains a detailed review of IMS shared queues support. This volume and this chapter
identify some of the planning tasks necessary, either formally or informally, to implement
shared queues in your IMSplex environment.
This chapter addresses the first two activities in the planning process for shared queues:
Objectives and expectations
Functional planning
The next two activities are addressed in Chapter 7, “Putting it all together” on page 159:
Configuration planning
Security planning
Shared queues implementation and operations are addressed briefly in this chapter, but are
covered more fully in IMS in the Parallel Sysplex, Volume III: IMSplex Implementation and
Operations, SG24-6929.
Like all of the other planning chapters, the objective here is to ask questions or raise issues
which you must address when developing your plans. Each question or issue is followed by
some discussion of what you should consider when developing your migration plan.
For BMPs, and batch IMS, “distributing” the workload was fairly straightforward - just let JES
schedule it wherever it wanted, or direct it to a specific OS/390 (MVS) platform any number of
ways. However, for transaction-driven applications, it was left to the user to determine how to
distribute the transaction workload. IMS has its own set of private message queues, based on
which IMS the user is logged on to. Once a message is queued to one IMS, it is no longer
available to another, even if the first one fails, or if the first one is too busy to process it in a
timely manner.
Several techniques were available, some of them not particularly user friendly:
Tell each end-user which IMS to log on to. This requires the user to know the specific
APPLIDs of the IMSs in the IMSplex and to know which ones are available in case their
assigned IMS is not available. Once logged on to a particular IMS, all work must be done
by that IMS.
Use MSC to send some transactions to other “remote” IMSs within the same data sharing
group based on the transaction code. This is just a form of partitioning the IMSplex and
does not provide much benefit for availability. Each IMS can process only those
transactions defined as local.
Use VTAM Generic Resources (VGR) to allow the user to log on to a generic IMS APPLID
and then let VGR distribute the workload. This shields the users from having to know the
real IMS APPLID or which ones are currently available, but once a user is logged on to a
particular IMS, all work must be done by that IMS.
Use the IMS Workload Router product to use MSC to “spread the workload” around the
data sharing group using MSC and a user-specified algorithm. This technique works fairly
well at balancing the workload but it does, however, add the overhead of MSC to any
“distributed” transactions.
Use some external (to IMS) process, such as a session manager, to distribute the logons
across multiple IMSs. Like the others, this process makes a decision about which IMS to
establish a connection with and then sends the work to that IMS regardless of how busy
that IMS may be. Some of these techniques may do some form of load balancing.
The list could go on with other non-IMS products which can be used to distribute the
connections, but each one employs a “push-down” technique of workload distribution - send
the work to one particular IMS regardless of how busy that IMS is. Chapter 6. “Planning for
IMSplex connectivity” on page 131 addresses some of these techniques in more detail. One
of the benefits of shared queues is that it uses a “pull-down” technique of workload
distribution. The messages wait on a queue until an IMS is ready to schedule it. Whichever
one is available first gets the work. If one IMS is too busy, including the one where the user
was logged on and entered the transaction, then another can process the transaction.
Sysplex terminal management, a capability provided with the Common Service Layer in IMS
Version 8, requires shared queues and uses them to be able to shift a user’s status from one
IMS to another when IMS fails. Sysplex terminal management is described in IMS in the
Parallel Sysplex, Volume I: Reviewing the IMSplex Technology, SG24-6908, and in Chapter 5.
“Planning for the Common Service Layer” on page 105 of this book.
By now we know that IMS data sharing uses XES cache services with OSAM, VSAM, and
shared DEDB VSO cache structures, and the IRLM uses XES lock services with a lock
structure. Shared queues also uses sysplex services for the shared queues themselves, and
for logging updates to the shared queues. Both of these use XES list services with list
structures. Figure 4-1 shows the components of a shared queues environment with two IMSs
sharing the same message queues. As a reminder, following this figure is a brief description
of the components, but for a more detailed description, refer to IMS in the Parallel Sysplex,
Volume I: Reviewing the IMSplex Technology, SG24-6908.
Logger Logger
Data Data
Space LOGGER LOGGER Space
STAGING STAGING
LOGGER
OLDS OFFLOAD OLDS
Coupling Facility
EMHQ EMHQ
CHKPT CHKPT
MSGQ MSGQ EMHQ MSGQ
CHKPT SRDS1 SRDS1 CHKPT
IMS supports two types of messages - those we call full function messages, and those we call
Fast Path expedited message handler (EMH) messages. Fast Path messages are stored in a
different set of buffers, and queued off different queue header control blocks. In a shared
queues environment, these message types are also treated separately with separate
structures.
IMS messages
Messages can be received by IMS from multiple sources, including a VTAM network, a
TCP/IP network, or from an application program running in IMS and inserting the message to
an alternate TP-PCB. When IMS receives a message from the network, it performs a process
called find destination (FINDDEST), which examines the beginning of the message for the
destination name. FINDDEST is also invoked when an application program issues a CHNG or
ISRT call to place a message on the message queue.
Handling of messages by IMS depends on the message destination type, and in some cases
on the message source. Shared queues affects the way IMS handles these messages,
including those with an unknown destination, in ways other than just putting them on the
shared queues. We will discuss the handling of these messages in 4.3 “Functional planning”
on page 86.
Coupling Facility
The Coupling Facility, not to be confused with XCF (cross-system coupling facility software
services), is hardware connected through Coupling Facility links to each processor in the
Parallel Sysplex. Users, such as the Common Queue Server, can use XES services to store
data in the Coupling Facility in blocks of CF memory called structures. In the IMS shared
queues environment, CQS uses list structures to store input and output messages. These
structures can be accessed by other CQSs, allowing them to share the message queues.
This process is somewhat similar to the buffer invalidation process where a bit in the local
cache vector is turned on to indicate that a buffer is invalid.
4.2.1 Objectives
Objectives for implementing any new technology generally fall into one or more of the
following categories:
Capacity
Performance
Availability
Recoverability
Operability
Functionality
You should define, in your migration plan, what your objectives are and why you are
implementing shared queues.
4.2.2 Expectations
Expectations are quite different from objectives. You may be implementing shared queues for
its functionality, availability, capacity, or recoverability, but you should document your
expectations for how (much) you expect these to improve, or degrade. You should also
document your expectations relative to the other characteristics that are not part of your
objectives.
How do you expect your IMS system capacity to change?
You should not expect the overall capacity of a data sharing environment with shared
queues to change as a result of implementing shared queues. The primary reason for a
capacity increase is because of the data sharing and the availability of multiple IMSs to
process the total workload. Shared queues just makes it easier to realize that capacity
improvement by distributing the workload to the first IMS that can process it. Shared
queues can, however, provide capacity relief for a temporarily over-utilized IMS by allowing
some of the workload to be processed by another less busy IMS.
Are you expecting your IMSplex to be more available than currently? Why?
Because messages on a shared queues are available to all IMSs from the shared queues
structures, the IMSplex itself becomes more available. You should expect that the
end-users perception of availability is greatly increased even though some individual IMSs
may be unavailable due to planned or unplanned outages. Shared queues also enables
sysplex terminal management, a component service of the IMS Version 8 Common
Service Layer. See 5.3.3 “Sysplex terminal management” on page 116 for the planning
considerations for sysplex terminal management.
Do you expect performance to decline when shared queues are implemented? By how
much? Is this acceptable? What if it is more?
Like data sharing, you should expect that, according to some measurements, applications
which share the message queues will not perform as well as those which do not share
message queues. These measurements are typically internal. For example, you may
measure processor consumption after shared queues implementation and realize that
more CPU is required in the shared queues environment. You may even measure
end-user response time and find that it went up a few percent. Like data sharing, there are
a number of reasons for this, many of which can be minimized through proper planning.
We will address these later in this chapter.
Do you expect message queue recovery to be easier, more difficult, or about the same?
Recoverability may be one of your objectives for shared queues. For example, messages
on non-shared queues will be lost if IMS is cold started. Some user status will be lost if the
user logs on to another IMS. Shared queues, especially when used in conjunction with
sysplex terminal management, can provide improvements in this area, but it will not
address every possible scenario. You should document what you expect in terms of
message recovery when shared queues are implemented.
Are you expecting your shared queues environment to be more difficult to manage or
operate than a non-shared queues environment? In what way?
System definition
The following system definition parameters are used by the IMS scheduling algorithms:
Application definition parameters
– SCHDTYP (serial or parallel) - determines if the PSB can be scheduled in multiple
dependent regions at one time
Transaction definition parameters
– MSGTYPE
• Message class - must match a dependent region message class
– PRTY (scheduling priority)
Planning considerations: Most of these have the same meanings with and without
shared queues. The exception is the scheduling priority (PRTY) and the parallel limit count
(PARLIM). These are discussed in “Global scheduling” on page 88. Some consideration
needs to be given to transaction defined as SERIAL also. This is discussed in “Scheduling
serial transactions” on page 90.
Transaction status
In addition to the scheduling-related definitions, the status of a transaction also determines
whether it can be scheduled.
Transaction and PSB status:
– IMS will not schedule transactions whose status is stopped, pstopped, or locked.
– IMS will not queue transactions whose status is stopped or locked.
– IMS will not schedule transactions whose message class is stopped.
– IMS will not schedule a transaction defined to a stopped program (PSB).
In a shared queues environment, if a transaction is stopped or locked in the front-end IMS,
that IMS will not queue it even if the transaction is not stopped on other IMSs. Pstopped
transactions can still be queued but will not be scheduled. A stopped PSB or CLASS does
not affect the queuing of transactions.
Global scheduling
Because messages are not queued locally, IMS does not know how many messages are
queued on the transaction queue in the CF, or even if ANY transactions are queued. Since
IMS does not connect directly to the shared queues structure itself, it relies on CQS to notify it
when something of interest arrives on the queue. Something of interest is something which
IMS is capable of processing. In this case, it is a transaction which is defined to IMS, and for
which the transaction code, the message class, and the PSB are not stopped, pstopped, or
locked. When these conditions exist, IMS registers interest in that transaction code (in that
transaction queue). It does this even if it has no dependent regions with the associated
message class (it will just never try to schedule it).
CQS must also be notified when something of interest is on the queue. It does this by
invoking a sysplex service called event queue (EVENTQ) monitoring - a service of XES and
the Coupling Facility Control Code (CFCC) requested when CQS connects to the structure.
CQS registers interest in specific queues (called sublists) representing transaction queues in
which its client IMS has registered interest.
When any CQS places a transaction on the shared message queue list structure, if that
message caused that queue (that sublist) to go from empty to non-empty, then the CFCC
notifies CQS that something has arrived. Notification is done by turning on a bit in the list
notification vector (LNV) for that CQS. This drives a CQS exit which then invokes another
XES service to determine what has arrived. CQS notifies its IMS client that a registered
transaction queue now has work on it. IMS turns on a flag in the corresponding SMB that says
there is work on the queue for that transaction and queues the SMB on the transaction class
table (TCT) queue according to its normal scheduling priority. This flag stays on, and the SMB
remains queued on the TCT, until IMS attempts to retrieve a message and is told by CQS that
the queue is empty. At this time, the flag is turned off and the SMB is dequeued from the TCT.
IMS will not try to schedule that transaction again until notified by CQS that there is work on
the queue.
When a dependent region becomes available, IMS scheduling algorithms determine which
transaction to schedule based on the scheduling priorities of SMBs on the TCT. IMS makes
sure the PSB is loaded, allocates space in the PSB work pool, and performs other scheduling
functions. It then issues a call to CQS to retrieve the transaction (equivalent to transaction
priming in a non-shared queues environment). CQS issues XES calls to retrieve the message
from the structure. If the message is still there, it is passed to IMS by CQS, control is given to
the dependent region, and the transaction is processed normally.
Scheduling priorities (PTRY)
As described above, IMS system definition macros allow you to specify a scheduling
priority on the PRTY macro. The scheduling priority determines the SMB’s position on the
TCT. The PRTY macro also allows you to specify a limit count and a limit priority. The limit
count is defined as the number of transactions on the queue. The limit priority is the
scheduling priority to raise to when the number of queued transactions equals or exceeds
the limit count. In a shared queues environment, IMS does not know how many messages
are on the shared queues. Therefore, these two parameters are ignored.
Parallel limit count (PARLIM) and maximum regions (MAXRGN)
Local scheduling
When a transaction arrives in IMS, there are conditions under which that transaction can be
scheduled immediately without putting it on the Transaction Ready Queue. Those conditions
are:
A dependent region is available to process the transaction and there are no higher priority
transactions waiting to be scheduled.
There are no transactions currently on the Transaction Ready Queue (TRQ).
The input message fits completely within a single Queue Buffer.
The transaction is not defined as SERIAL.
When local scheduling is performed, IMS puts the input transaction directly on the lock queue
(LOCKQ) instead of the TRQ. It keeps a copy of the message in its local QPOOL and
proceeds to schedule that message locally. Note that this saves several accesses to the
Coupling Facility, and may also save some false schedules in other IMSs trying to schedule
the same transaction. Local scheduling is a good thing, and part of your planning should
include how to configure your dependent regions and transaction classes to maximize local
scheduling by trying to always, or as frequently as possible, have a dependent region
available to process an incoming transaction.
In a shared queues environment, serial processing is only guaranteed within the front-end
IMS. Serial transactions are not made available to other IMSs in the shared queues group
(there are special queues for serial transactions). If a transaction is defined as serial in
multiple IMSs, each IMS will process transactions which it receives serially, but other IMSs
may be processing the same transaction at the same time.
If you have serial transactions defined, you should understand why they are defined that way.
If you really need them to be serial across the IMSplex, and be scheduled just as they are in a
non-shared queues environment, then you must do the following:
Remove the SERIAL=YES parameter from the TRANSACT macro (or code SERIAL=NO).
Give the transaction a message class which is unique within the environment.
Start a dependent region with that message class in only one IMS in the IMSplex. This is
the only IMS which will try to schedule it. Others may still queue it, but will not try to
schedule it since they will not have a dependent region with the right class.
Write a Non-discardable Message Exit (DFSNDMX0) which, when the transaction abends
with a U3303, will requeue the message and USTOP it (R15=12).
This will allow all IMSs to receive and queue the transaction, but only one IMS will ever try to
process it. You may even want to /PSTOP the transaction on the other IMSs to keep them
from registering interest and then being notified by CQS whenever there is work on the
queue. Do not /STOP the transaction on these IMSs. They will be rejected and not placed on
the TRQ.
Planning considerations: Identify all SERIAL transactions and determine whether they
must be executed serially across the IMSplex. If so, then implement the steps above. If not,
determine why they have been defined as SERIAL in the first place. Maybe it is not
necessary and the SERIAL attribute can be removed, allowing these transactions to
schedule in parallel and on any IMS in the shared queues group.
Planning considerations: When there are transactions on both the local queue and
the global queue, those on the local queue will process first, even those that may have
arrived later. If this does not present a problem for your applications, then local first is
the recommended sysplex processing code (and it is the default).
As described in IMS in the Parallel Sysplex, Volume I: Reviewing the IMSplex Technology,
SG24-6908, IMS systems in a shared queues group share MSC definitions. That is, when an
IMS joins the group, it notifies other IMSs of its MSC definitions, and receives notifications
from other IMSs of theirs. This notification process allows them to share MSC SYSIDs
(system IDs) and MSNAMEs (remote destination logical link names) so that even an IMS
which does not participate in a physical MSC connection can process input messages from,
and send output messages (through another IMS in the shared queues group) to, remote
destinations.
In a cloned environment, where all IMSs use the same system definition, then this is not an
issue. The MSC links will be defined to all IMSs since all IMSs are using the same system
definition. Note that it is OK for each IMS to define the same MSC logical and physical links.
Only one of them will be able to activate that link at any given time. If that IMS fails, then those
links can be activated on the surviving IMSs.
If the IMSs are not cloned, then as long as each IMS includes the three MSC macros, they
can all participate in the MSC environment. As each IMS joins the shared queues group, it
notifies other IMSs of its own MSC definitions. The result is that the SYSIDs and MSNAMEs
are shared by all IMSs in the group, allowing them to correctly process the MSC message
prefix and to queue input and output messages to the correct destination.
Planning considerations: Even if not all IMSs in the IMSplex will be connected through
an MSC link to a remote IMS, every IMS should include the MSC system definitions. If they
do not, then you cannot define remote transactions and LTERMs, and you cannot process
messages with these destinations.
If the shared queues completely replaces all MSC connections, then the MSC macros can be
completely removed from the system definition. But if there still remains at least one truly
remote MSC connection for at least one of the IMSs, then each IMS should continue to
include at least one of each MSC macro.
If you have applications which use directed routing (issues CHNG and ISRT calls to an
MSNAME destination) to route messages to IMS across an MSC link, and if those IMSs are
now in the same shared queues group, then the use of directed routing must be discontinued.
The MSNAME destination in the CHNG call must be replaced with a transaction or LTERM
destination. Instead of changing the application program, you can also code the program
If you use DFSMSCE0 to do directed routing of input messages (the program routing entry
point), then the exit should be changed.
Planning considerations: If shared queues are replacing MSC links between IMSs, the
MSC definitions can be removed. Remember to change remote transactions and remote
LTERMs to local and to discontinue the use of directed routing.
As mentioned above, if there are any MSC connections anywhere in the shared queues
group, then every IMS should include the three MSC macros - MSPLINK, MSLINK, and
MSNAME. Even if the IMSs are not cloned, every IMS can have exactly the same MSC
definitions since they share their definitions when joining the group anyway.
Planning considerations: If there are no MSC connections outside the shared queues
group, remove the MSC definitions from your system definition. Be sure to change remote
transaction and LTERM definitions to local. If there are remote MSC connections, then all
IMSs must include the three MSC macros so that the MSC code will be included in the IMS
nucleus. All IMSs can have the same MSC definitions even if those IMSs are not cloned.
Conversational processing
There are no changes to the concept of a conversation with shared queues. When a
conversational transaction is entered, the front-end (F-E) IMS creates a conversation control
block structure (CCB) and a scratch pad area (SPA) and queues the SPA as the first segment
in a multi-segment message with the actual input message as the second segment
(SPA+INMSG). Responses to conversational input are inserted by the application with the
SPA as the first segment inserted, and the output message as the second segment
(SPA+OUTMSG).
Also, while non-conversational messages, once they are queued on the shared queues
structure, are deleted from the QPOOL, conversational transaction messages are not
deleted from the QPOOL. Therefore, the QPOOL must have enough space allocated to
hold all active conversations.
IMS Version 8, when running with the Common Service Layer and a resource structure in the
Coupling Facility, can keep conversational status in the resource structure available to all
members of the shared queues group, allowing a user to move a conversation from one IMS
to another. This function is part of sysplex terminal management (STM). See IMS in the
Parallel Sysplex, Volume I: Reviewing the IMSplex Technology, SG24-6908, for a description
of this function, and 5.1 “The Common Service Layer” on page 106, for planning
considerations.
Planning considerations: When not running IMS Version 8 with sysplex terminal
management active, users should be discouraged from logging off one IMS and logging on
to another while still in conversational status, unless they are willing to lose their current
conversation.
Exiting a conversation
Conversations may be terminated in several ways (in addition to abending). Some work just
like non-shared queues, others work differently.
If an application terminates the conversation, then everything works as normal - nothing
different from non-shared queues (at least externally).
If a conversation is terminated by operator command (/EXIT), then the results depend on
what the current status of that conversation is. Because only the F-E even knows that the
If FINDDEST fails in a shared queues environment, the destination may still be valid. It may
be defined on another IMS (assuming the IMS systems are not cloned). The F-E IMS must
determine whether an unknown destination is valid. If so, then it may dynamically create a
control block structure to be used for enqueuing the message on the shared queues. The
dynamically defined destination can be a local or remote transaction, or if ETO is enabled, it
Dynamic transactions
Even without ETO, DFSINSX0 can be used to define a message destination as a transaction
for purposes of letting that IMS queue the message on the Transaction Ready Queue.
However, the exit must define all of the parameters of the transaction that would normally be
defined on the TRANSACT macro. If it does not do this, then the defaults will be used and it is
unlikely that they will be correct. The parameters the exit can define are documented in the
IMS Version 8: Customization Guide, SC27-1294 (or equivalent earlier documentation), and
include:
Local/remote/SYSID
Conversational/SPA length/truncated data option
Response mode
Fast Path
Because it would be difficult for an exit to know these attributes without prior knowledge, we
suggest that it not be used for defining dynamic transactions. A better solution would be to
add a static transaction definition using online change.
Dynamic LTERMs
To dynamically define an unknown destination as an LTERM, the extended terminal option
must be enabled. In this case, the ETO support with and without shared queues is the same.
That is, the Output Creation Exit (DFSINSX0) must define the characteristics of the logical
terminal.
In a shared queues environment, the potential problem area is the LTERM definitions. These
are the only ones which represent message destinations, and therefore which have queue
headers in the shared queues structure. Since output messages are queued by LTERM
name, and then delivered by any IMS with interest in that LTERM name to whatever NODE
that LTERM is assigned to, if the same LTERM name is used by different IMSs for different
NODEs, then delivery of messages could get confusing, as both IMSs would register interest
In a single IMS, that IMS will not allow an LTERM control block to be created more than once
with the same name. This is true even if the user is allowed to sign on multiple times. When a
user does sign on multiple times, the Signon Exit (DFSSGNX0) must assign unique LTERM
names within that IMS. One technique might be to use callable services to determine
whether an LTERM name already exists within that IMS. If a user is only allowed to sign on
once, then the LTERM name can be set to the user ID, guaranteeing uniqueness within that
IMS. Another technique, even when multiple signons are allowed, would be to assign an
LTERM name equal to the NODE name. Since a NODE can only log on once, the LTERM
names will be unique.
But with shared queues, each IMS (with ETO enabled) creates its own set of control blocks
unknown to other IMSs in the shared queues group. It would be possible to create the same
LTERM name in two IMSs assigned to two different NODEs. Until IMS Version 8, users
cannot be prohibited from signing on to multiple IMSs with the same user ID. For statically
defined terminals, this is not usually a problem, unless the system definitions are wrong. But
when ETO is used, the method of assigning LTERM names must guarantee uniqueness
across the IMSplex. If only one LTERM per logged on NODE is required, then setting the
LTERM name equal to the NODE name would work, since most NODEs can log on only once
(the exception is parallel session ISC NODEs). But if multiple LTERMs are required per
NODE, then the Signon Exit must create unique LTERM names. Again, in a single IMS, this is
not difficult. But when multiple IMSs are part of a shared queues group, then the Signon Exit
does not know what other IMSs may have already been assigned as LTERM names.
There are several possible solutions to this problem prior to IMS Version 8.
One might be to suffix the LTERM name with a character unique to the IMS on which the
exit is running. For example, all LTERMs on IMSA could end with the character A while all
LTERMS on IMSB could end with the character B. This may not be appropriate for all IMS
systems - especially if the applications are sensitive to the LTERM name (for example,
they have a CHNG/ISRT call to insert to a hard-coded LTERM name).
Set the LTERM name equal to the NODE name. This works if you only have a single
LTERM per NODE. It also requires that the applications not be sensitive to the LTERM
name.
Whatever technique you use, you must guarantee that LTERM names are unique across all
IMSs in the shared queues group.
In a shared queues environment, applications may execute on any IMS in the shared queues
group. If the same application, running on two different IMSs, were to issue the CHNG/ISRT
calls to an LTERM with the autologon NODE set to the same physical printer, each IMS would
attempt to acquire the session with the printer. One would get the session, and the second
would wait in a queue for the first to release the session. When all output messages are
drained from the first IMS, it releases the printer (closes the session), allowing the second
IMS to acquire the session. This can happen repeatedly, causing printer “thrashing” or “flip
flopping” between the two (or more) IMSs.
To address this problem, it is best to let just one of the IMSs handle the printers. Each IMS
should define the printers with OPTIONS=NORELRQ (meaning once the session is
established, do not release it when another IMS requests it). That one IMS would then send
all output to the printer. If that IMS terminates, the second IMS would acquire the session and
take over responsibility for sending output to that printer.
This does not work in a shared queues environment since messages are not queued locally.
They are on the shared queues structure and IMS doesn't know what or how many are out
there, or how long they have been there.
New parameters on the display command (QCNT and MSGAGE) is a request for IMS to
query the structure for messages on a specified queue type which have been there for more
than the user specified number of days. For example, specifying
/DIS QCNT LTERM MSGAGE 5
would identify all messages on the LTERM queue which have been there for more than five
days. The result is the total count of messages queued for any LTERM which has “aged”
messages, and the number which exceed the MSGAGE parameter. The timestamps of the
oldest and most recent message are also displayed. These messages can then be deleted
using the command:
/DEQ LTERM ltermname PURGE | PURGE1
When using shared queues, aged messages on the transaction queue can also be displayed
using the same commands:
/DIS QCNT TRANSACTION MSGAGE 3
/DEQ TRAN trancode PURGE | PURGE1
Planning considerations: The DLQT parameter will be ignored. You should change your
operating procedures to periodically check for messages which have been on a queue for
an extended period of time. If the destination is invalid, or no longer in service, dequeue the
unwanted messages.
Planning considerations: Include procedures for cold starting the shared queues in
your operational procedures.
Online change
The online change process is used to activate updates made offline to the ACBLIB, FMTLIB,
MODBLKS, and MATRIX data sets. When IMS systems are cloned, or even when they are
using many of the same definitions (for example, the database descriptors), then it is critical
that when resources are changed, the change be coordinated across all IMSs. Every effort
should be made to clone these IMS systems. Even if not every system uses every resource,
cloning reduces the chances of error for this as well as other processes.
The online change process itself requires preparing each system for online change, then
committing the change(s):
/MODIFY PREPARE xxxx
/MODIFY COMMIT
These commands are entered independently on each IMS. It is very possible that one IMS
could succeed, while another fails. When this happens, each IMS is running on two different
sets of libraries. Your choice is to either correct the problem with the failing IMS or return the
successful IMS to the original set of libraries.
Planning considerations: Review your online change procedures and be sure that
procedures are in place to handle the case where online change is successful on one or
more IMSs and fails on one or more IMSs.
Each IMS logs the activity that pertains to that IMS, such as input messages, output
messages, enqueues, dequeues, sync points, and so forth. Each CQS logs activity that
pertains to the shared queues structures, such as putting a message on the shared queues,
reading a message and moving it from one queue to another, and deleting a message.
And remember, CQS does not have its own internal logger. Instead, CQS requests the
System Logger to write a log record to a log stream defined in the LOGR policy and in the
CQSSGxxx PROCLIB member. All CQSs, since they are updating a common shared queues
structure, write log records to a common log stream. See “System Logger (IXGLOGR)” on
page 47 for a description of the System Logger and how CQS uses its services.
The net result of this cooperative processing of an IMS input transaction and the output
response is that message queue log records will be found on the IMS log that received the
message, the CQS log stream, and the IMS log that processed it.
Each IMS and CQS message queue log record contains a 32-byte Unit of Work ID (UOWID).
The first 16 bytes identify the original input message. The second 16 bytes identify messages
generated as a result of that first input message, including program-to-program message
switches and other messages inserted to the IO-PCB and any ALT-PCB. By looking at all
message queue log records with the same first 16 bytes, you can trace a transaction and all
spawned transactions, with their outputs, through the entire IMSplex.
And, there is a new log record - x'6740'. This log record is created whenever an IMS cold
starts following a failure and CQS moves the inflight (indoubt) messages to the COLDQ.
These log records also contain the UOWID, which may be used for analysis to determine a
course of action.
Note that CQS maintains two separate log streams - one for MSGQ activity and one for
EMHQ activity. The log records created are somewhat different for each log stream.
Planning considerations: The System Logger will require a log structure and offload data
sets, and optionally, staging data sets. Sizing of each of these components is required.
Refer to the IMS chapter in the IBM Redbook Everything You Wanted to Know about the
System Logger, SG24-6898 for a discussion on CQS’s use of the System Logger and how
to size the logger structure and data sets.
100 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Extended recovery facility
When using extended recovery facility (XRF), it is still necessary to define local message
queue data sets for both the active and the alternate. Only the alternate uses them, but the
active will not come up with XRF enabled if these data sets are not defined. For you non-XRF
wizards, these are NOT the regular message queue data sets. The DDNAMEs end with the
character L:
//QBLKSL DD
//SHMSGL DD
//LGMSGL DD
At initialization, the XRF alternate will start a CQS address space, but does not register or
connect. The XRF CQS does not connect to the shared queues structures during tracking. At
takeover, the alternate (new active) will register and connect to CQS, resynchronize, and
merge its local messages with the global queues. Processing then continues with the new
active processing global transactions.
Planning considerations: Be sure to include the “L” suffixed data sets in your active and
alternate control regions. They will not be used by the active system, but are required
anyway. You also need a CQS address space in the OS/390 image where your XRF
alternate is running. If that OS/390 image also contains an active IMS, they can share the
same CQS.
It is possible, for a planned takeover, to recover the shared queues. This can be done by
taking a structure checkpoint when the last CQS shuts down, then sending that SRDS to the
remote site. When CQS is started at the remote site, it will discover an empty structure and
recover it using the SRDS.
Planning considerations: Your procedures at the remote site should recognize that the
shared queues will not be recovered in the event of an unplanned takeover. If you want to
be able to recover your message queues after a planned takeover, establish procedures for
taking a structure checkpoint after all message queue activity is quiesced and sending that
SRDS to the remote site and making it available to CQS.
The first two of these can be considered list structure overhead and will require a minimal
amount of storage in the list structure. Fortunately, you do not have to know these numbers.
The CFSIZER tool will factor these into its size estimates. What takes the most amount of
space are the list entries, consisting of list entry controls, adjunct areas, and one or more data
elements, and the space allocated for event monitor controls.
List entries
The largest component of any shared queues structure, at least in the sizing calculations, are
the list entries. There is one or more list entries for every IMS message on the shared queues,
plus some additional list entries for CQS, although these tend to be minor in terms of space
requirements. A list entry represents the contents of a single IMS queue buffer. If a message
requires more than one queue buffer, then there will be more than one list entry. Although the
size of the list entry controls may vary with the level of Coupling Facility Control Code, an
estimate of 200 bytes is reasonable. Each list entry also has a 64-byte adjunct area. The third
component, the data element(s), are 512 bytes apiece.
The CFSIZER tool will do some basic calculations based on input provided by you. This tool
provides some help text and asks for input to be used in calculating an estimated size. It can
be found at:
http://www-1.ibm.com/servers/eserver/zseries/cfsizer/ims.html
It will ask you to provide numbers based on current IMS experience with local message
queue sizes. Since each buffer requires one list entry, and the number of data elements is
never more than what will fit into one of these buffers, the formula should provide a size that is
larger than would have been needed if both the short and long message high water marks
were reached at the same time. This information comes from the x’4502’ statistics log records
and from the system definition. While this does not produce an accurate calculation of the
size requirements of the shared queues structure, it does provide a “ballpark” number
(remember, there is no accurate number). You may want to make this larger or smaller,
depending on your estimates of how many messages may be on the queue at one time, and
how safe you want to be from a full message queue structure. A good approach may be to
use this size to set an INITSIZE for the primary structure, with a larger maximum SIZE
specified which will allow for altering the structure to a larger allocated size should the need
arise.
Overflow structure
This is a structure which you hope you will never have to use, but if you do, it should be large
enough to handle the overflow queues. To be safe, it is best to make it at least as large as the
primary structure.
102 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
4.4 Configuration, security, implementation, and operations
Because these planning activities are so often closely related to the same activities for other
IMSplex planning activities, they are discussed in their own chapter, Chapter 7. “Putting it all
together” on page 159.
This chapter identifies some of the planning tasks necessary, either formally or informally, to
implement the Common Service Layer architecture in your IMSplex environment. The
assumption here is that block level data sharing, shared queues, and end-user connectivity
have already been implemented.
This chapter addresses the first two activities in the planning process for CSL:
Objectives and expectations
Functional planning
The next two activities are addressed in Chapter 7, “Putting it all together” on page 159:
Configuration planning
Security planning
CSL implementation and operations are addressed briefly in this chapter, but are covered
more fully in IMS in the Parallel Sysplex, Volume III: IMSplex Implementation and Operations,
SG24-6929.
Like all of the other planning chapters, the objective here is to ask questions which you must
answer when developing your plans. Each question is followed by some discussion of what
you should consider when answering the question.
Figure 5-1 is a highly simplified look at the IMSplex by the end of IMS Version 7. Although
only two LPARs and three IMSs are shown, the complexity grows dramatically when these
numbers get higher. As many as 32 LPARs and 255 IMSs can be involved in the IMsplex.
But while the benefits of the IMSplex to the end-user and applications developers grew
significantly, so did the complexity of managing the environment. Figure 5-2 shows the
numerous methods that IMS operations personnel used for managing the IMSplex.
106 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
IMS exploits many parallel sysplex functions to share
resources in an IMSplex
Data sharing, shared queues
VTAM generic resources, multinode persistent sessions
Automatic restart management, XCF communications
LOG
LOGR SMQ LOGR
OSAM
VSAM
CQS1 SVSO CQS2
LOCK
IMS3 VGR
IMS1
DBRC IMS2
MNPS
DBRC DBRC
Databases
XCF VTAM VTAM
TCP/IP Network TCP/IP
IMS
CONNECT
TRANSACTION
IMSplex
E-MCS
End
AUTOMATION
SMQ DB
User
/STA TRAN
Network
Figure 5-3 shows this CSL architecture, including the three new address spaces, and the new
resource structure. This figure shows the configuration on a single LPAR, but as we have
already indicated, there may be as many as 32 LPARs in the IMSplex. Some of these
components may reside on multiple (or even every) LPAR, while others may exist only once in
the IMSplex. Part of the planning process is to define your configuration, making decisions
about which components reside where, and which CSL functions you need to meet your
business requirements.
Communications
SPOC Automatic RECON Global Online Change
SPOC Automation Loss Notification Sysplex Terminal Management
Resource
LOGR
Operations Structured Resource SMQ
Manager Call Manager OSAM
(OM) Interface (RM)
VSAM
Automation SCI SCI SCI SVSO
IRLM
VGR
Master
Terminal SCI
Communications
Common
IMS
Control
S
C
S
C
Queue Coupling
Region I I
Server
(CQS) CF
Facility
End User
Terminal
New CSL
CQS
address spaces SCI Online DBRC
SCI
DBRC Batch Utility
New function DBRC IMS
Batch with DBRC
IMS in the Parallel Sysplex, Volume I: Reviewing the IMSplex Technology, SG24-6908,
describes in detail the architecture itself and the services provided by the three new address
spaces. This volume addresses the planning considerations for implementing the CSL
architecture and exploiting the services available.
108 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Structured Call Interface
The Structured Call Interface (SCI) has two primary functions in the CSL:
Provide the entry point for IMSplex members to register as a member of the IMSplex.
Provide for a common communications interface for registered IMSplex members to
communicate.
In addition, the DB2 Admin Client, includes an IMS Control Center capability which provides a
graphical user interface (GUI) to OM and IMS. The DB2 Admin Client is available from the
DB2 Web site for free:
http://www.ibm.com/db2
Resource Manager
The Resource Manager (RM) provides the infrastructure for IMSplex resource management
and global process coordination. RM clients, such as IMS, exploit RM services to provide:
Global process coordination supports a global online change process across all IMSs in
the IMSplex requesting it through the OLC=GLOBAL parameter in DFSCGxxx.
Sysplex terminal management provides resource type consistency checking, resource
name uniqueness, and resource status recovery.
Global callable services allows a user exit to receive global information about IMSplex
terminal resources.
Resource structure
The resource structure is a list structure in the Coupling Facility, which is used by RM to
maintain IMSplex global and local information about its registered members, and to keep
sysplex terminal and user status information supporting the sysplex terminal management
function. This structure is optional, but is required for sysplex terminal management.
Performance
CSL should not be considered as a performance enhancement. Although performance
studies are not available at this time, only sysplex terminal management has significant
access to the Coupling Facility, and then only when SRM=GLOBAL (global status recovery
mode) is in effect. Relative to other Coupling Facility accesses for data sharing and shared
queues, this should not present significant additional overhead.
Capacity
Like performance, CSL should not be considered a capacity enhancement. Neither is it likely
to diminish the capacity of any IMSplex members. However, because of some of its other
features, like sysplex terminal management and resource status recovery, it may make
implementation of an IMSplex for capacity reasons more practical, or at least easier to
manage.
Availability
From a system perspective, CSL does not alter the availability of any IMSplex component. For
example, when an IMS fails, it still must be restarted and in-flight work backed out. The
automatic RECON loss notification (ARLN) function does enhance RECON availability by
notifying all connected users of a RECON reconfiguration, allowing a spare to be restored.
However, CSL’s greatest availability benefit is for the connected end-user when
SRM=GLOBAL is in effect. For example, if you are in an IMS conversation, and the IMS you
are connected to fails, then you can continue the conversation on another IMS without waiting
for the failed IMS to be restarted. For users entering Fast Path expedited message handler
(EMH) transactions, a transaction can be entered on one IMS and the response retrieved on
another. Similarly, if you are using a set and test sequence numbers (STSN) device, the
sequence numbers can be resumed on another IMS in the event that the IMS you are
connected to fails. Each of these availability functions is independently selectable.
Recoverability
As described above under availability, the recoverability benefits of CSL apply more the
end-user than to any IMSplex component. For the end-user, recoverability of significant status
is provided by the sysplex terminal management function.
110 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Systems manageability
As the IMS systems and IMSplexes become more complex, they also become more difficult
to manage. Just stopping a database on all IMSs in the data sharing group can be
cumbersome and verification that it is complete is not straight-forward. Consistent system
definitions become more important when end-users are provided with a single image of an
IMS processing complex. Changing those system definitions in a consistent fashion is equally
important and even more difficult. As the number of IMS control regions grows, so does the
complexity of managing them.
Functionality
CSL functionality is primarily directed towards the improvement of the systems management
requirements of an IMSplex. It does not offer the application system any function it does not
already have - it just makes it easier to manage those systems.
The unique CSL functions are identified in the following question. It is perfectly valid to
implement CSL in an environment for just some of its capabilities, and not use, or disable
others. For example, a valid reason for CSL may be for the operations management
capabilities, including single point of control (SPOC), and not to enable sysplex terminal
management, or perhaps for the SPOC and global online change. Note that STM is not
available if the IMSplex is not also sharing the message queues. Other functions do not
require shared queues.
Your decision as to whether or not to implement ARLN may depend on your experience with
RECON failures or switches. There are no performance implications. However, once a set of
RECONs is enabled for ARLN, the IMSplex name is stored in the RECON header and
connection to those RECONs by DBRC instances which are not members of the same
IMSplex (or any IMSplex) becomes more difficult.
While the first two STM functions are always enabled, resource status recovery can be
tailored to your needs through execution parameters and the existence (or non-existence) of
a resource structure. Activity to the resource structure is the only STM activity that could have
any performance impact on your system, and you should understand what the overhead is
before deciding on this function.
Note that it is not necessary for every IMS in the IMSplex to use global online change. Some
may be global while others are local. If your IMS systems are not clones of each other, then it
is likely that you will not want them all changing libraries at the same time, and so would
continue using local online change. The biggest impact is that an IMS cold start is required to
either switch from local online change to global online change, or vice versa.
112 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Operations management
Controlling an IMSplex environment can be very complex when multiple IMSs are sharing
databases and message queues. The traditional means of controlling a single IMS system is
through the master terminal operator (MTO) or automation (either user-written or vendor
provided). Unfortunately, these techniques apply only to a single IMS. When multiples are
involved, the same commands must be entered independently to each IMS. Verification that
they have executed successfully is also an independent action. IMS Version 6 provided
support for the entry of IMS commands from an E-MCS or MCS console to multiple IMSs with
the response from each IMS returned to the console. Use of this support was awkward at
best.
IMS Version 8 with the CSL provides a new interface through the Operations Manager. Users
can enter commands to any or all IMSs with the responses returned from each IMS within a
user specified time limit. The interface is documented and can be used for both
terminal-entered commands and for automation programs or execs. IMS Version 8 provides a
TSO Single Point of Control (SPOC) application which can be used to help manage the
IMSplex. An IBM tool called IMS Command Center is also available which provides another
GUI-based entry point for IMS commands. Refer to “IMS Control Center” on page 122 for
more details.
You may choose to have multiple SPOCs, multiple IMS Control Centers, and of course
multiple automation execs active concurrently within the IMSplex.
When the first DBRC registers with SCI, the IMSplex name is place in the RECON header.
Thereafter, unless permitted by exit DSPSCIX0, other IMS Version 8 DBRCs will not be able
to access the RECONs without using the correct IMSplex name. This restriction does not
apply to IMS Version 6 and IMS Version 7 DBRCs. They will not be denied access to the
RECONs even if the IMSplex name is in the header:
Will ARLN be used? For online? For batch? For utilities?
A decision must be made whether ARLN will be used, and if so, will IMS batch jobs and
utilities which run with DBRC enabled register with the correct IMSplex name. If they do
not, they will be denied access unless authorized by the exit. If any DBRC registers with
SCI, then all other DBRCs must either register with SCI and participate in ARLN, or have
the DSPSCIX0 exit, which must be in the DBRC, batch, or utility address space’s
STEPLIB, permit that address space to access the RECONs without registering. It is our
recommendation that, if any DBRC uses ARLN, then all should use it.
Will ARLN be used before all IMSs are at Version 8?
114 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
– Online control regions
– CSL address spaces (Operations Manager and Resource Manager)
– DBRC address spaces
– Batch and utility jobs with DBRC=Y
– IMS Connect when using the IMS Control Center
– TSO SPOC user IDs
– User- or vendor-written SPOCs
– Automated operator programs and execs
116 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
might be when a user with significant status goes from version 8 to version 7 (no status
recovery) then back to version 8. IMS Version 8 would recover the status that user had at
the time the user last terminated his/her session on IMS Version 8. If you do have a
mixture, you may want to consider using only IMS Version 8 as the front-end and using the
older releases strictly as back-end processors.
What types of terminals can connect to the IMSplex?
Different terminal types are treated differently by sysplex terminal management. Some
terminal types are not supported at all.
– VTAM? OTMA? BTAM?
STM supports only VTAM terminals. BTAM and OTMA are not supported.
– STSN?
STSN sequence numbers are considered significant status. This means that if
SRM=GLOBAL, and RCVYSTSN=YES, the resource structure will be updated for each
input and output message from STSN devices. Some users may decide that STSN
recovery is not required globally, and set SRM=LOCAL for STSN terminals. This must
be done in the Logon Exit (DFSLGNX0).
Other users may decide that STSN recovery is not required at all, in which case they
can set RCVYSTSN=NO in DFSDCxxx. This may be desirable when ETO is used for
STSN devices and the STSN significant status keeps the control block structure from
being deleted when the session is logged off.
– APPC?
There is only limited support for APPC by STM. The only support is for resource name
consistency between APPC descriptor names and other resource names including
transaction codes, LTERM names, and MSNAMEs. For planning purposes, you must
ensure that these APPC descriptor names do not conflict with other resource names.
This is probably not a problem you currently have if you are running on a single IMS. It
also probably is not a problem you would have if your IMSplex IMSs are clones. But if
they are not clones, it might be possible that these name conflict.
– ISC?
With single session ISC, the NODE is always unique within the IMSplex because the
STM uniqueness requirement does not allow the same NODE to be active on more
than one IMS.
Care must be used when providing STM support for parallel session ISC. STM does
not enforce uniqueness for NODE names since there are parallel sessions from the
same NODE. However, STM will enforce USER (SUBPOOL) and LTERM uniqueness
within the IMSplex. When IMSs are cloned, or at least have the same ISC definitions,
there is a definite possibility that, if the same parallel session NODE logs on to more
than one IMS, that each IMS may assign the same SUBPOOL and LTERM name to
that NODE. Since this would violate the uniqueness requirement, the second logon
would be rejected.
There are different solutions to this problems for static and dynamic.
• For static ISC sessions, single or parallel, the user may use the IMS Initialization
Exit (DFSINTX0) to tell IMS that it does not want to maintain any information at all in
the resource structure. This would allow the same NODEs (including single session
NODEs), USERs, and LTERMs to be active in each IMS. However, if an LTERM is
active on multiple IMSs in a shared queues group, each IMS would register interest
in that LTERM and deliver a message to whichever ISC NODE that LTERM was
assigned to. This may or may not be a problem in your environment. If it really is the
same NODE, then perhaps you will not care.
118 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
while the user has end-user significant status, the RM affinity in the structure will prevent
that user from logging on to another IMS. If the original IMS is going to be down for an
extended period, the user must know how to force another IMS to accept the logon
request. This can be done by coding the Logon Exit (or Signon Exit for ETO) to recognize
user data in the logon request as a request to “steal” the NODE.
What will the system default for SRM and RCVYx be? Will the Logon or Signon Exit
override SRM and RCVYx for some terminals? Which ones? Will ETO descriptors specify
SRM and RCVYx?
IMS has its own defaults for SRM. You can specify a different default in DFSDCxxx. For
ETO, you can override the system default on the user descriptor. You can override these
defaults in the Logon Exit or Signon Exit if different terminals/users are to have different
values for SRM. These decisions need to be made, although it will probably be sufficient to
let the system default as specified in DFSDCxxx apply to all terminals. What might be
worth overriding is the RCVYx parameter, for example, to turn off STSN recovery
(RCVYSTSN=N) in DFSDCxxx.
Will single signon be enforced within the IMSplex?
Sysplex terminal management can be used to enforce single signon for a user across the
entire IMSplex. Since single signon is the default, you do not need to do anything to get
this capability. When a user signs on, a USERID entry is made in the resource structure
which prevents that user from signing on again from another NODE. If you do NOT want
single signon, then you must code SGM=M (or other similar parameter) in DFSPBxxx. It is
important to note that the first IMS to join the IMSplex sets this value. Later IMSs, even if
their SGN value is different, cannot override it. It can only be changed by shutting down all
IMSs in the IMSplex and then starting up again with the correct value. So, be sure you get
it right the first time.
Is there an Output Creation Exit? If so, does it need to be updated? If not, is one needed?
When a message arrives in IMS, and the destination field of that message is not defined
locally, IMS will query the Resource Manager to determine whether that name is known
somewhere else in the IMSplex. If it is found to be an LTERM defined by some other IMS,
then the front-end IMS will enqueue it on the LTERM Ready Queue in the shared queues
structure. If it is found by RM to be a transaction, then the Output Creation Exit is called
and with a flag indicating that this unknown destination is defined in the resource structure
as a transaction. The exit has several choices. It can allow the transaction to be queued
with the default characteristics (may not be correct), it can set transaction characteristics
(if it knows what they should be, but the exit probably does not), or it can reject the
transaction.
We think the last alternative is best. If a transaction is not defined locally, it should be
rejected. If it is important for it to be defined locally, then online change should be used to
add it with the proper attributes (for example, Fast Path, conversational, response mode,
single segment, etc.).
What will a user do if IMS fails while one is logged on? Wait for ERE? How long? Will the
user be allowed to override RM affinities? Under what circumstances? Does the Logon or
Signon Exit have to be updated to allow stealing?
This is related to the above question about understanding what kind of significant status
will exist within the IMSplex environment, and what users need to do to recover from
session or IMS failures. If SRM=LOCAL, a Logon or Signon Exit will probably be needed
to allow the user to override the RM affinity and let a new IMS “steal” the NODE.
How will the RM structure be defined? Duplexed? Autoalter? Size (max, init, min)?
Where? Is SMREBLD and SMDUPLEX allowed by OS/390 or z/OS systems? Who can
make decisions about structure rebuild?
120 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
callable services will query RM to see if it exists or is active somewhere else in the
IMSplex by looking in the resource structure. If it is found in the structure, then control
blocks are built representing that resource and passed to the exit. The default for this is
global, but the exit can request that only local resources be searched. You should examine
your exits which use callable services to determine whether or not they need to be
updated.
In one respect this may be thought of as a single point of control. In reality, what it really is in
IMS Version 8 is a single point of command entry and response. At this time (IMS Version 8) it
cannot completely control the IMSplex since it does not receive messages originating within
IMS and reporting, for example, a problem. An example would be the DFS554A message
indicating that an application abend has occurred and that a PSB and transaction code have
been stopped. Many users would like to automatically restart the PSB (program) and
transaction code when this message is sent by IMS. Most automation tools available in the
IMS environment can do this. However, this message would not be sent to the OM client.
It is, however, possible to capture those messages through other means. For example,
NetView could detect the message and invoke a REXX exec which could then issue
commands to start the stopped resources. Many variations of this technique are possible.
During this functional planning phase, you should consider the ways in which you intend to
exploit this OM interface to the IMSplex. Note that you may decide at any time later whether to
add new OM clients, but you should probably plan on at least installing and using the TSO
SPOC program which comes with IMS Version 8.
IMS Control Center is a part of the DB2 Admin Client that you can download from the DB2
Web site for free:
http://www.ibm.com/db2
You can find the details for the prerequisites and how to set up the IMS Control Center from
the following Web site:
http://www.ibm.com/ims/imscc
In addition, IMS Version 8 provides REXX commands, functions, and variables to allow you to
write REXX execs which register with SCI and send commands to the Operations Manager.
These REXX execs can be executed as batch programs, they can execute under TSO, or they
can be invoked dynamically by a program such as NetView.
Will existing automation programs and execs continue to run in an IMSplex?
If you already have some IMS automation programs running in your system, you should
examine them to see if they will continue to work as needed in an IMSplex. If you are
already running in a Parallel Sysplex with data sharing and/or shared queues, then it is
likely that your automation programs will continue to run as you expect. If you are currently
122 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
running in a single IMS environment, then your automation programs need to be evaluated
to see if they still work as required.
Do any of them need to be converted to use the OM interface?
If you want these automation programs to have access to all the IMSs in the IMSplex, then
they will have to be converted to use the OM interface. If they are vendor products, you
should check with the vendors to see if they have plans to take advantage of this
environment.
Are any new automation execs needed? Who will write them? Will they be provided by
vendors?
Now that you are going to be running with a different environment, including three new
address space types and probably several of each, plus an optional new structure in the
Coupling Facility, you may want to consider additional automation. For example, you may
want an automation script to start up the CSL address spaces in the correct order, or to
detect the failure of one of the address spaces and restart it (although we recommend
ARM for this). When any of the address spaces are started, if they try to connect to one
that is not there, an error message will be issued. This could be the trigger for an
automation program to start the missing address space.
It is possible for you to have multiple IMSs which do not share data and do not share message
queues to define themselves as part of the same IMSplex. If this is done, then a SPOC or
other AOP program could join the same IMSplex and be used to control all of the IMS that are
otherwise unrelated.
124 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
determine the number of SCI address spaces you will need. Each one will require
execution JCL and an initialization PROCLIB member (CSLSIxxx).
– How many OMs will be started? Where?
While each LPAR requires an SCI address space, only one Operations Manager (OM)
is required in the IMSplex. At least two are recommended for availability, and there can
be one on every LPAR in the IMSplex. It is not likely that multiple OMs would be
required for performance reasons since it is not likely that command entry will
represent a very large workload. If you do not plan to have an OM address space on
every LPAR, then you need to decide which LPARs they will be started on.
– Will a resource structure be required?
The resource structure is a list structure in the Coupling Facility that is required to
support sysplex terminal management. It is not required for any other CSL function,
although global online change will make use of it for resource consistency checking if it
exists. At this stage of planning, you need only decide whether or not you will be using
STM and therefore require a resource structure. Note that if there is no structure, then
only one RM address space is allowed.
– How many RMs will be started? Where?
While each LPAR requires an SCI address space, only one Resource Manager (RM)
address space is required in the IMSplex. At least two are recommended for
availability. Since the RM may access the resource structure multiple times for each
transaction, additional RMs may be advisable for performance reasons as well. As
noted above, if there is no resource structure, then you can have only one RM in your
IMSplex.
– Will CQS be used to support both shared queues and resource management?
The IMS Version 8 Common Queue Server (CQS) can be used by the IMS Version 8
control region to support shared queues, and by the RM to support STM. Unlike RM,
there must be a CQS on each LPAR where there is either a control region using shared
queues, or a Resource Manager using the resource structure. If there is both a control
region and a RM on the same LPAR, then only one CQS is required on that LPAR,
although you may decide to have one for shared queues and one for RM.
The IMS Version 8 CQS does not, however, support previous versions of IMS for
shared queues. So, if you have an IMS Version 6 or IMS Version 7 control region on
any LPAR, you will need an IMS Version 7 CQS also on that LPAR to support shared
queues. You may have both an IMS Version 8 and an IMS Version 7 control region on
the same LPAR, in which case you would require both a version 7 and a version 8
CQS.
Will VTAM Generic Resources be used?
IMS support for VTAM Generic Resources (VGR) was introduced in IMS Version 6. The
IMSs could join a VTAM generic resource group to make logging on to the IMSplex easier
for the end-user by using a generic resource name instead of the IMS APPLID. In your
initial implementation, you may choose to use VTAM Generic Resources as a means of
distributing the online workload to multiple IMSs in the IMSplex before implementing
shared queues. If this is the case, then there are no CSL issues related to VGR. When
shared queues are used with CSL, then the use of VGR does have some impact on
sysplex terminal management.
Beginning with z/OS 1.2, VTAM supports session level affinities which improve VTAM
generic resource affinity management in the IMSplex by allowing IMS to set the affinity
management to either IMS or VTAM for each VTAM session. If you are running at less
than z/OS 1.2, you will need to specify an IMS parameter (GRAFFIN) to determine who
will manage the affinities.
126 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
either specify the IMSPLEX name as an execution parameter, or that the DBRC SCI
Registration Exit (DSPSCIX0) be coded and available to DBRC.
• Sysplex terminal management is the only function that requires both shared queues
and a resource structure. You may want to consider easing into CSL by
implementing the other functions before defining a resource structure. Note,
however, that adding a resource structure later on requires an IMS cold start.
Will BPE PROCLIB members be unique or shared?
Four address space types run on top of the Base Primitive Environment (BPE). These
are CQS, SCI, OM, and RM. Two PROCLIB members define the BPE environment for
each of these address space types. Each of these two members can be either unique
to the address space, or shared across all address spaces.
– BPE configuration (BPECFG)
Each of these address spaces has a BPE configuration PROCLIB member that
describes the BPE component of the address space. It identifies, for example, trace
levels and user exits. All parameters have defaults and it is not even necessary to have
a BPE configuration member.
– BPE user exit list (EXITMBR in BPECFG)
The BPE configuration member identifies for each component, a user exit list member.
The user exit list identifies user exits that are to be invoked by BPE on behalf of that
component (BPE, CQS, SCI, OM, RM).
Are any user exits needed?
If user exits are required, they must be included in the BPE user exit list PROCLIB
member (the BPE configuration PROCLIB member points to the exit list member). None of
the BPE exits are required, although the user may elect to code one or more of them
based on the installation’s requirements. CSL exit interfaces are documented in the IMS
Version 8: Common Service Layer Guide and Reference, SC27-1293.
Two of the exits that may be defined in the BPE user exit list member are BPE exits. The
use of these exits, and the interface, is documented in IMS Version 8: Base Primitive
Environment Guide and Reference, SC27-1290. You should review these exits and decide
whether you need them. Our guess is that you will not.
– BPE Initialization and Termination Exit?
These exits get invoked late during BPE initialization in the address space and earlier
during BPE termination. The termination exit does not get invoked if the address space
abends. It is not clear what the user would use these exits for, but the initialization exit
may, perhaps, be used to load a table that would be used later by other exits, such as
the security exit. The termination exit could then delete the table.
– BPE Statistics Exit?
This exit, when it exists, can be used to gather BPE statistics. The exit gets driven
based on a user specified statistics control interval.
Note that each component (OM, RM, and SCI) also supports these two exits. Other user
exits apply to particular components. For example, the Input Exit applies only to the
Operations Manager. These other user exits are documented in IMS Version 8: Common
Service Layer Guide and Reference, SC27-1293. Review each of these exits and decide
whether or not you need them. Of all possible exits, the ones most likely to be useful are
OM exits.
– OM Security Exits
The Security Exit is similar to the IMS Command Authorization Exit (DFSCCMD0) and
gets invoked on all commands entered through the OM interface. In gets invoked after
128 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
• Any attempt to update the structure will fail if either SCI or RM are not available.
• Logons and signons will be rejected. Note that this applies even if SRM is not set to
GLOBAL; resource entries are always created for command status.
• Normal activities (for example, conversations, STSN, and Fast Path transaction
processing) will continue with the status being maintained locally. When the failed
component is again available, the status will be updated in the resource structure
(as required).
• Logoffs and signoffs will wait until SCI, RM, and the resource structure are
available.
Your IMSplex configuration must contain at least one of each of these address spaces,
probably (should) contain at least two of each, and may contain one on each OS/390 or
z/OS image. Since each of these are supportable by ARM, it is highly recommended
that they all be defined with the ARMRST=Y parameter (the default). There are some
caveats:
• If there are multiple OMs and/or RMs in the IMSplex, then cross-system restart
should not be enabled. There is no advantage of having multiple instances on
single LPAR. The CSL instance on another LPAR can take over the work for a failed
CSL member in the IMSplex.
• If there is only one OM or RM in the IMSplex, then both local and cross-system
restart should be enabled. This would allow IMS access to OM and RM as quickly
as possible. Make sure that the candidate system is not the same one that already
has an OM or RM.
• Since there is already an SCI on every OS/390 or z/OS image, cross-system restart
should not be enabled for SCI (unless it is being restarted on an OS/390 or z/OS
image that does not currently have any IMSplex components). Wherever you move
IMS, there is probably already an SCI address space.
– RM structure
The resource structure is optional, but if you have one, it will be used for both command
significant status and, optionally, for end-user significant status. As mentioned above, if
you have a resource structure defined, then if it is not available, it may prevent some
activities from occurring. Although status will be kept locally while the resource
structure is unavailable, if IMS were to fail, that status could not be recovered globally.
You may want to consider duplexing the resource structure using system-managed
duplexing. For more information about the system-managed duplexing, refer to 2.7,
“Other connection services” on page 42.
To some extent, this is also true in a Parallel Sysplex. Users still connect to a single IMS, but it
may be any one of several IMSs in the IMSplex. Whether or not that IMS can process any
user-submitted transaction depends on the configuration and definitions of the individual
IMSs. Because of this, there is a need to plan for a much more detailed and robust
configuration for connectivity. The easiest configuration to manage is one where all of the
IMSs are clones and all of the databases are shared. Any transaction can execute on any
IMS, any BMP can run on any IMS, and any DRA or ODBA client can run anywhere there is
an available IMS. When applications are partitioned, or when databases are not shared, then
the work must be routed to an IMS where the applications and databases are available.
In this chapter we will address some of the considerations for connectivity to an IMSplex, and
some of the products and tools available to you, including:
VTAM network connectivity
TCP/IP network connectivity
Database connectivity
BMP routing and scheduling
In this chapter we examine some of the major elements that provide the foundations of the
IMSplex communications infrastructure. In this context, we are talking about users submitting
transactions to an IMS in the IMSplex. There are two types of networks that end-users may
use to submit message traffic to, and receive messages from, an IMS in the IMSplex. In some
cases, both types may be involved in the delivery of the message. Messages can also be
submitted directly by application programs running in IMS dependent regions.
VTAM SNA network
This is the traditional method for sending messages to the IMS DC component. IMS itself
is a VTAM logical unit (LU). Logon requests are routed to one IMS and all subsequent
message traffic is routed through VTAM directly to IMS. End-users may be terminals, such
as a 3270 (LU2) terminal or SNA printer (LU1), or programs themselves, such as a
FINANCE (LU0) terminal or an ISC client (LU6.1). Whatever the end-user LU type is, that
user is in session with (logged on to) a single IMS and every message goes to and from
that IMS.
– SLUTYPE2 (LU2, 3270, 3270 emulators)
These are 3270 terminals or 3270 emulators and have been the most common means
of accessing IMS through a VTAM network. They are sometimes called “dumb”
terminals since they have no internal programs to make decisions. The users typically
just logs on to whatever IMS APPLID they want to be connected to, and then works
from there.
– SLUTYPE1 (LU1, printers)
These are SNA printers which do not (usually) log on to IMS. Instead IMS acquires a
session with the printer either by the MTO issuing an /OPNDST NODE command or
IMS itself issuing a VTAM SIMLOGON command when output is queued. The subject
of printers is addressed later in this chapter.
– SLUTYPEP and FINANCE (LU0, ATMs)
These terminals are not dumb - they are intelligent. Most ATMs are of this type. They
can be programmed to make decisions about what IMS to log on to. They also utilize
VTAM set and test sequence numbers (STSN) command processing to synchronize
input and output message traffic during session initialization.
– ISC (LU6.1)
Intersystem Communication (ISC) supports communications between IMS and any
other application using VTAM LU6.1 protocols. Typically, these “other” applications are
CICS regions (more common) or other IMS regions (less common). They may be any
user-written or vendor-written applications which obey the LU6.1 protocols. This is the
only VTAM protocol that supports parallel sessions between the remote node and IMS.
– APPC (LU6.2)
APPC clients are programs running on any platform, including S/390, with VTAM
network connectivity to the IMS host. When APPC protocols are used, the end-user is
in session with APPC/MVS - not IMS. APPC/MVS and APPC/IMS (IMS’s support for
APPC communications) use XCF for message traffic between them.
132 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
– Multiple systems coupling (MSC)
MSC is used strictly for IMS-to-IMS communications. It is supported by a private VTAM
protocol or a channel-to-channel (CTC) protocol between IMSs. End-users are in
session with one IMS, which may then use MSC links to send that message to another
(remote) IMS and receive a response.
TCP/IP network
Open transaction manager access (OTMA) was introduced in IMS Version 5 to process
messages destined for IMS from non-VTAM sources. The most common of these is a
TCP/IP client which may be part of the installation’s private network, a public network such
as the internet, or a combination of private and public networks. Rather than being in
session with a single IMS as with VTAM, the end-user is connected to an IMS OTMA
client, such as IMS Connect or WebSphere MQ. Like APPC, these OTMA clients use XCF
for message traffic to and from IMS. There are several significant differences, however,
between the OTMA connection and the APPC connection:
– IMS does not have to be on the same OS/390 image as the OTMA client.
– There is no concept of logging on to IMS. Depending on the network hardware and
software, and the OTMA client itself, each individual message may be routed to any of
the available IMSs in the IMSplex.
Other IMS components and services which participate in IMS message processing include
the following. This is not intended to be a rigorous description of the actual IMS components -
it is largely conceptual. Many of these interact with each other, and with other unnamed
components and services, to accomplish the total processing requirements for a message:
IMS DC component
This is the component of the IMS control region which manages network traffic. In the
case of VTAM, it consists of device dependent modules, which are specific to each of the
supported SNA protocols (LU0, LU1, LU2, or LU6.1), and APPC/IMS which supports the
XCF communications with APPC/MVS. For non-VTAM network traffic, it includes OTMA
which supports the XCF communications with the OTMA client.
IMS queue manager
The queue manager processes messages received by the DC component from the
network or from local application programs inserting messages to a TP-PCB. Its function
is to queue the message on a final destination such as a transaction (program) or a
network destination. These destination queues can be local or, if shared queues are
enabled, they may be global, physically residing in the shared queues list structures in a
Coupling Facility.
IMS has two queue managers: one for full function messages and one for Fast Path
expedited message handler (EMH) messages. EMH messages are much less common
that the full function message, and use different queuing techniques and different
structures in a shared queues environment.
IMS scheduler
The IMS scheduler is responsible for scheduling IMS dependent regions to process
messages which are queued on either local or global transaction queues. Like the queue
managers, there are two schedulers: one for full function transactions and another for Fast
Path EMH transactions.
Common Queue Server (CQS)
When shared queues are enabled, CQS acts as a server to IMS to manage the inserting
and retrieving of messages to and from the shared queues structures. It supports both full
function and Fast Path EMH messages. CQS also acts as a server to the Resource
Outboard of IMS are numerous hardware and software components and services which play
a role in routing messages to and from IMS. Each of these will be described in varying
amounts of detail later in this chapter, and include:
Rapid Network Reconnect (RNR)
IMS’s RNR function invokes VTAM single-node or multi-node persistent session services
to save VTAM session information either in a local data space (SNPS), or in a structure in
the Coupling Facility (MNPS). It is used by IMS and VTAM to rapidly and automatically
restore network connectivity following an IMS failure and emergency restart.
VTAM Generic Resources (VGR)
This VTAM service allows a VTAM user to use a generic name when logging on to the
IMSplex. Use of the generic name allows VTAM to route the logon request to any available
IMS in the generic resource group, relieving the end-user of the need to know specific IMS
APPLIDs or knowing which IMSs might be available at any given time.
Network Dispatcher - WebSphere Edge Server
This software runs in a server on the “edge” of the TCP/IP network. It is conceptually the
last component before the TCP/IP stack (in the TCP/IP address space) on an OS/390
host. It may route an inbound message to any available TCP/IP server application.
Sysplex Distributor
This component runs as one of the TCP/IP stacks in the sysplex and performs a function
similar to that of the Network Dispatcher - that is, it can route an inbound message to any
available TCP/IP server application.
Telnet Server
This program can run in the OS/390 host or on a distributed platform in the TCP/IP
network. It is a TCP/IP standard application which receives inbound messages from a
TCP/IP client and forwards the message to a VTAM application (for example, IMS) across
a VTAM LU2 (3270) session.
IMS Connect
This TCP/IP server application runs in the Parallel Sysplex and is the destination
component to which TCP/IP end-user clients can route input messages for IMS. IMS
Connect can route incoming traffic to any IMS in the IMSplex which is a member of the
same OTMA XCF group.
WebSphere MQ
This program connects to the network as an APPC (LU6.2) partner TP program or TCP/IP
application. A message routed to WebSphere MQ uses XCF to queue the message for
any MQ client (for example, IMS) in the Parallel Sysplex. MQ has two options for delivering
messages to IMS. The MQ Bridge uses XCF and OTMA to put messages directly on the
IMS message queues. The MQ Adapter uses an IMS “listener BMP” to trigger a
transaction in IMS which then issues an MQ_GET call to retrieve the message from MQ,
bypassing the IMS message queues.
134 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
The remainder of this section will address some of the planning considerations both for the
native IMS network support and for the outboard connectivity components and services which
you may want, or need, to include in your migration and implementation plans.
As with all users, the 3270 user will have to be trained on what action to take when the IMS
one is logged on to fails, or is shut down. This may depend to some extent on that user’s
status at the time of the failure, and whether or not shared queues and sysplex terminal
management are active.
Shared printers
Statically defined printers can be shared by multiple IMSs in the IMSplex by specifying:
TERMINAL .....,OPTIONS=(SHARE,RELRQ)
ETO printers can be shared by coding the corresponding parameters on the logon and user
descriptors:
L PRTNODEA ...,OPTIONS=(RELRQ) <logon descriptor>
U PRINTERA ...,AUTLGN=PRTNODEA <user descriptor>
With these options set, each IMS will attempt to log on to a printer whenever there is output
queued. The RELRQ option tells IMS to release the printer (close the session) if another IMS
requests it. The potential problem here is obvious - printer thrashing. You could spend a lot of
time opening and closing printer sessions as they bounce back and forth between IMSs. A
better option to consider might be to not share the printers. Then the issue becomes how to
get the output messages to the IMS which owns the printer.
The STSN terminal uses this information to determine if it has to resend the last input and to
inform IMS whether IMS should resend the last recoverable output message. This STSN flow
occurs whenever a STSN session is established with IMS, even if the session had been
normally terminated the last time the terminal was connected to IMS.
136 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Without IMS Version 8 running with the Common Service Layer, sysplex terminal
management (STM), and global status recovery mode (SRM=GLOBAL) specified for the
STSN device, once a STSN device is logged on to a particular IMS, only that IMS knows the
sequence numbers. If it is important to the device to warm start the session (that is, to know
the last sequence numbers), then that device must continue to log on to the same IMS with
which its initial session was established. It it logs on to another IMS, that IMS knows nothing
about the sequence numbers and will start the session cold with the sequence numbers set
at zero.
With STM and SRM=GLOBAL, however, the sequence numbers are maintained in the
resource structure in the Coupling Facility and are therefore available to any IMS for session
initiation, allowing the user to log on to any IMS and continue with the same sequence
numbers as of the prior session termination.
ISC in an IMSplex
Prior to the IMSplex environment, ISC sessions were always with the single IMS. In an
IMSplex, you can continue to establish the ISC session(s) between CICS (for example) and
the same IMS that it was connected to before. However, if that IMS is not available, CICS has
no access to the IMSplex. Therefore, ISC sessions between CICS and IMS may be, and
probably would be desirable to be, with multiple IMSs, each with the capability to process the
transaction. If one is down, then others may still be available to process the workload.
If the original SANJOSE ISC definitions to the BOSTON CICS system are cloned, and the
BOSTON CICS system definitions are not changed, only the SANJOSE IMS subsystem can
have a connection to the BOSTON CICS system. This is true because the BOSTON CICS
system has no definitions that point to SANJOSE2. BOSTON CICS would never initiate a
session with SANJOSE2, and an attempt to initiate it from SANJOSE2 would fail because it is
not defined in BOSTON. Therefore, one alternative is to make no changes to the BOSTON
CICS system. That is, the BOSTON and SANJOSE systems have the active set of parallel
One can consider defining and implementing another set of parallel ISC sessions between
BOSTON and SANJOSE2 to remove the SANJOSE to BOSTON connection as a single point
of failure. To do this for cloned IMS systems in the IMSplex:
On the CICS side, an additional DEFINE CONNECTION must be coded specifying
SANJOSE2 as the NETNAME. Additional DEFINE SESSIONS must be coded to define
the individual sessions. These should have different SESSNAMEs and NETNAMEQs than
the connections to SANJOSE to avoid duplicate names being used.
DEFINE CONNECTION
NETNAME(SANJOSE)
DEFINE SESSIONS
SESSNAME(S1)
NETNAMEQ(SP1)
DEFINE SESSIONS
SESSNAME(S2)
NETNAMEQ(SP2)
DEFINE CONNECTION
NETNAME(SANJOSE2)
DEFINE SESSIONS
SESSNAME(S3)
NETNAMEQ(SP1)
DEFINE SESSIONS
SESSNAME(S4)
NETNAMEQ(SP4)
On the IMS side, assuming cloned definitions, each IMS will require that the active
SUBPOOLs and LTERMs be unique, but all SUBPOOLs and LTERMs can be defined to
both IMSs. Of course, only some of them will be used by each IMS. Additional SUBPOOL
and NAME macros should be coded in the common system definition. The SUBPOOL
names should match the NETNAMEQ names defined in the BOSTON CICS.
TERMINAL NAME(BOSTON)
VTAMPOOL
SUBPOOL NAME=(SP1)
NAME LT1
SUBPOOL NAME=(SP2)
NAME LT2
138 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
SUBPOOL NAME=(SP3)
NAME LT3
SUBPOOL NAME=(SP4)
NAME LT4
The term APPC/IMS refers to the support in IMS for communications with APPC/MVS and for
APPC conversations between IMS and the remote APPC program. The term APPC/IMS ACB
refers to the VTAM ACB in APPC/MVS that is associated with a particular IMS. This is defined
in SYS1.PARMLIB member APPCPMxx by associating an IMSID with one or more
APPC/IMS ACBs. When the /START APPC command is entered to IMS, IMS notifies
APPC/MVS to open these defined ACBs and begin accepting session initiation requests.
After establishing the LU6.2 VTAM session (the sessions can be single session or parallel
sessions), the remote APPC program can initiate an APPC conversation with the IMS
associated with that session (only one active APPC conversation per session is allowed).
Communications between IMS and APPC/MVS are supported by XCF communication
services.
There are only a few issues related to APPC and an IMSplex. They are similar to those of any
IMS VTAM terminal.
The APPC program session must be with an APPC/IMS which is on the same OS/390
image as the IMS with which it wants to initiate a conversation.
If that IMS fails, or is not available for any other reason, the remote APPC program must
either wait for IMS to be restarted, or establish a session with another available
APPC/MVS. Like the 3270 user, training will be required to determine how (and if) to
establish another session.
MSC workload balancing attempts to balance the workload among the IMS systems in the
IMSplex by distributing the work to the various systems through MSC definitions and/or MSC
exits. For example, one could have all of the end-users logged on to one of the IMS systems
and have that system route some percentage of the transactions to another system or
systems within the IMSplex. Let us assume, for example, that all of the end-users are logged
on to IMS1. In addition, we want 40 percent of the transaction workload to process on IMS1
and the other 60 percent to process on IMS2.
Routing based on specific transaction codes (a possible solution when systems are
partitioned by application) can be accomplished through the use of IMS system definition
specifications. Routing based on percentages (a possible solution when systems are cloned)
requires the use of the TM and MSC Message Routing and Control User Exit (DFSMSCE0).
For more information on planning for MSC use within your IMS Parallel Sysplex environment
please refer to IMS/ESA Multiple Systems Coupling in a Parallel Sysplex; SG24-4750.
140 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Automatically recognizes and avoids routing transactions to unavailable server
system/MSC links
Provides for automatic workload reconfiguration in the event of both planned and
unplanned outages
Provides an online/real-time administrator interface for monitoring and dynamically
updating the Workload Router configuration
Although the basic functions of MSC can be used to distribute application workload through
the IMSplex, with this tool, you have more control over the execution of the MSC environment.
The Workload Router supports IMS Versions 7 and 8.
Until that IMS is restarted, that terminal is considered to be in session with the failed IMS and
cannot log on to another IMS. Because of this, when running IMS in an IMSplex where the
user would want to log on to another IMS without waiting for the failing IMS to restart, this
parameter should be specified as RNR=NONE. Do not specify RNR=NRNR, as this holds the
session active until IMS is restarted and then terminates it, giving you the worst of both
worlds.
VTAM routes a logon request to one of the IMSs in the generic resource group based upon
the following algorithm:
If the terminal has an existing affinity for a particular IMS, the session request is routed to
that IMS (see “IMS and VGR affinities” below). When this is the case, the following steps
are not invoked.
The preceding algorithm does have one exception. If an end-user logs on to a specific IMS
APPLID, that request will be honored and bypass the algorithm entirely, including the exit.
142 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
deletion for each individual session according to a set of rules which generally result in
VTAM managing the affinity any time SRM=GLOBAL.
– Refer to IMS Version 8 Implementation Guide, A Technical Overview of the New
Features, SG24-6594, for a complete description of sysplex terminal management and
VTAM Generic Resources.
User training
It may be necessary for the user to re-connect after a failure. Ideally, this would be to the
same VGR group name as used before, but this might fail. Logging on directly to another IMS
system’s APPLID may result in an affinity remaining on the first IMS. The user would need to
be made aware that reconnecting to the first IMS (after an unknown period) may result in
re-establishing the significant status that existed at the time of failure.
Another possibility, following an IMS failure, would be for the user to retrieve messages
asynchronously. This could happen in both a shared queues and non-shared queues
environment.
In all the above scenarios, the end-user should be trained to handle session reconnect as well
as affinity management.
A single remote APPC node can establish multiple sets of single or parallel sessions with one
or more APPC/IMS ACBs. For example, program APPCR1 can establish parallel sessions
with LU62IMS1 and another set of parallel sessions with LU62IMS2. When using generic
resource support, VTAM will select the ACB with which to establish the first session. Each
additional parallel session will be routed to the same APPC/IMS ACB. Different sets of
parallel sessions from the same remote APPC program (determined by using different mode
tables in the session initiation request) can be routed to different APPC/IMS ACBs even when
using the APPC generic resource name (GRNAME).
Session managers can negate many of the advantages of VTAM Generic Resources. VGR’s
algorithm for distributing logon requests is first to try to balance the logons equally across the
IMSs in the generic resource group. However, if the LU making the logon request is local to
that VTAM, and if there is a local IMS, then the logon request will be sent to that IMS
regardless of the number of logons already active there. If there were only a single session
manager, then all logons would be to the same local IMS. There are a couple possible
solutions to this problem:
Make sure the session manager is running on a different LPAR than any of the members
of the IMS VTAM generic resource group. When this is the case, the session manager LUs
are not local and normal VTAM resource resolution algorithms apply.
VTAM provides the ability to write a VTAM Generic Resource Resolution Exit
(IXGEXCGR) which can override VTAM’s decision about the IMS to which to route the
logon request. This exit can be used to distribute logon requests from all LUs, including
local session managers.
Working from the inside out, we will discuss the following products, programs, and services
that utilize XCF and OTMA to enter transactions. There is also the special case of the
WebSphere MQ Adapter. Again, we are interested primarily in how their implementation
might change from a non-IMSplex to an IMSplex environment.
144 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
User applications written in C/C++ and using the OTMA callable interface documented in
IMS Version 8: Open Transaction Manager Access Guide, SC27-1303.
Figure 6-1 shows that a client in a TCP/IP network, such as the Internet, entering a
transaction through one of two IMS Connects to an IMS in the IMSplex. A description of the
physical connection to the TCP/IP stack is discussed later in this chapter. However, in this
configuration, that user may be connected to either of the two IMS Connect TCP/IP server
applications. Whichever IMS Connect receives the message can route it to any IMS in the
same OTMA XCF group to which IMS Connect itself belongs.
SYSP LEX
TCP/IP TCP/IP
IMS IMS
Connect Connect
XCF
XCF XCF
O TM A OTM A
IM SA IMSB
IMS Connect maintains a DATASTORE Table which identifies the status of each IMS
connected to the same XCF group. A User Message Exit in IMS Connect can use this table to
determine which IMSs are available and to decide to which of the available IMSs in the
IMSplex to route the message. In this way, if IMSA (for example) is unavailable, inbound
Once a message is received by IMS, it will queue that message on a local or shared queues
where it will be scheduled for processing in a dependent region. The response to the input is
then sent back to the remote client through OTMA and IMS Connect. The TPIPE name in the
input message prefix is included with the output response, telling IMS Connect where to send
the response. IMS does not know the ultimate destination of the message.
6.3.3 WebSphere MQ
WebSphere MQ can connect to the network as a VTAM APPC (LU6.2) partner TP program or
as a TCP/IP server application. Once a message is received and queued by MQ, there are
three methods by which it can deliver these messages to IMS:
IMS Bridge
IMS Adapter
Native MQ API call from IMS application program
146 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
S Y S P LE X
T C P /IP APPC
MQA MQB
XCF
XCF XCF
O TM A OTM A
IM S A IM S B
Now, only a single IMS Bridge task can run against the shared queue at a given time. But, if
the queue manager where the task is running should abnormally terminate, the other queue
managers in the queue sharing group will be notified and a race will occur to start another
instance of the IMS Bridge task to serve the shared queue.
Figure 6-3 shows two WebSphere address spaces, one on each of two systems in the
Parallel Sysplex. One is TCP/IP connected and the other is APPC connected. In this
configuration, IMS and MQ must be on the same OS/390 image. If IMSB, for example, fails,
then MQB would not be able to route transactions to IMSA.
148 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
SYSPLEX
T C P / IP APPC
MQA MQB
If you are planning to create a QSG with cloned queue managers, then you should ideally
have ‘cloned’ transaction managers that connect to each of these queue managers. So, for
example, it should be possible to run instances of your transactions on each transaction
manager. Additionally, if you are using triggering, then you should have trigger monitors
started on each transaction manager.
For a more detailed description of using the WebSphere MQ queue sharing group, refer to the
IBM Redpaper WebSphere MQ Queue Sharing Group in a Parallel Sysplex environment,
REDP3636.
Remote
Client
Note: The real network interfaces 10.0.1.1 and 10.0.1.2 appear to be intermediate hops
Note that the client application references the target server using the VIPA address or a
hostname that resolves to the VIPA address. In this example it is 10.0.3.5. The real network
interfaces 10.0.1.1 and 10.0.1.2 appear to be intermediate hops. When the connection
request is made, one of the network interfaces (for example, 10.0.1.1) is chosen for the
connection and all subsequent data transfer.
The definition and activation of VIPA is done through configuration statements in the
hlq.profile.tcpip file. IMS Connect is unaware of VIPA. It simply connects to a TCP/IP stack
and relies on the stack to perform the appropriate network function.
Dynamic VIPA
Beginning with OS/390 V2R8 the VIPA concept was extended to be more dynamic and to
support recovery of a failed TCP/IP stack. Dynamic VIPA takeover was introduced to
automate the movement of the VIPA to a surviving backup stack. The use of this capability
presumes that server application instances (for example, IMS Connect instances), exist on
the backup stack and can serve the clients formerly connected to the failed stack.
In the example shown in Figure 6-5 on page 151, the Dynamic VIPA IP address 10.0.3.5 is
defined as having a home stack on TCPIPA and a backup on TCPIPB.
150 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Automatic VIPA Takeover
Support for other TCP/IP stacks to be backup VIPA address
Allows an active stack to assume the load of a failing stack
Stacks share information using OS/390 XCF messaging
Requires IMS Connect on each stack listening on Port 5000
VipaDynamic
OS/390
VipaDefine 255.255.255.0 10.0.3.5
VipaBackup 1 10.0.3.8 TCPIPA
EndVipaDynamic
10.0.1.1 IMS
10.0.3.5 Connect
10.0.3.8 Port
10.0.1.2 5000
Connect to
Routers with XCF
Port 5000
Dynamic Route
at
Updating
TCPIPB
10.0.3.5
10.0.1.3 IMS
10.0.3.8 Connect
10.0.3.5 Port
10.0.1.4 5000
VipaDynamic
VipaDefine 255.255.255.0 10.0.3.8
OS/390
VipaBackup 1 10.0.3.5
EndVipaDynamic
Likewise, TCPIPB is defined as the primary for 10.0.3.8 and TCPIPA as the backup. Both
stacks share information regarding the Dynamic VIPAs through the use of XCF messaging
services. Each TCP/IP stack, therefore, is aware of all the Dynamic VIPA addresses and the
associated primary and backup order.
If a stack or its underlying OS/390 or z/OS fails, all other stacks in the sysplex are informed of
the failure. The VIPA address is automatically moved to the backup stack which receives
information regarding the connections from the original stack. All new connection requests to
the VIPA address are processed by the backup which becomes the new active. Instances of
the server applications (for example, IMS Connect systems), listening on the same ports
should be automatically started if they are not already active on the backup. Additionally, the
network routers are informed about the change. From a remote client application perspective,
a connection failure is received when the primary stack fails but the client can immediately
resubmit a new connection request which will be processed by the backup (new active) stack.
The Network Dispatcher is a function that intercepts connection requests and attempts to
balance traffic by choosing and then forwarding the request to a specific server in the sysplex.
This helps maximize the potential of a sysplex because it provides a solution that can
The Network Dispatcher function was originally implemented and delivered in IBM networking
hardware such as the 2216 and the 3745 MAE. More recently, it has been delivered as an
integrated component of the IBM WebSphere Edge Server, Network Deployment Edition,
which can be implemented on platforms such as AIX®, Windows NT, Windows 2000, Sun
Solaris, and Linux. Figure 6-6 identifies some of its characteristics and capabilities.
Network Dispatcher
Websphere Edge Server
Network Deployment Edition
Establishes session with MVS WLM if servers are OS/390
Balances workload based on workload goals
Never selects an unavailable server
In this environment, clients send their connection requests and data to a special IP address
that is defined as a cluster address to the Network Dispatcher. This same address is further
defined as a loop back alias address on all the sysplex IP stacks that contain copies of the
target application server, IMS Connect in this case. When a request resolves to this special
address, the Network Dispatcher selects one of the backend servers and forwards the packet
to the appropriate port and server.
The selection of a server for load balancing requirements can be controlled through several
mechanisms. It can be based on simple round-robin scheduling to available servers, or it can
use more sophisticated techniques. These can be based on the type of request (HTTP, FTP,
Telnet, and so on...), by analysis of the load on the servers through information obtained from
the MVS Workload Manager (WLM), or even through an algorithm based on weights
assigned to each server. Figure 6-7 shows that a remote TCP/IP client sends a message to
the IP address of the Network Dispatcher, which then selects a host TCP/IP defined with the
same Loopback Alias.
152 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Network Dispatcher
Loopback aliases
Parallel Sysplex of
TCP/IP servers,
9.28.32.4 9.28.32.4 9.28.32.4 e.g., HTTP, FTP, SSL, etc.
Chosen server
responds
directly
to client
Figure 6-7 Network Dispatcher controls workload distribution and load balancing
Once a server is selected, the connection request and all subsequent data packets on that
connection are routed to that server. Since IP packets contain the originating IP address of
the requesting client, the server can reply directly to the client without sending the output back
through the Network Dispatcher.
As a sysplex function, it removes the configuration limitations associated with the Network
Dispatcher (XCF links rather than LAN connections are used between the distributing stack
and the target servers) and further removes the requirement of specific hardware in the wide
area network (WAN). It also enhances the Dynamic VIPA capability and takes advantage of
the takeover/takeback support for its own distributing and backup stacks. Figure 6-8 illustrates
the concept.
Sysplex
Sysplex distributor
Sysplex Distributor Target TCP/IP
- keeps track of connections H1 Routing Stack
H3 Stack IMS
- keeps track of target stacks and applications X
VIPA PRIMARY C VIPA HIDDEN Connect
- invokes WLM
10.1.9.9 F 10.1.9.9 Port 5000
Target TCP/IP
H4 Stack IMS
VIPA HIDDEN Connect
CF 10.1.9.9 Port 5000
ESCON
Connect to Sysplex Distributor Target TCP/IP
H2 Backup Stack
H5 Stack IMS
Port 5000 X
at 10.1.9.9 VIPA BACKUP C VIPA HIDDEN Connect
10.1.9.9 F 10.1.9.9 Port 5000
Remote
Client
As part of the implementation, the Sysplex Distributor is configured with a distributing stack in
one of the sysplex images (for example, H1) and a backup stack on another image (H2). The
other images (H3-H5) can be configured as secondary backups. Only the active distributing
stack takes the responsibility of advertising the dynamic VIPA (DVIPA) outside the sysplex.
The stacks in the sysplex communicate with each other using XCF services. H1 detects
which stacks in the sysplex have active listening PORTs (for example, Port 5000 for IMS
Connect) that are defined as part of the DVIPA environment. The distributing stack builds a
table to keep track of server information and also of all connection requests associated with
the DVIPA.
When an inbound connection request arrives, the distributing stack selects an available target
server with a listening socket and uses XCF services to send the connection to the selected
stack. The connection is actually established between the remote client and the target server,
in this case H3. The distributing stack on H1 updates its connection table with the information.
This allows H1 to know where to route subsequent data packets that are sent on that
connection. When the connection terminates, H3 notifies H1 that the connection no longer
exists so that H1 can update its table.
154 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
CICS access
ODBA access
If either of the preceding restrictions is encountered, IMS will ignore the statement that
caused the MVS call (no messages will be issued).
The advantage of this feature is that it allows easy movement of BMPs between IMSplex
members. This results in the ability of applications running in BMPs to continue to access
DB2 data using any available DB2 defined within the DB2 group. This increases end-user
availability to data and easier workload balancing when necessary.
Figure 6-9 shows an example of a BMP which might be scheduled on either of two OS/390
images with data sharing IMSs and DB2s.
Example
DB2A and DB2B defined with group attach name DB2X
IMSA and IMSB executed with IMSGROUP=IMSX
Figure 6-9 BMP connectivity to IMS and DB2 data sharing environment
156 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
is that you include connectivity as an integral part of your IMSplex migration plan. All of the
following can play a role:
Data sharing
Shared queues
Common service layer resource manager
IMS Connect
WebSphere MQ
Dynamic VIPA
Sysplex Distributor
WebSphere Edge Server - Network Dispatcher
IMS and DB2 group attach
This chapter addresses the next two activities in migration planning and includes activities
that apply to the entire IMSplex as well as to individual functional components such as data
sharing, shared queues, and the Common Service Layer. Many of the planning activities
apply to non-IMSplex implementation as well as IMSplex migration, but we will try to identify
where you may see some differences:
Configuration planning
What will the intermediate and final configuration of your IMSplex look like? What are the
IMSplex components and how will users be connected to the IMsplex?
Security planning
How will you provide access security to the IMSplex? What new resources will require
protection, and how will existing resource security requirements change?
The final two activities are addressed in this volume in only a cursory fashion. The details of
these activities can be found in IMS in the Parallel Sysplex, Volume III: IMSplex
Implementation and Operations, SG24-6929:
Implementation planning
What tasks are required to implement an IMSplex? What are the requirements of the new
IMSplex components and what are the new requirements of existing IMSplex
components?
Operational planning
What are the operational considerations for execution in an IMSplex environment? How do
operations change? How are new operational components such as the TSO Single Point
of Control, or the IMS Control Center going to be used?
You should also include a plan for degraded mode processing if not all components are
available. For example, if the LPAR containing one of your IMS control regions is unavailable
for an extended period, what will you do? Will applications have to be moved, will end-users
wait or log back on to another IMS? How will those end-users reestablish their status on the
new IMS?
BMP regions
With data sharing, BMPs can be scheduled by any IMS in the data sharing group with access
to the data required by the BMP. The IMS control regions must specify an IMSGROUP
parameter, and the BMPs must use this as the IMSID. If these BMPs are accessing DB2, they
should code the DB2 group attach name in their SSM PROCLIB members.
Batch regions
If you have batch jobs for which, prior to data sharing, you DBRed the databases from the
non-sharing online subsystem, you may want to consider converting them to BMPs. BMPs
are much easier from both a management perspective and from an availability perspective.
They share the same logs as the other online IMS applications, and no matter how the BMP
might fail, the IMS control region can dynamically back out the databases (not always true for
batch jobs). If you decide to keep them as batch jobs, then you should identify on which
LPARs they will run and be sure that there is an IRLM available on that LPAR. Note, however,
that CPU time charged to the BMP will probably be higher than that charged to a non-sharing
(or even a sharing) batch region and may affect any CPU-based “charge-back” algorithms
you might be using with your clients.
160 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Shared databases
It may be that not all databases are shared. The section titled “What to share?” on page 63
identified some reasons why some databases will not be shared. DBRC registration should
be done based on how these databases will or will not be shared. If not all databases are
shared, then special attention must be given to connectivity and BMP scheduling. If a
database is not shared at the block level, register it in the RECONs at SHARELVL(1) to allow
non-update access, such as concurrent image copy, to run while the database is online.
IRLMs
For each data sharing group, you should plan to have one, and only one, IMS IRLM (IMS and
DB2 can not share the same IRLM) on each LPAR where there is a data sharing IMS in the
same data sharing group. Each IRLM requires a different IRLMID, but the IRLM name
(IRLMNM parameter) should be the same for all IRLMs. This allows all IMSs that are running
on the same LPAR to connect to the local IRLM. This is particularly important when IMS, after
an LPAR failure, is being restarted (perhaps by ARM) on another LPAR where there is already
an IMS and an IRLM. In the event of a system failure, you should not let ARM perform a
cross-system restart of the IRLM on a system which already has an IRLM in the same data
sharing group.
The transactions may be from a VTAM network, a TCP/IP network (via IMS Connect and
OTMA), or from the MQSeries® bridge (via OTMA). IMS is sharing the databases (and the
RECONs) with IMS batch jobs and with other IMS data sharing configurations on other
OS/390 images. Also shown (partially) is a Fast Database Recovery (FDBR) address space
that would be supporting an IMS running on one of the other images. LCV is the local cache
vector used for buffer invalidation notification.
1
OS/390 Images
2 Coupling
LCV
n
Facilities
VSO Structures
FDBR2
IMS1
IMS
Batch
Sched PSB
Transactions
ODBA
Client Shared
Databases
VTAM or TCP/IP
Network
RECONs
162 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
7.1.2 Shared queues configuration
Shared queues makes it easier to distribute the workload across the IMSplex and data
sharing group and makes the issues of connectivity easier to implement while the careful
planning is still important. Shared queues adds its own components to the IMSplex
configuration.
You may also define any of these structures as duplexed (requires z/OS 1.2 or later).
However, since they are recoverable, and there is some additional overhead for duplexing,
this may not be a good choice.
System Logger
There must be a System Logger (IXGLOGR) address space on each OS/390 image where
there is a CQS address space supporting shared queues. If you have a Parallel Sysplex, this
address space probably already exists. A single System Logger address space provides
logger services to all users on the same OS/390 image. System Logger users such as CQS
write log records to a log stream.
Log streams
CQS requires a log stream for updates to the full function shared queues structures, and a
second log stream for Fast Path EMH if an EMH shared queues structure is defined. The log
streams must be defined in the LOGR policy.
Logger structures
One or two structures must be defined in the active CFRM policy for use by the System
Logger for CQS’s log stream(s). The LOGR policy associates the log streams with the logger
structures. These structures are:
Full function logger structure
CQS requires a log stream for updates to the full function shared queues structures. The
System Logger requires a logger structure for the log stream.
Fast Path EMH logger structure
CQS requires a log stream for updates to the Fast Path EMH shared queues structures (if
Fast Path is enabled). The System Logger requires a logger structure for the log stream.
Like the shared queues structures, these can be duplexed. However, the System Logger does
its own duplexing and structure duplexing should not be required.
164 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Logger Logger
Data Data
Space LOGGER LOGGER Space
STAGING STAGING
LOGGER
OLDS OFFLOAD OLDS
Coupling Facility
LNVs Transaction
Transaction
Resource Manager
Only one Resource Manager (RM) address space is required in the IMSplex. If you do not
define a resource structure, only one RM is allowed in the IMSplex. If you do have a resource
structure, then at least two RMs are recommended for availability and performance. They can
be on any image where an SCI exists.
166 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Communications
SPOC Automatic RECON Global Online Change
Automation Loss Notification Sysplex Terminal Management
SPOC
Resource
Structure
Operations Structured Resource
Manager Call Manager
(OM) Interface (RM)
Automation SCI SCI SCI
SCI
IMS Communications
Connect
IMS S Common
S
Queue
Control C C
Server
Region I I
(CQS) CF
IMS
Control
Center CQS
SCI Online DBRC
SCI
DBRC Batch Utility
Master DBRC Batch with DBRC IMS
Terminal Utility with DBRC
Database access using the database resource adapter (DRA) to schedule a PSB
CICS Transaction Server to DB Control
168 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
front-end, authorization is done before it is put on the shared queues. When the back-end
processes it, it assumes it is authorized.
There are some cases, however, where the back-end must perform some type of security
checking. Specifically, when an application issues a call that would require additional security,
such as:
CHNG call to a transaction destination
ISRT SPA call with a different transaction code (deferred conversational
program-to-program switch)
AUTH call
How these cases are treated depends on whether or not you have coded a Build Security
Environment Exit (DFSBSEX0). The exit gets invoked during transaction scheduling and
determines how the above described CHNG, ISRT, and AUTH calls are handled in a
back-end IMS. The exit has several options:
Do not build an ACEE in the dependent region during scheduling. Build an end-user ACEE
in the dependent region later if needed for CHNG, ISRT, or AUTH calls. Use the ACEE for
authorization processing.
Build an end-user ACEE in a dependent region before the message given to application.
Use the ACEE for authorization processing.
Do not build an end-user ACEE at all. Call SAF for transaction authorization if required.
Use dependent region ACEE (if one exists) or control region ACEE (if not). Does not use
end-user ACEE. Call security exits (DFSCTRN0 and DFSCTSE0) if they exist.
Do not build ACEE at all. Bypass call to SAF. Call security exits if they exist.
Do not build ACEE at all. Bypass all security, including security exits.
Do not build an ACEE at all. Call SAF for transaction authorization if needed. Use
dependent region ACEE (if one exists) or control region ACEE (if not). Does not use
end-user ACEE. Bypass calls to security exits.
When the DFSBSEX0 exit is not coded, then when security is required in the back-end, an
ACEE will be built for each call, and then it will be deleted. If a transaction on the back-end
issues five CHNG calls, then the ACEE will be built and deleted five times.
If DFSINSX0 dynamically defines a transaction in the F-E, RACF security checking will be
performed, but no SMU security checking will be done - either in the F-E or in the B-E.
170 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Structured Call Interface
The Structured Call Interface (SCI) address spaces determine who is allowed to join the
IMSplex and participate in the CSL services. You may define the IMSplex-name in the RACF
FACILITY class and then control who (what user IDs) are allowed to register with SCI and
thereby join the IMSplex. This process is described in IMS in the Parallel Sysplex, Volume III:
IMSplex Implementation and Operations, SG24-6929. The following is an example:
You can, of course, identify all users as members of a group, and then just PERMIT the group
access to the IMSplex. Note that UPDATE access is required.
Do you want registration security for SCI?
If the answer to this is no, skip to the next section. Remember, however, that without
registration security, any program can register with SCI as an AOP and issue commands
to the IMSplex. While the unwanted user may still be thwarted by command security, it is
probably best not to let that user join in the first place.
What are the user IDs of the address spaces that will register with SCI?
If you do want registration security, then you have to define the IMSplex name in the RACF
FACILITY class and then PERMIT all valid users to register with that IMSplex name. Or
add all users to a RACF group and PERMIT the group. Every IMS control region, CQS,
OM, RM, and AOP user must be permitted.
– Are you going to implement automatic RECON loss notification?
If you plan to implement ARLN, then you must permit the user IDs of all the DBRC
instances that will be registering with SCI access to the IMSplex name.
– Will you be using the IMS Control Center?
If you are, then you must register the user IDs of all the IMS Connect address spaces
that the IMS Control Center will be using.
– Will you be writing your own automation programs?
If you do, then these programs must pass a user ID to SCI when they register, and that
user ID must be permitted access to the IMSplex in RACF. For the TSO SPOC, this will
be the user ID of the TSO user.
Another advantage is that it is more efficient to do it in OM rather than send the command
across the SCI link to be done by IMS. This is true especially since some commands may
be routed to multiple IMSs. If security is done in OM, it need be done only once.
What type of command security will be used in OM? SAF (RACF)? Exit? Both?
Here you have the same kind of choice you have with command authorization in IMS. You
can use SAF (RACF or equivalent), a command authorization exit, or both. When you
choose both, the RACF security will be done before the exit. The exit can override the
decision made by RACF.
The type of security you want must be defined in the OM initialization PROCLIB member
(CSLOIxxx) with the following parameter:
CMDSEC=N|E|R|A (the default is none)
If you decide to use a command authorization exit in OM (E or A), then you must define
that exit in the user exit list PROCLIB member identified in the BPECFG PROCLIB
member in use by OM. For example:
BPECFG PROCLIB MEMBER
EXITMBR=(SHREXIT0,OM)
SHREXIT0 PROCLIB MEMBER
EXITDEF=(TYPE=SECURITY,EXITS=(OMSECX),COMP=OM)
You then must write the exit (OMSECX in this example) and put it in the OM STEPLIB. If
you have multiple OMs, do not forget to put it in all STEPLIBs.
What user IDs will be authorized for command security in OM?
Since RACF security, and probably any exit security, is based on user ID, then you need to
determine what user IDs can issue what commands and define the commands and permit
the users in RACF, as shown above.
What OM user IDs must be authorized to join the IMSplex (register with SCI)?
If you have decided to use SCI registration security, then the user ID of each OM address
space must be permitted to access the IMSplex.
Resource Manager
Like the Operations Manager, the Resource Manager (RM) must be authorized to join the
IMSplex if the decision was to require authorization.
Is authorization required to connect to SCI?
If the answer is no, skip to the next section.
172 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
What RM user IDs must be authorized to join the IMSplex (register with SCI)?
You must identify the user IDs for each RM that is in the IMSplex and either add that user
ID to the IMSplex GROUP, or permit that user to access the IMSplex.
Resource structure
The resource structure is optional, but required if you want to enable sysplex terminal
management. If you do not have a resource structure, skip to the next section.
Will connection to the resource structure be protected by RACF?
Connection to the resource structure can be protected by defining the structure in the
RACF FACILITY class. Access to the resource structure can be permitted either by
individual user ID or by group. The required access is UPDATE.
What RM user IDs will be connecting to the resource structure?
The resource manager does not connect directly to the resource structure. RM uses CQS
to access the structure much the same way IMS uses CQS to access the shared queues
structures. The same CQSs that support IMS access to the shared queues can support
RM access to the resource structure.
You must determine the user IDs of the RMs that will connect to the resource structure
and either add them to a group which is authorized access to the structure, or permit each
user ID access to the structure.
Besides RM, what other applications (for example, vendor products) must be authorized to
connect to the structure?
CQS, acting as a server for RM, is the only address space in the IMSplex requiring access
to the resource structure. However, since the interface is open, any user or vendor can
write a program which accesses the resource structure through CQS. The user IDs of
these address spaces must also be defined to RACF and permitted access.
CQS
CQS can be a server to IMS for shared queues and a server to RM for the resource structure.
You must have a CQS on every system where there is an RM requiring access to the
resource structure. You must also have a CQS on every system where these is an IMS
requiring access to the shared queues. These can be the same CQSs.
What CQS user IDs must be authorized to join the IMSplex (register with SCI)?
Identify the CQS user IDs and permit them access to the IMSplex.
Is resource structure connection security required?
Identify the CQS user IDs and permit them access to the resource structure (See 7.2.4,
“Structure security” on page 174).
To do this, you must define the structure resource in the FACILITY class and then PERMIT
the connectors to access the structure. The resource name, in this case, has a high-level
qualifier of IXLSTR instead of CQSSTR. Connectors to be authorized must include all of the
CQSs in the IMSplex plus any other programs which will be connecting to the structure
directly (not through CQS); for example, IMS Queue Control Facility (QCF) product.
Example 7-5 shows how you might provide access security for the primary and overflow full
function shared queues structures.
Similar security can be defined for the other structures, such as the OSAM, VSAM, VSO,
IRLM, and resource structures.
174 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
RDEFINE STARTED *.OMAPROC STDATA(USER(OMA) GROUP(IMSPLEX1) PRIVILEGED(NO) TRUSTED(NO))
RDEFINE STARTED ** STDATA(USER(=MEMBER) GROUP(SYSPLEX1) PRIVILEGED(NO) TRUSTED(NO))
The last statement defines a generic started procedure. This is used when a started
procedure is not defined in the STARTED class (or in the RACF started procedure table) and
says to use the member name as the user ID.
Implementation planning should include both preparing the environment for the IMSplex, and
the actual implementation of the IMSplex in that environment to operational status.
The preparation phase is where you get the S/390, OS/390 or z/OS, and IMS environments
ready for cutover to operational status. This includes hardware, software, libraries, JCL
procedures, operational procedures, and any changes required to applications and
databases. This goes on while the current environment is still operational and it can get very
confusing keeping libraries straight, installing new hardware and software, preparing changes
to procedures, changing applications, and so on. A well documented plan for how this will
occur can save a lot of problems later.
Define the tasks to implement your configuration (including any intermediate configurations)
through all phases of the migration. Plans for the implementation phase must include (if
necessary) termination of the existing environment and initiation of the new environment (for
example, IMS shutdown and a cold start). Do not forget to include the end-users in your
implementation plan.
Although this document does not prescribe a format for the implementation plan, the following
elements should be included:
Tasks What must be done?
Responsibility Who is responsible to see that it gets done?
Schedule When must it be done - start/complete/drop-dead dates?
Dependencies Is any task a prerequisite to another task?
Duration How long should each task take?
Status A mechanism for monitoring progress?
In the following list are some of the tasks required to prepare your environment for IMSplex
implementation and operation. This is only a selection of tasks, some of which (but not all) are
described in more detail below, and all of which are described in IMS in the Parallel Sysplex,
Volume III: IMSplex Implementation and Operations, SG24-6929.
Prepare the OS/390 or z/OS environment for the IMSplex:
– Program properties table
– Couple data sets
– SYS1.PROCLIB (and concatenations)
Identify new and existing procedures and parameters which must be created or updated
for the IMSplex:
– IMS control region environment
• Control region (CTL, DBC, DCC), DBRC, DLISAS, dependent regions (MPP, IFP,
BMP, JMP, JBP), IMSRDR, IMSWTxxx, FDBR, XRF alternate
– IRLM
– CQS
– CSL (SCI, OM, RM, TSO SPOC, other AOPs)
– IMS Control Center and IMS Connect
– IMS batch
– IMS utilities
Identify new and existing PROCLIB members and/or execution parameters that must be
created or updated:
– IMS control regions
• DFSPBxxx, DFSDCxxx, DFSVSMxx, DFSFDRxx, DFSSQxxx, DFSCGxxx
– IRLM
• Execution parameters
– Base Primitive Environment (BPE)
• BPE configuration
• BPE user exit list
– CSL address spaces (SCI, OM, RM)
• BPECFG, CSLSIxxx, CSLOIxx, CSLRIxxx
– Common Queue Servers (CQS)
• BPECFG, CQSIPxxx, CQSSLxxx, CQSSGxxx
Identify new and existing exits that must be coded or modified:
– IMS control region exits
• IMS Initialization, Logon/Logoff, Signon/Signoff, Command Authorization, Output
Creation, Conversational Abnormal Termination, others
• Identify existing exits which use callable control block services for FIND or SCAN.
Can they handle the GLOBAL default when sysplex terminal management is
active?
– BPE user exits
• Initialization/Termination, Statistics
– SCI user exits
176 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
• Client Connection, Initialization/Termination
– OM user exits
• Client Connection, Initialization/Termination, Input, Output, Security
– RM user exits
• Client Connection, Initialization/Termination
– DBRC SCI Registration Exit (DSPSCIX0)
– IMS Connect
• User message exits (HWSxxxx)
Define CFRM policy parameters for the structures
– Define OSAM, VSAM, lock, shared queues, logger, and resource structures:
• Name, sizes (max, init, min), duplexing, autoalter, CF preference list
• Use CFSIZER to help with sizing structure
Create or update ARM policy
– Define restart groups, restart methods, restart attempts
Update the LOGR policy
– Define shared queues log streams
Review and update database procedures
– Backup and recovery procedures
• DBRing and starting databases and DEDB areas
• Making image copies (standard, concurrent, Image Copy 2, other)
• Running change accumulation (required to recover shared database)
• Using IMS Online Recovery Service
– Database monitoring and reorganization
Review and update operational procedures
– Starting and stopping IMSplex components
– Monitoring and controlling the IMSplex environment
• MTO, TSO SPOC, IMS Control Center
– Recovery from component failure
Review and update automated operations
– NetView execs
– New OM interface AO programs
– Vendor-supplied programs
Review and update end-user connectivity procedures
– Connecting cold
– Reconnecting after failure
– Disconnecting
Review, update, and/or create structure management procedures
– Structure rebuild procedures
– Structure recovery procedures
– Structure monitoring (size and performance)
The following sections address the OS/390 and IMS subsystem environment in more detail.
IMSplex environment
This section describes the IMSplex environment, including the control region, subordinate
address spaces, dependent regions, and other related address spaces.
178 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
If you are running with shared queues, and any IMS in the shared queues group has MSC
links to a remote IMS, then every IMS must be generated with at least one each of the
three MSC-related macros - MSPLINK, MSLINK, and MSNAME. This is true even if not all
IMSs have physical MSC links to remote systems. Coding the macros includes the
necessary MSC code to handle the MSC prefix found on incoming messages, and to
queue the output messages on the remote queue.
If your IMSs are clones, then this is not an issue. Each IMS can be defined with the same
physical and logical links even though only one may actually start those links.
Network definitions
If your IMSs are clones, then you have only one issue here, and that is the names of the
master and secondary master NODEs and LTERMs. They must be unique for each IMS.
However, beginning with IMS Version 6, each IMS can be generated with the same names
and then made unique by coding overrides in DFSDCxxx.
If you decide that all your IMS systems will be clones, then you only need a single IMS system
definition. Everything that needs to be unique can be overridden in the various PROCLIB
members.
Each of the following address spaces, whether started procedures or JOBs, should be
examined for execution in an IMSplex. Each will have to be tailored, to some extent, to the
particular IMS image that they represent:
IMS control region (CTL, DBC, DCC)
Some data sets must be shared by all members of the IMSplex. Others must be unique to
each IMS. Still others may either be unique or shared. Some may be dynamically allocated
while others must be defined in the JCL. If they are dynamically allocated, they should not
be specified in the procedure JCL.
– Must be shared
These libraries must be shared by all IMSs in the IMSplex:
• Fast Path area data sets
DEDB ADSs belong to the control region, not the DLI address space. These data
sets are usually dynamically allocated from their DBRC definitions (INIT.ADS).
• OLCSTAT
This data set is used to keep track of the global online change status for each IMS
with OLC=GLOBAL coded in DFSCQxxx. Not all IMSs in the IMSplex have to have
the same OLC= value - some can be global while others are local. This data set is
dynamically allocated using the data set name found in DFSCGxxx.
– Must be unique
Each IMS must have its own copy of these libraries. Data set naming conventions
should identify the IMS to which the data set belongs:
• DFSOLPnn and DFSOLSnn (OLDS)
Every IMS must have its own set of log data sets. These can be dynamically
allocated using the corresponding DFSMDA member in STEPLIB.
180 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
• MODSTAT
This data set is used to keep track of the online change status for an IMS that is
running with local online change (OLC=LOCAL in DFSCGxxx).
• IMSMON
The IMS monitor data set must be unique for each IMS. It can be dynamically
allocated using the corresponding DFSMDA member in STEPLIB.
• MSDB data sets
These data sets, which include the MSDB checkpoint, dump, and initialization data
sets, are required only if IMS is using Fast Path main storage databases (MSDBs).
Since these databases cannot be shared, these data sets must be unique.
• DFSTRA0n
The external trace output data sets must be unique to each IMS.
– May be shared or unique
You can decide whether to share these libraries, make them unique to each IMS, or
some combination of both. For example, you will probably want two dynamic allocation
libraries: one for data sets that must be unique (OLDS, WADS, IMSMON) and another
for data sets that must be shared (RECONs, database data sets, OLCSTAT).
• SDFSRESL
This library (formerly, and still sometimes, called RESLIB) contains most of the IMS
executable code. Each IMS system has its own nucleus module which is identified
by a unique suffix. Since most of the code is loaded at initialization time,
SDFSRESL is not accessed often during normal executions. It is unlikely that
unique copies would be required for performance reasons. However, if one
SDFSRESL is shared, all IMS systems will be at the same maintenance level and
have the same defined features.
If an installation wants to be able to run different IMS subsystems at different
maintenance levels, they must have different SDFSRESLs. This is especially
desirable when new maintenance levels are introduced. It might be advisable to
implement the new level on one system instead of all systems simultaneously. For
installations with continuous availability (24x7) requirements, this will allow them to
introduce the new levels without bringing down all systems at the same time.
• Dynamic allocation libraries
Dynamic allocation members for an IMS subsystem reside in libraries which are
part of the STEPLIB concatenation. Some of the data sets defined in these
members will be shared. Others must be unique. Therefore, it will probably be
necessary to have two sets of dynamic allocation libraries, one common library for
shared data sets, and one which is unique to each IMS for unique data sets. The
STEPLIB would be similar to the following:
//STEPLIB DD DSN=IMSPLEX1.IMSA.DYNALLOC,DISP=SHR
// DD DSN=IMSPLEX1.IMS.DYNALLOC,DISP=SHR
//...
If multiple libraries are used, then be sure to update the DFSMDA utility JCL
("//SYSLMOD DD") to put these members in the correct data set.
• Exit libraries
Exit routines, whether written by the user or provided by IMS, must be in libraries in
the //STEPLIB concatenation for the address space that invokes them. This can be
the control region, the DLISAS region, or even a dependent region (for example, the
182 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
since it is only used to find a library with the Archive utility program (DFSUARC0), it
probably does not matter which SDFSRESL is identified. And, since the data sets have
DSNs which include .%SSID. qualifiers, this library could be shared by all DBRCs:
//JCLPDS DD DSN=IMSPLEX1.JCLOUT,DISP=SHR
– JCLOUT
This JCL allocated data set is used to hold the output of a GENJCL command. If the
DD statement specifies the internal reader, the job is submitted as soon as the JCL is
generated. If it specifies a data set, then the job is placed in that data set for later
submission (for example, from a TSO ISPF session). When the data set is used, it must
be unique for each IMS subsystem (each DBRC instance).
//JCLOUT DD SYSOUT=(A,INTRDR)
- or -
//JCLOUT DD DSN=IMSPLEX1.IMSA.JCLOUT,DISP=SHR
– Automatic RECON loss notification (ARLN)
IMS Version 8 supports automatic RECON loss notification (ARLN) when running with
the Common Service Layer. To enable this function, you must identify the IMSplex
which this DBRC is to join. Note that, the first time this is done, the RECON header will
be updated with the IMSplex name and can only be used within that IMSplex. ARLN
can be enable by:
• Coding an IMSPLEX= parameter in the DBRC JCL
• Coding an DBRC SCI Registration Exit (DSPSCIX0) and include that exit in
STEPLIB
The exit is recommended since it can be used for all DBRC instances (online, batch,
and utility) without having to change any JCL.
Dependent regions
The region types include MPP, IFP, BMP, JMP, and JBP regions. There are two planning
considerations for dependent regions.
– Starting dependent regions
They are typically started as JOBs using the IMS command:
/START REGION <member-name>
where <member-name> is a member of a library defined on the //IEFRDER DD
statement of the IMSRDR procedure:
//IEFRDER DD DSN=IMSPLEX1.JOBS,DISP=SHR
This library would contain JCL for each dependent region that you want to start,
although a single member can contain the JCL for multiple dependent region jobs to be
started with a single command. With a single IMS environment, the IMSID can be
coded in the JCL. With multiple IMSs, you have choices.
• Use a different IMSRDR procedure for each IMS. The default is IMSRDR but can be
changed for each IMS by coding the PRDR parameter in DFSPBxxx (or on the
IMSCTF macro). Each procedure should define a different JOBS library where the
dependent region JCL can be found:
//IEFRDER DD DSN=IMSPLEX1.IMSA.JOBS
• Use the same IMSRDR procedure and JOBS library and start the regions using the
LOCAL keyword on the command:
/START REGION IMSMSG01 LOCAL
However, since these JOBS have job names, and job names must be unique within an
OS/390 environment, to use the same JOBS library when multiple IMSs are executing
PROCLIB members
There are several new PROCLIB members when running in an IMSplex and several new or
changed parameters required in existing members. Each IMS must be started with an
RGSUF=xxx parameter either in the IMS procedure JCL or on the S IMS command. RGSUF
identifies the DFSPBxxx member where other parameters can be found, and where other
PROCLIB members are identified. Because of this, there can be a single IMS PROCLIB for
the IMSplex containing multiple members tailored to each IMS.
The DFSPBxxx member defines all other PROCLIB members for IMS. It is identified by the
RGSUF=xxx execution parameter on the IMS control region JCL or start command. It is the
only execution parameter that is required on the JCL or start command. All others can be
specified or overridden in this member, or other members pointed to by this member.
Each of these PROCLIB members are either unique to an IMSplex environment, have
parameters that may differ in an IMSplex, or parameters which apply only to an IMSplex
environment. They are identified and described in detail in IMS in the Parallel Sysplex,
Volume III: IMSplex Implementation and Operations, SG24-6929.
Exits
Special attention should be given to exits which are dependent either on the IMS on which
they are running, or which may have some dependencies on the IMSplex environment. Most
(but not all) of these are related to the communications environment. For example, the Signon
Exit (DFSSGNX0) in an ETO environment with shared queues may need to verify that
generated LTERM names are unique within the IMSplex. Some of the exits to pay particular
attention to are:
Initialization Exit (DFSINTX0)
Logon Exit (DFSLGNX0)
Logoff Exit (DFSLGFX0)
Signon Exit (DFSSGNX0)
184 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Signoff Exit (DFSSGFX0)
Conversational Abnormal Termination Exit (DFSCONE0)
Output Creation Exit (DFSINSX0)
DBRC SCI Registration Exit (DSPSCIX0)
Build Security Environment Exit (DFSBSEX0)
Fast Path Input Edit/Routing Exit (DBFHAGU0)
TM and MSC Message Routing and Control User Exit (DFSMSCE0)
Queue Space Notification Exit (DFSQSPC0)
BPE exits (you name them)
188 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Related publications
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this redbook.
IBM Redbooks
For information on ordering these publications, see “How to get IBM Redbooks” on page 190:
IMS Primer, SG24-5352
IMS/ESA V6 Parallel Sysplex Migration Planning Guide for IMS TM and DBCTL,
SG24-5461
IMS Version 7 High Availability Large Database Guide, SG24-5751
IMS Version 7 Release Guide, SG24-5753
A DBA’s View of IMS Online Recovery Service, SG24-6112
IMS Version 7 Performance Monitoring and Tuning Update, SG24-6404
IMS e-business Connectors: A Guide to IMS Connectivity, SG24-6514
IMS Version 7 Java Update, SG24-6536
Ensuring IMS Data Integrity Using IMS Tools, SG24-6533
IMS Installation and Maintenance Processes, SG24-6574
IMS Version 8 Implementation Guide - A Technical Introduction of the New Features,
SG24-6594
IMS DataPropagator Implementation Guide, SG24-6838
Using IMS Data Management Tools for Fast Path Databases, SG24-6866
Everything You Wanted to Know about the System Logger, SG24-6898
IMS in the Parallel Sysplex, Volume I: Reviewing the IMSplex Technology, SG24-6908
IMS in the Parallel Sysplex, Volume III: Operations and Implementation, SG24-6929
The Complete IMS HALDB Guide, All You Need to Know to Manage HALDBs, SG24-6945
Other resources
These publications are also relevant as further information sources:
IMS Version 8: Base Primitive Environment Guide and Reference, SC27-1290
IMS Version 8: Command Reference, SC27-1291
IMS Version 8: Release Planning Guide, GC27-1305
Parallel Sysplex Overview: Introducing Data Sharing and Parallelism in a Sysplex,
SA22-7661
z/OS MVS Programming: Sysplex Services Guide, SA22-7617
z/OS MVS Programming: Sysplex Services Reference, SA22-7618
z/OS MVS Setting Up a Sysplex, SA22-7625
OS/390 V1R3.0 OS/390 System Commands, GC28-1781
DFSMS/MVS V1R5 DFSMSdss Storage Administration Guide, SC26-4930
DFSMS/MVS V1R5 DFSMSdss Storage Administration Reference, SC26-4929
IMS Connect Guide and Reference, SC27-0946
You can also download additional materials (code samples or diskette/CD-ROM images) from
that site.
190 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
Index
cold start 3, 42, 80, 84–85, 98, 100, 112, 115–116,
A 126–127, 136, 175, 185
ACBLIB 99, 115, 182 COLDCOMM restart 136
adjunct area 38–39 Command Authorization Exit (DFSCCMD0) 127, 170
alternate TP-PCB 80, 86 Common Queue Server (CQS) 14, 20, 38, 47, 80
ALT-PCB 100 Common Service Layer 1–2, 6, 8, 10, 74, 78, 84–86, 94,
AOP 169, 171, 176 97, 99, 105–106, 108, 110, 121–122, 134, 137, 142,
AOR 7, 21, 161 159–160, 163, 165–166, 170, 183
APAR 155 common time reference 13
APPC 7, 117–118, 132–134, 139–140 component failure impact analysis (CFIA) 8
APPC/IMS 132–133, 139 configuration planning 2, 55, 77, 105, 159–160
APPC/MVS 132–133, 139 connectivity failure 19–20
APPCSE 169 conversation control block 93
application owning region (AOR) 7, 161 Conversational Abnormal Termination Exit (DFSCONE0)
ARM ix, 14, 16, 20–21, 71–73, 106, 123, 128, 156, 161 94, 185
ARM policy 20, 72, 106, 128, 177–178 couple data set 14, 22, 43
autoalter 44, 119–120, 177 Coupling Facility 1, 6, 11–14, 19, 21–23, 28, 36, 42–44,
automated operator program (AOP) 169 46–47, 58, 65–66, 68, 73, 76, 80–84, 87, 89, 94, 102, 106,
automatic RECON loss notification 126 108–110, 112, 116, 118, 120, 123, 125, 133–134, 137,
automatic RECON loss notification (ARLN) 53, 110–111, 142, 161
113, 124, 126, 165, 171, 183 Coupling Facility cache structure 65, 68
automatic restart management 20, 52, 71, 106 Coupling Facility Control Code 6, 13, 82, 88
Automatic Restart Manager (ARM) ix, 14, 16, 20, 71, Coupling Facility list structure 46–47, 82, 87
106, 128 Coupling Facility resource management (CFRM) 14, 16,
22
B Coupling Facility structure 1, 6, 15, 21, 44, 68, 141
balancing group (BALG) 91 CQS 14, 16, 20–21, 37–42, 44, 47–48, 80–82, 84, 88,
BALG 91 90, 100–102, 163–164, 170–171, 173–174, 176
Base Primitive Environment (BPE) 127, 176 CQS log stream 100, 178
BMP 4, 6–7, 45, 60, 64, 68, 72, 74–76, 126, 131, 134, cross-system coupling facility (XCF) 12–13, 15, 17
154–155, 160–161, 167–168, 176, 183 cross-system extended services (XES) 9, 12, 15, 21
BPE 127–128, 176, 185 CSA 18
BPE Initialization and Termination Exit 127 CSL 10, 20, 74, 99, 105, 108, 110–113, 115–116,
BPE Statistics Exit 127 123–129, 160, 166, 170, 173, 176
BPE user exit list 127, 176 CTC 13, 91, 133
BPEINI00 178
BTAM 117 D
Build Security Environment Exit (DFSBSEX0) 169, 185 data entry database (DEDB) 26, 61, 63
data sharing 1–2, 5, 7–9, 12–15, 17–19, 21–22, 31–32,
C 35–36, 52, 55–79, 83–86, 92, 101, 105–106, 110–111,
cache structure 15, 21, 23, 27–30, 40, 43 114, 118, 120, 122, 126, 140, 155–156, 159–163,
capacity 56 167–168, 182
CCB 120 data sharing group 14, 17–18, 36, 56–59, 69, 72, 75, 78,
CDS 14, 16, 18–20, 43, 45, 120, 128, 178 92, 140, 160
CF Level 6, 27 database resource adapter 132, 162, 167–168
CFCC 39 DB2 4–5, 7, 16, 20–21, 52, 59, 71–72, 122, 155
CFRM 14, 16, 43, 120, 178 DB2 Admin Client 109, 122, 124, 166
CFRM policy 16, 19, 22, 43–44, 73–74, 161, 163–164, DB2 group attach 157, 160
177–178 DB2 stored procedures 4, 7, 156
CFSIZER 31, 102, 177 DBCTL 56
CHANGE.RECON REPLACE 111 DBD 66, 115
channel-to-channel (CTC) connection 91 DBFHAGU0 185
CICS 4–5, 7, 16, 20–21, 45, 47, 52, 59, 71–72, 131–132, DBRC 24, 52, 57, 69, 72, 82, 111, 113–115, 124, 126,
136–139, 144, 155–156, 161–162, 168 160–161, 165, 168, 171, 176–177, 179, 182–183
192 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
JMP 176, 183 P
parallel processing 51
L parallel session 96–97, 117–118, 132, 139, 143
LCV 58, 65, 162 persistent connection 22, 42, 99
list entry control (LEC) 38 persistent session 106
list notification vector (LNV) 82, 88 persistent sessions 37
LNV 88 persistent socket 151
local cache vector (LCV) 58, 65, 82, 162 persistent structure 22
lock structure 15, 36, 161 PHDAM 66–67
Logoff Exit (DFSLGFX0) 95, 184 PHIDAM 66–67
Logon Exit (DFSLGNX0) 117, 184 planning phase 1–2, 62, 76, 113, 121, 143, 159–160
LOGR 14, 82 PPT 178
LOGR policy 100, 164, 177–178 PQ42180 155
LPAR 2, 6–7, 13, 19, 56, 60–62, 72, 82, 84, 106, 108, preparation phase 1, 3, 9, 175
114, 124–125, 129, 144, 148–149, 160–161 PROCLIB 3, 18, 46, 71, 99–100, 112, 115–116, 125,
LPAR failure 84, 106, 161 127, 155, 160, 163, 172–173, 175–176, 178–179, 182,
184
program isolation (PI) 68
M program properties table (PPT) 178
main storage database (MSDB) 61, 63, 181 PSB 63–64, 70, 86–88, 115, 121, 154–156, 162,
MATRIX 99, 182 167–168, 178
MCS console 113
MNPS 141
MODBLKS 99, 115, 182 Q
MPP 91, 176, 183 Queue Space Notification Exit (DFSQSPC0) 185
MSC 5, 38–39, 64, 78–80, 85, 91–93, 133, 136–137,
140–141, 143, 167, 178–179 R
MSDB 61, 63, 181 RACF 16, 114, 116, 121, 128, 168–174
multi-node persistent sessions 52, 134, 141 Rapid Network Reconnect 74, 106, 134, 141
multinode persistent sessions 106 RBA 24
multiple systems coupling (MSC) 7, 84, 91, 133, 140 RECON 57, 69, 72, 110–111, 113–114, 124, 161–162,
181–183
N recovery 3–6, 10, 13, 15–17, 19, 27, 31–32, 47, 49–50,
Non-discardable Message Exit (DFSNDMX0) 90 57, 61, 64, 68–71, 74, 76, 81–82, 84–86, 95, 117–118,
NORSRCC 115 134, 150, 164, 177
recovery point 69
Redbooks Web site 190
O Contact us xi
ODBA 7, 59, 131–132, 155–156, 162, 167–168 relative byte address (RBA) 24
OLCSTAT 112, 179, 181 remote site recovery (RSR) 101
OLDS 49, 70, 75, 82, 84, 100, 179, 181 Resource Manager (RM) 41, 109, 125, 165, 172
OM 20, 109, 116, 121–124, 126–129, 165–166, resource status recovery 109, 112
171–174, 176–177 restart 3, 5, 8, 20–21, 46–47, 49, 59, 71–72, 74–75, 81,
OM client 121–122, 128 84, 87, 99, 106, 110, 112, 121, 123, 128–129, 136, 139,
open database access 7, 132, 156, 167–168 141, 156, 161
open transaction manager access (OTMA) 7, 133, 144 restart data set 180
operational phase 1, 3, 9, 17 restart group 177
operational planning 3, 159, 175 RM 20, 109, 118–121, 125, 127–129, 165, 171, 173,
Operations Manager 2, 53, 109, 113, 115–116, 121–122, 176–177
125–127, 165, 168, 171 RMF 14, 68
OSAM 22–26, 58, 66, 73, 79, 106, 174, 177 RNR 134, 141
OSAM cache structure 6, 25, 43, 73, 161 RSR 101
OTMA 4, 7, 18, 117, 131, 133–134, 144–146, 162–163,
168–169
OTMA client 7, 133, 144, 146 S
OTMA group 5, 17 SAF 169–170, 172
OTMASE 169 SCI 20, 109, 113–114, 121–129, 165–166, 171–173, 176
Output Creation Exit (DFSINSX0) 96–97, 185 scratch pad area (SPA) 93
SDFSRESL 9, 114, 181–183
security planning 2, 55, 77, 105, 159, 168
Index 193
SERIAL 5, 8, 64, 87, 89–90, 178 system-managed duplexing rebuild 43
serial processing 90, 178 system-managed rebuild 42–43
serial transactions 90
set and test sequence numbers (STSN) 110, 132, 136
SETXCF FORCE,CONNECTION 42 T
SETXCF FORCE,STRUCTURE 42 TCB 68
SETXCF START 44 TCP/IP 4, 131, 133, 144, 162, 166–168, 185
SETXCF START,ALTER 44 TCT 88
SETXCF START,POLICY 20 time-of-day (TOD) clocks 13
SETXCF START,REBUILD 43 TM and MSC Message Routing and Control User Exit
SETXCF STOP 44 (DFSMSCE0) 140, 185
SFM 19–20 TOD 13
SFM policy 19–20 TP-PCB 133
shared DEDB VSO 6, 23, 58, 79 transaction
shared EMH queue 6, 14 online 56
shared queues 1–2, 5, 7–8, 10, 15, 21–22, 37–42, 46, transaction class table (TCT) 88
48–49, 52, 64, 74, 77–88, 90–98, 100–102, 105–106, TSO SPOC 52, 115, 121–124, 165–166, 171, 176–177
110–112, 114, 116–120, 122, 125–128, 133–138,
140–143, 146, 148, 159–160, 163–164, 166–170, U
173–174, 177–178, 180, 184 Unit of Work ID (UOWID 100
shared queues group 10, 17, 37, 46, 49, 80–84, 90–94, UOWID 100
97, 99, 141, 163, 179 user-managed duplexing 42–43
shared VSO 22–23, 26–27, 30 user-managed rebuild 42–43
Signoff Exit (DFSSGFX0) 95, 185
Signon Exit (DFSSGNX0) 97, 118, 184
single node persistent session (SNPS) 141 V
SLDS 49, 70, 72, 75, 82 VIPA 149–150, 153, 168
SMU 169–170 Virtual IP Addressing (VIPA) 149
SNA 6, 131–133, 136–137 VSAM 22–23, 79, 106, 174, 177
SNPS 134, 141 VSAM cache structure 6, 24, 73–74, 161
SPA 93–96 VSO 22, 26, 58, 63, 74, 106, 161, 174
SPC 91 VTAM 4, 16–17, 20, 45, 80, 84, 91, 95–96, 106, 117, 125,
SPOC 3, 109, 111, 113, 121 131–135, 139, 141–144, 146
SRDS 49, 81–82, 84, 99, 101 VTAM generic resource group 5, 17, 46, 125, 141, 144,
SRM=GLOBAL 110, 137, 142–143 163
SRM=LOCAL 117–118 VTAM Generic Resources ix, 6, 37, 45–46, 52, 78, 106,
STARTED class 174–175 125–126, 134–136, 139–141, 143–144
static VIPA 150 VTCB 96, 120
STM 74, 84, 94, 111–112, 116–118, 120, 125
store-in cache structure 58
store-through cache structure 58
W
warm start 136–137
structure alter 44
WebSphere Application Server 4, 7, 156, 162, 168
structure recovery 177
WebSphere MQ 4, 7, 16–17, 52, 59, 71, 106, 131,
structure recovery data set 81, 99
133–134, 144, 146
Structured Call Interface (SCI) 109, 165, 171
WLM policy 45
STSN 110, 117–119, 129, 136–137
SYS1.PROCLIB 176, 178
Sysplex 111 X
Sysplex Distributor 134, 153, 157, 168 XCF 11, 16–18, 37, 42, 44, 58–59, 71–73, 79–80,
sysplex failure management (SFM) 14, 19 132–133, 145, 151, 168, 178
sysplex processing code (SPC) 91 XCF communications 106, 133, 144
sysplex services for communications 17 XCF group 14, 17–18, 35, 42, 56, 131, 134, 145, 147,
sysplex services for data sharing 17 163
Sysplex services for recovery 17 XCF group services 18
sysplex terminal management (STM) 74, 84, 94, 111, XCF monitoring services 18
116 XCF signaling services 18
sysplex timer 13 XES 9, 11–12, 15–17, 21–22, 28, 32, 35, 42, 56, 58–59,
System Logger 6, 14, 16, 21, 37, 47–48, 50, 82, 100, 65, 68, 77, 79–80, 83, 88, 106, 178
164, 170 XRF 21, 101, 143, 176
system-managed duplexing 42–43, 58, 129 XRF planned takeover 20
194 IMS in the Parallel Sysplex Volume II: Planning the IMSplex
IMS in the Parallel Sysplex Volume II: Planning the IMSplex
(0.2”spine)
0.17”<->0.473”
90<->249 pages
Back cover ®