100% found this document useful (1 vote)

1K views

ISPE GAMP RDI Good Practice Guide Data Integrity by Design

The ISPE GAMP® RDI Good Practice Guide: Data Integrity by Design offers guidance for pharmaceutical organizations to ensure good data governance and compliance in GxP computerized systems. It emphasizes the importance of integrating data integrity from the planning stage through to system retirement, promoting a holistic approach to data governance. The guide serves as a comprehensive resource that bridges system and data lifecycle management, enhancing patient safety and product quality.

Uploaded by

jeyapragash Ramadass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

1K views

ISPE GAMP RDI Good Practice Guide Data Integrity by Design

Uploaded by

jeyapragash Ramadass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 176

gxp 1027

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
RECORDS AND
ISPE, the Developers of GAMP®

DATA INTEGRITY

GOOD PRACTICE GUIDE:

Data Integrity
by Design

Disclaimer:
The ISPE GAMP® RDI Good Practice Guide: Data Integrity by Design provides practical guidance to support
pharmaceutical organizations achieve good data governance together with the efficient and effective implementation
and operation of compliant GxP computerized systems. This Guide is solely created and owned by ISPE. It is not
a regulation, standard or regulatory guideline document. ISPE cannot ensure and does not warrant that a system
managed in accordance with this Guide will be acceptable to regulatory authorities. Further, this Guide does not
replace the need for hiring professional engineers or technicians.

Limitation of Liability
In no event shall ISPE or any of its affiliates, or the officers, directors, employees, members, or agents of each
of them, or the authors, be liable for any damages of any kind, including without limitation any special, incidental,
indirect, or consequential damages, whether or not advised of the possibility of such damages, and on any theory of
liability whatsoever, arising out of or in connection with the use of this information.

© Copyright ISPE 2020. All rights reserved.

All rights reserved. No part of this document may be reproduced or copied in any form or by any means – graphic,
electronic, or mechanical, including photocopying, taping, or information storage and retrieval systems – without
written permission of ISPE.

All trademarks used are acknowledged.

ISBN 978-1-946964-34-2

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 2 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Preface
Data Integrity by Design is the concept that data integrity must be incorporated from the initial planning of a business
process through to the implementation, operation, and retirement of computerized systems supporting that business
process. It promotes the application of critical thinking to identify how data flows through the business process, and
to proactively assess and mitigate risks across both the system and data lifecycles. It emphasizes data integrity as
foundational to protecting patient safety and product quality.

This ISPE GAMP® RDI Good Practice Guide: Data Integrity by Design supports organizations as they embrace
and implement a holistic approach by leveraging data governance and knowledge management activities to drive
continual improvement in data integrity. The Guide fosters a patient-centric mindset, focusing resources and
management attention on quality best practices that inherently facilitate meeting regulatory compliance requirements.

This Guide provides a bridge between the system lifecycle approach defined in ISPE GAMP® 5: A Risk-Based
Approach to Compliant GxP Computerized Systems, and the data lifecycle approach in the ISPE GAMP® Guide:
Records and Data Integrity. Data integrity can only be achieved when both lifecycle approaches are adopted,
understood, and actively managed.

Part of the ISPE GAMP® Guide: Records and Data Integrity series, this Guide replaces and significantly expands
upon the ISPE GAMP® Good Practice Guide: Electronic Data Archiving from 2007.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 3
Data Integrity by Design

Acknowledgements
The Guide was produced by a Task Team led by James Henderson (Eli Lilly and Company, USA), Lorrie Vuolo-
Schuessler (Syneos Health, USA) and Charlie Wakeham (Waters Corporation, Australia). The work was supported by
the ISPE GAMP Community of Practice (CoP) and sponsored by Michael Rutherford (Syneos Health, USA).

Core Team

The following individuals took lead roles in the preparation of this Guide:

Petch Ashida-Druar GlaxoSmithKline USA

Matt Brawner Data Integrity and Compliance Consultant USA
Roger Buchanan Buchanan Consultancy Ltd. United Kingdom
Rob Dillman Eli Lilly & Company USA
Stuart Jones Merck & Co. Inc. Ireland
Paige Kane Merck & Co. Inc. USA
Heather Longden Waters Corporation USA
Charles H. Maher III CAI USA
Bob McDowall R D McDowall Ltd. United Kingdom
Matt McMenamin GlaxoSmithKline USA
Anubha Mukherjee CSV and IT Compliance Consultant India
Mark Newton Heartland QA USA
Levi Schenk CSL Behring USA
Rob Stephenson Rob Stephenson Consultancy United Kingdom

Contributors and Team Reviewers

The Leads wish to thank the following individuals for their valuable contribution during the preparation of this Guide.

Elissa Hilton Eli Lilly & Co. USA

Brandi Stockton USA

CSA Industry Pilot Team

The Team Leads wish to thank the following members of the CSA Industry Pilot Team for their exceptional efforts in
producing the CSA Appendix.

Senthil Gurumoothi Gilead Sciences, Inc. USA

Shana Kinney REGENXBIO USA
Kevin Martin Azzur Group LLC USA
Khaled Moussally Compliance Group Inc USA
Raechelle Raimondo Abbvie (formerly Allergan) USA
Ken Shitamoto Gilead Sciences, Inc. USA
Francisco Vincenty U.S. Food and Drug Administration (FDA) USA

Special Interest Appendix on AI

The Team Leads wish to thank the following members for their hard work on the Special Interest Appendix on Artificial
Intelligence: Machine Learning.

Ravi Krishna Assembly Consulting USA

Urmi Parkar PRA Health Sciences USA
Levi Schenk CSL Behring USA
Eric Staib (Lead) Genpact USA
Brandi Stockton USA

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 4 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Subject Matter Expert Input and Review

The Team Leads wish to thank the following individuals for their expert contribution to the document.

Peter Baker Green Mountain Quality Assurance USA

Monica Cahilly Green Mountain Quality Assurance USA
Judy Samardelis Thermo Fisher Scientific USA

Regulatory Input and Review

Particular thanks go to the following for their review and comments on this Guide:

Kevin Bailey MHRA United Kingdom

Gaye Camm Therapeutic Goods Administration (TGA) Australia
Robert Tollefsen U.S. Food & Drug Administration (FDA) USA
Francisco Vincenty U.S. Food and Drug Administration (FDA) USA

Special Thanks

The Leads would like to give particular thanks to Chris Clark (TenTenTen Consulting Limited, United Kingdom), and
ISPE Technical Advisor, Sion Wyn (Conformity Ltd., United Kingdom) for their efforts during the creation process
of this Guide. The Team would also like to thank ISPE for technical writing and editing support by Jeanne Perez
(ISPE Guidance Documents Technical Writer/Editor) and production support by Lynda Goldbach (ISPE Guidance
Documents Manager).

The Team Leads would like to express their grateful thanks to the many individuals and companies from around the
world who reviewed and provided comments during the preparation of this Guide; although they are too numerous to
list here, their input is greatly appreciated.

Company affiliations are as of the final draft of the Guide.

600 N. Westshore Blvd., Suite 900, Tampa, Florida 33609 USA

Tel: +1-813-960-2105, Fax: +1-813-264-2816

www.ISPE.org

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 5
Data Integrity by Design

Table of Contents
1 Introduction ..................................................................................................................... 9

.
1.1 Background .................................................................................................................................................. 9

.
1.2 Case for Quality Program ............................................................................................................................ 9

.
1.3 Purpose...................................................................................................................................................... 10
.
1.4 Scope ......................................................................................................................................................... 10
.
1.5 Structure of the Guide ................................................................................................................................ 11

.
1.6 Key Terms .................................................................................................................................................. 11
.
1.7 Key Roles and Responsibilities.................................................................................................................. 14

2 Foundations to Data Integrity by Design ................................................................... 15

.
2.1 Governance to Achieve Data Integrity by Design ...................................................................................... 15

.
2.2 Data Ownership ......................................................................................................................................... 19
.
3 Retention Strategy ........................................................................................................ 23
.
3.1 Retention Periods ...................................................................................................................................... 24
.
3.2 Readability ................................................................................................................................................. 25
.
3.3 Availability .................................................................................................................................................. 28
.
3.4 Access ....................................................................................................................................................... 30
.
3.5 Protecting Records and Data ..................................................................................................................... 31
.
3.6 Managing System Retirement.................................................................................................................... 36
3.7 Records Management and Retention through Mergers, Acquisitions, and Divestments ........................... 41

.
4 Implementing Data Integrity by Design .................................................................... 43
.
4.1 A Process to Achieve Data Integrity by Design .......................................................................................... 43
.
4.2 Business Process ...................................................................................................................................... 47
.
4.3 Data Flow Diagrams .................................................................................................................................. 48
.
4.4 Data Classification and the Intended Use of the Data ............................................................................... 49
.
4.5 Business Process Risk Assessment .......................................................................................................... 50
.
4.6 Data Lifecycle ............................................................................................................................................ 51
.
4.7 Data Nomenclature .................................................................................................................................... 55
.
5 System Planning ............................................................................................................ 59
.
5.1 Planning Computerized Systems to Efficiently Support the Optimized Business Process ........................ 59
.
5.2 Addressing Individual Systems .................................................................................................................. 61
.
5.3 System Risk Assessment........................................................................................................................... 63
.
6 Active Records............................................................................................................... 67
.
6.1 Creation ..................................................................................................................................................... 67
.
6.2 Processing ................................................................................................................................................. 69
.
6.3 Review, Reporting, and Use ...................................................................................................................... 75
.
7 Semi-active and Inactive Records ............................................................................... 81
.
7.1 Semi-active Records .................................................................................................................................. 81
.
7.2 Retention of Inactive Records .................................................................................................................... 81
.
7.3 Return to Active State (Retrieval)............................................................................................................... 86
7.4 Destruction ................................................................................................................................................. 87
.
For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 6 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Management Appendices

8 Appendix M1 – Knowledge Management................................................................... 89

.
8.1 Introduction ................................................................................................................................................ 89

.
8.2 Key Concepts............................................................................................................................................. 89
8.3 Managing Knowledge ................................................................................................................................ 92

.
8.4 Mindsets and Behaviors............................................................................................................................. 94
8.5 Conclusion ................................................................................................................................................. 95
.
9 Appendix M2 – Understanding Data Integrity Compared to Data Quality .......... 97

.
10 Appendix M3 – Third-Party Data ................................................................................ 99

.
10.1 Introduction ................................................................................................................................................ 99
.
10.2 Assessments and Responsibilities............................................................................................................. 99
10.3 Data Governance for CxO ......................................................................................................................... 99
.
10.4 Data Storage Off-Premise........................................................................................................................ 100
.
10.5 Conclusion ............................................................................................................................................... 101
.
Development Appendices

11 Appendix D1 – Example Business Process Mapping: Laboratory System............103

.
12 Appendix D2 – Instrument Devices with Electronic Record Storage ...................105

.
12.1 Introduction .............................................................................................................................................. 105
.
12.2 Background .............................................................................................................................................. 105
.
12.3 Instrument Use......................................................................................................................................... 106
.
12.4 Accounting for Original Data .................................................................................................................... 106
.
12.5 Challenges ............................................................................................................................................... 107
.
12.6 Remediation Strategies............................................................................................................................ 107
12.7 Risk Register............................................................................................................................................ 109
.
12.8 Example Data Integrity Risks, Interim Controls, and Actions to Consider ............................................... 109
.
12.9 Conclusion ............................................................................................................................................... 110
.
Operation Appendices

13 Appendix O1 – Data Analytics and Technical Solutions Supporting

Data Integrity ................................................................................................................. 111
.
13.1 Introduction ...............................................................................................................................................111
.
13.2 Detecting Data Integrity Issues with Business Rules ............................................................................... 112
.
13.3 Technical Solution Using Computer Lockdown via Software Shells ........................................................ 116
.
14 Appendix O2 – Good Practices for General Archiving ........................................... 119
.
14.1 What is Archiving? ................................................................................................................................... 119
.
14.2 What to Archive ........................................................................................................................................ 119
.
14.3 How to Archive ......................................................................................................................................... 119
.
14.4 Managing an Archive ............................................................................................................................... 120
.
For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 7
Data Integrity by Design

15 Appendix O3 – GLP Archiving Considerations ........................................................ 123

.
16 Appendix O4 – Example Retention Periods and Requirements ...........................125

.
17 Appendix O5 – Maintaining Legacy Software .........................................................139

.
17.1 Introduction .............................................................................................................................................. 139

.
17.2 Non-Disposal of Retired Systems ............................................................................................................ 139

.
17.3 Compatibility to Modern Operating Systems............................................................................................ 139
17.4 Virtual Machine Solution .......................................................................................................................... 139

.
17.5 The Hardware Museum ........................................................................................................................... 140

.
17.6 Conclusion ............................................................................................................................................... 140
.
Special Interest Topics Appendices

18 Appendix S1 – Artificial Intelligence: Machine Learning ........................................ 141

.
18.1 Introduction .............................................................................................................................................. 141
.
18.2 Background .............................................................................................................................................. 141
.
18.3 Scope ....................................................................................................................................................... 141
.
18.4 Data Lifecycle (Iterative, Autonomous, and Adaptive) ............................................................................. 142
.
18.5 Concept Phase (Understanding the Business Case)............................................................................... 142
18.6 Project Phase (Data Modeling and Evaluation) ....................................................................................... 144
.
18.7 Operation Phase (Deployment and Monitoring)....................................................................................... 145
18.8 Further Reading ....................................................................................................................................... 146
.
19 Appendix S2 – Computer Software Assurance ........................................................147
.
19.1 Introduction .............................................................................................................................................. 147
.
19.2 Establishing a Lifecycle-based Approach and Basic Assurance .............................................................. 152
.
19.3 Risk-based Assurance ............................................................................................................................. 153
.
19.4 Example: Applying Risk-based Approach from ISPE GAMP® 5 and CSA ............................................... 157
.
19.5 Example: Applying ISPE GAMP® 5 and CSA Using Direct Leveraging of Testing
Throughout the System Lifecycle............................................................................................................. 160

19.6 Conclusion ............................................................................................................................................... 162
.
General Appendices

20 Appendix G1 – References ..........................................................................................163

.
21 Appendix G2 – Glossary ..............................................................................................169
.
21.1 Acronyms and Abbreviations ................................................................................................................... 169
.
21.2 Definitions ................................................................................................................................................ 171
.
For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 9
Data Integrity by Design

1 Introduction
1.1 Background

To date, many regulated life science organizations have been addressing data integrity on a system by system
basis. The initial focus has often been on achieving good data integrity outcomes by the remediation of technical
gaps in existing systems, and by developing good practice cultures and behaviors to ensure that compliance is
being achieved “in real time.” This focus may have resulted in missed opportunities to work with suppliers on the
development of enhanced functionalities supporting data integrity, for both existing and new systems.

The requirement to achieve data integrity is not new; data integrity expectations and holistic approaches have been
implied within regulations globally for some time, for example:

EU GMP 1.4 [1]:

“(i) Product realisation is achieved by designing, planning, implementing, maintaining and continuously improving
a system that allows the consistent delivery of products with appropriate quality attributes”

“(xi) Continual improvement is facilitated through the implementation of quality improvements appropriate to the
current level of process and product knowledge.”

US GMP 21 CFR 211.160 (b) [2]:

“Laboratory controls shall include the establishment of scientifically sound and appropriate specifications,
standards, sampling plans, and test procedures designed to assure that components, drug product containers,
closures, in-process materials, labeling, and drug products conform to appropriate standards of identity, strength,
quality, and purity.”

US GMP 21 CFR 211.194 (a) [3]:

“Laboratory records shall include complete data derived from all tests necessary to assure compliance with
established specifications and standards, including examinations and assays”

Achieving data integrity requires a holistic approach across the entire organization, leveraging critical thinking, and
implementing data governance across all regulated business processes.

1.2 Case for Quality Program

The US FDA Center for Devices and Radiological Health (CDRH) Case for Quality program [4] promotes a risk-
based, product quality-focused, and patient-centric approach. This initiative has been endorsed by the ISPE GAMP®
CoP (Community of Practice) Leadership [5] who believe that:

“such an approach is appropriate throughout the regulated life science industries, including pharmaceutical,
biological, and medical devices, and throughout the complete product life cycle, regardless of the specific
applicable predicate regulation. GAMP® strongly supports the adoption of new and innovative computerized
technologies and approaches throughout the product life cycle to support product quality, patient safety, and
public health.”

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 10 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Computer Software Assurance was born from CDRH’s Case for Quality [4] which enhances and incentivizes the
adoption of practices and behaviors to improve medical safety, responsiveness, and how patients experience
products. As part of that effort, Computerized System Validation (CSV) was identified as a barrier to new technologies
due to the lack of clarity on risk-based effort, compliance-focused approaches, and perceived regulatory burden.
In actuality, the FDA supports and encourages the use of automation, information technology, and data solutions
throughout the product lifecycle in the design, manufacturing, service, and support of life sciences [6]. An industry team
was formed and work begun on the development of an FDA draft guidance on Computer Software Assurance (CSA).

Appendix S2 contains a detailed discussion of CSA, and a case study based around a data integrity technical control
demonstrating the application of CSA in a real-life situation.

1.3 Purpose

This ISPE GAMP® RDI Good Practice Guide: Data Integrity by Design builds on the guidance contained in the various
ISPE GAMP® Good Practice Guide Series [7], including the ISPE GAMP® Guide: Records and Data Integrity [8], and
ISPE GAMP® 5: A Risk-Based Approach to Compliant GxP Computerized Systems [9].

The focus of this new Good Practice Guide is on unifying system and data lifecycles, from data creation to
destruction. The purpose of this new Guide is to provide a practical “bridge” between achieving good data
governance for regulated business processes and the efficient and effective implementation and operation of
compliant GxP computerized systems.

This Guide emphasizes the interrelationships between these operational imperatives and presents a common
approach, providing a harmonized framework that will help organizations achieve/enhance product quality and patient
safety.

1.4 Scope

This Guide is intended to encourage organizations to adopt a risk-based approach to ensuring data integrity in
support of patient safety and product quality.

The scope of this Guide is as broad as the scope of ISPE GAMP® 5 [9] and includes all computerized systems used
in support of GxP regulated business processes within the life science industries. While computerized systems
feature strongly within this Guide, as these systems provide technical controls for data integrity, data governance
also requires procedural and behavioral controls supported by a quality culture that leverages critical thinking and a
continual improvement mindset.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 11
Data Integrity by Design

1.5 Structure of the Guide

This Guide is divided into seven chapters and related sub-sections, plus a set of appendices, as shown in Figure 1.1.

Figure 1.1: Guide Structure

1.6 Key Terms

1.6.1 ALCOA+

ALCOA+ is the acronym for the key concepts that can help to support record and data integrity [8]. See Table 1.1.

Table 1.1: ALCOA+ Principles [8]

Principle Data Expectation

Attributable • Attributable to the person or system generating the data

• Identify the person or system performing an activity that creates or modifies data
• Linked to the source of the data

Legible • Readable and permanent

• Accessible throughout the data life cycle
• Original data and any subsequent modifications are not obscured

Contemporaneous • Recorded or observed at the time the activity is performed

Original • Original data is the first recording of data, or a “true copy” which preserves content or
meaning

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 12 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Table 1.1: ALCOA+ Principles [8] (continued)

Principle Data Expectation

Accurate • Free from error

• No editing performed without documented amendments
• Conforming to truth or standard

Complete • All data, and relevant metadata, including any repeat or re-analysis performed

Consistent • Application of good documentation practices throughout any process

• The application of date and time stamps in the expected sequence

Enduring • Recorded in a permanent, maintainable form for the retention period

Available • Available and accessible for review, audit, or inspection throughout the retention period

1.6.2 Audit Trail

In this document the term audit trail refers to a data audit trail of operator entries and actions that create, modify,
or delete regulated records, as required by 21 CFR Part 11 [10] and EU Annex 11 [11], as distinguished from other
system and technical logs.

1.6.3 GxP

The term “GxP” is used within this Guide to represent the encompassing regulations (Good Practices) to which
different aspects of regulated companies must adhere. It is not intended to imply that all regulatory requirements
are the same across Good Manufacturing Practice (GMP), Good Clinical Practice (GCP), Good Laboratory Practice
(GLP), Good Distribution Practice (GDP), and Good Pharmacovigilance Practice (GVP, also known as GPvP), etc.

1.6.4 Data Integrity and Quality

Data integrity and data quality are sometimes used interchangeably, but there is actually a difference between these
two terms.

It is the awareness of the similarities and differentiation between data integrity and data quality that is less prevalent
and therefore a number of examples are provided in Appendix M2 Table 9.1 to aid in this clarification.

1.6.4.1 Understanding Data Integrity

The need for data integrity is well established through predicate rules, through data integrity guidance, and through
industry acceptance of data integrity as an essential component to protecting patient safety and ensuring product quality.

Data integrity is the assurance that the data is original and trustworthy, and that this assurance has been maintained
throughout the data lifecycle. The requirements for data integrity are discussed comprehensively in the ISPE GAMP®
Guide: Records and Data Integrity [8].

1.6.4.2 Understanding Data Quality

The ISPE GAMP® Guide: Records and Data Integrity [8] states that:

“Data quality relates to the data’s fitness to serve its intended purpose in a given context within a specified
business or regulatory process. Data quality management activities address aspects including accuracy,
completeness, relevance, consistency, reliability, and accessibility.”

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 13
Data Integrity by Design

MHRA [12] defines data quality as:

“The assurance that data produced is exactly what was intended to be produced and fit for its intended purpose.
This incorporates ALCOA.”

The Organisation for Economic Co-operation and Development (OECD) [13] defines data quality as:

“Data quality is the assurance that the data produced are generated according to applicable standards and fit for
intended purpose in regard to the meaning of the data and the context that supports it. Data quality affects the
value and overall acceptability of the data in regard to decision-making or onward use.”

Data quality requires that the data is organized and able to be accessed, sorted, and searched to enable the business
to effectively use the data, and is reflected in the list below:

• Data Accuracy: The extent to which the data is free of identifiable errors

• Data Accessibility: The level of ease and efficiency at which data is legally obtainable, within a well-protected
and controlled environment

• Data Comprehensiveness: The extent to which all required data within the entire scope are collected,
documenting intended exclusions

• Data Consistency: The extent to which the data is reliable, identical, and reproducible by different users across
applications

• Data Currency: The extent to which data is up-to-date; a datum value is up-to-date if it is current for a specific
point in time, and it is outdated if it was current at a preceding time but incorrect at a later time

• Data Nomenclature: A consistent approach to metadata entry that facilitates identification of data relating to the
same product or process for use in trending and/or data analytics, for example, a data lake. Discussed more fully
in Section 4.7

• Data Granularity: The level of detail at which the attributes and characteristics of data quality are defined

• Data Precision: The degree to which measures support their purpose, and/or the closeness of two or more
measures to each other

• Data Relevancy: The extent to which data is useful for the purposes for which it was collected

• Data Timeliness: The availability of up-to-date data within the useful, operative, or indicated time

Note: Items in bold underline are reasonably addressed in ALCOA+ as part of data integrity attributes.

Data quality is defined by the Data Management Body of Knowledge (DMBOK) [14] as:

“…the planning, implementation, and control of activities that apply quality management techniques to data, in
order to assure it is fit for consumption and meet the needs of data consumers.”

See Figure 8.2 in Appendix M1 for the concept of data producers and consumers within knowledge and data
management.

1.6.5 Types of Data

In this Guide, data is classified as regulated, operational, or unnecessary, as originally defined in ISPE GAMP® RDI
Good Practice Guide: Data Integrity – Manufacturing Records [15]:

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 14 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

• “Regulated: used for a regulated decision or to support a regulated process, that is, data as required by, or in
support of, the predicate rules – what we have to keep

• Operational: non-regulated data used for business process decisions such as performance analysis and
management of maintenance schedules – what we want to keep

• Unnecessary: data not needed due to either the circumstances of its creation (e.g., during a non-regulated
activity or process) or because that data does not provide additional context, metadata, or meaning for the
activity or process – what we do not need”

Note that some data may needs to be retained for financial, health and safety, or other non-life science regulations.

1.7 Key Roles and Responsibilities

The key roles recommended for data integrity by design are those identified in the following ISPE GAMP® publications:

• ISPE GAMP® 5: A Risk-Based Approach to Compliant GxP Computerized Systems (GAMP 5) [9]

• ISPE GAMP® Guide: Records and Data Integrity (RDI) [8]

• ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts (DI-KC) [16]

The definitions of these roles and their assigned responsibilities are not repeated here, however, the roles and where
they are described are listed below:

• Senior Management (DI-KC)

• Chief Data Integrity/Governance Officer (DI-KC)

• Process Owner (GAMP 5, DI-KC)

• System Owner (GAMP 5, DI-KC)

• System Support and Administration Personnel (DI-KC)

• Database Administration (DI-KC)

• Data Steward (RDI, DI-KC)

• Data Owner (RDI, DI-KC)

• Quality Unit (GAMP 5)/Quality Function (DI-KC)

• Subject Matter Expert (GAMP 5, DI-KC)

• End User (GAMP 5, DI-KC)

• System Developer (DI-KC)

• Supplier (GAMP 5)

See Figure 2.3 for a schematic representation of the relationship between various owner roles, data lifecycle, and
record phase.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 15
Data Integrity by Design

2 Foundations to Data Integrity by Design

Data integrity by design is a critical-to-quality initiative for GxP regulated organizations that ensures public health,
patient safety, and product quality. This is supported by the FDA Case for Quality initiative [4] as introduced in Section
1.2, and by current (and emerging) regulations and guidelines for compliant records and data. Collectively these,
along with warning letters and other interventions, are acting as drivers to achieve the adoption of fully electronic
recordkeeping (even for manual tasks). There are also business efficiency gains that can be achieved with moving
away from paper records.

Data integrity by design can be achieved through a governance framework combined with Quality Risk Management
(QRM) and Knowledge Management (KM).

The data governance framework provides the controls for data integrity and quality assurance. QRM is collectively
applied to product and process understanding to achieve patient safety, product quality, and data integrity resulting in
high quality foundational data. This is an input to KM to drive continual improvement to organizational and business
processes. Throughout all activities, critical thinking must be leveraged within a quality culture to strive for operational
excellence. This all-encompassing approach is shown in Figure 2.1. KM is explained in detail in Appendix M1.

Figure 2.1: Foundations to Data Integrity by Design

2.1 Governance to Achieve Data Integrity by Design

Governance to achieve data integrity by design requires the following elements to be established:

• A long-term vision

• A unifying strategy for data and system governance

• A roadmap to deliver the strategy for a specific process or project

• Provision of resources (budgetary and personnel)

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 16 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

• Supporting policies and procedures

Figure 2.2 expands on the data governance framework with tools to aid in the understanding of the process and to
optimize system planning to achieve data integrity by design. These tools are discussed in detail in Chapter 4.

Figure 2.2: Leveraging Governance Elements to Ensure Data Integrity by Design

Adapted from ISPE GAMP® Guide: Records and Data Integrity [8]

Some components of this governance approach may already be in place within the regulated organization; however,
these should be reviewed and expanded as necessary to achieve a fully comprehensive governance framework.
Effective data governance is the ultimate aim of any data integrity assurance program.

2.1.1 Establishing a Long-term Vision

In order to achieve an effective data integrity by design strategy, a high-level vision for record and data management,
“what this might look like,” needs to be established at an organization level.

The high-level vision needs to leverage critical thinking across all data governance activities, including developing the
data flow through the business process and identifying critical-to-quality data and its owner(s). It also needs to apply a
risk-based approach to identifying and mitigating the risks and vulnerabilities of manual intervention, which otherwise
could result in inadvertent or unauthorized data manipulation.

Without an organization-wide vision for data integrity by design, there is the risk that important decisions relating to
the introduction or modification of processes and systems may be taken without proper consideration of the data
integrity implications. This may result in the organization adopting suboptimal solutions that prevent or inhibit the
ability to achieve and demonstrate good data integrity practices.
______________________________________________________________________________________
_
For Example, a strong company vision with respect to data integrity could be that:

As part of our commitment to support public health, patient safety, and product quality, our five-year vision
is that all regulated records will be created and maintained in electronic format according to good practice
ALCOA+ principles throughout their lifecycle (creation to destruction).
______________________________________________________________________________________
_
For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 17
Data Integrity by Design

This vision should form a strategic approach to defining an appropriate data architecture and corresponding data
lifecycle that are considered when acquiring new systems and remediating existing record and data lifecycle
processes. This strategic approach sets the expectation that eventually eliminates all paper and hybrid records within
the organization that are difficult to maintain in a compliant state throughout their retention period.

The vision can be translated into annual goals and continual improvement objectives that funnel down to individual
contributors (who are the data generators) to provide assurance that integrity and quality are built into the process.

2.1.2 A Unifying Strategy for Data and System Governance

Governance can take place at many levels within an organization, for example, global, regional, group, national, site
wide, departmental, process, project.

From a governance point of view, “data integrity by design” can be restated as “data integrity by intent.” There needs
to be a clear mandate at every level of governance that achieving data integrity through intentional design is a
business-critical requirement, one that will be supported, endorsed, and resourced. This should be a commitment not
just for the quality unit, but for all functions involved in the design, implementation, validation, and operation of the
business process.

Governance to achieve data integrity and governance for the validation of computerized systems within a regulated
organization have been discussed in the ISPE GAMP® Guide: Records and Data Integrity [8] and ISPE GAMP® 5 [9]
respectively.

Validation of computerized systems for intended use is necessary to achieve data integrity, while data integrity
requirements must be considered as part of each validation effort in order to ensure that ALCOA+ imperatives are
delivered.

When determining and establishing governance approaches to achieve data integrity by design, there is an integral
relationship between data integrity and computerized system validation that must be recognized in the corporate
Quality Management System (QMS). The data needs will in turn create the risk scenarios that ultimately drive the
configuration and validation activities for the computerized system. Data, and the regulated use of the data, are the
drivers for many project activities within the system lifecycle. The system lifecycle ensures that ongoing controls
are established and verified to make certain the data is appropriately and effectively managed throughout the data
lifecycle.

Table 2.1 shows a list of practices that are shared by good governance frameworks for both data integrity and
computerized system validation.

Table 2.1: Common Governance Elements

Elements Common to Data Integrity and Computerized

System Validation

• Senior management sponsorship and engagement

• Defined roles and responsibilities
• Provision of adequate resources
• Training support
• Focus on patient safety, product quality, and data integrity
• Risk-based approach
• Leveraging supplier involvement
• Part of the QMS
• Continual improvement
• Scalability
• Change control and configuration management
• Access control

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 18 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Table 2.1: Common Governance Elements (continued)

Elements Common to Data Integrity and Computerized

System Validation

• Information security and cybersecurity

• Periodic review and evaluation
• Disaster recovery/business continuity plans
• Backup and restore
• Archive and retrieve
• Active management of data retention and destruction
• IT support

A company-wide data integrity by design strategy should be established as early as possible in the development of
any significant data integrity initiative. It may also be appropriate to determine a data integrity by design strategy at a
lower level, such as for a specific business process or project.

Effective data integrity by design should follow a risk-based approach that encompasses the full business process
lifecycle. This lifecycle addresses data (creation, processing, reporting and use, retention and destruction) and, by
implication, the lifecycle phases of all associated computerized systems (concept, project, operation, retirement).

For a regulated organization, the scope of governance activities to achieve data integrity by design should cover
GxP processes, data lifecycles, and computerized systems. The same governance elements are required whether
the component(s) of the computerized system are on-premise or leverage a service-based architecture (e.g.,
Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS)).

This governance scope is shown schematically in Figure 2.3. This diagram is an abstract representation of the need
for, and relationship between, the data lifecycle and system lifecycle for the computerized systems involved in the
creation, processing, review, reporting and use, retention and destruction of the data within a business process. The
data lifecycle should be considered both within the context of the system lifecycle and separate to it. The lifecycles
run on independent timelines in that the data may outlive the system and need alternative arrangements for retention
after system retirement, or conversely, the system may continue in operation long after the retention period has
expired for a specific data set.

Within the operational phase of the system lifecycle, change control and configuration management are essential to
maintain the validated state of the system.

The roles and responsibilities depicted at the top and bottom of the diagram in Figure 2.3 need to be defined and
assigned to the appropriate individuals, and may only be meaningful through specific phases of the two lifecycles.
Of these roles, data ownership is a critical concept to ensure data integrity throughout the data lifecycle. Regardless
of where the data resides (on- or off-premise) or where it is generated (Contract Manufacturing Organization (CMO),
Contract Research Organization (CRO), etc.), data ownership remains with the regulated company.

Figure 2.3 shows that processes to manage records and data throughout the data lifecycle must be considered in
the initial data integrity by design phase. Data lifecycles can transcend multiple systems, with data flow between the
systems. Overall, a data flow diagram of the complete business process is an essential requirement for data integrity
by design.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 19
Data Integrity by Design

Figure 2.3: Data and System(s) Lifecycle

To consistently achieve an effective design that delivers compliance throughout the data lifecycle, the retention
requirements must be understood when defining the business process specification.

In many circumstances it may be ineffective to consider only the lifecycle of the data within a single system because
data may be transferred across multiple systems (computerized or paper-based) as part of the data flows required
to construct the regulated record. Where copies of the data exist in more than one system (e.g., source and target
systems), it is important to understand:

• The scope of the data use

• Which record will be used to make any decision under the predicate rules

• How it will be assured that the data is current and complete

It is also possible that a single system supports multiple business processes, for example, a document management
system. Consequently, the retention and other data integrity aspects from several processes may need to be
considered in the system design and implementation. A holistic approach to data integrity by design is therefore
required over the full data lifecycle(s) for all impacted regulated data.

2.2 Data Ownership

2.2.1 Considerations on Ownership

Consider a business process that is not computerized. The (business) process owner is accountable for all aspects of
the process.

Once the business process is supported by a computerized system, accountabilities are divided into two separate
aspects: process ownership and system ownership (responsible for the computerized system, as discussed below).

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 20 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Those activities associated with the performance of the business process remain with the process owner; the
separate role of system owner has accountability for those activities related to the technical support and maintenance
of the computerized system.

The process owner has accountability for the quality and integrity of the entire business process including the process
data throughout its data lifecycle, but may delegate data responsibility to data owner(s) at specified phases of the
data lifecycle.

As a further level of complexity, the business process itself may be divided into subprocesses. The process owner
may delegate some or all their responsibilities to sub-process owners, which may also include delegation of system
and data ownership. A hierarchy of process, system, and data owners is thus established, as depicted in Figure 2.4.
The subprocesses can each contain one or more computerized systems.

Figure 2.4: Simple Example Showing Business Process Divided into Two Subprocesses

Data ownership and system ownership creates a matrix of responsibilities as delegated data ownership may be
aligned with, for example, specific clinical trials, sites, or products, while the data may be processed in multiple
systems.

Data should never be without a designated owner at any point in the data lifecycle; effective handover of data
ownership responsibilities between individuals transitioning to/from an organization role (with designated data
ownership responsibility) is critical to successful maintenance of data integrity and data quality.

When considering the computerization of a complex process, the overall process owner should understand the
intended scope of the computerization and ensure the involvement of all subsidiary process, system, and data
owners.

2.2.2 Principles of Data Ownership

It is important for an organization to establish and maintain a culture that supports data integrity and to have clear
policies for data governance. Data governance should address data ownership throughout the data lifecycle to make
certain data integrity risks are addressed throughout the data lifecycle. Electronic systems have become integral to
most business processes within the life science industry, and with this evolution comes the need to understand what
data is captured and who owns it. The concept of data ownership is not new and applies equally to paper data and
electronic data. The ease of movement of electronic data makes data ownership more complex.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 21
Data Integrity by Design

Data ownership is often defined at a company level, but is then refined to the system level (documented in the
roles and responsibilities section of the system Standard Operating Procedures (SOPs)) or even to a specific
project (documented in the quality and project plan) or clinical trial (documented in the trial master file or other trial
configuration documentation). The data owner must understand the business value of the data and often has a
position of responsibility within the business unit generating the data. Individual organizations should allocate roles
and responsibilities based on organizational structure and the specific system involved. The organization needs to
establish policies and procedures to support the role of data owner and ensure it is appropriately resourced for each
business process or system.

Data ownership can be a subset responsibility of the process owner (see ISPE GAMP® RDI Good Practice Guide:
Data Integrity – Key Concepts [16]). As defined within ISPE GAMP® 5 [9], the process owner is:

“the person ultimately responsible for the business process or processes being managed. This person is usually
the head of the functional unit or department using that system, although the role should be based on specific
knowledge of the process rather than position in the organization. The process owner is responsible for ensuring
that the computerized system and its operation is in compliance and fit for intended use in accordance with
applicable Standard Operating Procedures (SOPs) throughout its useful life….Ownership of the data held on a
system should be defined and typically belongs to the process owner.”

2.2.3 Responsibility of the Data Owner

The data owner is responsible for the integrity of the data, and for defining the quality requirements for the data and
how the data is to be used. Because electronic data is easily shared, there is a need to make sure that the data
owner maintains awareness of the use of the data. In the case of clinical trial data and medical records, the data
owner must protect the interests of the patient and their confidential data. Non-GxP regulations such as GDPR [17]
and HIPAA [18] may also impact patient data.

The data owner must be knowledgeable about how the data is used throughout the data lifecycle (data that may span
the organization), and how and where it is retained in a secure archive for the required period. Data ownership is
typically associated with a particular job title or function in the organization, and each new person to that job inherits
the data ownership. Where an internal reorganization impacts job roles and responsibilities, special attention should
be paid to ensure that data ownership is not lost or overlooked.

Where data crosses organizational boundaries within the process, data ownership may be transferred to a new
data owner, although the original data owner may always remain accountable for the original data generated within
their scope. For example, analytical data owned by the Quality Control (QC) laboratory may be incorporated into a
manufacturing batch record, which has its own data owner. In the event of an investigation or audit, any queries about
the analytical data within the batch record requires assistance from the QC laboratory data owner.

Uncertainty in data ownership can be introduced during mergers and acquisitions and should be resolved as part of
the integration process, see Section 3.7.

It is important to separate the data owner from the system owner, who is responsible for system maintenance roles
(for example, IT groups responsible for the operating system, application, and database), to prevent a conflict of
interest. With current working practices, interest in the data often spans different departments (e.g., IT security is
concerned about safeguarding the data, and the legal department may be concerned with data privacy.) Data owners
have a direct interest in the data and therefore should not have system administration and/or privileged accounts.

Many organizations create a data steward role for the technical people with day-to-day responsibility for the data. Data
stewardship activities may be embedded in the responsibilities of other roles, rather than being a new and specific
individual role. The Data Steward role is defined in the ISPE GAMP® Guide: Records and Data Integrity [8] as:

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 22 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

“A person with specific tactical coordination and implementation responsibilities for data integrity, responsible for
carrying out data usage, management, and security policies as determined by wider data governance initiatives,
such as acting as a liaison between the IT department and the business. They are typically members of the
operational unit or department creating, maintaining, or using the data, for example personnel on the shop floor
or in the laboratories who actually generate, manage, and handle the data.”

2.2.4 Data Owner Involvement in Infrastructure Changes

In IT Service Management (ITSM), a platform change (e.g., database migration or cloud hosting provider migration)
may often be considered an infrastructure change, which involves the system owner but not necessarily the business
process owner. At a minimum, the data owner (as delegated by the business process owner) should be consulted
before any change occurs and be kept informed through its implementation.

At all times, the data owner should know where the data resides, and the process followed to verify any data
migrations, considering the mechanism used, reliance on network bandwidth during migration, error reporting tools
used, any data transformation etc., such as non-Unicode™ to Unicode™ conversion, or database changes from
SQL to Oracle®. The level of verification should be commensurate with the risks of the infrastructure change and any
migration involved.

2.2.5 Data Ownership Across Multiple Organizations

Data ownership may be simple when a single organization owns and uses the data throughout the data lifecycle, but
when the creation, processing, and/or reporting of data is contracted to a second organization, the concept of data
ownership becomes more complex. The contract giver or sponsor company legally owns the data (even if they only
use data in summary form), while the contract acceptor may be the entity responsible for the creation, processing,
and reporting of the data.

The contract acceptor should have a data steward to address the day-to-day data responsibilities within the contract
organization, and report all concerns and risks to the data owner within the contract giver company who is ultimately
responsible for data ownership. The contract acceptor may be responsible for the data throughout the data lifecycle.
All of these responsibilities and expectations must be clearly defined within the quality contract and/or technical
agreement and verified through the vendor qualification process. In an environment involving multiple organizations,
a robust communication plan is necessary to ensure potential data integrity issues are reported to the “legal” data
owner. In this context, potential data integrity issues are not limited to instances of suspected data manipulation but
also include cyberattacks or security breaches, and the report should address the impact of these events.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 23
Data Integrity by Design

3 Retention Strategy
A strategy should be developed to ensure the long-term retention of regulated or operational data output by a
business process. The retention strategy should focus on the management of the data throughout its lifecycle, not just
on the initial creation of data within the process.

Use of electronic records as the original/official records supporting business processes continues to increase.
Therefore, it is imperative to apply critical thinking to develop the retention strategy in the planning phase for a new
process or system. This will help ensure data is maintained in its dynamic format throughout the data lifecycle as long
as reasonably possible.

The risk is that as technology advances, old file types become obsolete, driving companies to:

• Maintain obsolete technology to continue to read and access the files, which can increase data integrity risks

• Migrate the data so that new technology can continue to read the record

There are risks associated with any strategy, and assessing those risks and planning a broad methodology for record
and data retention at the beginning of a system’s lifecycle allows the plan to be adjusted as technology changes or
as systems go through their lifecycle. As such, it is imperative for a company to have a strategy in place that will lead
their employees through the process; without this, the company increases its risk as the system moves toward its
end of life. The longer a system goes without a retention strategy, the greater the likelihood the company will lose the
ability to migrate or maintain records to the end of their retention period. Typically, as data ages, the need for it to be
processible decreases until at some point, a risk-based decision to change to a static format may be appropriate. This
is discussed in Section 7.2.

In order for data integrity to be ensured within the system and data lifecycles, a retention strategy should be crafted
during the design phase and ensure that data remains secure and protected from deliberate or accidental changes,
manipulations, or deletions throughout the retention period. The strategy for archiving GxP data also needs to be
planned, whether it is archived electronically or using a paper-based system.

An electronic archiving system, both hardware and software, must be designated as a computerized system and
validated to ensure the integrity of the data, and make certain that it is accessible and readable. See Appendices O2
and O3 for more details on archiving requirements. The data governance for the records in any new computerized
system and the archive solution should be aligned early. Where problems with long-term access to data are
envisaged, or when computerized systems must be retired, mitigation strategies for ensuring continued readability
of the data should be established. These may include virtualization of the original computerized systems, which is
discussed in Appendix O5.

As regulated companies increasingly utilize cloud-based services, a SaaS provider may need to offer appropriate
strategies to ensure adequate controls are in place for data retention. See ISPE GAMP® Good Practice Guide:
IT Infrastructure Control and Compliance (Second Edition) [19] for additional considerations around cloud-based
services and data retention.

The high-level elements of the strategy should be defined in a corporate data governance document as discussed
in ISPE GAMP® Guide: Records and Data Integrity Section 3.3.6 Policies and Standards [8], listing the necessary
elements, plus ones to consider, for the record retention of a system or project. Refer to Appendix O1 – Retention,
Archiving, and Migration [8] of the same Guide for information on how to document the retention strategy for a
particular system.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 24 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

The key points to a retention strategy are knowledge and consideration of data classification and intended use,
readability, availability, access, and data ownership. A good place to start when developing a strategy is to document
this information in the form of a risk assessment with the goal of documenting the business needs, associated risks,
and current mitigation plans. As new information becomes available over the system lifecycle, the assessment can
be updated easily to reflect such changes. See Figure 4.1 Data Integrity by Design Process Flow Diagram for an
overview of the assessment during the system’s lifecycle. This approach keeps the retention strategy up-to-date
and representative of the current needs for the system against the ability and feasibility in technology. The following
sections provide a more thorough breakdown of the details to consider.

3.1 Retention Periods

A regulated company generates many records with varied retention requirements during the course of its business
processes, for example:

• Regulated records subject to a mandated retention period

• Regulated records retained beyond the minimum for business reasons (see Section 7.4.1)

• Records of scientific innovation or invention (including drug development data) that may be needed in perpetuity
for legal reasons

• Financial records falling under the Sarbanes-Oxley Act 2002 (US) [20] (or similar legislation in other countries)

• Records containing personally identifiable information (for example, human resources records and patient
enrollment records for clinical trials)

• Operational records needed solely for process improvements and business decisions (see Section 1.6.5).

Each of these records are required to be available for a different duration depending on what, if any, legislation
applies to that data classification (see Sections 1.6.5 and 4.4 for simple classification, and the more detailed example
in ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts, Appendix 8 [16]). Even within the regulated
records category, different retention periods are specified. Additionally, there may be legal and business reasons for
extending retention periods, including possible future use of the record.

It is therefore important to identify what records fall under what legislation, and the applicable retention periods.
Where different records have drastically different retention periods (they can vary from 6 months to 30 years), it is
important to segregate their storage. For example, a database containing a mixture of data (GLP, GCP, GMP, GDP,
and business records) needs to be kept for the duration of the longest retention period of any of the records, giving
rise to long-term storage of an excessively large amount of data, much of which is no longer required.

Where an organization is creating data that is relevant to a number of other entities (e.g., training records within
a Contract “x” Organization (CxO)), it is important that the organization has a clear approach for identifying and
managing the data – both for operational use and when archiving/destroying data – for the individual entities.
Implications concerning third-party data are discussed in Appendix M3.

The retention period, if not specified, should be sufficient to support any challenges to data integrity. In the absence of
a required retention period, the final disposition should be documented.

Records should be protected to ensure they are accurate and readily retrievable throughout the retention period.
Readability is implied in retention period and discussed in Section 3.2. Readability of audit trails and other metadata
is required under predicate rules to preserve the GxP content and meaning of the record, and should be retained and
available for review and copying for at least as long as the original record. During system planning and selection, it is
essential to include a user requirement for the ability to archive the complete data including audit trails and metadata.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 25
Data Integrity by Design

Failure to meet this requirement could result in either an inability to include the audit trails and metadata in the
archive, or a need to preserve the entire original database to allow recreation of the audit trails.

There may be business needs that require data to be retained beyond the GxP-mandated retention period; these
business needs are discussed in Section 7.4.1.

Appendix O4 Tables 16.1 and 16.2 contain examples of retention requirements and periods. Appendix M3 discusses
the important considerations around managing data generated by third parties, for example, CROs.

3.2 Readability

3.2.1 Importance of Readability

Data should remain readable during the complete data lifecycle, that is:

• Able to be displayed in a human-readable format

• For dynamic data, retained in such a way that it can be further interacted with if required by the intended use
and/or specific predicate rule requirements (see Section 4.4 on intended use)

With paper records, no special tools are required to read the data. If the paper record has endured (survived), it can
be read as a stand-alone complete record. It can be converted to a static electronic image, that is, a PDF or JPG
file by a scanning process, and easily and reliably read on any system having PDF or image viewing software. This
should be a verified process to ensure the electronic copy is a true copy, and is discussed in detail in Section 7.2.1.

In a computerized system, original records generated electronically often require additional resources such as
databases to supply the record content, along with associated metadata to provide the context and meaning of the
record. Typically, data created in an electronic system is readable by that same system, and often readable only by
that system, due vendor-proprietary formats and database structure.

If the record is static in nature (i.e., does not require user intervention or processing to be meaningful), then it can be
converted to a PDF or similar format by a verified process to provide ease of readability. Section 7.2.1 discusses the
differences between static and dynamic data and implication thereof.

3.2.2 Impact of Upgrades or Retirement on Readability

Changes to the software application such as a software upgrade or database structure modifications can result in a
loss of readability for data created in earlier software versions. For readability considerations around inactive data
stored in a dedicated archiving system, see Section 7.2.2.

It is recommended to ensure software is backward compatible prior to the upgrade. If the software does not provide
backward compatibility, then confirm that the data can be migrated as and when required. Controls around data
migration are discussed in Section 3.5.3.

Changes to a GxP computerized system should be implemented under formal change control and configuration
management, which is detailed fully in ISPE GAMP® Good Practice Guide: A Risk-Based Approach to Operation of
GxP Computerized Systems Chapter 10 [21]. When reviewing a change proposal for a computerized system, it is
important to review vendor release notes for:

• Changes to the data structure or file format in the new version, and vendor claims around record compatibility
between old and new versions

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 26 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

• Changes to the storage approach, e.g., change in the database version or even to the database type (Oracle®
versus mySQL™, etc.)

Where there will be an impact on the readability of the existing data in applying the software upgrade, remediation
activities are needed, such as:

• Using vendor tools (if provided) to migrate the existing data into a newer format readable by the upgraded
software, and verifying the migrated data remains complete (including metadata), accurate, and consistent
compared to the original records. Data migration testing, including advice on statistical sampling, is covered in
detail in the ISPE GAMP® Good Practice Guide: A Risk-Based Approach to Testing of GxP Systems (Second
Edition) Appendix T9 [22].

• If a vendor tool is not available, determine if there is a commercially available tool that can provide the needed
data migration.

Data migration is discussed in detail in Section 3.5.3.

Even where there is no explicit vendor statement on the impact to the data readability, it is still recommended to verify
that a sample of existing data is readable (and, for dynamic data, can be interacted with) after the upgrade.

Data sampling should be risk-based and leverage statistical approaches to ensure sufficient data is sampled to
identify (and correct) any missing or corrupt data. This is in addition to, and separate from, validating the change to
the computerized system.

Where there is an upgrade to a computerized system containing master data (e.g., manufacturing recipes, customer
contact details), formal verification may be required to confirm that the data is unaffected and remains aligned across
integrated systems.

In the event of a database change, an extensive software upgrade with no migration tool or backward compatibility,
or at system retirement, an evaluation should be made of the feasibility of maintaining a functional copy of the original
computerized system (either as a physical or virtual machine) that will be used to read and interact with existing data
throughout the remaining retention period for that data.

Considerations for maintaining functional copies of the software include:

• Newer Operating Systems (OS) have better backward compatibility features that may be able to run legacy
software without much trouble if it was produced in the last several years and if the data storage remains in the
same structure (e.g., flat file throughout or the same relational database).

• For laboratory and process control software, it is important to determine if the software application can be run
without connection to instrumentation and process equipment.

• The practical implications and limitations of maintaining a functional copy of an obsolete system using virtual
machine technology, as covered in detail in Appendix O5.

This should be combined with an assessment of the risk of not keeping the records in a dynamic format (see Section
7.2.1). At some point, as discussed in Section 7.2, maintaining readability of the old dynamic records may become
unfeasible and conversion to static format is inevitable.

One risk associated with long-term readability of records is personnel turnover. The personnel that generated and
interacted with the data may no longer be with the company and operation of the legacy system may be challenging
with inexperienced operators. It is advisable to archive training materials with the software so there is at least some
information available when accessing the old data by inexperienced personnel. Any information that can be provided
to a new user about the data also helps the data owner answer questions during a regulatory review.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 27
Data Integrity by Design

3.2.3 Recommendations for New Systems

When selecting a new system for use with GxP data, incorporate the retention and archiving requirements into the
user requirements.

When considering new systems, it is important to evaluate:

• The available functions to export and report data out of the system for archiving purposes

• The system’s ability to mark data as “Archived” and how archived data will be removed from the originating
system

• How existing active data is impacted during a system upgrade:

- Is the data automatically upgraded as needed to suit the new software version as part of the system
upgrade?

- Does the data have to be revised or modernized by a separate step to keep it readable?

- Does the data need to be imported into the upgraded system?

• The history of earlier versions of the system to understand the depth and frequency of changes to data structure,
file formats and databases, and also to assess the level of backward compatibility routinely maintained in the
system.

These could be addressed in a vendor evaluation; minimally, there should be a discussion with the vendor to
understand their vision and plans for the product going forward.

When a system is upgraded (upgrade of the application or operating system, or implementation of a later-model
system), the existing data may be automatically revised to be readable within the system, either during the upgrade
or when transferred into the upgraded system. This relies on the capability of the new system to read data created in
the older version, possibly by converting it into the newer format. This type of transfer is the simplest form of migration
but still requires a risk-based level of verification to confirm the integrity of the data. Special attention needs to be paid
to any changes in the processing algorithms that could produce a different result.

A system that internally manages its data revision during an upgrade reduces the effort required to keep data
readable in the system through future system upgrades during its operational life. A system with the inherent ability to
read data created in all earlier versions of the system, including data restored from the archives, greatly simplifies the
ongoing data retention burden.

The capability to export data for archiving and retention is critical when using SaaS solutions hosted off-premise.
There needs to be a mechanism to extract the data if and when the regulated company moves away from that
solution; this requirement should be defined in the Service Level Agreement with the SaaS provider. Many of the
considerations discussed in Appendix M3 – Third-Party Data may also be relevant for SaaS solutions.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 28 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

3.3 Availability

The need for the availability of records is determined by the current state of the records: active, semi-active, or
inactive.

Figure 3.1: Example Frequency of Records Access over Time

In the Figure 3.1 diagram:

 Active records start from the first creation of data. Generally the record remains active through the processing
and review, reporting and use phases of the data lifecycle, where access is needed frequently to progress
the data to complete its business purpose, that is, its use. Active records are introduced in Section 3.3.1 and
discussed in detail in Chapter 6.

 Semi-active describes the situation where the record is no longer routinely accessed, but may be needed
periodically such as for use in trending and control charts, and during annual product review or complaints
investigations. Semi-active records are introduced in Section 3.3.2 and discussed in detail in Section 7.1.

 Inactive records are records no longer expected to be accessed other than for exceptional circumstances, for
example during an inspection or investigation. At this point in the record lifecycle, the records will be infrequently
or minimally accessed for the remainder of their retention period. Inactive records are introduced in Section 3.3.3
and discussed in detail in Section 7.2.

 At the end of the retention period, the inactive record can be destroyed as long as there is no legal hold or
business reason to keep the data. Destruction is covered in Section 7.4.

There are no hard-and-fast rules as to when a record should transition from active to semi-active, or from semi-active
to inactive; it is a spectrum of decreasing need to frequently access the data and a desire to maintain performance in
the live system(s). Points  and  can move within the graph based on the regulated company’s choices.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 29
Data Integrity by Design

It should be noted that a record entering the semi-active or inactive state should not impact its static or dynamic state
– the regulatory expectations to maintain dynamic data in dynamic format remain.

3.3.1 Active Records

Data stored in the computerized system that originally generated the data (the “live system”) is immediately available
to authorized users of that computerized system (although there may be restrictions on which users can access what
data – see Section 3.4). This immediate availability is efficient where data is active (being generated, processed,
reported, reviewed, and/or used) and being used by a range of interested parties (production operators generating
data, production supervisors collating and reporting data, quality unity reviewing data, authorized person releasing
the batch based on the data). Active records are discussed in detail in Chapter 6.

3.3.2 Semi-active Records

As it moves through the lifecycle, data may be needed less frequently; for example, batch release data from the
previous month may not be needed until the annual product review when it will be trended with other release data. In
this semi-active phase, where the data is needed infrequently, the options available to an organization are to:

• Leave the data in the live system

• Leave the data in the live system but prevent further changes by segregating the semi-active (or inactive) data
from the active records, that is, the data is restricted as read-only or is rendered inaccessible to the majority of
users

• Look at off-line storage solutions, for example, electronic archives as discussed below and accept the
consequently slower access to the data inherent in moving it out of the live system. This should be balanced
against the need for data to be rapidly available in the case of a recall scenario.

Based on the business process and regulatory requirements, a read-only or reduced access approach may be used
to limit the potential for data changes, in which case ideally the system should offer incremental levels of control that
can be implemented over time, such as:

1. No new data can be written into the data folder but allowing users to work with existing data

2. Progressing later to preventing further processing or editing of the data, but the data is still available for viewing
and approval

3. Finally, after the business process cycle time is complete, moving to fully read-only

There may even be a “no access” level of control whereby the data can no longer be viewed, pending its transfer to
the archiving system.

It is important to limit which users can initiate the restrictions, with potentially a smaller subset of users able to
reverse the restrictions in case the data is needed for an investigation or audit. This is particularly important for
chromatography data where an inspector can request the data to be reprocessed to assess the impact of alternate
processing parameters.

Semi-active records are discussed in more detail in Section 7.1.

3.3.3 Inactive Records

When an organization determines that the data is no longer needed to be readily available for instant access, or
the amount of data in the live system is degrading system performance, moving the data from the live system to an
electronic archive is recommended. This brings multiple advantages:

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 30 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

• Reduced workload for managing active records

• The amount of data in the live system does not become sufficiently large to impact system performance (e.g.,
long search times, delays in reading data from a database)

• The time required to back up the live system remains manageable (e.g., less than 24 hours, thus allowing daily
backups)

• The storage space required for backups of the live system is minimized

• The archived data is only accessible to a limited number of independent personnel and no longer available to the
originating users, thus reducing the risk of alteration during the retention period [23]; this is a specific regulatory
requirement for some areas. (See Appendix O4 for examples of requirements for record retention.)

There are more general regulatory requirements (listed in Appendices O2 and O3) for the use of archives in support
of long-term data retention. In its simplest form, archiving electronic data requires the creation of a complete and
accurate copy of the data (including metadata to preserve GxP content and meaning) in an off-line storage location.
Once the off-line copy has been verified as complete and capable of being restored into the live system when
needed, the original record can be deleted from the live system. As with any other regulated data, provision should be
made for backing up the archived data to guard against data loss.

Where validated automated archiving processes cannot be provided, manual archiving may be required. Manual
archiving can only be controlled by procedure/protocol and requires verification that all the required records are
archived. Manually administering the electronic archiving process, especially when dealing with flat files, carries an
inherent risk that more data may be deleted than was copied to the archive location, for example by choosing to copy
only the passing results for long-term retention. For this reason, validated automated electronic archiving systems are
preferred.

See Section 7.2.2 for a discussion of the important features for an electronic archiving system, including advanced
indexing capabilities to facilitate record searching. Archive requirements are included in Appendices O2, O3, and O4.

If a manual archive process is all that is available for a system, an archive protocol must be written to ensure all data
is appropriately archived.

Some companies choose to maintain the data in a secured area of the live system in a “no access” state for the
full retention period. If that approach is used, the data may be secure, but the advantages listed above will not be
achieved.

Inactive records are discussed in more detail in Section 7.2.

3.4 Access

Inherent in considerations regarding the availability of the data is the consideration of who should be able to access
the data during its lifecycle. Access controls should ensure that only authorized personnel can access data. This may
be controlled by business role and organization.

A detailed discussion of security management for computerized systems is contained in ISPE GAMP® Good Practice
Guide: A Risk-Based Approach to Operation of GxP Computerized Systems Chapter 15 [21] and in ISPE GAMP®
Good Practice Guide: IT Infrastructure Control and Compliance (Second Edition) Appendix 5 [19] for infrastructure
security.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 31
Data Integrity by Design

3.4.1 Access to Live Systems

When active records are stored in the live system, access controls in that system are based on the principles of:

• Segregation of Duties: This is the general concept of having more than one person required to complete a
process or workflow as an internal control intended to prevent fraud and error. For example, segregation of
duties requires that authorization to generate data should be separate to authorization to verify data (e.g., a user
can review their peers’ data but not their own), and that users conducting normal work tasks in the system should
have access rights to do so while users with elevated access rights (e.g., administrator privileges, engineer roles)
should not conduct normal work tasks on the system [24].

• Least Privileges: A further aspect of assigning access to functionality is the principle of least privileges, whereby
each user is given sufficient access for their routine tasks and no more. Senior users may be given access
to higher risk functionality (e.g., creating new data storage folders or adding new users), with the highest risk
functionality (e.g., system administration, database administration or other enhanced access) reserved for
users outside of the department’s reporting structure (for example, members of the corporate IT department).
Careful consideration should be given to how and where passwords are managed, including resetting forgotten
passwords; there is potential for fraud if someone in the business process can reset a password and then
conceivably execute tasks under someone else’s account.

Assigning access controls, including database administrator access, is discussed in detail in ISPE GAMP® RDI Good
Practice Guide: Data Integrity – Key Concepts Section 4.5 [16].

In addition to controlling which functionality in the system is accessible to a particular user, there may also be a need
to control what data can be accessed by that user. This is especially relevant where a computerized system is shared
across multiple business processes and/or organizations, for example sharing a chromatography data system across
both the research and development and QC laboratories, or where a system contains confidential patient information.

3.4.2 Access to Archive Systems

In an electronic archiving system, access to the archived data can be restricted to a limited number of data
management personnel and no longer available to the originating users thus reducing the risk of data alteration
during the retention period [23].

If there is a need to restore the archived data to the original system or a functional copy thereof, special care must be
taken to manage any access to or editing of the restored data. This is discussed in Section 7.3.

System lifecycle considerations for archive systems are covered in Section 7.2.2; archiving requirements are
discussed in more detail in Appendices O2 and O3.

3.4.3 Physical Security

Logical security (system access controls) alone are not sufficient [11, 25]. Physical access controls such as perimeter
security, building and facility access control should be in place in addition to logical security. The storage location
must be physically secured against unauthorized access, for example, by instigating key card door controls on the IT
server room such that only authorized IT personnel can gain access to the system server or archive storage media.

3.5 Protecting Records and Data

It is important to protect records supporting GxP decisions to ensure that the information is secure from unauthorized
modification or deletion. The procedures and processes implemented to address risks to active records should be
commensurate with the importance of the data, the process, and complexity of the system. All data must be in a
format that permits secure storage and available for review and reporting.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 32 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

As discussed in the ISPE GAMP® Good Practice Guide: IT Infrastructure Control and Compliance (Second Edition) [19]:

“Lack of security may compromise availability of applications and services, record integrity and confidentiality,
reputation with stakeholders, and may lead to unauthorized use of systems that would ultimately impact product
quality.

Information security may be characterized as the preservation of:

• Availability: ensuring that authorized users have access to information and associated assets when required

• Integrity: safeguarding the accuracy and completeness of information and processing methods

• Confidentiality: ensuring that information is accessible only to those persons authorized to have access.”

3.5.1 Site Storage and Controls

There are processes, controls, and procedures necessary to protect the safety, confidentiality, integrity, and
availability of the data in the operational environment. Such safeguards include (but are not limited to):

• Backup and restore procedures

• Business continuity planning

• Disaster recovery planning

It is important to develop the backup approach and disaster recovery strategy based upon the risk of the system
and the criticality of the process and data. Backup and restore, and disaster recovery as part of business continuity
management are discussed in detail in Chapters 13 and 14 of the ISPE GAMP® Good Practice Guide: A Risk-Based
Approach to Operation of GxP Computerized Systems [21].

Backup and restore procedures ensure that the accurate and reproducible copying of digital assets, including the
data and software, are protected against loss of the original data so that the data can be restored, if required due to a
disaster.

Conventional backup processes typically involve creating a duplicate copy of system data on a fixed-time basis, such
as a nightly scheduled backup. Other controls that can be implemented utilize sophisticated network architectures
including database mirroring and redundancy across diverse and geographically dispersed datacenters, which can
result in close to 100% uptime. Synchronous replication – where data is written to the primary storage and a remote
replica simultaneously ensuring that both versions are always identical and where one replica is always available – is
another approach to lessen the risk of loss of data. Data transfer speed should ensure that replication is performed at
a necessary rate to prevent degradation or loss.

The Business Continuity Plan and the Disaster Recovery Plan collectively address how to continue or resume
operation after a disaster. A disaster is defined within the ISPE GAMP® Good Practice Guide: IT Infrastructure Control
and Compliance (Second Edition) [19], as:

“Any event (i.e., fire, earthquake, power failure, etc.) which could have a detrimental effect upon an automated
system or its associated information.”

The Business Continuity Plan (BCP) describes the steps the business must follow to restore the critical business
process following a disruption and addresses what the business must do to continue without the computerized
system(s).

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 33
Data Integrity by Design

A Disaster Recovery Plan (DRP) is defined by the ISPE GAMP® Good Practice Guide: A Risk-Based Approach to
Operation of GxP Computerized Systems [21] as:

“A sub-set of Business Continuity Management that focuses on regaining access to an IT system, including
software, hardware, and data following a disaster.”

Disaster recovery must address the provision for replacing or restoring the computerized system. The DRP
must provide details of how to restore the complete computerized system including the infrastructure, hardware,
application, and database. The recovery site is usually chosen a geographical distance from the main site in case
the disruption is due to a natural occurrence such as an earthquake or flooding. If there is a need to restore specific
instruments, as in the case of laboratory systems, the DRP should contain procedures for obtaining replacement
instruments.

It is important to document how much downtime is acceptable for a specific computerized system/business process.
The process owner must define the allowed maximum system down time expressed as the Recovery Time Objective
(RTO) and the maximum loss of data that can be tolerated by the business process defined as the Recovery
Point Objective (RPO). This information is used by the IT specialist to design the necessary backup process and
architecture to support these requirements. Companies must ensure response times are described within their
internal or external Service Level Agreements (SLAs).

All of these processes, controls, and procedures should be periodically verified and may be assessed during periodic
review to provide assurance that they still meet organizational and regulatory expectations, including doing trial
restores of the backup data. Consideration should be given to perform walk-throughs and mock-executions of the
DRP and BCP.

3.5.2 Off-Premise Storage and Controls

The same data integrity requirements (including confidentiality and security) apply to systems managed internally by
the regulated company or externally. The expectations from the regulators are the same whether organizations use
physical infrastructure on-premise, virtualized servers on-premise, hosting at datacenters, or leveraging off-premise
cloud-based services. The regulated company has the ultimate responsibility for patient safety, product quality, and
data integrity, whatever and however the data is stored.

In cases where the system is supplied using cloud-based infrastructure (SaaS, PaaS, or IaaS), the responsibility for
the procedures related to disaster recovery are shifted in whole or in part to the cloud provider. One of the benefits
of a cloud-based infrastructure is its resiliency, that is, the ability of the service to respond to issues, although the
positive achieved in resilience is often challenged by a decrease in security controls. One of the advantages of using
a cloud-based infrastructure is system uptime, which providers often report as 99% to 100% [26].

The ISPE GAMP® Good Practice Guide: IT Infrastructure Control and Compliance (Second Edition) [19] provides
some insight into the advantages and risks of using various infrastructure services.

“Advantages for IaaS, SaaS, and PaaS solutions include:

• Large flexibility in capacity and services

• Easy to expend or reduce

• Reduced internal hardware foot print

• Reduced capital costs

• Outsourced control and maintenance

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 34 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

• High availability

• Less need for IT knowledge and IT investment”

“Cloud computing introduces a flexibility in resource capacity, but also introduces new risks to regulated
companies. These risks include:

• Less or no control over the datacenter

• Several suppliers working together to provide the infrastructure

• Less control over the infrastructure

• Less control over the data

• Less control over applied services

• Data and systems are outside the companies’ network

• Different qualification approach required.”

The use of infrastructure outsourcing has also presented compliance challenges including security, availability,
integrity, confidentiality, and should the contract be terminated, data deletion or an inability to access the data.

The two critical activities/considerations for the users of cloud-based infrastructure services are the supplier
assessment and the contract or Service Level Agreement (SLA).

As with any supplier assessment, it is critical to assess the services delivered by the cloud provider and the controls
and procedures established to ensure the operations. It is important to define the expectations of the regulated
company and the corresponding responsibilities of the provider. One of the considerations is the notification of
changes to the hosted service, including upgrades to infrastructure, platform, or software application. At a minimum,
the notification should include details of the change to allow the regulated company to evaluate any potential impact
to their systems and data.

As part of this assessment, it is important to evaluate the disaster recovery process at the supplier including
review of the DRP and the frequency and conditions of the disaster recovery testing. The DRP should address the
communication strategy during a disaster. It is important to document how a disaster at the cloud provider will be
communicated to the system users.

It is also important to determine if the cloud-service provider subcontracts any part of their service. If this is the case,
such as a SaaS provider using a third-party PaaS or IaaS provider, it is important to understand how the cloud-
service provider evaluated their subcontractors and how problems with service at the subcontractors will be prioritized
and communicated.

The ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts [16] states:

“Even if the third party is not subject to healthcare regulations when they enter into a Quality Contract/Technical
Agreement with a regulated company, they must be aware of the requirements of the environment in which they
are working.”

Ongoing performance monitoring of service delivery from the provider is important.

Appendix M3 contains additional guidance on managing off-premise data.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 35
Data Integrity by Design

3.5.3 Migration

The MHRA ‘GXP’ Data Integrity Guidance and Definitions [12] provides significant guidance on controls required
for data transfer and data migration. Fundamentally, any transfer or migration process must be validated and must
preserve the GxP content and meaning of the data.

The ISPE GAMP® Guide: Records and Data Integrity Appendix O1 [8] discusses the option of storing electronic
records in formats other than the original record format. Appropriate points when considering record transfer or
migration include:

• “Creation of the record

• The point at which a record is to be archived

• At system upgrade, especially if electronic record conversion is needed

• At system replacement when contemplating data migration to the new system

• At system retirement, especially if electronic record conversion or development of rendering software is

needed

• When a media refresh is needed”

The decision to convert electronic records into a different format (an alternative file type, or dynamic to static) should
be based upon a documented risk assessment considering the requirements for record retention, access, and use.
ISPE GAMP® Guide: Records and Data Integrity Appendix O1 [8] includes a comprehensive table, Table 18.1, that
describes multiple risk factors for the conversion of electronic records to an alternative format.

Transferring data between different applications will also likely require a conversion of format. When this is needed
for active data, discussions with the vendor of the target system will determine if the entire active data set can be
converted directly:

• Via direct conversion and transfer

• Via an export from the current system and import into the new application

One challenge with retention of records is that the records may be in a format proprietary to an individual vendor and
therefore the data may only be read in the originating application.

The conversion of inactive records to a vendor-neutral format was introduced in the ISPE GAMP® RDI Good Practice
Guide: Data Integrity – Key Concepts Section 4.3.6.3 [16]. The vision of vendor-neutral formats has been a desire
for industry for many years, especially in the realm of analytical laboratory data. Leveraging a vendor-neutral format
could help to break the reliance between the data lifecycle and the system lifecycle. Data would no longer be
exported from one system and imported into another, but exist in file structures that live in a common data lake with
the ability of various applications to open, view, interrogate, and mine the data for new scientific insights, as well as
provide the ability to easily share data between collaborating companies. This is an ongoing industry initiative and not
yet fully implemented or available.

Any “temporary” storage locations of data need to be tightly secured and controlled. There should be a record of
transferring data between storage types, formats, or computerized systems. A Data Migration Plan should define
the scope of the migration, the personnel and data involved, and the strategy to be applied. The ISPE GAMP® Good
Practice Guide: A Risk-Based Approach to Operation of GxP Computerized Systems Chapter 17 [21] examines
migration in detail and offers guidance on creating a Data Migration Plan. Migrating data of questionable integrity
does not improve the integrity of the data but may protect the data from further risks going forward.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 36 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

As with all migration/conversion activities, it is essential to consider if all data, metadata (including audit trails and
review and approval signatures), and relational traceability between records is preserved after migration to the new
application. Care must be taken when migrating IT systems with databases to ensure that the associations are not
disrupted and that the data viewing capabilities remain. Risk assessment and management for any changes in data
or metadata should be evaluated and documented. It is important that migration or data transfer processes are
designed and validated to confirm that data integrity is preserved (the accuracy, completeness, content, and meaning
of the data) following a risk-based approach.

When the migration/transfer involves cloud-based solutions, there are additional considerations required as described
in the ISPE GAMP® Guide: Records and Data Integrity [8]:

“As part of an outsourcing process regulated companies need to:

• Understand and accept which aspects of control are being delegated to a provider

• Assess and accept the controls implemented by the provider

• Contractually define the level and frequency of reporting

• Agree on the need for supplier support during regulatory inspections, depending on the architecture and
services provided”

Often using PaaS or SaaS solutions, system upgrades may be driven by the service provider and not by the
regulated company. If the upgrade forces the need for data migration, a risk-based approach should be undertaken to
managing the migration. Appendix O5 offers additional considerations around off-premise solutions.

3.6 Managing System Retirement

As stated in the ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts [16], system retirement:

“involves decisions about data retention, migration, or destruction, and the management of these processes.
Retiring a computerized system has major implications about how the data created in the system during its
operational life remains available, enduring, and readable in the remaining phases of the data lifecycle.”

When retiring a system, it is important to determine:

• What data in the system needs to be retained after the system retirement (it may need administrator access to
view all data within the system)?

• What data is within its retention period that needs the system to read and display the data?

For dynamic data, a functional copy of the legacy software may be needed to read or interact with the data if the data
has not been migrated to a format readable in a current system. This is discussed more in Section 3.2.2, with details
around maintaining legacy software covered in Appendix O5.

Appendix M10 of ISPE GAMP® 5 on System Retirement [9] presents an overview of the system retirement process
including data integrity considerations.

Managing system retirement is comprehensively discussed in the ISPE GAMP® Good Practice Guide: A Risk-
Based Approach to Operation of GxP Computerized Systems, Chapter 18: System Retirement, Decommissioning,
and Disposal [21], which contains a detailed process flow diagram showing the critical activities and records to be
produced (see Figure 3.2), plus a RACI matrix (Responsible/Accountable/Consulted/Informed) of the roles and
responsibilities involved.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 37
Data Integrity by Design

Figure 3.2: Managing System Retirement Workflow [21]

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 38 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

The primary consideration during system retirement is to ensure that data and metadata generated and stored in
a system are retained and readily available and readable as defined in the system retirement plan. The system
retirement process should have been considered during the creation of the organization’s retention strategy. The
strategy should be based on risk, taking into account regulatory expectations and business requirements, data
criticality, and the complexity of the computerized system. The strategy should require the identification of record type
(GMP, GLP, GCP, etc.), record retention period, and archival and retrieval processes for each system.

The 2010 ISPE GAMP® Good Practice Guide: A Risk-Based Approach to Operation of GxP Computerized Systems
[21] remains valid in every respect, but consideration of subsequent regulations and guidance on data integrity,
specifically expanding roles and responsibilities to include data owner and data steward, generates a slightly modified
RACI matrix as presented in Table 3.1.

Table 3.1: Managing System Retirement RACI Roles and Responsibilities

Adapted from ISPE GAMP® Good Practice Guide: A Risk-Based Approach to Operation of GxP Computerized Systems [21]

Role RACI Code Responsibilities

Process Owner (can A • Accountable for the Retirement Plan

also be the Data • Accountable for the execution and verification of the Retirement Plan
Owner) • Accountable for the data and documentation related to the system being retired
and ensuring compliance with relevant retention policies
• Responsible for approval of the Retirement Plan and Report

Data Owner R • Responsible for preserving the quality and integrity of data and documentation
related to the system being retired and ensuring compliance with relevant retention
policies
• Consulted on content of the Retirement Plan

System Owner R • Responsible for preparing aspects of the Retirement Plan

• Responsible for approval of the Retirement Plan and Report

Quality Unit C • Consulted on content of Retirement Plan for regulatory and compliance aspects
• Responsible for approval the Retirement Plan and Report

End User I • Informed of the plan to retire the system

• Responsible for verifying aspects of the Retirement Plan, e.g., ongoing data
accessibility, impact on interfaced systems

Platform Support R • Responsible for preparing aspects of the Retirement Plan

(SME) • Responsible for executing platform dependent aspects of the Retirement Plan

Application Support R • Responsible for preparing aspects of the Retirement Plan

(SME) • Responsible for executing the Retirement Plan

System Administrator R • Responsible for preparing aspects of the Retirement Plan

(SME)

Archivist (SME) R • Consulted on content of Retirement Plan for organization records retention policies
• Responsible for executing the archive aspects of the Retirement Plan
• Acts as Data Steward for archived data
• Responsible for preserving the quality and integrity of data and documentation
related to the retired system and ensuring compliance with relevant retention
policies

Project Manager R • Responsible for managing retirement, decommissioning, and disposal

Data Steward R • Responsible for ensuring that good data integrity practices are followed during the
planning and execution of the retirement, decommissioning and disposal phases

Where: R = Responsible, A = Accountable, C = Consult, I = Inform

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 39
Data Integrity by Design

3.6.1 Data Considerations

Prior to beginning the retirement process, there needs to be a documented risk assessment to understand the
concerns and difficulties to be addressed. The retirement process should be defined, and any transfer or migration
process to be used validated, to ensure that no data or metadata is lost or compromised during the process. The
records should be copied and verified as complete and accurate before the destruction of the original equipment. It is
essential to ensure that the metadata, such as the audit trail or other information has transferred completely, as loss
of such information permanently compromises the integrity of the copy of the record.

Transferring inactive records that have already been archived from the system into a secure off-line archive location
may require additional steps, such as restoration back to the original application prior to conversion and transfer. This
is common when the inactive records have been archived in the original existing format, which is likely to be vendor
dependent. Therefore, it is critical to plan for this migration before decommissioning the original computerized system
application.

If some of the data has reached the end of its retention period and can be discarded there should be a clearly defined
process requiring data owner, process owner, and quality (and maybe legal) approvals prior to disposal. This requires
the ability to segregate such data from the rest of the data without creating issues and that the date of data creation
is stored with the record. Naming conventions that include the date of creation may be beneficial because when data
is migrated, the file creation dates may be reset to the date of migration. Data destruction is discussed further in
Section 7.4.

When a system reaches the end of life and is to be decommissioned, a retention strategy is needed for data that has
not reached the end of its retention period. As discussed in Section 3.2.2, ideally, dynamic data should be maintained
in a dynamic format, but sometimes this is not possible. If data is required to be retained for decades (e.g., traceability
records for tissue donors, as contained in Appendix O4), there may come a point where a risk-based decision to save
it in a static format may be justifiable (i.e., the need to sort or reprocess it is very unlikely).

For example, a risk assessment may find that there is little or no foreseeable need for reprocessing after 10 years,
while retaining a computer (or even a virtual environment) with an unsupported operating system and application gets
increasingly problematic. It is important to consider and document the risk of creating static records from the dynamic
records [12] and to ensure that the records are still able to support the reconstruction of the activities performed and
decisions made. This approach should not be considered for active records. This is further discussed in Section 7.2.

Information stored in PDF files may be substituted and transformed when transferred to a new server if the fonts were
not embedded when the PDF was generated. Such substitution or transformation can alter the record permanently
and may alter the meaning, such as when symbols are transformed or changed to standard fonts, rendering the
information useless if it becomes unreadable. It is important to consider this during the initial transfer plan.

Another significant problem with PDF files is that they are limited to a static representation of the data and may not
contain the complete and accurate record of what was originally recorded by an electronic system; this is extensively
discussed in Section 7.2.1. Finally, PDF files also have very limited searching and indexing capabilities, and indexing
within a document management system is needed to ensure the records can be found when needed.

Readability of the retained data depends on the format of the data. If the data has remained in a vendor-specific
format, a functional copy of the legacy software may need to be retained to allow for data readability (as discussed in
Section 3.2.2 and Appendix O5). Training and operating materials should be archived along with the legacy system to
assist access by inexperienced personnel.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 40 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

3.6.2 System Considerations

The scope of the system retirement strategy should include everything from simple devices and non-networked
systems through to and including enterprise systems. Different scenarios relating to system retirement are described
below. Note that for Step 1 these are broken down by system types, with common activities listed first, and system-
specific activities listed under system type.

Step 1 – System Retirement

The system is removed from active operations, normal end user access is withdrawn, and interfaces with other
systems are deactivated. SMEs involved in decommissioning activities are allowed access and permissions sufficient
to perform their allotted tasks. During the development of the retirement plan, it is essential to consider all associated
records, qualification data, calibration records, etc., as well as the regulated data and metadata required to be retained.

System Types

System not connected to a computer: This system generates electronic data but does not permanently store the
data (titrator, osmometer, etc.). The retirement plan should document the archival and retrieval of the temporary data
stored on the system. When retiring this system, no additional data will be generated from the system. Data typically
generated from these systems is static in nature and therefore may be retained in printed or electronic format. These
files are often saved in a read-only format (e.g., PDF) for ease of archival. Appendix D2 discusses some of the
implications of managing the data during the use of these systems.

System connected to a non-networked computer: Systems are connected to the computer for system control
and data acquisition. Data from these systems use temporary media to transfer data to a more permanent secure
location. When retiring such a computerized system, no additional data will be generated from the system. Data
typically generated from these systems is dynamic in nature. The retirement plan should document the archival
and retrieval of the data stored on the system and permanent storage location. This plan should allow for data to
be maintained in its original dynamic format. In most situations, it is necessary to retain a copy of the application
software to retrieve data in its dynamic format.

System connected to a networked computer: These are systems connected to a computer for system control and
data acquisition, where the computer is connected to a network allowing direct data transfer to a secure location.
When retiring such a system, no additional data will be generated from the system. Data typically generated from
these systems is dynamic in nature. The retirement plan should document the archival and retrieval of the data stored
on the system. This plan should allow for data to be maintained in its original dynamic format. In most situations, it is
necessary to retain a copy of the application software to retrieve data in its dynamic format should this be required. A
virtualized copy of the system can be retained for this purpose until the end of the retention period is reached – see
Section 3.2.2 and Appendix O5.

Enterprise Systems: Enterprise systems are typically used to meet the needs of multiple sites. When retiring an
enterprise system, no additional data is added to the system from the specified retirement date. An enterprise system
may be retired at a single site or across all sites:

• If only a specific site is retiring the system and it will remain in operational use at other sites, then data availability
is simplified since the system is still active.

• It is more complicated if the whole enterprise system is to be retired from all sites as alternative arrangements
will be needed to maintain data availability and readability.

• When one site pilots a new system, then there may be an interim negative impact on data quality and
nomenclature across the organization, as the new system may store data differently and with an enhanced
nomenclature compared to the existing system. This incompatibility continues until all sites have migrated to the
new system. See Sections 1.6.4.2, 4.7, and Appendix M2 for further explanation of data quality and nomenclature.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 41
Data Integrity by Design

The retirement plan should allow site data to be retrievable in its dynamic format with the permission of the
designated data owner. An enterprise system typically remains in use until data is migrated to a new enterprise
system or to a long-term archive. When retiring a complete enterprise system to move to a new one, no additional
data is added to the system from this point forward. The retired system typically remains in use until the data stored is
either migrated to the new system or has reached its record retention requirement.

Retirement of the Archive System: When an archive system is due for replacement (reaches the end of its lifecycle
or an improved archiving solution is found), special care must be taken to preserve the integrity of the data throughout
the process. If the archive system is connected to other systems, replacement interfaces need to be created to link
to the new archive and validated to ensure they are transferring complete data. Retirement of an archive system
requires all of the same considerations as retirement of any other type of system but is complicated by the diversity
of the records and the possibility that the new archive system does not support all of the file formats. This may be the
time to convert dynamic data to static data based on unsupported file formats. Refer to Section 7.2.1 for details about
this conversion.

Step 2 – System Decommissioning

Decommissioning is the controlled shutdown of a retired system. A system may be stored if required to be reactivated
at a later date, such as to retrieve regulatory data or results.

Step 3 – System Disposal

Data, documentation, software, or hardware can be permanently destroyed. Each may reach this stage at a different
time. Data and documentation should not be disposed of until they have reached the end of the record retention
period as specified in the record retention policy.

3.7 Records Management and Retention through Mergers, Acquisitions, and

Divestments

During mergers, acquisitions, and divestments, it is essential to plan and define the approach to managing the
impacted systems and data. A fixed timeframe may be imposed by legal contracts to review associated and impacted
data as part of the transition period.

Some factors to consider for records management and retention include:

• What data is needed and impacted?

• What systems are involved, which will be kept and which will be retired?

• Is a new system purchase required as part of consolidation or divestment? If so, this will need validation for
intended use before data can be imported.

• Can data generated by the same business process across the two companies be transferred into a single
system, e.g., all batch record data can be merged within the chosen Manufacturing Execution System (MES)?

• How will both the active data from the live system and the inactive data from an archive system be managed?

• What legacy systems are needed to read the archived data and how will they be managed?

• Who will be the data owners at the end of the transition phase?

• There may be changes in the underlying business processes during the transition period; how will these impact
the data flow ongoing?

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 42 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

For divestments, the transition period should be used to identify how to segregate data between the companies. An
independent party may provide the data segregation process and provide a dataset to the recipient. In that situation,
it is vital to identify the data requirements early, ensure that data segregation is performed at least twice, so there is
an opportunity to rehearse data segregation and verify the resulting data set, and address any issues before the final
cut of data is taken.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 43
Data Integrity by Design

4 Implementing Data Integrity by Design

4.1 A Process to Achieve Data Integrity by Design

This section presents an approach to gaining a more systematic and enhanced understanding of the data set/data
system under development with respect to effectively managing the data lifecycle for regulated records.

A process to achieve data integrity by design throughout the data lifecycle is presented schematically in Figure 4.1.

Figure 4.1: Data Integrity by Design Process Flow Diagram

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 44 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

This process supports the development, maintenance, and retirement of compliant data sets and data systems.

The data integrity by design process facilitates delivery of a design that ensures the computerized system(s) is/are fit
for intended use with respect to the data lifecycle and the business process.

The key development related activities for a process to be computerized are described or referenced in more detail
below.

• Define Existing Business Process: In addition to the process workflow it is critical to capture the business
process and end to end data flows (incorporating data transfer across interfaces) in order to provide assurance
that the full data lifecycle is understood, including high-risk areas and compliance gaps.

• Scope “To Be” Business Process: Use critical thinking to identify opportunities for end to end business
process improvement, with a focus on prevention and detection opportunities to minimize existing risks to data
integrity and data quality for regulated and other business-essential records.

Business process mapping is explained in detail in Section 4.2, while Section 4.3 introduces data flow diagrams.
Once the business process map and data flow diagrams have been generated, a business process risk
assessment (see Section 4.5) can be used to identify potential data integrity risks inherent in the process that
must be addressed as far as possible by the computerized systems selected.

• Develop Requirements: Critical thinking should be applied to analyze the regulatory requirements for data
integrity, and determine the necessary and most effective controls to meet the intended use. The intended use
inherently impacts the controls needed; for example, a system intended for use in early stage drug discovery
may have a somewhat different set of data integrity requirements than one intended for use in product release
testing with potential direct impact on patient safety.

It is also important to realize that the computerized system is comprised of people, process, and technology,
and therefore the requirements should inherently address not just the system technical controls but also the
applicable data governance aspects (e.g., user access controls and segregation of duties) and the ability of the
system to be configured to meet the intended use.

Sections 4.4, 4.6, and 4.7 provide guidance around defining the intended use of the data, the data lifecycle, and
the importance of creating data nomenclature to be used throughout the process or organization.

Establish high-level functionality and data requirements for the “to be” business process and data lifecycle (see
the ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts Appendix 6 [16]). In purchased
systems, the user requirements may be a combination of the business process requirements and the vendor
specification.

The historical focus of a typical User Requirements Specification (URS) has been on the required functionality
of the system to support the business process, including the interfaces to other systems and the ability to collate
data from multiple systems into a cohesive data set. Data and data lifecycle considerations should be carefully
considered and specified with equal rigor in order to ensure that good practice expectations for data quality and
data integrity can be incorporated into the design for a new system at the earliest opportunity.

To properly develop the design, a more detailed process map and data flow diagrams may need to be developed
at this stage in order to facilitate the identification of data integrity requirements. As part of data integrity by
design, how the system is designed to be available to a user in their environment should be considered. For
example, identifying the physical location of terminals and determining if they are sufficiently robust (e.g., to
cleaning processes in the operating environment) and maintained so that the terminal is available to an operator
for recording entries contemporaneously.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 45
Data Integrity by Design

The preparation of the URS for a computerized system should indicate the priority of each requirement, for
example, “must have,” “should have,” and “could have.” As part of this process, data integrity-specific requirements
should be prioritized as a part of an iterative process of QRM (according to ICH Q9 (pharmaceuticals) [27], ISO
14971 (medical devices) [28], and ISPE GAMP® 5 [9]) in order to assess each requirement’s potential to impact the
quality and integrity of data within the system. Known data integrity trouble spots based on industry and company
experience (e.g., interfaces) should be explicitly examined for potential risks.

Data-related requirements should be clearly cross-referenced to process step activities and data flow
considerations, and to risks to public health, patient safety, and product quality. Requirements must be clear,
correct, and unambiguous. Cross-references to specific regulatory requirements (such as the predicate rules of
US GMP [29] as well as US FDA Guidance for Industry: Part 11 [30] and EU EudraLex Chapter 4 [31] and Annex
11 [11]) may also be helpful. An example of regulated record retention periods is contained in Appendix O4.

For computerized systems, data integrity requirements address availability, performance, resilience, security,
deployment model/architecture, and stability. These requirements are technical or procedural and include specific
controls such as second person verification, secure time stamp, compliant electronic signature, electronic audit
trail, backup and restore, etc. There may be other data integrity related considerations for a particular system,
for example, that static1 records are searchable or that dynamic records are able to be reprocessed by duly
authorized users.

Minimally, systems that generate static data in a flat-file format may need to be networked with centralized data
storage to protect the data. An automated and validated mechanism to capture the flat files into a database
system for increased security and traceability is desirable.

• Develop Design: Identify and evaluate any commercially available systems with the potential to meet the
requirements. Where there is no commercially available solution, establish a design project to develop a system.
The initial design of a purchased, commercially available system may be configuring a single system with a
workstation or may include combining multiple systems with a workstation to automate a process. The automated
system could be from a different vendor but configured to work with the same software, for example, connecting
a dissolution bath, fraction collector, and a UV/Vis spectrophotometer. (Chapter 5 offers guidance on planning
computerized systems.)

This design activity may be performed with help from the supplier or using internal support but should be
presented to system owner’s Subject Matter Experts (SMEs) and quality representatives for review.

• Perform Risk Assessment: Data integrity focused risk assessment should be performed early in the project
phase and repeated as more information becomes available and greater knowledge is obtained (see ISPE
GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts, Chapter 4 – Risk Management Approaches
[16] for more information). System level risk assessments are covered in Section 5.3.

• Identify High-Risk Functionality: (see Section 5.3) – A “must have” requirement for data integrity is typically a
property or characteristic of the developed system that assures data set/data system quality and data integrity
within the business process. “Must have” requirements for data integrity are not automatically high risk. While
“must have” status may infer a high severity of harm if the requirement is not met, the additional factors of
likelihood of occurrence and probability of detection in the ISPE GAMP® 5 [9] risk assessment methodology may
result in an overall medium or low risk priority for the requirement.

Having identified the high-risk functionality, QRM can be used to prioritize subsequent activities.

• Develop Strategy to Control Risk: A strategy to control end to end process and data flow ensures that a data
set of the required quality will be produced consistently. (This is discussed throughout the system lifecycle
considerations in Chapters 6 and 7.)

1
Static and dynamic records are defined and discussed in detail in Section 7.2.1.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 46 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

The elements of the control strategy should describe and justify how the chosen configuration and technical
and procedural controls contribute to the final quality and integrity of the data output. These controls should
be based on product and process understanding as defined in the key concepts of ISPE GAMP® 5 [9]. A
detailed understanding of the system should be supported by the quality system, such as incident management
processes. The quality system requires adequate tools to support critical thinking to assess the impact of these
types of events (and evolves as more knowledge of the system is gained).

Sources of process variability, such as manual intervention, that can impact patient safety, product quality, and
data integrity should be identified as risks, appropriately understood, and subsequently controlled. Understanding
sources of variability and their impact on the data within the system can provide an opportunity to implement
additional or alternative controls that reduce the risks. Science-based product and process understanding, in
combination with QRM, supports the control of data sets and data systems such that any variability can be
compensated for in an adaptable manner to deliver consistent data quality.

The adoption and implementation of innovative technologies and control paradigms, for example, process
analytical technology, may allow the design of an adaptive process step (a step that is responsive to input) with
appropriate process controls to ensure consistent product quality and data integrity.

Enhanced understanding of the data system and data set performance can justify the use of alternative
approaches to determine that data is fit for purpose.

Continuing to work through Figure 4.1:

• Build Business Process Solution: It is uncommon now for systems to be developed as custom solutions. Build
process solution more typically represents the application of a selected system configuration to meet intended
use and the provision of supporting procedures for that use. In the case of some SaaS solutions, configuration
may not be available, in which case build solution is limited to the implementation of use procedures.

• Validate Business Process Solution

Supporting processes include:

- Knowledge Management (see Appendix M1)

- Supplier and System Selection (see ISPE GAMP® 5, Appendix M2 – Supplier Assessment [9])

With the move to more purchased and built solutions, including SaaS, benefits can be gained by collaboratively
working with the supplier to improve the technical controls built into the system to prevent and detect data
integrity issues.

- Managing Continual Risk Assessment and Process Improvement throughout the Data Lifecycle: (see
Figure 4.1) Throughout the data lifecycle, companies have opportunities to evaluate innovative approaches
to improve data quality and integrity (see ICH Q10 [32]). This includes re-evaluating controls to determine
if additional controls or adjustments to those controls are needed. Even after risk mitigation there may be a
level of residual risk that should be periodically reviewed and reassessed.

In the operational phase of the system, process performance can be monitored to ensure that it is working as
anticipated to maintain data integrity as expected. This monitoring includes:

• System periodic review

• Procedures and instructions

• Roles and responsibilities (RACI)

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 47
Data Integrity by Design

• Record media, e.g., electronic or paper

• Critical aspects of the process impacting patient safety, product quality, and data integrity

• Process step parameters, including changes and the cause of those changes

Upon gaining additional process knowledge and performance monitoring data, the data integrity-specific configuration
of the computerized system and/or control elements can be revised accordingly.

It should be noted that each of the data lifecycle processes may require a different focus of design such that
appropriate controls are in place. These controls may vary for different records based on data classification and
intended use. For example, in-process checks as part of a batch record need to be retained until the batch expiration
date plus 1 year, compared with biological tissue traceability records [33, 34], which need to be retained for at least
30 years. The design of the retention solution for these two examples is likely to be very different. A more detailed
listing of retention periods is contained in Appendix O4.

4.2 Business Process

The ability to map the business process is important to enable understanding of the business activities, and decision
points in a business process. The business process mapping or modeling may be depicted as flowcharts or tables,
or in combination. If a table is used to describe the business process, the location (where), responsible person/
role (who), the proper time to perform the action (when), and the output(s) of the action (what) should be included,
depending upon the needs [8]. Business process mapping illustrates the steps performed to fulfill the business
purpose, and should include both manual and computerized system aspects of the process.

As shown in the ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts, Section 3.2 [16], a business
process can be described in very simple terms using a block diagram. A single block can involve one or many manual
operation(s) or computerized system(s). The level of granularity of the process map is dependent upon the level
required for the business to identify the associated risks to data integrity, product quality, and patient safety.

Business process maps are useful to help the business identify the risks associated with the use of the system
including data integrity risks. Depending upon the needs, the process map may include details such as:

• Sequence of process steps

• Location of the activity

• Time of the activity

• Critical decision points, including actions (e.g., review/approval/disposition) mandated under the predicate rules

• System functions

• Responsible person/roles

• System interfaces

• Infrastructure including servers for production, application, database

• Inputs of the activity

• Outputs of the activity

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 48 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

• Risks associated with the process

• Documentation involved in the process (used and generated)

If any parts of the process are outsourced, it is important to include this information in the process map. Data created
by an outsourced facility remains the responsibility of the marketing authorization company, and, as described in the
ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts [16], needs to be available for review, such
as during a clinical trial. A process map used to help IT support the system should contain information about the
hardware and servers supporting the system.

The process map can help with the creation of SOPs, work instructions, and training materials before the operational
phase. It is also an essential basis for the end to end testing of the business process before the system is released
for use.

4.3 Data Flow Diagrams

It is essential for a data flow diagram to capture the level of detail that will truly identify all data activities and the
subsequent identification of their potential risks. For example, the use of a data flow diagram incorporating ancillary
systems could identify the use of unvalidated and uncontrolled spreadsheet to evaluate specification limits. However,
a separate data flow diagram may not be needed for a very simple business process.

Depending on the complexity of the computerized system involved and/or the business model (e.g., outsourcing
services or systems), the data flow could be simple or very complex. It is crucial not to overcomplicate the data flow
as this can inhibit the clarity and understanding of the diagram. The more controls and interfaces in place, the more
complex the data flow diagram becomes.

A data flow diagram can be created by identifying for each step in the business process map:

• From where will information come into the process step (e.g., sample or batch ID, limits and specifications) and
how (e.g., electronic interface/manual transcription)

• What data will be generated during that step of the process and how is it captured? What calculations or
processing must be performed on the data and how and where will those be done? What metadata is needed to
provide the GxP content and meaning to the data?

• For manufacturing processes, what are the Critical Process Parameters (CPPs) and Critical Quality Attributes
(CQAs), and how are they managed and fed into the process control?

• Where does the data need to go next (e.g., to the next system used in the process, or to a laboratory information
system or Enterprise Resource Planning (ERP) system, or both)? How will it be transferred? How is the transfer
verified? What data cannot be transferred? How and where is it stored?

Figure 4.2 shows a simplified outline from which to build a data flow diagram. Split out all the data sources and
downstream systems/processes, and add detail on the data being transferred.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 49
Data Integrity by Design

Figure 4.2: A Simplified Outline from which to Build a Data Flow Diagram

The data flow diagram should graphically illustrate the creation, use, and movement of data elements throughout a
business process: the “data” view of activities. The data flow diagrams can aid understanding and identification of:

• Data elements impacted by activities

• Data elements required by regulations

• Data that can be reprocessed or modified (therefore requiring an audit trail)

• Data necessary for correct decisions

• Data that is processed by a vendor/partner

• Who needs access to the data

• General and specific data integrity risks, e.g., time pressures, equipment limitations, resource constraints

The above process can help ensure that the system remains in a compliant state.

4.4 Data Classification and the Intended Use of the Data

Fundamental to managing data is to determine what data is needed and for what use. It should be noted that there
are non-GxP requirements for data classification and retention, e.g., GDPR [17], HIPAA [18], and legal hold; however,
this Good Practice Guide focuses exclusively on the GxP requirements.

Taxonomy is mentioned in Appendix M1 Section 8.3 as an essential component to KM; it is the organizing of data into
related groups. In this Good Practice Guide, the taxonomy is based on classifying data as regulated, operational, or
unnecessary, as defined in Section 1.6.5.

Within the regulated data (i.e., data that must be kept), it is important to understand the intended use of the data now
(e.g., for making quality-critical decisions on a batch) and in the future (e.g., for trending and data analytics). Section
4.7 discusses the need to set business rules to define the data nomenclature for data used in ongoing collation and
data analytics, such as trending CQAs and compiling quality metrics.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 50 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Once the data classification is identified, the following considerations should be made to complete the understanding
of the data lifecycle:

• Identify the intended use and criticality of the data: data with high criticality has the greatest potential impact to
product quality and patient safety. The data needs to be retained and archived.

• User requirements: see ISPE GAMP® 5 [9]

• Record type: original record and true copies: see Section 7.2.1 data lifecycle considerations

• Essential controls: access control and data management, see Section 3.4

• Review, reporting, and use process: see Section 6.3

• Data backup: see Section 3.5

• Retention period: see Section 3.1

• Retention state, including static versus dynamic state to support the intended use: see Section 7.2

Note: As experience of a process is gained, data classification may change, for example, something that was initially
considered to be unnecessary becomes useful operational data.

4.5 Business Process Risk Assessment

A business process risk assessment is a non-system-specific high-level assessment of the business process and
data flow, and the potential data integrity risks inherent in the process. It is aimed at identifying key process-level risks
to patient safety, product quality, and data integrity, and identifying the essential controls to manage these risks. As
with any assessment, it requires the application of critical thinking by knowledgeable and experienced SMEs.

The business process map should be created early so that it can be used to drive the identification and assessment
of risks to the process and subsequently act as a feeder to the system planning and lifecycle. (See Chapter 5.)

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 51
Data Integrity by Design

4.6 Data Lifecycle

The ISPE GAMP® Guide: Records and Data Integrity [8] represents the data lifecycle in five phases, and is shown in
Figure 4.3.

Figure 4.3: Data Lifecycle [8]

The key points in the lifecycle [16] are:

• Creation: Data capture or recording should ensure that data of appropriate accuracy, completeness, content,
and meaning is collected and retained for its intended use. This could include manual data entry as well as
automated capture.

• Processing: Data is processed to obtain and present information in the required format. Processing should
occur in accordance with defined and verified processes (e.g., specified and tested calculations and algorithms),
and approved procedures.

• Review, Reporting, and Use: Data is used for informed decision-making. Data review, reporting, and use
should be performed in accordance with defined and verified processes and approved procedures. Data review
and reporting is typically concerned with record/report type documents. Second person reviews2 should focus on
the overall process from data creation to the calculation of reportable results, including the metadata required to
support the GxP content and meaning of the data. Such reviews may cross system boundaries and include the
associated external records and may include verification of any calculations used. The data reporting procedures
should contain the complete data set and define the data handling procedures, and ensure the consistency and
integrity of the results. If there are automated controls available, for example, a validated exception-reporting
process, the extent of second person review may be reduced.

2
Second person reviews for laboratory data are explicitly required and discussed in the FDA Guidance for Industry: Data Integrity and Compliance
with Drug CGMP Questions and Answers [35] and PIC/S PI 041-1 (Draft 3) Good Practices for Data Management and Integrity in Regulated GMP/
GDP Environments [24].

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 52 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

• Retention and Retrieval: Data should be retained securely. Data should be readily available through the defined
retention period in accordance with defined and verified processes and approved procedures. Retention periods
vary by data classification, intended use and applicable regulation, and some records, e.g., validation and
qualification records, need to be retained for the life of the system or process.

• Destruction: The data destruction phase involves ensuring that the correct original data is disposed of after the
required retention period in accordance with a defined process and approved procedures.

The middle phases (those between Creation and Destruction) may fall in any order and may repeat.

A detailed discussion of the data lifecycle is available in the ISPE GAMP® Guide: Records and Data Integrity, Chapter
4 – Data Life Cycle [8].

Where validated technical controls and/or automated tools ensure the integrity of the results, it may be possible to
justify a reduced rigor of routine data review. Where the technical controls alone cannot ensure the integrity of the
results, procedural controls including enhanced data review may be required to address the deficiencies based on risk.

4.6.1 Defining Data Lifecycle

This section describes the data lifecycle with a focus on regulated records. Data integrity of a regulated record
should be ensured from its creation to the end of its retention period and resultant destruction. A regulated record
is a collection of regulated data (and any metadata necessary to provide meaning and context) with a specific GxP
purpose, content, and meaning, and required by GxP regulations. Records include instructions as well as data and
reports, as defined in Section 21.2 of the ISPE GAMP® Guide: Records and Data Integrity [8]. As a regulated record
progresses though its data lifecycle, critical thinking can be used to identify areas that may impact data integrity such
as:

• Interpreting measurement device signals (e.g., thermocouple resistance to temperature data)

• Selecting metadata that accompanies a record (e.g., thermocouple identification and location, associated audit
trail entries)

• Representing retrieved data in displays and reports

• Data moving across multiple computer systems

• Any alarm information or annotations relating to possible issues with the data from earlier in the data lifecycle

• Evaluate the use of human interaction, such as manual tasks and processing of data

• Converting data into a different electronic format (e.g., native formats to PDF)

• Migration3 of records between media (paper, electronic) or to other computer systems (e.g., application program
interface, data migration)

• Transfer of records to cloud servers or servers from different organizations within the regulated companies and
third parties

3
See Section 6.8 of the MHRA GxP Data Integrity Guidance and Definitions [12] for the definition of data transfer and data migration.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 53
Data Integrity by Design

4.6.2 Benefits of the Data Lifecycle Approach

The data flows discussed in the Section 4.3 can be broken down into component parts that form the steps of the
data lifecycle. A data lifecycle approach ensuring data integrity is recommended as it has been found to improve
the effectiveness of data integrity controls and results in a more easily managed and understood program. The data
flow diagram facilitates the abstraction of risks from systems to data lifecycle phases, allowing standardized/modular
controls to be developed for each lifecycle stage. These modular controls can then be used any time a system
performs the associated lifecycle stage.

Specific areas of benefit include:

• Reduced new system assessment time

• Standardized data integrity controls (e.g., design, configuration, testing)

• Reduced documentation development and review duration as a result of standardization

In order to mitigate data integrity risks and based on critical thinking, different controls may be implemented for
different areas of the lifecycle, for instance:

• Manual verification of data during its creation

• Audit trail review during review, reporting, and use

• Qualification of algorithms for data processing

• Appropriate security measures for data retention

• Qualification/verification of data retrieval to endpoint (report)

• Destruction executed at the proper time per controlling procedure

The data lifecycle may not be a system-centric view. Data may cross systems and organizations throughout the data
lifecycle, which must be considered, especially when addressing data integrity controls including security and audit
trail. Where a record is needed to support multiple processes or operations, it is better from a data integrity and data
quality point of view to retain a single copy of the record and link to it rather than having multiple copies in different
locations.

Appendix D1 contains an example flowchart for a laboratory business process.

4.6.3 Mapping the Data Lifecycle

A record’s data flow derived from understanding the business process (as described in Section 4.2) provides a
structure to identify, assess, mitigate, and communicate potential data integrity issues/risks associated with each
data lifecycle stage. Understanding the record’s data flow through mapping helps understand the data lifecycle of
the regulated record. Section 3.2 of the ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts [16]
presents detailed examples of this for manufacturing, QC laboratory, and electronic Case Report Form (eCRF) data.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 54 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Figure 4.4: Data Lifecycle Examples and Challenges by Stage

Manufacturing Example

In the manufacturing area, In-Process Checks (IPC) are performed to confirm optimal process setup and for
monitoring during production to ensure that the in-process samples conform to specifications. Such IPC data should
be available in the manufacturing record, and available for quality review as part of final disposition.

Additional examples for the manufacturing data lifecycle are available in the ISPE GAMP® Good Practice Guide: Data
Integrity – Manufacturing Records [15].

QC Laboratory Example

In the QC laboratory, raw materials arriving on site are tested to verify that the quality of the raw materials meets
specifications. An ERP system may be used to manage and track raw materials from receipt through use. Challenges
to data integrity may occur during the capture of incoming material data, associating the data with the material lot,
and ensuring it is considered when dispensing inventory for manufacturing (within expiry).

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 55
Data Integrity by Design

Clinical Study Example

In a clinical study business process, data may be entered into an eCRF system from the patients’ Electronic Health
Record (EHR) or a paper record at the investigator site. Consideration should be given to mitigating data integrity
risks when verifying the source data entered in the eCRF, reviewing, and querying the data, to provide assurance that
the data is complete, accurate, and consistent with protocol and clinical expectations, before conducting statistical
analysis and reporting the outcome. Data integrity controls, including data reviews, should be incorporated into the
business processes and optimized via critical thinking at the individual study level.

Data integrity risks within the business process activities should be considered when mapping a record’s data
lifecycle; issues identified in the early phases of the data flow and lifecycle are carried forward across different
computer systems and processes, compromising the integrity of the regulated record. It is also important to document
and define the records in the lifecycle that will support GxP decision-making.

Security of clinical records is a particular concern in view of data privacy rules for medical records (HIPAA [18], GDPR
[17]) and must be addressed by the data integrity controls.

Additional examples for clinical data are available in the ISPE GAMP® Good Practice Guide: Validation and
Compliance of Computerized GCP Systems and Data (Good eClinical Practice) [36].

Alignment between manual tasks and multiple computerized systems are needed to support data integrity of the
records throughout the lifecycle. System lifecycle considerations are discussed in the next section.

4.7 Data Nomenclature

Data nomenclature is a critical enabler of data quality and especially data analytics. In simplest terms, it is a
framework to ensure that data is captured and named consistently from the start to the end of the business process,
that is, “lot number” in one system is not labeled as “batch number” in another system. Inconsistent identification
results in using aliases to relate the data by its different names. Standardized nomenclature means the metadata
is coherent and relatable, ensuring that the data is usable for reporting, trending, data analytics, and even
machine learning/Artificial Intelligence (AI) applications. Defined after the business process has been mapped, the
nomenclature framework is applied throughout the data flow diagram so that all systems supporting the business
process have nomenclature uniformity.

Without standardized nomenclature, each site in an organization is permitted to develop their own data nomenclature
for materials. The firm manufactures a material, aspirin: Site A calls it “Aspirin,” Site B names it “ASA,” Site C uses
“Acetylsalicylic Acid,” and at Site D the material name is “Acid, Acetylsalicylic.” All sites use the same electronic batch
management system, which manages each batch using the material name. When conducting a review of 5-year
manufacturing trends for this material across multiple sites, the comparison is neither easy nor fast because of the
lack of nomenclature framework across the sites. A data dictionary could be implemented to define all the aliases
for the material name; however, it is simpler and more cost-effective to start with a defined nomenclature framework
when designing the business process.

Nomenclature is distinct and separate to normalizing, as used in data analytics. Normalizing is a way to eliminate
redundancy and rescale disparate data sets such that a like-for-like comparison can be achieved. For example,
comparing sample throughput between two laboratories of unequal size is meaningless, but normalizing the data as
sample throughput per capita enables an evaluation of laboratory efficiency.

4.7.1 Design Backward

Nomenclature is not inherently part of the data entered, but it is an essential basis for ensuring coherent, consistent
entry and syntax of metadata to support the data.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 56 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

A good nomenclature program starts from the end of the process and works backwards: “begin with the end in mind.”
During the process/system concept and delivery phases, business process owners must develop requirements to
describe how they need to access business information: the speed of access, depth of access, breadth of access,
and metadata to enable queries of the business data. For example, are most queries site-centric or are they global for
a particular activity or operation?

4.7.2 Nomenclature Framework

Nomenclature is an activity that develops and applies business rules to various data and metadata elements (objects)
to standardize data and metadata entry and syntax as it is recorded to ensure consistency; such consistency
enables queries that will retrieve all requested data. The framework must be applied consistently across the entire
organization or its value is greatly diminished.

4.7.3 Business Rules for Nomenclature

To create consistency in metadata, it is necessary to start with business rules that guide the creation of new metadata
values when they are requested. Some business rules will be universally applied, such as there should not be two
values that describe the same real-world article, while other business rules will be unique to a specific data element.
Table 4.1 provides an example list of business rules to aid understanding. These are only examples: the technology
limitations and required data elements will cause each organization to create their set of business rules. Business
rules must be shared not only to all internal stakeholders but also to suppliers or contract organizations that need to
follow these rules in their system design, configuration, or operations.

Table 4.1: Example Business Rules

Analytical Characteristics (for Test Results):

• Must describe a single unique characteristic under test

• Format of Analytical Characteristics: (Item Under Assessment) (Unit of Measure) (Modifier) (Other Required Reference)
• Each word in Item Under Assessment will be upper case (title case)
• Item Under Assessment must describe the characteristic being measured, not the method:
- “Karl Fischer” is the technique and “Water” is the characteristic of assessment
- “HPLC” is the technique and “Acetylsalicylic Acid” is the characteristic of assessment
• Avoid the use of Items Under Assessment that are universal, when possible, for example, “Assay” for all Potency methods.
Instead, Items Under Assessment should refer to specific molecular entities or characteristics. This rule is not applicable to
some visual observations and microbiological methods, e.g., sterility.
• Unit of Measure will avoid the use of special characters, such as microns for micrograms. Instead, they will be spelled out or
abbreviated to permit their use across multiple platforms.
- Common abbreviations are permitted, such as milligrams (mg), micrograms (mcg), grams (g), Colony Forming Units
(CFU). Data Steward may develop a list of approved abbreviations to promote consistency.
- Unit of Measure entries will be maintained by Data Steward in a separate list. All Units of Measure must conform to this list.
• When Item Under Assessment has modifiers, it will be written in UPPER CASE.
- It is optional but recommended to add a modifier that describes the common state of the Item Under Test if other modifiers
exist, for example, add an “AS IS” modifier when an “ANHYDROUS” modifier is also used, where the “AS IS” state is the
normal condition.
- Modifier entries will be captured in a separate list, maintained by Data Steward. All modifier names must conform to this list.
• Other Required References refer to special situations where customers (or standards bodies such as USP [37], Ph. Eur. [38],
JP [39]) have differing requirements that alter the manner in which an Item Under Assessment is either tested or calculated.
- All entries for Other Required Reference will be defined in a separate list, maintained by Data Steward. All Other Required
Reference names will conform to this list.
- When it is necessary to use an Other Required Reference, it will be added after the modifier in UPPER CASE and be
surrounded with square brackets [ ].
• Examples
- Acetylsalicylic Acid mcg/mg ANHYDROUS - pH [USP]
- Acetylsalicylic Acid mcg/mg AS IS - Bacterial Endotoxin EU/mg
- Acetylsalicylic Acid Percent LABEL CLAIM - Visual Appearance [USP]

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 57
Data Integrity by Design

4.7.4 Roles in Nomenclature

Business rules must be developed by a team of people with varied backgrounds and skill sets, as the activity of rule
creation demands knowledge of the automated systems involved (and their technical limitations) and the data used
by a business area under discussion. The team also needs someone who has strong analytical capabilities to create
a systematic approach to building data rules. It is desirable to create rules for a number of metadata entities, and then
apply them in a test environment to test the robustness of the rules, as errors and gaps will frequently be found in
early tests.

Process Owners

Owners need to have the business connections and resources to identify and educate uniquely qualified personnel
to become Data Stewards, or leaders of a data steward unit. By reason of prior experience, the process owners
may serve as business experts in crafting business rules for data entities, but their primary role is accountability to
senior level personnel for Data Governance Steward activities, championing the value of data stewardship within the
organization, clearing political obstacles, providing resources and systems, and enabling Data Governance Stewards
to be successful in their work.

Data Governance Stewards

Once rules are developed, they must be maintained and disseminated to the individuals who will maintain and
enforce the rules. This is a critical role and a difficult one as well, because there will be tremendous organizational
pressure to bypass the rules, e.g., (1) It takes too long; (2) But we always call it ‘”XYZ;” (3) Those people are too
controlling; (4) Our site head says we do not have to follow those rules, etc.

The Data Governance Steward must guard the data and associated metadata so data mining and data analytics will
be possible in the future. If an organization elects to go with a dictionary approach, the Data Governance Steward
should own the dictionary and be responsible for its maintenance. The Data Governance Steward will often be forced
to refuse a request, because the entity request already exists under another name. They must research every request
to prevent duplicate entities. In addition, Data Governance Stewards working across multiple business sites must
constantly coordinate to keep business rules and approved lists aligned, ideally with tools that span all organizational
sites where their entries will be used. Centralization and control give Data Governance Stewards the capability they
require to create consistent metadata entries. In turn, these bring tremendous value to the organization as global data
searches are simplified to empower the “big data” views that global organizations desire.

Data Experts

Data experts are people with deep business and/or systems knowledge in a specific area of operations, such as
Manufacturing, Support Services, Calibration, Automation, Reference Standards, Stability Management, and so
forth. They assist Data Governance Stewards by providing examples of the various ways in which data is received,
managed, and reported in parts of the organization. Their deep knowledge of the various areas helps the Data
Governance Steward develop business rules robust enough to cope with infrequent and unusual scenarios.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 59
Data Integrity by Design

5 System Planning
Organizations should approach data integrity using a holistic top-down view, starting from the perspective of the high-
level business process, then looking at the subprocesses before looking at the individual computerized systems and
activities. This approach is depicted in Figure 5.1 and provides assurance that data integrity truly is designed in at the
business process level.

Figure 5.1: Top-Down Approach to System Planning

Confidence in data integrity starts with the capability of the system to provide technical controls for data integrity.
The relationship the customer has with the supplier is essential when determining which supplier to use and system
to purchase. It is important to select a supplier that has an established quality system and hardware/software
development process, as determined through an appropriate vendor qualification process.

5.1 Planning Computerized Systems to Efficiently Support the Optimized Business

Process

It is essential to understand the business process, to understand and determine how the data will flow through the
different business processes, and to determine what interactions (i.e., transfer of data) are utilized by the existing
business process (manual/automated) within the data lifecycle. Only after understanding the business processes can
improvements be made to the existing processes as needed and as new technology becomes available. An example
of this can be seen in the data archiving process: switching from an internally developed script-based backup process
to a backup process leveraging a commercial utility.

The diagram in Figure 5.2 is a high-level representation of a new process along with identification of existing
processes that may impact the new process. When developing a process flow, greater details are needed to clearly
identify each process step and the complexity of those processes. Once there is an understanding of this, it is
possible to examine the data directly involved with the critical processes to understand where it resides in each
process stage and what equipment/computerized system supports that data.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 60 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Figure 5.2: Impact of a New Business Process on an Existing Process

Within the pharmaceutical industry, there will be a range of business processes used to execute different activities.
Additionally, each process is supported by various computerized systems. The size and complexity of the business
process determines the initial scope of the user requirements for these computerized system(s).

Appendix D1 of the ISPE GAMP® Guide: Records and Data Integrity [8] gives examples of business processes and
how it can lead to generation of user requirements. The requirements can be manually developed from the business
process or automatically derived using tools such as Business Process Model and Notation.

Figure 5.3 is a graphical representation of the relationship between the system and data lifecycles, showing how
a combination of manual tasks and/or multiple computerized systems may support a single data lifecycle. Only
the project stage of the system lifecycle is shown, purely to denote the need for validated technical controls. Data
destruction and system retirement occur independently of each other and may or may not happen at different times.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 61
Data Integrity by Design

Figure 5.3: System and Data Lifecycles [16]

5.2 Addressing Individual Systems

A complete system may consist of several components/processes. There are multiple considerations when designing
a new system. For example, what phases within the process are well established and what phases need to be
improved or established?

A well-defined, consistent system with no manual intervention carries less risk than a system with manual input.
Considerations for a system should include system access control and data management as both capabilities vary
by system. The system should be chosen and/or defined to mitigate the risks to data integrity by providing robust
technical controls that can be validated as fit for purpose. Existing constraints may dictate selecting a particular
system solution over other commonly used options. An example of this is a relational database that provides strong
traceability and audit trailing for active data compared to flat-file storage; however, if the archiving system can only
manage flat files, then the database solution becomes less desirable.

When evaluating the purchase of a new system, it is important to assess the new system’s capability to meet the
business needs and requirements, including the regulatory requirements based on the environment in which the
system operates. Additionally, it is imperative that the appropriate personnel are involved in the planning process
(e.g., business technical resource, IT, engineering, etc.).

Prior to planning and purchasing a new system the high-level business process, GxP data elements, and high-level
data flows between the different process steps or activities should be established. This includes the need for the
system to operate and integrate with existing systems that support the business process.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 62 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Figure 5.4 is an example Ishikawa diagram (fishbone analysis). Although Ishikawa diagrams are typically used as a
root cause analysis tool, here it is used as a means to determine which data is directly involved in critical processes.
The use of Ishikawa diagrams is just one example of a simple tool to help understand where data may be found.
Organizations should select a tool or approach that best suits their needs.

Figure 5.4: Ishikawa Analysis of Data

As determined from the Ishikawa diagram, the systems that print the data, store e-data, and apply e-signatures need
focus.

There may be existing constraints that must be accommodated, for example, if the organization utilizes an archiving
system that works with flat files only, it is necessary to ensure that any system proposed supports this process. This
could involve forfeiting the security of storing the live data in a relational database before archiving or even converting
the data to flat-file format to facilitate the archival process. Converting to flat files for archival purposes must not
compromise the integrity of the data.

The GxP data lifecycle needs to be planned along with the identification of the business processes and data elements
directly involving critical processes. The GxP data in each system needs to be mapped with the systems, and the
data lifecycle and controls required to maintain the integrity of the GxP data need to be documented.

Business process mapping brings an understanding of the use and governance of data. From this mapping and the
application of ISPE GAMP® 5 [9] principles, organizations can then plan what activities based on risk are required
for the computerized system(s) that will support the business process. Many vendors provide software systems that
enable full control and monitoring of a process. Therefore, the relationship between the regulated company and the
computer system supplier is the foundation towards achieving a successful, on time/within budget project that meets
all the business and applicable regulatory needs. See ISPE GAMP® 5, Appendix M2 – Supplier Assessment [9] for
more details.

When performing an analysis of a process and its data, there must be adequate representation from the business
to provide the expertise of personnel involved in the process being analyzed and the associated equipment
computerized system(s). For example, when planning the batch release process, some or all of the following
roles may need to be involved: process owners, data owners and stewards, technical and process SMEs, IT,
Engineering, QC, Quality Assurance (QA), and computer system QA personnel. It may be valuable to define the RACI
responsibilities for each role.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 63
Data Integrity by Design

5.3 System Risk Assessment

The system risk assessment is a foundation to QRM and requires critical thinking to identify the potential risks to
patient safety, product quality, and data integrity. Even after risk mitigation there may be a level of risk remaining that
should be periodically reviewed and reassessed.

Consider for the regulated and operational data at each step in the data flow diagram:

• Is the data and metadata at risk of unauthorized alteration by manual interventions or by systemic/technology
issues?

• Is the data and metadata stored in a secure location immediately after creation?

• Can the user influence what data is stored or reported?

• How will the data be transferred from one system to the next system in the process?

- How will data be protected (secured) during the transfer?

- Is the interface validated?

- Is the next system hosted externally?

• Can and will all required metadata (including audit trails) be transferred with the data?

- How is the metadata linked to the record?

• How and where will the data be reviewed?

- What role will perform the review?

- How will the review be documented?

Such an understanding of the types of data involved, the essential metadata, the transfer and use of the data, and the
risks involved, facilitates the design and specification of systems and processes in support of the business process
and the integrity of the data therein.

When identifying system lifecycle risks, it is recommended to map how system access is configured, and how data
is managed and stored. During the evaluation process, document the controls in place, risks present, and required
mitigation. Always strive to implement a system with the strongest (preferred) controls that reduce the risk as low as
reasonably possible.

Table 5.1 shows the key considerations to help identify risks and suggests corresponding mitigations of varying
effectiveness that a computerized system can provide. It does not address behavioral issues such as writing down
passwords, or sharing passwords due to a lack of user licenses.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 64 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Table 5.1: Key Considerations for Risks and System Level Mitigations

Key Considerations System Access Controls

Least Preferred Best/Preferred

How is the user authenticated before Role based ID/password Unique ID/password
the capture can occur?

How does the system reduce the risk Simple password Robust, sophisticated password
of another user accessing the system enforced
with someone else’s credentials?

How does the system deal with User must logoff to allow another user System is able to manage access
multiple concurrent users? to access the system to multiple application windows
concurrently

Key Considerations Data Management Controls

Least Preferred Best/Preferred

How are new devices (e.g., balances New devices can only be added by an New devices can only be added by
etc.) authorized on the system? application user individuals with elevated privileges

How does the system prevent Parameters modifiable by authorized Parameters locked, changeable
parameters from modification that user only only with elevated access via formal
could influence the result or process? change control

How is the data transferred from Manual data transfer to next system Automated data transfer pulled by
the data creation system to the next by selecting files to transfer, possibly next system, leveraging checksums
system in the business process even using USB flash drives and handshakes around the
workflow? connection and data transfer

Where is the data stored? Initial data storage – local drive Initial data storage – secure server

Is the data stored to a durable Temporary data storage Permanent data storage
medium?

For duplicate readings, where does The separate audit trail documents All readings are contained and audit
the system store the duplicates, and the multiple readings trailed within the electronic form/
how it is documented which one is worksheet
used?

Data is not secured at the initial Deletion privileges removed from all Data is automatically captured into a
location (local drive) storage folders secure server at the time of creation
(can use third-party software to do
this)

Is every calculated result stored? Procedural controls require manual System automatically stores a new
saving of all results result every time the calculation is
run. Result shows the version number
(e.g., fifth calculation shows result V5)

How does the system reduce the risks Field mandatory blockers (e.g., red Manual data entry has been replaced
of error in manual data entry? asterisks that mandate the field must with an automated, validated interface
be populated), range calculators to capture the data (e.g., bar code
(low and high limits) that dictate the scanning or serial interface)
acceptable range for a certain field

How does the system avoid data or Procedural controls require all edited System automatically saves all edits
files being overwritten? files to be saved with an incremental under a file name with an incremental
version number version number and audit trail of
changes

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 65
Data Integrity by Design

QRM principles can then be applied to implement and verify controls as part of the computerized system validation.

A periodic review strategy, including frequency of review, should be developed to identify the processes monitored
to confirm that the system is maintained in a validated state during its operational phase. This includes a review of
procedural controls needed to maintain data integrity to ensure unacceptable risks have not arisen. The amount of
residual risk should be minimized and reviewed within the periodic review. The rationale should be documented to
ensure undue risks are not accepted and that appropriate mitigations are in place. In addition, the strategy should
also contain a review of:

• Any change of intended use

• Change control records

• Deviation records

• Incidents

• Problems

• Upgrade history

• Performance

• Reliability

• Backup and restoration process

• System security (access roster)

• Key validation/qualification documents

• Associated training

The periodic review timing should be defined according to local procedures and based on the system’s impact on the
data.

6 Active Records
As shown in Figure 3.1, records become active at data creation and generally remain active through the processing
and review, reporting and use phases of the data lifecycle. Active records are typically characterized by the need for
frequent access by a range of personnel as part of the business process and the systems supporting the process
must provide effective controls around such access and activity.

Successful data integrity assurance programs commonly evaluate data integrity risks from two different perspectives:
system lifecycle and data lifecycle. With respect to data integrity, system lifecycle considerations ensure data integrity
on an individual system level, while data lifecycle considerations provide cross-system assurance. This approach has
been recognized to provide three key benefits:

• It provides a familiar framework by which to ensure data integrity

- Consistent with traditionally accepted assurance methodologies

- Easily understood and adopted

• It encourages a “data as the product” perspective

- Promotes consideration of a cross-system process (product) like controls for data

- Promotes consideration that data is an integral part of a physical product’s quality

• It promotes more consistent use of risk mitigating controls across systems within a data lifecycle

Starting from the business process and data flow diagram discussed in Sections 4.2 and 4.3, it is recommended to
use critical thinking to assess the path the data will travel through the data lifecycle and whether it will pass through
one or multiple systems during its lifecycle:

• If the data will remain within a single system, such as a well-configured and validated document management
system that takes and maintains the data, then this reduces the opportunities for data issues.

• If the data will travel through multiple systems, then there needs to be appropriate measures taken to define what
data will transfer between which systems. For example, a clinical trial is an area where massive records and
data are recorded from the CRO and trial sites, and then aggregated before going to the sponsor. In this process
there are increased risks that should be documented, as should the mitigation strategies intended to protect the
records from corruption and/or loss.

The business process risk assessment (see Section 4.5) focuses attention on the underlying process, to facilitate
critical thinking through the more detailed data and system assessments. When assessing risks from the data
lifecycle and then from the system lifecycle, there may be common areas of risk identified. Instead of addressing
them separately, a holistic approach can be applied. Password control is an example of a risk mitigation strategy from
the data lifecycle applied across multiple systems.

6.1 Creation

Creation is the initial phase in the overall data lifecycle and therefore is the first place that data integrity may be
compromised. If data integrity is not ensured at this phase, there is no way to later regain that integrity. It is essential
to consider and document the risks and necessary controls to ensure data is accurate, complete, and fit for purpose.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 68 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

It is important to understand the data lifecycle (such as intended use, review and reporting, and retention needs) and
how those lifecycle elements can impact data creation. These items are mainly pulled from a process flow or risk
assessment, as stated in Sections 4.3 and 4.5.

The data lifecycle should be considered both within the context of a system lifecycle and separate to it, as data is fluid
and moves from location to location. It is important to consider:

• How data will be captured and stored

• The associated metadata

• What ability users have to adjust data once it has been recorded and how that adjustment is captured (such as
audit trail, change control, etc.)

6.1.1 Data Lifecycle Considerations

The data lifecycle starts with the first creation of data. Creation dictates the initial and fundamental quality and
accuracy of the data as it moves through the lifecycle. Integrity of the data cannot be improved as the data moves
through the lifecycle, that is, the integrity is only as good as that of the original record at the time of creation. It can
degrade during the lifecycle if proper controls are lacking. It is important to have the well-defined business processes
documented in order to assess the risks to data integrity during creation.

In the creation phase of the data lifecycle, data is captured or recorded from an instrument, device, another system,
or manually. It is essential to ensure the data (file) naming convention and storage location are documented and
secured from unauthorized modification. It is possible that data is stored in multiple different locations (e.g., local
drive, secure network server, or LIMS). Therefore, the data or system owner should ensure that data is secure at
each location with the identification of which record takes precedence for decision-making.

If data must be transferred to a different location, automated transfer is recommended. Automated data transfer can
be validated proactively to ensure consistent, repeatable, reliable transfer, whereas manual transfer processes can
only be verified by human review after the event (by confirming the number and size of files transferred, and possibly
only reviewing a subset of the files). If the transfer process includes the subsequent removal of the data from the
original location, then there must be confirmation of successful data arrival into the target location before it is deleted
from the original location. This can be achieved by validation of the automated data transfer process.

When identifying data lifecycle risks, it is recommended to map how and where original data is created and
subsequently transferred to the next location or system. During the evaluation process, document the controls in
place, risks present, and required mitigation.

Examples of data creation risks:

• Data is recorded manually, lacking an independent audit trail to capture repeat measurements or data, and is
vulnerable to deliberate or accidental transcription errors.

Suggested mitigation: replace manual data recording with electronic data capture

• Creating data by calculation, in Excel for example, the user can create an average sample weight by entering
an averaging formula and hitting Enter. This calculates the average of the values, but this average is not
automatically stored. The user can then keep editing the individual sample weights until they get an average they
like, and then hit Save.

Suggested mitigation: ensure that the system enforces automatic data saving at the time of creation. (Guidance
concerning the validation, control, and use of spreadsheets is discussed in ISPE GAMP® Guide: Records and
Data Integrity, Appendix D5 – Data Integrity for End-User Applications [8].)

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 69
Data Integrity by Design

6.1.2 System Lifecycle Considerations

A computerized system involved in the creation or first capture of data needs all of the essential data integrity
technical controls (e.g., access controls, privileges, data management, etc.) plus the following abilities specific to
creation, to:

• Support automated data collection/capture of the measured or calculated value to eliminate manual transcription

• Generate a complete record with all of the metadata integral to the record

• Retain sufficient information (who, how, using what equipment, etc.) to allow later reconstruction of the activity
performed when creating the data

• Store multiple or repeated readings with no overwriting and full audit trail on all values

• Transfer the record to the next system in the business process using a validated automated interface

Integrity lost at the time of data creation cannot be subsequently restored so it is essential that any system used in
data creation provides strong and effective technical controls during data creation activities.

6.2 Processing

Often the data created and stored may need to be processed to convert it into a meaningful value. How the
processing is applied, how much processing is applied, and was there any repeated reprocessing can impact the
value generated; therefore, processing can be an inherent source of data integrity risk.

A validated computerized system typically reduces data integrity risk; however, where there is a possibility for manual
intervention, the risk increases [12]. This has been amply demonstrated by the many US FDA warning letters [40]
regarding reprocessing chromatography data, where:

• It has been a subjective decision made by the analyst that the initial automated integration is in some way
deficient based on their personal experience without any defined good/bad criteria for assessing integration.

• The analyst has been able to manually intervene, forcing changes to the integration parameters and/or baseline
compared to the initial parameters used by the validated chromatography data system and thus generating a
different result.

The impact of this combination of subjective interpretation and manual intervention has resulted in multiple citations
for data manipulation. This risk factor is not restricted to chromatography data but rather is prevalent wherever the
acceptance criteria are not clearly defined with examples or numerical limits, and where a human decision can
override the validated controls.

As shown in Figure 6.1, the data integrity risk from processing is impacted by two factors:

• The extent of objective versus subjective interpretation of the result

• Whether the process to create the record is well or poorly defined

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 70 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Figure 6.1: Impact of Process Consistency and Intervention/Subjectivity on Data Integrity

Adapted from ISPE GAMP® RDI Good Practice Guide: Data Integrity – Manufacturing Records [15]

Table 6.1: Examples of Data Integrity Risk as Impacted by the Consistency and Subjectivity of Processing

Graph Laboratory Example Manufacturing Example Clinical Example

Point

 Method development for a new Experimental PAT system using Blood pressure is taken
drug candidate, with co-eluting Near Infrared (NIR) to predict with a manual cuff (aneroid
peaks and entirely manual when to manually stop the sphygmomanometer) and typed
integration mixing process into the patient history

 Sample weight manually Ingredient weighed manually Blood pressure is taken with an
transcribed from manually and weight transcribed into automated monitor and typed
recorded weight in laboratory control system for storage into the patient history
book into the Chromatography
Data System (CDS)

 Acquired data is processed as Vision recognition system Automated interpretation of

a complete sample set using an automatically captures images Electrocardiogram (ECG)
automated processing method. and checks packaging quality. tracings, with the option for
Analyst can reprocess and/ Operator override permitted. a manual override on the
or manually integrate any of diagnosis by a clinician
the results based on their own
interpretation of what is good or
bad integration.

 pH reading is automatically Sterilization temperature data Automated calculation of body

captured into an Electronic automatically captured from a mass index by the weigh scales,
Laboratory Notebook (ELN) and thermocouple, compared to the and comparison to the preset
compared to the pass/fail limits required sterilization time and clinical trial limits as part of
set in the product specification temperature and reported as screening for patient suitability
pass/fail

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 71
Data Integrity by Design

Depending on the types of data and processing, these risks need to be remediated in either the data lifecycle or the
system lifecycle or a combination of both working together to reduce the residual risk.

6.2.1 Data Lifecycle Considerations

Consideration of processing risks from a data lifecycle perspective is important because it helps identify risks not
seen by considering computerized systems individually. It is particularly important when processing activities involve
multiple systems or applications. While the system lifecycle may adequately identify processing risks inherent to
a single system, it is not well suited for identifying risks to processing that involve multiple systems or applications
working together.

A significant benefit of using the data lifecycle perspective is that it encourages the use of consistent risk mitigating
controls across the data lifecycle.

6.2.1.1 Characteristics of Data Lifecycle Processing Risks

In the processing phase of the data lifecycle, data is processed to obtain and present information in the required
format. While all critical data processing considered in the system lifecycle and the data lifecycle should occur in
accordance with defined and verified processes (e.g., specified and tested calculations and algorithms) and approved
procedures, data lifecycle processing considers modes of data processing that fall outside individual system
boundaries. Additionally, it considers cross-system boundary effects that may impact data quality.

Examples of data processing risks:

• Record/report documents are electronically compiled with processed data from separate electronic sources

• Human data processing and entry into an electronic system

• Statistical data processing within an application that relies on an Application Program Interface (API) to a
separate application to perform all or part of the processing

6.2.1.2 Identifying Data Lifecycle Processing Risks

It is recommended to start with a business process map and data flow diagram (see Chapter 4). Map the data
lifecycle for the subject data values (e.g., conductivity, weight). Include the boundaries for the systems, applications,
and procedures that perform the processing. Share this map with a cross-disciplinary team to identify risks. Risks
identified can then be addressed using a mitigation strategy based on risk priority. Identifying and planning mitigation
of processing risks from a data lifecycle perspective relies on many of the same methodologies and critical thinking
used to identify risks at a system level.

While it is possible to evaluate the lifecycle steps of individual data values without creating a data flow map, this is
not recommended. The data flow map visualization is one of the best tools for promoting a team’s critical thinking
activities. It helps teams start from a common point of understanding, understand and verify the flow of data, and
break down the lifecycle steps to identify and mitigate risk.

While this section focuses on the processing stage of the data lifecycle, it is not suggested that a map be made for
each stage in the data lifecycle; rather, one lifecycle map should be created to include all the phases a data value
encounters.

Section 4.3 of the ISPE GAMP® Guide: Records and Data Integrity [8] offers a list of topics for consideration during
risk assessment in the processing phase.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 72 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

6.2.1.3 Mitigating Data Lifecycle Processing Risks

Mitigating processing risks, as with any risks, should be based on the risk priority (Refer to ISPE GAMP® 5, Chapter
5 – Quality Risk Management [9]). Separate from the process of assessing and choosing commensurate controls to
manage risk, common controls can be used when mitigating risks identified using a data lifecycle perspective.

As discussed, a data lifecycle perspective helps identify risks that exist between and outside of systems (e.g., manual
processing, multisystem/application processing), as system risks are commonly managed using the system lifecycle.
It therefore makes sense that processing risks are commonly mitigated with controls that are not system specific.
However, this is not always the case as can be seen in the following examples:

Non-system Specific Processing Controls

• Manual procedure checks

• Cross-system testing (end to end business process and data flow)

System-Specific Processing Controls

• Input field data format validation (e.g., only number accepted)

• Configuration and format of data output (e.g., units, significant digits)

• Database field storage configuration (e.g., maximum characters, rounding)

• Validation/qualification testing

Where there is a business process need to allow manual processing, it must be recognized that this presents an
ongoing residual risk to data integrity within that process. Increased rigor of review for manually processed data
provides limited mitigation but cannot wholly eliminate that residual risk. Chromatography data is the classic example
where manual processing is needed if the automated method is unable to process an atypical chromatogram.
Continual improvements to the analytical method and ongoing user training should be employed to reduce the need
for manual processing.

Figure 6.2 details examples of ways to mitigate two common processing risks. As shown, mitigation of risk (assurance
of quality) is rarely achieved through only one control. Controls working together have a cumulative impact towards
mitigating risk, but some level of residual risk may always remain.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 73
Data Integrity by Design

Figure 6.2: Risk Examples and the Cumulative Effect of Mitigations

Note: While Figure 6.2 shows four mitigating controls for human and system data processing, it is common that
process activities executed by humans require more mitigating controls due to variability. Procedural controls alone
are typically not sufficient to mitigate data integrity risks as they rely on human actions.

Even after risk mitigation there may be a level of residual risk that should be periodically reviewed and reassessed.
Each regulated company should determine what level of residual risk is acceptable within their organization.

When using system-specific process controls, consider if these controls should be a standard requirement for
all systems. Standardizing controls in this way is significantly more efficient as it prevents the need to develop
new controls (for a given risk) for each piece of new equipment. For example, standardizing the use of network
permissions (e.g., Lightweight Directory Access Protocol) on all systems eliminates the need to develop a
permissions strategy for each new system.

6.2.2 System Lifecycle Considerations

The ISPE GAMP® Good Practice Guide: Data Integrity – Manufacturing Records [15] states that:

“Where a process is well defined (‘we know exactly how to do this’) and consistent (‘if we do it like this, we
always end up with the correct result’), has no manual intervention (‘it all happens automatically’) and an
objective output (‘we all agree on the result’), data integrity controls can be achieved by validating the system
and maintaining it in a validated state.”

For processed values such as those described in the Point  examples in Table 6.1, well-applied and managed
computerized system validation and access controls restricting changes to the preset limits are sufficient to mitigate
the (low) data integrity risks associated with the processing step.

For all other processed values (e.g., Points  to  in Table 6.1), the risks need a combination of controls involving
people, process, and technology for mitigation. As discussed in Section 6.2.1.3, some of the mitigating actions need
to address risks at the level of the data lifecycle and business process rather than at the individual system level.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 74 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

6.2.2.1 Technical Controls within the System

Before writing out requirements for technical controls around processing, it is important to first risk assess what can
impact the validity of a processed value. Use critical thinking to identify likely sources of impact on the processed
values (and considerations for essential data integrity controls around these), such as:

• Initially acquired data in: Where does the initial data come from? Could the initial data be impacted, e.g., by
electrical interference or signal loss? Is there a transfer of the data to a separate system for processing? If so,
are the units and other metadata transferred with the data or reapplied in the target system?

• Calibration data and scaling: Many sensors initially generate an analog electrical signal which is converted to
a digital signal and scaled using calibration factors. Where does the calibration data come from? Who is able
to change it? Are the previous calibration values available or are they overwritten? Can calibration points be
excluded?

• Averaging results: Where multiple readings are taken automatically or manually, and the average is within
limits but individual readings are outside the limits; this may be acceptable for a rolling average of a pressure
transducer reading in a Process Control System (PCS) to dampen out momentary pressure spikes but would not
be acceptable for analytical results in a QC laboratory (Out of Specification investigation required).

• Manually entered values: Are the privileges available to a sufficiently granular level to control who can enter
and who can review? Would the system flag a value entered out of range? Can the system enforce a second
person review on an entered value?

• Changes to calculations (formulae), integration parameters, or methods: Who can make changes? What
record is there of the change? Is it possible to require an approval step before the system allows the change?

• Changes to processed result: Are all versions of the result stored? Can a comparison be made between
the first and last processed results to see how the value changed? Where a change has been made, does the
system provide tools to highlight or flag human intervention (refer to Sections 6.3.1 and 6.3.2.1), for example as
part of exception reporting?

Fundamental within the consideration of processing is defining the controls for human intervention. The extent of
subjectivity in a manual interpretation can be reduced using an SOP to define when it is appropriate to intervene,
how, and what additional activities and reviews are required after the intervention; and this in turn may increase the
consistency of the processing. This was mentioned in Graph Point , laboratory example, in Table 6.1.

Physical security may be the only available way to reduce a risk, for example, PID controllers lack logical access
controls and can only be secured in a locked control cabinet. Where there is a residual risk that cannot be addressed
by system technical controls, consider what, if any, procedural controls could reduce this risk (while recognizing that
procedural controls are, at best, optional for the operator to follow and therefore cannot offer reliable mitigation).
At system periodic review, evaluate the residual risk to assess whether, for example, a newer software version can
provide the appropriate technical control.

6.2.2.2 Role of Validation in Processing Consistency

Validating the system technical controls should give assurance that the controls will consistently perform as intended
to reduce the data integrity risk associated with automated processing.

In the case of human intervention in the processing activity, additional human review may be required to evaluate the
justification and validity of the intervention and consequent process result. Validation of the computerized system has
no impact on the requirement to review human intervention, as the user is able to impact the processed result.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 75
Data Integrity by Design

Validation of computerized systems is well documented in ISPE GAMP® 5 [9], with detailed application of those
principles to particular system types discussed in the detailed ISPE Good Practice Guides [41, 42, 43]. For this
reason, extensive guidance on is not reproduced here.

As described in Appendix S2, the most recent iteration in risk-based approaches to validation, Computer Software
Assurance (CSA), relies on critical thinking and a thorough understanding of the risks associated with the business
process to define the approach that should be used to validate the system for its intended use.

The validated system must be placed under change control and configuration management to maintain the validated
state throughout the operational life of the system, especially during upgrades or changes of use. The processes and
procedures for the operational phase are covered in the ISPE GAMP® Good Practice Guide: A Risk-Based Approach
to Operation of GxP Computerized Systems [21].

6.3 Review, Reporting, and Use

Review, reporting, and use of data are the phases in the data lifecycle where the data generated fulfills its purpose
in the business process. Before the data can be used there must be a review step to assess the integrity of the data
and to determine whether specifications, targets, limits, or criteria have been met. Data review is essential to evaluate
if the data is fit for its intended use in a regulated environment and representative of the actual process or product
state. Data review and the associated audit trail review are detailed in ISPE GAMP® Guide: Records and Data
Integrity, Section 4.4 and Appendix M4 [8].

Creating strategies for reporting, review of those reports, and outlining the use of that information provides
increased assurance of data integrity and is an essential part of data management. See Figure 2.3 for a schematic
representation of the correlation between data and system lifecycles.

6.3.1 Data Lifecycle Considerations

It is essential to apply a risk-based approach when establishing a data review process. The risk is based on the
complexity of a system, the controls that are in place within the system and data lifecycles, and the criticality of the
decision that the data will be used to make.

The scope and rigor of routine data review may be much narrower than a review conducted as part of an investigation
where, for example, a review of training and site attendance records may be included to verify that personnel were
onsite and trained to complete an activity (as discussed in ISPE GAMP® RDI Good Practice Guide: Data Integrity –
Key Concepts Appendix 18 [16]).

During routine data review, the scope of the supporting data records reviewed may be limited to records of
preceding activities, for example, laboratory notebooks reviewed as part of evaluating a chromatographic result. If a
chromatography method is locked to only allow the analyst to select the method, then the reviewer verifies the data
and correct method selection. In contrast, if the method is not locked down, the reviewer needs to ensure that all
method parameters and associated data are set as defined in the method.

Reviewing data is not limited to the result alone – the metadata, including a history of the data (any audit trail of
operator actions that create, modify or delete regulated data), must also be included in the review as it is inherent in
providing the context and meaning of the GxP data. Where data is transferred to a higher-level system, such as CDS
to LIMS or PCS to MES, it is not always possible to transmit all of the metadata, in which case some level of review
may need to occur in the source system.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 76 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Data cannot be reviewed effectively in isolation from its metadata, including an audit trail of changes to that data. As
defined in ISPE GAMP® Guide: Records and Data Integrity Appendix M4 [8], there are three main types of audit trail
review:

• Review of audit trails as part of normal operational data review and verification:

- For audit trails that capture changes to data, a review of each audit trail record must occur prior to data
release. Where a system lacks necessary audit trails, a risk assessment should be performed to establish
appropriate mitigation. These mitigations may include improved procedural controls, real-time verification, etc.

• Review of audit trails for a specific data set during an investigation (e.g., deviations or data discrepancies):

- During an investigation the review may not be limited to audit trails, but may include a review of the system
logs, such as system configuration and operational events.

• Review and verification of effective audit trail functionality (e.g., verification of audit trail configuration as part of
periodic review).

6.3.2 System Lifecycle Considerations

Reviews can be done 100% manually; however, this is time-consuming and may not detect all of the issues. A system
that offers functionality, such as automated reporting tools and alarm or exception reports, may reduce the checks
needed by a human reviewer while improving the overall detection rate of data integrity issues and errors [12].

The use of electronic signatures in reporting and approval allows a direct linkage between the signature and the data
to which it is applied, and can streamline the reporting process to reduce the time to release data and/or product.

6.3.2.1 Review by Exception

Computerized systems are most suited for:

• Repeatability on repetitive tasks (when validated)

• Speed with analyzing and processing large volumes of data

• Verifying numbers against limits

Human reviewers are most suited for:

• Spotting patterns, having instincts and “gut feelings” (the feeling of “oh, there’s something not quite right here”)

• Leveraging experience and applying critical thinking during data review

• Following a complex trail of events to determine the impact of changes

Human reviewer time is a finite resource and is best saved for investigating suspect data in careful detail to decide if
there is evidence of a data integrity issue.

The most effective data review approach is to use computers and software for general screening of all results to
identify suspect data, and then to use people to investigate in careful detail data flagged by the computer as “suspect
data” or high-risk samples to decide if it is evidence of a data integrity issue. This is the principle of review by
exception.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 77
Data Integrity by Design

“An ‘exception report’ is a validated search tool that identifies and documents predetermined ‘abnormal’ data or
actions, which require further attention or investigation by the data reviewer.” [24]

To implement a review-by-exception process the following items are needed:

• A computerized system that offers exception-reporting capabilities, with the ability to configure the limits and
specifications of the exception report

• Detailed product and process understanding feeding into a detailed data integrity risk assessment to identify
what should be automatically evaluated in an exception report

• Robust verification of any exception-reporting tool, including positive and negative case testing

• Configuration management of the exception report limits and specifications

• An audit trail of changes to the exception-reporting tool or limits, available for review as part of the data review
process

• A documented review process identifying the actions to be taken by the human reviewer in response to suspect
data flagged in the exception report

• Training for all review personnel on how to leverage review by exception, including training on the system
features and functionality used for a detailed review of the original data

6.3.2.2 Electronic Data and Audit Trail Review

Dynamic data must be reviewed electronically because a paper printout cannot represent the original record [23].
The data reviewer must have the skill and training in both the business process and in the software application to
competently review the data in electronic format.

Significant gains in efficiency can be achieved through the use of reports or views for consistent review of data.
Such reports/views need to be validated and secured against unauthorized alteration to ensure they represent the
complete data set needed for review.

Many software applications have important metadata stored separately to the data, for example, in a separate
audit trail documenting the operator actions that have created, modified, or deleted data within the system. Some
systems may store such metadata across multiple locations, not all of which are named audit trails. There may also
be versioning information with details on changes to methods or recipes stored as part of the method or recipe. It is
essential that the metadata is reviewed in conjunction with the data itself to provide the GxP context and meaning of
the data.

It is important within each GxP system to determine:

• What information or metadata is relevant to assess the integrity of the data under review (irrespective of whether
or not it is stored in something called an audit trail)?

• Where is this relevant metadata stored and how will it be reviewed?

• How and when should the rigor of the review be escalated, e.g., what constitutes suspect data requiring
additional review effort?

• Are exception-reporting processes used (see Section 6.3.2.1)?

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 78 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

• Where such processes are used, is the relevant metadata included in that review by exception? What are the
triggers for additional levels of human review?

• How will the review process be documented?

One way to mitigate the risk of reviewing audit trails independently of the data is to have a record with its associated
metadata embedded or viewable within the data review window, rather than reviewing a separate audit trail log of
actions. With the increased focus on patient safety through data integrity, it is becoming increasingly important to
provide easy visibility and access to the metadata with the data for review purposes.

The best person to review the data and metadata is someone who understands the business process that the data
supports, e.g., sample analysis in a QC laboratory or a manufacturing batch operation, should review both the data
and metadata together. Where related or supporting data is located in a separate system, a person knowledgeable in
that system should evaluate that data. For example, the validity of a chromatography result in the CDS is dependent
upon the sample preparation data recorded in a paper laboratory book or ELN. Excluded data should be included in
the review process, such that the reviewer makes an independent decision as to whether or not the exclusion was
scientifically justified, e.g., the system suitability failures before the samples were introduced. A detailed SOP should
outline the review process, and when and how to scale the scope and rigor of the review.

In the same way that not all relevant metadata is contained in something called an audit trail, not all metadata in the
audit trail needs to be part of routine data review. The term “audit trail” is frequently misused and misunderstood, and
a formal assessment should be made of what metadata needs to be reviewed, where it can be found, and how and
when that review will occur. Records of changes to system configuration, user accounts, etc., need to be considered
for periodic monitoring of the ongoing system controls and compliance.

Further guidance on data review and audit trails is found in the ISPE GAMP® Guide: Records and Data Integrity
Appendix M4 [8].

6.3.2.3 Transmission of Metadata to Higher Systems

It is important to minimize the number of systems that a user needs to access to complete the data review. In an ideal
world all data, including metadata, can be transferred to the highest-level system (e.g., an ERP for a manufacturing)
in an organization and reviewed and approved electronically in that single system. When transferring data between
systems, there should be an entry in the audit trail that data has been exported from the sending system and a
corresponding entry in the receiving system’s audit trail. Detail on the design and validation of interfaces is covered in
ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts Section 4.4 [16].

In reality, the review is often performed in the source system, with only the final results and electronic signatures
passing to higher-level systems. At the enterprise level, it may only be a summary report that gains final approval
for batch release, so it is critical to patient safety and product quality that the data flow and relevant metadata are
understood throughout the business process and reviewed at each level.

With manufacturing data, it is common to need to reapply elements of the metadata (e.g., units) to the data when it is
received in the target system. This requires an additional level of verification to ensure the metadata is consistently
and correctly reapplied. Metadata transmission in manufacturing systems is covered in detail in the ISPE GAMP®
Good Practice Guide: Data Integrity – Manufacturing Records Section 3.3.2 [15].

It is recommended to analyze the metadata elements in the source system, if they can be accepted by the higher
systems upstream (and how the fields are represented), and the purpose of those metadata points (what value do
they bring). For each business process, the business process mapping and data flow (see Chapter 4), helps identify:

• The relevant metadata for the regulated record

• Whether or not it is possible to transmit all required metadata with the regulated record to the next-level system,
e.g., from CDS to LIMS or from PCS to MES

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 79
Data Integrity by Design

If it is not possible to transfer all of the required metadata, the regulated record needs to be reviewed in the source
system.

6.3.2.4 Electronic Signatures and Batch Release/Study Approval

Electronic Signatures under Predicate Rules

The specific requirements for the use of electronic signatures with regulated records have been documented since
1997 [10] and include signature metadata:

• Printed name of the signer

• Meaning of the signature, e.g., reviewer, approver, etc.

• Date and time of the signature

When establishing a process for the use of electronic signatures, it is important to consider the distinction between
a signature required under the predicate rules versus a need to authenticate on the computerized system to confirm
identity and access.

With the current focus on managing data electronically it is important that the electronic signature [12] is:

• Permanently and indelibly linked to the corresponding electronic data/regulated record

• Attributable only to the signer via secured means

• Protected against copying and manipulation

• Visibly manifested on both the electronic record and printouts

• Applied using a validated functionality

Where Single-Sign-On technology is used to access a GxP application, it does not obviate the need for the entry of at
least one electronic signature component (typically this is the user’s password) before an electronic signature can be
generated.

Where there is the possibility to later edit a signed record, it must be clear that the signature was associated with an
earlier version of the record and does not apply to the edited record. Signatures should be permanently linked to the
version of the record signed, with an obvious representation in the original signed record and absent in the edited,
unsigned version of the record. This can be achieved by a visible signature manifestation or by applying a state
change (e.g., from “open” to “approved” on signing, and then back to “open” on later editing) to an electronic record.
Users of the system and data must be trained and understand the implications of such manifestations and state
changes, to ensure that any decision-making only occurs on approved data.

Some software applications apply signatures as proof of use authentication at steps in the process where no
predicate rule requirement for a signature exists, for example, requiring entry of a username and password to begin
an analysis and recording a signature against the start of the analysis. The predicate rule requirement is for attribution
of the analysis start to an individual operator but no signature. The signature is only required on the final result, as
an accountability for having generated that result. Such excessive application of electronic signatures brings no
additional compliance or patient safety benefits and should be avoided.

Below, batch release and eConsent are presented as examples with very specific requirements for electronic
signatures. It is important to identify specific requirements that are applicable to an organization and the business
process.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 80 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Batch Release

Where an electronic signature is used for batch release, only a specifically authorized person (or in Europe, Qualified
Person) should have the access to apply such a signature [11, 25].

eConsent

The use of electronic consent (eConsent) in clinical trials is approached quite differently to electronic signatures used
with other regulated GxP records, and may include any of the following as electronic signatures as stated in MHRA/
HRA Joint statement on seeking consent by electronic methods [44]:

• “Tickbox plus declarations

• Typewritten

• Scanned

• An electronic representation of a handwritten signature

• A unique representation of characters

• A digital representation of characteristics, for example, fingerprint or retina scan

• A signature created by cryptographic means”

The MHRA/HRA Joint Statement [44] places the onus on the study personnel to justify the method of applying the
electronic consent signature based on:

• “The nature and the complexity of the research;

• The risks, burdens and potential benefits (to the participants and/or society); and

• The ethical issues at stake”

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 81
Data Integrity by Design

7 Semi-active and Inactive Records

The retention and retrieval phase of the data lifecycle consists of records less frequently accessed but retrievable,
that is, semi-active records, which transition to inactive records over time, as depicted in Figure 3.1.

7.1 Semi-active Records

Semi-active data is needed only periodically, having fulfilled its primary use in support of decision-making while it was
active. It is an internal company decision defined within a documented policy/procedure as to when records become
semi-active and inactive.

Some organizations keep semi-active records in the originating system (ideally in a restricted access or read-only
state) until all periodic use of the record has passed, for example, the annual product review has been completed and
the data will not be included in future trending or control charts. As the records transition to inactive, this could be the
trigger to transfer the data into a separate archive system to serve out the remainder of its retention time.

Other organizations choose to transfer records to the archive as soon as they transition from active.

The advantages and disadvantages of keeping data in the originating system versus a dedicated archive system are
discussed in Section 3.3.

7.2 Retention of Inactive Records

Records need to be retained for the ongoing support of patient safety and product quality: pre-clinical toxicological
and pharmacological data, clinical trial data, adverse event reporting, batch release data, donor data for blood and
tissue, for example. The data should be available to reconstruct the activities and decisions used to support the
development, manufacture, and release of drugs. The regulatory requirements for retention are intended to ensure
the data is available for at least a minimum period commensurate with the use of the data.

Regulatory requirements define the need for electronic records retention over a protracted period of time for safety
and efficacy data, and vary by country. (See Appendix O4 for examples of requirements for record retention.) The
expectation is that data be maintained in a human-readable form so that it may be reviewed in the same fashion as it
was when first used to make critical decisions.

Implementation of business solutions to meet the regulatory requirements of preserving data integrity and maintaining
the records in human-readable form may prove challenging for organizations when extensive time periods are
involved. There often comes a time in the lifecycle of data where maintaining the record in its dynamic format is no
longer realistic and a company will need to convert the dynamic record into a static format that captures as much
content and meaning as possible given the static nature of the record. Any loss of content and meaning should be
captured in a documented risk assessment. The connotations of static and dynamic data are discussed later in this
section. See ISPE GAMP® Guide: Records and Data Integrity [8] for further details including consideration of the
prospective use of the data.

As described in Chapter 3, it is important to plan the retention strategy for a system in the beginning of its lifecycle.

An organization must ensure that the accuracy, content, and meaning of records is maintained throughout the
retention period. Record retention and archiving are different concepts, but archiving can be used to meet electronic
records retention requirements.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 82 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Retention of records involves one or more of the following:

• Storage in the application that created the records

• Transfer to a read-only system

• Migration to a new system

• Conversion to different media

• Conversion of a dynamic data format to a static data format

Data must be readily available throughout the retention period, based on organizational and regulatory needs.
Wherever data is archived, procedures must be established to ensure that the archived data is maintained against
disaster and loss. Data must remain accessible and readable.

One traditional view of archiving is to move data from the online production server to off-line or near-line storage
in a separate application or read-only database. Another approach is to retain the records and data in the original
application, provided that they are logically separated from live data and are set as read-only. These approaches are
discussed in Section 3.3. There may be a change in data stewardship when the data is transferred to an archive,
however, data ownership remains unchanged as discussed in Section 2.2.

The intention of the rest of this section is to go into further detail of aspects of record retention within the data lifecycle
and system lifecycle. In doing this, the intention is not to repeat information presented in the ISPE GAMP® Guide:
Records and Data Integrity [8] and in the ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts [16].

7.2.1 Data Lifecycle Considerations

Based on the data flow diagrams (described in Section 4.3), the regulated data identified therein needs to be retained
and readable for the mandated retention period as defined in Appendix O4.

Data may not be limited to records in one application alone and ensuring that the data is readable may depend on
multiple applications interfaced together, each with version dependencies. For example, the sample could be a chain
of data that runs through a LIMS, including imported data, methods, analyzed data, and summarized data. It needs
to be ensured that none of these records has been adversely affected by changes to the LIMS system or the source
system. Additionally, metadata such as the audit trail should be checked, because if the audit trail is negatively
affected, then data integrity could be permanently lost.

Best practices for general archiving are presented in Appendix O2, and specific considerations for GLP archiving are
contained in Appendix O3.

It was discussed in detail in the ISPE GAMP® RDI Good Practice Guide: Data Integrity – Manufacturing Records
Section 3.2 [15] that the objectivity and consistency around how the regulated record is produced may impact
whether the initially acquired data points need to be kept. With validated controls and configuration management
regarding the scaling and calibration settings, the processed data values (e.g., temperature in °C) may be sufficient
and the initially acquired values (e.g., analog signal in mA) unnecessary. This is very different to the situation with
chromatography data where the initially acquired values (e.g., channel data from the detector) must be kept because
the value of the processed result can be significantly impacted by the integration parameters applied by the analyst.

Ideally all dynamic data is retained in a dynamic format throughout the retention period, however this may become
unfeasible over time (see Section 7.2.1). For the regulated data to be retained, it is therefore important to consider:

• The intended use of the data ongoing when it is semi-active and inactive, i.e., is it retained purely for regulatory
and audit requirements?

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 83
Data Integrity by Design

• The likelihood of a prospective need for the data to be reprocessed (and how much will this likelihood decrease
over time)

These considerations, addressed in a risk assessment, impact how and when the data is to be archived; the aim
must always be to preserve the GxP content and meaning of the regulated data throughout the retention period.

From a business perspective, the importance of the stored data is likely to decline over the retention period. Clinical
trial data may be used as an example. Drug recalls are more likely to occur early in the licensed use of the drug; thus
the clinical trial data represents the major drug safety and efficacy data available during this time. After a long period
of use of the drug, the clinical trial data maintained as dynamic records becomes less significant as data from patient
use of the drug provides more extensive information on a wider scale. It should be noted that if existing clinical data is
re-used for a new submission, the retention period is restarted and may require continuation to maintain the data in a
dynamic format.

During corporate acquisitions or outsourcing activities there needs to be a process that verifies the data, metadata,
and the ability to open records in their dynamic format for review is maintained throughout the transition between
companies. See Section 3.7 for further information on records management through mergers, acquisitions, and
divestments.

Inactive, archived data is unlikely to be routinely accessed by the generating departments. Therefore, it is highly
recommended that the data is protected from access, specifically access to edit, once it is deemed to be inactive and
archived. This might include a process to migrate the data away from the department or division that initially collected
and stored the information [23], unless that data is required for a specific review project, regulatory submission, or
investigation.

It is likely that a record may need to be migrated at least once during its retention period. Each migration carries a
risk to the content and structure of a record and potential loss of metadata. Taking the concept of reduced usefulness
over time into account enables the migration risk to be balanced against the relevance of the record, perhaps making
rendition to a static format, e.g., PDF more acceptable. If data is to be migrated, it should be done following an
approved process, procedure, or protocol. Data migration is discussed in Section 3.5.3.

Many organizations use a data lake (also called data fabric, data mesh, data swamp, etc.) as a storage repository
for all structured, semi-structured, and unstructured enterprise data to enable them to visualize, report, and perform
data analytics. A data lake allows the storage of all data of an organization without requiring the creation of data silos,
and is heavily reliant on structured nomenclature (see Section 4.7). The use of a data lake is further discussed in
Appendix S1.

7.2.1.1 Static Data

Static data has been defined within guidances as listed in the ISPE GAMP® RDI Good Practice Guide: Data Integrity
– Key Concepts Appendix 5 [16].

The FDA [35] states:

“Static is used to indicate a fixed-data record such as a paper record or an electronic image.”

The MHRA [12] states:

“A static record format, such as a paper or electronic record, is one that is fixed and allows little or no interaction
between the user and the record content. For example, once printed or converted to static electronic format
chromatography records lose the capability of being reprocessed or enabling more detailed viewing of baselines.”

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 84 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

From these definitions it can be seen that some original records may be created in a static record format: a completed
paper form, a photograph, a printout from an instrument that does not keep electronic records where the printout is
the only record created. A true copy of static paper records can be created as a static electronic record via a scanning
process with verification that all information has been captured with no loss of color or pages. The amount and types
of quality checks performed on the scanned records should be based upon risk.

There should be a documented policy with respect to the destruction of original paper records. For example, if there
is potential for litigation around the records, then it may be inappropriate for the company to destroy original records.

Other static records might be generated as copies of original records that were created in a dynamic electronic format
(note that a static record cannot be a true copy of dynamic data). Static records of this type need to be considered as
one of two distinct kinds of records:

• Copies containing as much meaning and context of the original electronic record as can be converted into a
static format but containing, at a minimum, the final or reported data

• Summary reports used to present data into a meaningful format to make further quality decisions. (The use of
summary reports is discussed later in this section.)

Static electronic records are commonly stored in PDF format. It is assumed that PDF records are static as they
cannot provide the kind of user examination or interaction with the record as that of a dynamic record. Equally, there
is an expectation that static PDF records cannot be modified or altered by users and therefore there is no requirement
for levels of access control or audit trails for PDF records. However, as the capabilities of applications that read
or display PDF documents increase, it is essential to challenge the integrity of static PDF records. The abilities to
edit, delete pages, reorder pages, sign, un-sign, or re-sign PDF files, or convert to editable formats like word and
then back to PDF, without the technical controls expected for electronic GxP records, need to be considered for risk
assessment when relying on PDF record formats.

7.2.1.2 Dynamic Data

Dynamic data has been defined in the ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts,
Chapter 3.1.4 – Data Requirements [16], under the discussion of “true copies:”

“dynamic data requires additional processing to derive a meaningful value (reportable result) that can be used for
decision making.”

An extensive suite of regulatory definitions for dynamic data can be found in the ISPE GAMP® RDI Good Practice
Guide: Data Integrity – Key Concepts [16].

When archiving inactive electronic records, the possible future requirements for examination, review, and
reprocessing need to be considered before deciding on a format or transformation of the data.

In most cases, the simplest and most successful format to ensure future processability is the original vendor format,
with an intent to restore the original electronic dynamic data into the same or updated version of the application. This
should make certain that all metadata, audit trails, and traceability between metadata, as well as the dynamic nature
of the record, are preserved. It should most easily allow reconstruction of the data as it was reviewed at the time of
the quality decision it supported. The difficulty with this approach is that it requires that the original application is still
available and capable of reading older records.

One key situation where summary records may be used is where the original records are held and maintained by a
third-party contracting company, while the regulated company is only provided verified summary data.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 85
Data Integrity by Design

Summary documents do not purport to be true copies of the original record but must be verified [23] to include only
data or results that have been reviewed and approved by looking at the original complete record and the key data
recorded in the summary accurately. Typically, the summary is verified as an accurate (but not a true and complete)
representation of the original data. One example is a Certificate of Analysis, which summarizes results from multiple
analytical tests, whose original records exist in several different GxP systems.

7.2.2 System Lifecycle Considerations

Key features in a dedicated archive system could include (but are not limited to) the ability to:

• Capture and manage data from multiple sources and in multiple formats, including both flat files and database
files

• Capture data based on user-defined criterion (time or event based) and/or manual tagging of the files as ready
for archiving

• Generate a secure log entry of the capture of the archive files from the originating system

• Restore archived files to designated locations for further viewing or reprocessing in the native applications, and
document such activities in a secure log

• Indexing and tags of records and record content for ease of retrieval

• Segregation of records into separate, access-controlled folders or storage areas

• Set a minimum retention period for individual records or folders (including an option for legal holds to indefinitely
extend the retention period if required). (Retention periods and legal holds are discussed in Section 7.4.2.)

Retention is a challenge especially for electronic records with the constant changes in technology leading to system
upgrades or system replacement. When a computerized system undergoes an upgrade, it must be determined if
the new software can restore and read data archived from previous versions of the system, as well as all supporting
information to enable the reconstruction of the activities.

Verification from the vendor of the system should be sought to ensure that historical dynamic, previously archived
data can be restored in any future installed or upgraded version of the application based on:

• There may be a need for an intermediate transformation, or a bulk migration process of previously archived data

• There may be a dependency to restore records from the archive to the live system, and then re-archive from the
newer application version

• There may be a process to update the archived records, which requires validation to ensure the records could
be retrieved, but the process is only carried out if archived records were specifically required to be revived

Another challenge to records retention is that there are no common data standards, as discussed in Section 3.5.3. If
the upgraded version of the computerized system cannot restore previous versions of data or if the decision is made
to retire the current system and obtain a different computerized system, a process needs to be determined to ensure
the continued retention and readability of previously acquired data. This can be extremely challenging for a company
when system upgrades are sometimes not optional. It is important to understand the impact of the system changes
on the data and, if possible, plan for them.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 86 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Throughout the system lifecycle, there are many system changes and upgrades requiring assessment as to the
readability of the inactive records as well as any archived records. As part of system periodic review, a representative
sample of archived inactive data may need to be retrieved and checked to determine that they still are in human-
readable form through the original application. This is especially important when upgrades to operating system and
application software are occurring and the review frequency should be aligned with those changes. Data may have
been archived from a much earlier version of the application.

However, the ability to restore the data to its original or updated application depends on that system still being in its
active lifecycle state, i.e., that the application is still available and validated for use, either still generating new records,
or for reading historically archived data. See Appendix O5 for a discussion of managing legacy software.

To avoid the possibility that archived data cannot be restored into the latest version of the application, it may be
necessary to keep all data live in the environment of the most recent version of the application. A second instance of
the application can be deployed, specifically to house inactive archived data. This application must be updated to the
same version as the production system, and the complete data it contains is migrated for each upgrade, ensuring it is
able to be read in the most recent version.

Readability of records, active and inactive, is also discussed in Section 3.2.

7.3 Return to Active State (Retrieval)

At any point during the retention period, data may need to be retrieved for further viewing, and in the case of dynamic
data, reprocessing. In effect, the inactive data must be restored to an active state, usually within the originating
system or a copy thereof.

Controls need to be in place to:

• Define the scope of the archived data to be restored and the justification for doing so

• Authorize the restoration of archived data into the originating system or compatible application

• Ensure the restored data is highlighted, and where possible segregated at least by date, from the current data if
restored into the originating system is still in operational use; this is to prevent confusion with current data

• Reprocess the restored data to obtain new and alternative results if required

• Manage the retention of new results including maintaining data integrity and archiving of those results in
compliance with current active record policies, procedures, and controls

• Assess and document the subsequent impact those results may have on previously approved decisions (e.g.,
batch release), trending, and annual product reviews

• Ensure the restored data is deleted from the originating system after the investigation is completed, with the
deletion formally approved and documented

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 87
Data Integrity by Design

7.4 Destruction

7.4.1 Data

Data destruction is the final stage in the data lifecycle and often the most difficult to manage. However, because this
process is, in many cases, irrevocable and involves the deletion of GxP critical data, it warrants special consideration.

There are several driving forces for deleting data at the end of the data lifecycle:

• End of regulated retention period

• Non-regulated data no longer applicable

• A contractual requirement between a sponsor company and their CRO/CMO requires deletion of the sponsor
data after a specified time

• To remove the management burden of records that no longer need to be retained. The cost of discovering each
electronic record during litigation processes is significant, and increases in proportion to the volume of records
stored. Therefore, limiting the volume of records in a controlled manner saves considerable cost, and also
optimizes the time taken to retrieve records.

• To speed up system performance, e.g., backup and search functions

• To remove confidential or financial records that are no longer required for regulatory purposes and that present a
business risk to the organization if retained and accessed by unauthorized individuals

Irrespective of the drivers for deletion, the process used should take account of the following factors:

• A reverse step method is recommended to enable the deletion to be reversed in the case of serious failure.

• Irrespective of whether the deletion process is automated or manual, it is essential to ensure that only the
appropriate records have been deleted. Where an organization operates a fully automated deletion process,
stringent validation of the process should be carried out. In addition, the automated process should contain in-
built verification of the correct operation of the process with error notification. The performance of the deletion
process should further be regularly reviewed for correct operation.

• Any litigation or legal holds must prevent the data from being deleted, and a check of any legal hold status must
be part of the process for the deletion of data.

• The data owner is responsible for ensuring that the correct data is deleted, i.e., that data still required by
regulation or the business is not deleted. This means that the data owner has a central role to play in the deletion
of data.

Due to the criticality of the process, it is recommended that in addition to approval by the data owner, there is process
owner and QA approval for the deletion of GxP data. In a contract situation (CRO/CMO/contract testing laboratories,
etc.) the data is owned by the contract giver (e.g., sponsor company), and explicit approval for the deletion should be
obtained in advance from that contract giver. Additional approval by the legal department may also be put in place.
Such approvals should be at a stage in the deletion process before the data is irrevocably lost.

There may be occasions when the business choses to retain the data after the mandated retention period. For
example, based on ICH E6 (R2) [45] GCP requirements, clinical records could be destroyed 2 years after the last
approval of a marketing application, but in reality, that same data will be required when the company adds new
formulations and uses the same data in a new submission.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 88 ISPE GAMP® RDI Good Practice Guide:
Data Integrity by Design

Note that it is sometimes necessary to retain data beyond its retention period for a number of valid business reasons,
including where there is a possibility of it being required in connection with legal proceedings; consideration should be
given to a legal approval step prior to deletion. (Product liability and patent defense considerations are out of scope
for this Guide.) Time-based data destruction may not be supported in all systems, again leading to data surviving
beyond the retention period.

Data destruction is performed when all legal, regulatory, and business requirements have expired, for example, 15
years after the record is created. The destruction phase ensures the correct original record, any true copies, and any
uncontrolled copies are appropriately destroyed. Consideration should be given to managing the disposal of records
even from the backup copies of the data. The disposal of data should be performed following an approved disposal
protocol where verification of the appropriate records is performed prior to final destruction. Section 4.6 in the ISPE
GAMP® Guide: Records and Data Integrity [8] provides procedural requirements and additional information related to
data destruction.

7.4.2 System

As discussed in Section 7.4.1, there is a reluctance within the regulated industry to actively destroy data even at the
end of the mandated retention period. Once the decision has been made to proceed with data destruction as the
final step in the data lifecycle, there are systems available that provide a consistent, validated mechanism for the
controlled and automated deletion of records based on a data retention policy, making the destruction process simple
and well documented.

When choosing an archive system with the intent to use automated data deletion, it is essential that the system
includes the ability for an authorized user to:

• Set a project retention policy defining that the data is needed for a specified time after archiving, such that
the system automatically retains the data in that project throughout the specified retention period before
automatically deleting it

• Set different retention policies for different data folders, projects, or studies including a “never delete” policy (e.g.,
Table 16.1 in Appendix O4, there are different retention periods for batch data versus validation data versus
clinical data, etc.)

• Configure the system to exclude specific data or records from the retention policy within a folder, project, or study
to prevent their automatic deletion (e.g., for records required to be kept as part of a legal hold)

• Generate a system log of data deletion detailing the records deleted, from which folder, project, or study and
when (note that this is not called an audit trail as the deletion is not an operator action but an automated system
action)

• Control, by granular access privileges, who is able to set or amend project retention policies and record legal
holds

• Capture in an audit trail the setting of, and changes to, retention policies and legal holds, including the option to
enter a reason for the change

The data destruction features listed above should be risk assessed and validated based on their risk priority. It is
essential that legal holds are applied immediately upon notification of a possible need for the data for legal purposes.

Critical thinking should be applied when assessing and evaluating the use of automatic deletion as well as the
controls needed around such an activity, as it may not be appropriate for all data, situations, or organizations.
Data retention is not only governed by life sciences regulations, but also by business, financial, safety, and
legal requirements, etc. It is important to ensure that the organization’s destruction policy involves all impacted
stakeholders.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 89
Data Integrity by Design Appendix M1

8 Appendix M1 – Knowledge Management

Appendix M1
8.1 Introduction

With the vast amount of data generated from the development, manufacturing, and marketing of products in the life
science industry, KM enables organizations to make decisions more efficiently and encourages an organizational
culture of learning. It drives and facilitates continual improvements to data integrity and product quality, resulting in
increased patient safety. This appendix presents the concepts and discusses tools to help organizations increase use
and advance their maturity level in KM.

KM and QRM are identified as the two enablers to Pharmaceutical Quality Systems in ICH Q10 [32].

ICH Q10 [32] defines KM as:

“Systematic approach to acquiring, analysing, storing, and disseminating information related to products,
manufacturing processes and components.”

ICH Q10 [32] also notes:

“Sources of knowledge include, but are not limited to, prior knowledge (public domain or internally documented);
pharmaceutical development studies; technology transfer activities; process validation studies over the product
lifecycle; manufacturing experience; innovation; continual improvement; and change management activities.”

QRM has been extensively discussed in ICH Q9 [27] and ISPE GAMP® 5 [9].

8.2 Key Concepts

8.2.1 Data, Information, Knowledge, Wisdom (DIKW)

The DIKW pyramid is the commonly used model in the area of data science to describe the relation and hierarchy
of data, information, and knowledge. The DIKW hierarchy is considered foundational in many information science
curriculums and is commonly represented as a pyramid with the foundational base of data [46]. An example is shown
in Figure 8.1.

Figure 8.1: Example DIKW Hierarchy

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 90 ISPE GAMP® RDI Good Practice Guide:
Appendix M1 Data Integrity by Design

In the context of the DIKW hierarchy, the following definitions provide context:

• Data: Symbols that represent the properties of objects and events [47]

• Information: Information consists of processed data, the processing directed at increasing its usefulness [47],
e.g., data with context

• Knowledge: As defined by the Cambridge Dictionary [48], knowledge can be described as: awareness,
understanding, or information that has been obtained by experience or study, and that is either in a person’s
mind or possessed by people. However, in the context of an organization, knowledge can be a combination of
content (explicit knowledge), information, as well as tacit knowledge.

• Wisdom: Wisdom is the ability to act quickly or practically in any given situation [49]

An alternative to the DIKW triangle, replacing wisdom with insights has been proposed as a reflection of the use
of current technology tools and approaches [49]. Insights is more fitting in the current day, as wisdom is widely
agreed as a “uniquely human” characteristic. Insights may be derived by people with knowledge and experience;
however, new trends suggest that insights may also be derived by new computing or AI models that identify trends
and correlations previously not possible to see with experience alone [49]. Figure 8.2 utilizes the concept to view the
relationship of the producers of data and information and the consumers. This notion is driven by a strong foundation
of data that enables information, knowledge, and insights.

Figure 8.2: Understanding Data Producers and Data Consumers

Adapted from “A Blueprint for Knowledge Management in the Biopharmaceutical Sector” by P. Kane [49]

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 91
Data Integrity by Design Appendix M1

Data transforms into information by assigning a meaning or context to data. Furthermore, the accumulation of a
data bundle or the linking of various data can also represent information. The moment the data is processed, linked,
and stored, whether by a machine or a human being, it becomes information. The ability to apply that information
appropriately translates into knowledge. Prospective utilization of information and knowledge leads to insight. Context
is essential throughout the information, knowledge, and insight stages.

One’s process information may be another’s data, e.g., bioanalytical results are data inputs for pharmacokinetic
analysis.

8.2.2 The Role of Knowledge in Pharmaceutical Development, Commercialization, and

Manufacturing

The foundations of ICH Q8 [50], Q9 [27], Q10 [32], Q11 [51], and Q12 [52] build upon science, application of risk-
based approaches, and utilization of prior knowledge. The ability to capture, store, and provide visibility of product
knowledge is critical to enable the development, application, and manufacture of medicinal products as well as to
support continual improvement and post-approval changes. Product knowledge is typically classified as explicit
(documented or codified) or tacit (intuitive) and may be developed or acquired from data generated during research,
commercialization, manufacturing, and continual improvement activities.

Figure 8.3 provides a representation of how the data governance framework provides controls for data integrity/data
quality and links with QRM, KM, and data management. Leveraging the data integrity governance and framework
(inclusive of the quality-culture mindsets and behaviors discussed in Section 8.4) and actively utilizing QRM (middle
of diagram) provides high quality foundational data that an organization can use to create information, knowledge,
and insights.

Figure 8.3: Foundations to Data Integrity by Design

It is important to remember that achieving insights is not the ultimate destination but a waypoint on the journey.
Proactive KM provides feedback into the data governance framework, allowing for optimization of the organization
and driving continual improvements to the business processes.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 92 ISPE GAMP® RDI Good Practice Guide:
Appendix M1 Data Integrity by Design

This is reflected in EU GMP Part 1, Chapter 1, Section 1.4 [1], which states:

“A Pharmaceutical Quality System appropriate for the manufacture of medicinal products should ensure that:

(ii) Product and process knowledge is managed throughout all lifecycle stages; …

(xi) Continual improvement is facilitated through the implementation of quality improvements appropriate to the
current level of process and product knowledge.”

Reflecting on the DIKW/I model [46], this underscores the importance of data that is fit for purpose with both data
integrity and data quality, and can therefore contribute to product and process knowledge. As discussed in Section
1.6.4, data quality is defined by the OECD [13] as:

It is important to remember that data quality is not synonymous with data integrity and the controls associated with
data integrity (ALCOA+) do not ensure the quality of the data generated. This is further discussed in Section 1.6.4,
and Appendix M2 shows examples of a lack of integrity and/or quality.

8.3 Managing Knowledge

Knowledge, like data, has a lifecycle as depicted in Figure 8.4. The knowledge lifecycle can be described as:

“A continual cycle that describes how knowledge moves through an organization.” [53]

Knowledge has also been described as “information in action.”

Figure 8.4: APQC Knowledge Lifecycle [54]

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 93
Data Integrity by Design Appendix M1

The knowledge lifecycle is similar the data lifecycle (Figure 8.5). It is interesting to note that the APQC’s (American
Productivity & Quality Center) [54] knowledge flow process deals with the creation, processing, review, reporting, and
use aspects of the data lifecycle. However, in the regulated industry it is important to ensure the integrity of the data
throughout the mandated retention period. It is also essential during the retention period that the data is retrievable
for review in support of regulated processes or for regulatory inspection.

Figure 8.5: Data Lifecycle [8]

The ICH Q10 [32] KM definition suggests a systematic approach. Case studies within and outside of the
biopharmaceutical industry show that multiple approaches, for example, content management guidance, the
application of data taxonomy (see Section 4.4 on data classification and Section 4.7 on data nomenclature for
further details), lessons learned, communities of practice, etc., are often warranted to help manage knowledge
across the pharmaceutical product lifecycle and the supporting business processes. Organizations with best in class
KM capabilities often measure KM maturity and take a programmatic approach to managing knowledge. APQC4
developed their Knowledge Management Capability Assessment Tool (KM-CAT™) [54] in 2007, which is industry
neutral and measures KM capabilities and their respective level of maturity. The KM-CAT™ measures 5 levels of
maturity over 4 categories inclusive of 12 subcategories with 146 questions.

A key outcome of KM is facilitating knowledge flow. A strong foundation of data, information, and knowledge is
required to enable the objectives of ICH Q10 [32]:

• Achieve product realization

• Establish and maintain a state of control

• Facilitate continual improvement

4
APQC [54] is a not for profit research organization that is the recognized leader in the practice of knowledge management.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 94 ISPE GAMP® RDI Good Practice Guide:
Appendix M1 Data Integrity by Design

Organizations should proactively consider how knowledge associated with products and pharmaceutical processes
will be developed, curated, and used, and develop standard approaches to managing such knowledge. Standard KM
tools and processes may be utilized to enable knowledge flow. Examples include:

• Communities of practice and networks

• Content management (GxP and non-GxP)

• Taxonomy

• Lessons learned

• Expertise location

• Expertise transfer and retention

KM tools and processes are enablers to:

• Make information and knowledge visible

• Enable knowledge to flow in the organization to those that need it

• Provide forums and processes to connect people and expertise to tacit knowledge

With the advances in technology and innovation, a strong foundation of quality data and information also may
facilitate using advanced computer systems leveraging AI/machine learning/deep learning to generate predictive
insights. Appendix S1 on AI and ML discusses such systems.

8.4 Mindsets and Behaviors

Quality data is required as a foundation, so that data can ascend the DIKW/I hierarchy (Figures 8.1 and 8.2) and
generate additional value to the organization and patients by feeding back to drive continual improvements to data
governance. Fundamentally, data, information, and knowledge are organizational assets and must be appropriately
managed to protect and ensure availability of such assets.

The DMBOK [14] defines data management as:

“The development, execution, and supervision of plans, policies, programs, and practices that deliver, control,
protect, and enhance the value of data and information assets throughout their lifecycles.”

One of the objectives of data management is for the organization to control its data resources, i.e., data governance.

An important element for both data governance and KM is a quality culture. The ISPE Cultural Excellence Report [55]
explores the term “cultural excellence,” proposing that within any given organization, there is not a separate quality
culture, safety culture, data integrity culture, etc. Rather, one primary corporate or organizational culture exists that
influences the behaviors and actions of personnel giving rise to quality, data integrity, and safety outcomes that matter
to the patient and the business. The six dimensions of the Cultural Excellence Framework [55] are:

• Leadership & Vision

• Mindset & Attitudes

• Gemba Walk

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 95
Data Integrity by Design Appendix M1

• Leading Quality Indicators (LQIs) and Triggers

• Management Oversight & Reporting

• Cultural Enablers

The ISPE Cultural Excellence Report [55] recommends moving from a culture of compliance towards a culture of
excellence, where:

“there is deep understanding throughout an organization of the elements critical to product quality.”

Quality culture is a foundational mindset required to create data that has quality, which is controlled throughout its
lifecycle to ensure integrity, which in turn enables the organization to use the respective data to create information,
knowledge and insights. The quality-culture mindset and the mindset to treat data and knowledge as an asset,
comparable to physical assets in manufacturing or laboratories, is necessary to deliver value to the business and
ultimately the patient.

Knowledge and insights, whether from people or systems, may be lost or impeded during mergers, acquisitions, and
divestments. Particular care should be taken to evaluate the respective corporate systems and ensure not only that
governance and procedures are reviewed but the tacit knowledge of such systems as well. An acquired company
may be forced to align to the corporate QMS (including data governance and KM) of the purchasing company, even if
the purchasing company has weaker processes.

8.5 Conclusion

In summary, ICH Q10 [32] recognizes QRM and KM as two enablers to a pharmaceutical quality system. The role of
quality foundational data is critically important for QRM and KM. Data is used in organizations to create information,
knowledge, and insights. The data, information, knowledge, and insights must be available to the organization so
they can use them to contribute to the objectives in ICH Q10 [32] (as reproduced in Section 8.1) and go beyond
compliance to deliver value to the business and the patient.

9 Appendix M2 – Understanding Data

Appendix M2
Integrity Compared to Data Quality
This appendix contains examples to aid in understanding the differences between data integrity and data quality, and
the need for both.

Table 9.1: Examples of Data Integrity and Data Quality (or Lack Thereof)

Example Data Data Details

Integrity Quality

Manufacturing Bill of
Materials (BoM)
  The BoM is approved by the right persons, is secure and unalterable
with a formal change control process, which includes formal revision and
review/re-approval, and an audit trail is in place to capture before and
after data values. The component parts and quantities listed are correct;
there is only one valid version available to be added to a process order at
any one time.

Vendor
Management in an
  There are security and approval processes in place to ensure duplicate
vendors cannot be added to circumvent a vendor listed as blocked on the
ERP System ERP system. There is only one valid entry for a given approved vendor
on an ERP system.

Analytical
Laboratory [56]
  The laboratory has new equipment, great training and an excellent quality
culture. Independent audits give them glowing reports. They create data
with a high amount of integrity that can absolutely be trusted. In contrast,
each manufacturing site has its own Electronic Batch Record System
(EBRS). They each have a different standard for describing materials,
procedures, methods, qualifications, and the like. No two sites describe
their processes identically for the same product, even though all groups
electronically submit data to the same laboratory. Quality would like to
assess the manufacturing capability of product XY across the nine global
sites where it is manufactured. Due to site-centric data descriptions, IT
has to create a different query at each site then combine them to provide
the quality unit with the data required for the assessment.

Contract Laboratory
[56]
  The manufacturer conducts a cursory review of SOPs and deviations at
the contract laboratory every two years. If any deviations occur in the
contract laboratory, they are responsible for investigating and closing
them—but the contact laboratory receives no payment for deviation
activities. They are paid for the number of test results they provide
the manufacturer. This business scenario provides ample motivation
for the contract firm to take shortcuts in practices, conduct superficial
investigations, and release test results with inadequate review, and
few chances to detect the poor integrity of the underlying data. The
manufacturer can create necessary reports quickly and efficiently, but the
data in the reports could lead to incorrect conclusions about capability,
because the data in the report cannot be trusted.

The manufacturer has a single, global EBRS installation with strict data
management practices that ensure database attributes are defined only
once. Validated reports are available for routine operations. However,
each of the nine manufacturing sites uses a local contract laboratory to
conduct in-process and release testing. These contract laboratories keep
data on their local systems, entering the final reportable value in the global
EBRS system using a secured network connection in the laboratory.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 98 ISPE GAMP® RDI Good Practice Guide:
Appendix M2 Data Integrity by Design

The analytical laboratory example above demonstrates why data quality is needed and how data nomenclature
(see Section 4.7) helps with quality, and how data analytics (see Appendix O1) then relies on data standardization,
definitions, and libraries. A laboratory can have excellent data integrity, but if the analytical information used to make
a decision is delivered late, or not at all, then the data quality is useless.

“Data integrity’s focus is providing a value that can be trusted by users. Data quality’s focus is providing attributes
around data values (context, metadata) so values can be sorted, searched, and filtered in an efficient and timely
manner, confident that the complete data set is included.” [56]

It is important to remember that it is possible to have integrity without quality and quality without integrity, but that it is
essential to have both quality and integrity.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 99
Data Integrity by Design Appendix M3

10 Appendix M3 – Third-Party Data

Appendix M3
10.1 Introduction

There are multiple scenarios where regulated data is under the control of a third party (the contract acceptor). It is
important for the marketing authorization holder or sponsor company (the contract giver) to understand and document
the risks and controls to ensure that the roles and responsibilities are clearly articulated in support of maintaining data
integrity throughout the data lifecycle. This appendix presents some of the risks and data concerns associated with
the use of contract organizations (such as CRO, CMO, or other variants; collectively referred to as CxO or contract
acceptors) or cloud providers. The use of any of these third-party services results in similar data integrity concerns.

10.2 Assessments and Responsibilities

Key to the use of third parties is the need to appropriately assess the provider and clearly define the responsibilities.

“The responsibility for the quality of IT software and services will always reside with the life sciences company
that uses them. Having a vendor or even an independent third party produce an independent attestation
regarding the control environment’s effectiveness does not affect that obligation. However, with the expanding
use of such services, the need to maximize the efficiency of quality assessments has become a more significant
challenge. In addition, suppliers are starting to offer services with significant GxP risk, such as laboratory
information management systems (LIMS) as an SaaS application. The use of such high-risk services is a driver
for a structured and controlled approach to supplier assessment.” [57]

Assessment can be done by direct vendor audit or postal questionnaire, and/or by performing due diligence on the
available documentation provided by the third party, such as SOC2+ reports [57] and GxP supporting information.
It is important to ensure that the requirements for long-term archival storage and viewing in human-readable
form throughout the retention period are included in the CxO or cloud provider assessment. It is also important
that contract givers include SMEs with detailed understanding of both IT and data integrity requirements in the
assessment and ongoing vendor management process.

10.3 Data Governance for CxO

The most important thing to understand if regulated activities have been outsourced to a CxO is that the marketing
authorization holder is no longer in control of their data and it is therefore crucial that they appropriately manage
the contract acceptors. Data created by a contract acceptor remains the responsibility of the contract giver, and, as
described in the ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts [16], needs to be available
for review throughout the record retention period.

The principles for data governance, and details of data ownership and access, should be outlined in the quality
contract and or technical agreement with the CxO [24]. This should include not only access and ownership of live
data but also inactive archived data. This is further discussed in Section 4.2 in the ISPE GAMP® RDI Good Practice
Guide: Data Integrity – Key Concepts [16]. Retention requirements for such data, and even deletion conditions for the
same, must be covered in the quality contract and/or technical agreement.

The CxO may or may not have the expertise to maintain the data in a long-term human-readable format and this
should be evaluated when selecting a third party. One outcome of the assessment discussed in Section 10.2 is an
understanding of the capabilities and weaknesses associated with the services offered. It is important that these are
addressed in the quality contract and/or technical agreement and monitored through a robust vendor management
program.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 100 ISPE GAMP® RDI Good Practice Guide:
Appendix M3 Data Integrity by Design

Often the summaries of any outsourced activity may be presented to the marketing authorization holder in static
format, with the original electronic records secured and archived by the CxO. It is important that the validity of the
summary data compared to the original data has been managed and verified to ensure it is complete and accurate.

Where a CxO stores the contract giver’s data in the cloud, there is a risk that any interruption in the cloud agreement
could cause a loss of that data. The contract giver should consider storing the data from the CxO in their own servers
or cloud provider’s space rather than leaving it up to the CxO; this is especially important if the contract giver decides
to terminate the contract. There may be decisions to make about whether archived records are returned to the
contract giver, in what format, and how to make certain that the contract giver has the software available to read them.

If transferring these electronic records is required, potentially with a change in format, the process should be validated
to ensure the integrity and security of the data during transfer, particularly if the data includes confidential or private
information. If the transfer is completed successfully, the data should only be deleted by the CxO, once the marketing
authorization holder has given explicit documented instructions to do so.

10.4 Data Storage Off-Premise

Data can be stored off-premise either in a commercial datacenter or by hosting in the cloud. For both of these
options, ISO/IEC 27001:2013 [58] applies. This standard provides the requirements for information security
management systems and addresses people, processes, and technology to ensure confidentiality, integrity, and
availability. It includes requirements for areas of governance, risk management, and compliance, such as:

• Physical security

• Network security

• Logical security

• Cyber security

• Data protection to ensure data confidentiality

• Backup power systems and environmental controls

It remains the responsibility of the marketing authorization holder to ensure controls are in place and adequate to
protect the integrity, security, and availability of their regulated data throughout the mandated retention period. Such
requirements should be defined in an SLA before the hosting service begins, and the ongoing performance of the
provider monitored.

If the hosting services are IaaS, it should be straightforward to retrieve any data when needed. Where SaaS is used,
it is important to consider:

• What data governance is needed to ensure data integrity, availability, security, confidentiality, and privacy?

• Can the data be read outside of the SaaS application, or will migration/conversion be required?

• Will the SaaS provider archive the data for the marketing authorization holder, and if so, how will any data
destruction be managed based on predicate rule requirements and any business reasons to extend the retention
period?

• If SaaS provider will not be responsible for archiving the data, how does the marketing authorization holder
obtain the data from the SaaS provider?

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 101
Data Integrity by Design Appendix M3

• What will happen if the SaaS provider goes out of business?

• What will happen if the relationship with the SaaS provider breaks down?

• At the end of the contract, how will it be ensured that all data associated with the SaaS solution are completely
removed from all instances with evidence of destruction from each instance (if applicable) to be provided?

These considerations should be defined in the SLA with the provider.

10.5 Conclusion

There should be no resultant increase in data integrity risks arising from the use of third parties. Data governance
must be in place and effective at all levels, with data ownership and final responsibility for the data remaining with the
marketing authorization holder.

It is important to identify any risks associated with the use of third parties who are creating, processing, or storing
regulated data, and establish appropriate mitigation controls.

11 Appendix D1 – Example Business Process

Appendix D1
Mapping: Laboratory System
This appendix contains an example of mapping a business process. This example is specific to a laboratory process,
however, it is presented to illustrate how a high-level business process provides the foundation to understanding the
detailed activities and user interactions.

In this example, a block diagram (Figure 11.1) shows the steps, a flowchart (Figure 11.2) is used to represent the
activities in the process, and from this a user-centric view is derived (Figure 11.3) identifying which users will interact
with the data.

Figure 11.1: High-level Laboratory System Business Process

Although this block diagram (Figure 11.1) depicts the various steps in the analysis of samples, it does not provide
enough detail to understand areas of risk. Additional detail is required to understand the process, for example:

• What system will be used for sample analysis – complex (NIR), simple (titration temporarily stores data), or
enterprise (HPLC within a CDS)?

• Are data review processes performed per approved process or procedure?

Expanding the sample analysis process enables a more thorough understanding of the individual activities (Figure 11.2).

Figure 11.2: Laboratory Analysis Flowchart

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 104 ISPE GAMP® RDI Good Practice Guide:
Appendix D1 Data Integrity by Design

Mapping the user interaction within the process flowchart aids an understanding of the user roles needed for the
different computerized systems supporting the process. In this example, the laboratory users are primarily split into
basic users (run analyses and report results) and power users (able to create storage folders, analytical methods,
etc.). Outside of (and independent from) the laboratory structure, an administrative account is needed to manage the
systems.

Table 11.1: Laboratory Process – Corresponding Data Creation and Roles

Activity Where Who

1 Laboratory PC User, Power User, Admin

2 Laboratory Bench User, Power User

3 Laboratory Bench User, Power User

4 Laboratory PC Software User, Power User

5 Laboratory PC Software User, Power User

6 PC Software User, Power User

7 Laboratory PC System

8 Laboratory PC Software User, Power User

9 Laboratory Printer User, Power User

10 Server x User, Power User

11 Laboratory PC User, Power User, Admin

12 Laboratory PC Software Power User

From the detailed process flowchart in Figure 11.2, the detailed data flows can be generated, as discussed in
Section 4.3.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 105
Data Integrity by Design Appendix D2

12 Appendix D2 – Instrument Devices with

Appendix D2
Electronic Record Storage
12.1 Introduction

This appendix discusses instrument devices that have moved on from a simple display of the measured value, to
become more and more sophisticated, and now temporarily store electronic records that must be managed and
reviewed. The appendix does not discuss devices limited to a display or printout capability because these will be
managed as manually recorded data.

12.2 Background

Testing instrument devices such as filter integrity testers, particle counters, glucose concentration measurement
systems, balances, and pH meters have historically been considered simple systems. These systems would only
display data or display and temporarily store data. These systems have become more and more advanced over the
past few years. This trend is a good example of the technology enhancement continuum, as vendors add additional
features to make device use more convenient, but at the same time, the data integrity controls continuum must keep
up. These more functional devices should not be run in “simple” mode – where they are treated as their previous
models with only a display or a printout – as this does not address the electronic records generated.

The regulatory agencies are becoming increasing concerned about the risks posed by such devices. In their 2018
guidance, MHRA [12] stated:

“Where the basic electronic equipment does store electronic data permanently and only holds a certain volume
before overwriting; this data should be periodically reviewed and where necessary reconciled against paper
records and extracted as electronic data where this is supported by the equipment itself.”

Functions that enhance data integrity should be used where possible. Appropriate data integrity controls must be
considered and applied.

Many of these types of instruments lack the necessary data integrity capabilities; therefore they require additional
procedural controls and/or enhanced second person verification:

• Limited unique user accounts

• Weak passwords or PIN codes

• Shared administrator passwords/PIN codes

• Lacking a run log (audit trail) of tests/results

• Limited data storage

• Temporary media required for data transfer

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 106 ISPE GAMP® RDI Good Practice Guide:
Appendix D2 Data Integrity by Design

12.3 Instrument Use

The use of an instrument may vary based on instrument type, business process, and data criticality. Below is an
example of routine use approach. Critical thinking should be applied to determine which elements of this approach
are appropriate and what additional steps may be needed.

1. Login and select test method (or no login functionality available).

2. Run test and obtain result and report or ticket printout.

3. Log use in a paper logbook (attach, initial and date printout) or electronic logbook (include passed, failed, or
aborted tests).

Note: A unique identifier such as a filename (if unique) or batch number is needed to ensure traceability between
the data in the system, the logbook, and GxP record.

4. Transcribe result value(s) to the GxP record (e.g., laboratory worksheet or batch record).

5. Second person review (based on risk) of:

• GxP record

• Logbook

• Original electronic data on the device – including the result and a review for data not accounted for in the
logbook

12.4 Accounting for Original Data

All passed, failed, and aborted testing must be accounted for and distinguishable from other valid data generated
during calibration, maintenance, and training activities, in order to detect any “orphan” test results not reported. The
typical way to account for the data is at point/time of use is with a logbook to create a meaningful chronological use
record. These logbook entries can then be used to assist the second person original (electronic) data review as
needed.

This can be considered as a data review “triangle” to review specific results and detect orphan data. See Figure 12.1.

Figure 12.1: Data Review Triangle

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 107
Data Integrity by Design Appendix D2

Below is an example approach to reviewing data. Critical thinking should be applied to determine which elements
of this approach are appropriate to a particular instrument, business process, and what additional controls may be
needed:

1. Identify original data, metadata, and orphan data (unaccounted for data). They may be included in an equipment
use log, test run log, or individual files temporarily stored on the device.

2. Review the logbook (or electronic equivalent) to ensure all test data/files are accounted for and reported, for
example, an equipment use log.

3. Compare the GxP record under review with the associated logbook(s) to ensure all relevant data is reported, i.e.,
passed, failed, and aborted tests.

4. If additional tests or files are identified that cannot be accounted for via the original data or logbook, initiate an
investigation.

12.5 Challenges

The process described above is labor intensive for one device. Where multiple devices are utilized within the
organization, it becomes very challenging for reasons such as:

• Easy location of mobile units (e.g., filter integrity testers) to perform reviews and manual backups

• Sheer volume of data to review and account for on a small screen in busy factories

• Storage of large amounts of data from manual backups having to be curated manually

12.6 Remediation Strategies

This section discusses the benefits of implementing systems with stronger technical controls and better interface
options to improve and simplify the review process.

As with all large volumes of data, getting it into a database or secured flat-file format is the best starting point for
managing it compliantly. This is not always possible of course, but here are some possibilities to consider:

• Select secured Wi-Fi enabled devices to enable backups without having to physically locate the devices

• Select devices with browser-based management software if available. This can enable remote data review and
backup.

• Consider technology to link to devices to pull data into a database

• Electronic logbooks to create the chronological usage records for data reconciliation

The worst scenario is a device that is improved halfway, with a few features to require controls, but not enough
electronic capability to manage the added complexity; therefore, it is better to select devices that include connectivity,
configuration management, and features found in complex devices.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 108 ISPE GAMP® RDI Good Practice Guide:
Appendix D2 Data Integrity by Design

When deliberating the replacement/upgrade of a device, consider the following capabilities for improved data integrity
controls:

• Enhanced access level: operator, power user, system administrator

• Number of users: up to 30 users

• Configurable unique ID and password

• Biometrics: fingerprint capable

• Temporary data storage: 20,000 data points, 250 analyses, system administrator access required for deletion

• Method control: capability to securely store multiple methods

• Auto-read capable: limit the ability to select when to take the reading

• Laboratory Execution System (LES) connectivity: data/metadata transfer

• Track LIMS ID: batch number, sample ID, etc.

The most costly but effective scenario is the use of middleware (i.e., software that enables communication and
data management distributed across various instruments) to connect between an LES and an instrument to provide
instrument control and data management. When selecting vendors for middleware, assess their compliance to
following:

• Compliant QMS

• Robust software development methodology

• Compliance with associated computer system regulations

When selecting middleware, consider the following capabilities:

• Number and types of systems it is able to control (e.g., balance, pH meter, conductivity meter, melting point,
UV/Vis spectrophotometers)

• Seamless integration with the LES

• Ability to utilize Lightweight Directory Access Protocol (LDAP)

• Ability to disable local control

• Software capable of meeting ALCOA+ principles

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 109
Data Integrity by Design Appendix D2

12.7 Risk Register

Depending on the age and severity of technical deficiencies in the device, it may not make sense to replace it
immediately with the latest, most connected version available. Connectivity will certainly enable more efficient
electronic data review and data backup processes, but subject to a documented and justified risk assessment, it may
still be possible to use them in a compliant manner with manual processes. Some of the necessary manual processes
could include:

• Data reconciliation directly on the unit

• Backing up manually to USB and transferring to a networked location

• Backing up either automated or manual via direct USB cable or ethernet connection to a networked location

Note 1: USB ports should be disabled by default on computers creating, processing, or storing GxP data. Under
controlled conditions, a single port may be opened for the duration of transfer of data from a secure and virus-
scanned USB flash drive (see also Note 2).

Note 2: USB flash drives should be used as temporary storage only prior to transferring data to a networked location
that is backed up automatically to preserve the data. An SOP covering USB controls and data transfer and accounting
should be in place for this manual process. It is a best practice to document, at a minimum, the following when using
temporary media to transfer data:

• Source and destination (location or link if e-logbook)

• Number and size of files transferred

• Date and time of transfer

• Verification of data transfer

• Verification of data deletion from source location as needed

Inevitably there is some risk with remaining on current but slightly old technology. Keeping track of these risks and
their mitigations in a risk register is a good way to revisit the device’s compliance during its lifecycle and make an
informed decision as to the optimal time to upgrade or replace. The risk register should be integrated with other QMS
processes to ensure regular review either periodically or as a result of an incident.

12.8 Example Data Integrity Risks, Interim Controls, and Actions to Consider

Table 12.1 identifies typical risks and suggested mitigations for various instruments such as filter integrity testers,
particle counters, pH meter, balance, etc. The best mitigation may be the implementation of middleware, depending
on the type of testing (lot release versus buffer preparation), number of systems, and increase to data integrity. See
Section 12.6 for additional information.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 110 ISPE GAMP® RDI Good Practice Guide:
Appendix D2 Data Integrity by Design

Table 12.1: Data Integrity Risks and Suggested Mitigations for Instrument Devices with Electronic Record
Storage

Data Integrity Risk Mitigation

Interim control:
Record identity of user in GxP record (e.g., ELN) and attach printout
(initial and date) as available, or document in a LIMS system.
Lacks individual login
(Attributable)
Action:
Replace with a more sophisticated instrument based on risk and
availability.

Interim control:
Record results on GMP record or attach printout (initial and date). Add a
Lacks test log/register (audit trail) or
printer if not currently present where possible.
does not store result data (printout
only)
Action:
(Original/Accurate)
Replace with a more sophisticated instrument based on risk and
availability.

Interim control:
Regular backups to USB per procedure; move from USB to a network
location and manually log data in the location.
Backup is manual to USB
(Original/Legible)
Action:
Replace with a more sophisticated instrument based on risk and
availability (e.g., a Wi-Fi enabled device).

Interim control:
Review and reconcile per Figure 12.1 Data Review Triangle, on each unit
Large numbers of devices in individually.
an organization generating a
significant amount of data on a Action:
regular basis, requiring extensive Consider middleware or a networked solution over Wi-Fi with a database
effort for electronic data review and to enable metadata entry and automated backups. LDAP integration to
reconciliation simplify password management for users. Perform data review remote
from the devices. This approach can also reduce the number of personnel
entering graded areas.

12.9 Conclusion

Based on risk, it is important to evaluate the instrument devices to understand their capabilities and design the most
effective mitigation possible. This may involve procedural controls, integration via middleware, or even replacement
of instruments with new models offering more data integrity technical controls. Care should be taken when generating
user requirement or purchase specifications for instrument devices that generate electronic records to ensure that
new instruments have the best available controls for data integrity.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 111
Data Integrity by Design Appendix O1

13 Appendix O1 – Data Analytics and

Appendix O1
Technical Solutions Supporting Data
Integrity
13.1 Introduction

Increased use of technology has triggered increased opportunity and visibility of data integrity incidents, often caused
by human factors, with issues being:

• Unintentional: such as calculation, typographical, or transcription errors

• Intentional: caused by lapses in personal integrity such as sharing user names and passwords, falsification of
documentation for more favorable results, or to cover mistakes, etc.

This appendix focuses on utilizing technology to detect or ideally prevent data integrity issues, and offers thoughts
on ways to design and use systems to increase the visibility of data integrity incidents or concerns, and consequently
promote improved data integrity.

Historically, much of the design and review of systems for data integrity has been verified through the systems
initial validation followed by batch data review, a manual review of audit trails due to routine process, or deviation,
or by inspection (internal or regulatory). Human review of this vast amount of data is time-consuming and far
from exhaustive. An automated review based on tacit knowledge of issues and the design of bots is much more
comprehensive; however, it is reactive and may not discover willful data integrity breaches. A well-designed system
can provide a proactive look at critical steps in a system’s process to prevent issues.

There is one special area of concern for prevention of data integrity issues: failing to permanently record the initial
value before reporting the issue and waiting for correction. Failure to record each value permits unlimited changes
until a desired data value is obtained, which is a specific variation of testing into compliance. Enforced saving of data
at the time of entry provides mitigation for this.

During system design or upgrades to legacy systems, a review of the process and workflows can identify key areas
where critical information is entered, so as to identify technical controls that can be embedded into the system’s
operation. Additionally, the means to control access to the software (either electronic or physical) should be
investigated to restrict the opportunity to manipulate data. Examples include:

• The time required for personnel to properly train on a procedure in a learning management system (Can read
and understand training be performed on a 50-page procedure in two minutes? Maybe it can be by exception
(e.g., the procedure author is documenting their training), but a flag to the employee’s manager can ensure a
proper review.)

• Repeated consistent or identical results entered when variable data is being reported

These are simple examples, and this appendix investigates more detailed and practical applications and thought
processes to leverage technology to improve data integrity and minimize human factors on the data.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 112 ISPE GAMP® RDI Good Practice Guide:
Appendix O1 Data Integrity by Design

13.2 Detecting Data Integrity Issues with Business Rules

13.2.1 Limitations of Business Rules

Before providing examples of business rules that can detect potential issues with data integrity, it is necessary to
understand the limitations of this approach, so it can be put into context with other data integrity efforts, such as
training and well established quality-culture advancements. Some limitations include:

• Some business rule failures cannot be detected with current technology. For example, simple instruments (e.g.,
pH meters, balances) permit a person to remeasure a sample multiple times before forwarding a data value for
retention.

• Business rules must be automated to be practical. Hybrid processes are difficult, if not impossible, to implement
in an efficient manner.

• Every business rule violation must be investigated to assess its merit. Often, a suspect entry has a valid reason
behind it, but investigation is necessary. This investment in time is critical to understand—it means that business
rule queries will reach a point of marginal return. This makes it imperative to monitor use (and effectiveness) of
queries and stop using those that are not providing value in detecting “real” issues in the organization or process.
Not all rules are equal.

One additional factor to consider is the timing of the application. Use of business rules can be applied either as
prevention or detection. As prevention, the business rule is applied at the time data is collected and stored in a
permanent medium. This provides feedback to the person performing the activity, and permits immediate remedial
action: reprocess, re-collect, re-enter. Generally, prevention of an issue is preferable to detection after the issue.

13.2.2 Improving Detection

As manufacturing and laboratories continue the drive to automate operations, operators/analysts, and reviewers will
encounter a growing number of audit trails to enter or review as a routine part of work. If required to manually review
these audit records, firms will find that review requires more time than performing the actual work under review.

Additionally, greater knowledge of processes and how they can be improperly manipulated leads to an increasing
need to look for data that is suspicious due to its content or the sequence of events that occurred during the conduct
of the activity. Again, it is easy to overwhelm reviewers with a huge amount of files and records to be reviewed for
potential issues.

Increasing adoption of technology has created both these issues, and technology is also the solution. Developing
automated business rules for data integrity enables personnel to look for potential issues in a time-sensitive manner.
Automated searches through electronic data offer several advantages:

• Consistent review regardless of the person involved

• Consolidate multiple data sources into a single report

• Reduce the review to a small set of records (only records matching the criteria)

• Permit personnel to focus on records with a higher probability of issues

• Permit automated collection of metrics for the process

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 113
Data Integrity by Design Appendix O1

13.2.3 Ideas for Business Rules

Business rules should be generated taking into account the business process mapping, data flow diagrams, and the
applicable regulations.

Other sources of information that can be considered are:

• Regulatory enforcement actions (Notice of Concern, Warning Letters, 483 Documents) can provide a wealth of
knowledge about the places and manners in which data can be improperly manipulated or deleted. For each
specific observation, add the question, “How would we prevent or detect that?” and a new business rule is
developed.

• CAPA and deviation reports can provide an opportunity to prevent recurrence of data integrity issues using
business rules to prevent or detect aberrant data. In addition, these reports are internal, so they can point to
behaviors within the organization that deserve special attention.

• Lifecycle status reviews: look at the statuses of materials as they move through the business process and from
system to system. By looking at combinations of these statuses, discrepancies can be identified. For instance,
an approved batch will have a status of “Approved” in the batch record application, and all the tests will be
“Released” in the LIMS system. Any status other than these indicates something unusual. Status reviews provide
a simple, high-level technique for consistency in records.

13.2.4 Business Rule Examples

Business rules establish normal patterns for getting work done in an efficient manner. Ideally, they can be used
to point out unusual data values that merit added investigation by a qualified person who deeply understands the
business process to determine the merit of the data. This assures that data used to make decisions has a high
probability of meeting ALCOA+ attributes.

Because there are a wide variety of systems and equipment used in dedicated manufacturing and other GxP
processes, it is unreasonable to expect off-the-shelf business rule checks. The reality is that business rules must be
developed for specific processes, equipment, and systems. Something as simple as a different model of equipment
can change data integrity vulnerabilities. While there is opportunity for the reuse of rules, there is no substitute for
qualified people to create and manage business rules to ensure they meet the quality objectives of the firm.

The following are example business rules, organized by system type:

• All computerized systems

- Security Roster: List of personnel, sorted by security role, permits review of the current access rights for
everyone. Special attention should be paid to roles with enhanced rights to administer accounts, review
data, modify calculations or data, or modify method/workflow parameters.

- Access Change History: While the access roster displays the current access state, this report provides
a list of changes in access rights over time and can identify changes made in the past that may have an
impact on reported data.

- Date of Last Access: A report of all users and their last date of access is valuable when removing users
from a system who no longer require system access. This report drives the right behavior: if someone no
longer requires access to the system, remove their access.

- Vendor Accounts: This report is a list of vendor (or non-employee) accounts. Ideally, it also includes the
dates of access into the system by any vendor. This report could be combined into the Security Roster and
Access Change History, but vendors represent a higher risk than users, and should be carefully reviewed.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 114 ISPE GAMP® RDI Good Practice Guide:
Appendix O1 Data Integrity by Design

- Elevated Accounts: This report lists elevated accounts and compares it to a list of approved elevated
accounts along with inactivated access to detect anyone that should not have elevated privileges or remove
access when someone leaves the area or the company. The approved elevated access should be controlled
procedurally to prevent anyone with a conflict of interest in the business area from getting that access.

- Important Changes to System: This report scans the system log and extracts changes to the system clock,
the recycle bin, plus any other system events that might be important to test.

- Recycle Bin: A list of files in the recycle bin permits a review of the files to determine if GxP data is being
discarded (Note: it is possible to delete data without it appearing in the recycle bin).

- Sample Statuses: Larger applications, such as ELN, LIMS, EBRS or Electronic Document Management
Systems (EDMS) often use statuses to manage the flow of data through a process. Two important reports
can look at statuses for potential issues: (1) List of items that are in statuses not supported by company
business practices. These are statuses present in a commercial system that are not needed for company
practices. These statuses are typically mentioned as unused in SOPs; (2) Statuses that are not “aligned”
across systems. For example, a released batch might have all tests “Released” in an ELN/LIMS, “Complete”
in the EBRS, and “Approved” in the EDMS. If QC laboratory personnel rolled back a test to make changes,
the test would not be “Released.” This discrepancy (statuses not all in their terminal state) can be detected
with this report. Similarly, flag any data that has not been reviewed appropriately prior to batch release (e.g.,
audit trail review). These must be resolved before releasing the batch.

- Audit Trail Change Reasons: There is tremendous value in a report that simply prints the date and time
and the change reason for all changes in an audit trail. Such a report provides the means to assess the
quality of entries in the audit trail.

- Output by Analyst: A report listing the amount of work performed per analyst permits reviewers to identify
performance that is “too good to be true” or a spike that would indicate shared account usage. However,
such reports should anonymize user names to avoid privacy issues.

- Control Chart by Analyst: Some systems permit the use of control charts by analyst to look for aberrant
trends. For example, data variability that is significantly skewed, or significantly less variable than other
analysts. Repeated use of a data set with the same time stamps or information could also be detected.
Again, user names should be anonymized in the report to avoid privacy issues.

• Chromatography: Many data integrity enforcement actions have resulted from improper manipulation or
exclusion of chromatographic data. Modern systems maintain audit trails that indicate changes to sample
injection sequences, integration of peaks, processing of injections, and calculation of reporting results. The
inherent flexibility and complexity of these systems requires a number of business rules to assure data integrity.
Some business rules of value include:

- Manual Integration of Peaks in any Injection: Auto-integration of peaks is the regulatory expectation,
but may not be a reality for legacy methods. Peaks manually integrated are expected to receive enhanced
review by qualified personnel, to assure that baselines are not created to bias the results.

- By Site and Method: A report of percentage of peaks auto-integrated versus manually integrated. This can
lead efforts to reduce manual integration, resulting in improvements in the consistency and efficiency of
results.

- Aborted Runs: Firms have used the Abort Run feature to prevent undesired data from being successfully
recorded so they are not required to justify it (or reject material). Consequently, any aborted runs must be
justified. In addition, it would be useful to report the number of aborted runs over a time period (e.g., month
or quarter) to ensure that the feature is not used to destroy unwanted data.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 115
Data Integrity by Design Appendix O1

- Multiple Integrations of Data: In addition to the use of manual integration to bias results, some firms
manipulate auto-integration parameters to obtain the desired results. This can be monitored by tracking the
number of changes to the processing method over a time period. Another means to detect reprocessing is to
search data channels for multiple entries for the same channel ID.

- Unprocessed Injections: When analysts are permitted to view peaks in real time, they can often determine
if peaks will yield desired results before the injection is processed. Personnel may choose to ignore an
injection and reinject the solution to get a more favorable result. Failure to process and justify all injections is
a clear violation of GxP requirements for complete records of testing.

- Short Runs: Personnel who repeatedly inject samples to obtain favorable results can leave unprocessed
injections (above), or create additional sample sequences (with 1 to 2 samples), or reinject samples under
testing. Searching for sample sequences with 1 to 2 samples can detect this behavior.

- Missing Injections in Sample Sequence: Since runs have a fixed injection length, look for times between
injections that go beyond the length of a single injection time. This indicates a missing injection deleted after
the event.

• Benchtop Systems: Common data integrity gaps for stand-alone benchtop systems include copying of data files
to other sample identities, collecting data and not forward processing it (choosing “best” result) for batch review,
permitting everyone to adjust instrument parameters to “adjust” test outcome, and permitting users to delete test
results. Because many of these systems generate data in individual files, it may be difficult to determine if a test
result has been processed and forwarded for inclusion in the batch record.

- Runs Present in Batch Release: This is a comparison between files in LIMS/ELN, and the archive of the
benchtop system. This report would ordinarily be used to identify data files not forwarded for inclusion in the
batch record.

- Modified Files: Identifying files where the modified date is later than the creation date indicates data
updates after the original save. In some systems, by design this will not be useful, but systems where data is
settled prior to saving the data file, this search can identify files that merit review. This is especially true in an
archive where files should be protected against update.

- Deleted Files: If a PC is configured so files may be placed in a recycle bin but not deleted, it is possible to
identify deleted files and review the appropriateness of each deleted file.

• Enterprise Systems: These systems provide superior data integration and handling, enabling reviewers to
closely look for data issues that merit closer scrutiny.

- Time Sequence Anomalies: A report that identifies steps in workflow that are out of order based on date
and time stamps, permits a review for multiple uploads of data, changes after test completion, or clock
modifications.

- Time to Complete Process by Method and Analyst: Comparing average time to complete a method
across several analysts can yield valuable information about resourcing and can also identify individuals
who complete a process in significantly less time than other people in the work group.

- Multiple Uploads: External systems that upload multiple times to a method might indicate someone
attempting to pick a dataset giving a favorable result. A threshold of three uploads is recommended to start
this report.

The above items are examples only; critical thinking should be applied to determine the business rules most
applicable and appropriate for an organization or business process.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 116 ISPE GAMP® RDI Good Practice Guide:
Appendix O1 Data Integrity by Design

13.2.5 Artificial Intelligence and Machine Learning in Reports

A great potential exists in this area to adopt AI and ML algorithms to look for data patterns and trends that merit
further review. This area of exploration is in its infancy within GxP areas, but is expected to grow rapidly in the coming
years. Algorithms have the potential to identify sites with differing error rates, outputs, and variability. This will provide
sites with the ability to ask questions and gain insights into their processes. Experience has shown that an initial
“danger” of these technologies is their ability to create questions faster than they can be processed and answered by
the organization. AI and ML are further covered in Appendix S1.

13.2.6 Governance

Senior management want metrics to measure the organization’s progress in detecting data integrity issues. It should
be noted that these reports will cause a few individuals to modify their practices to avoid detection. There are metrics
that can provide insights into uptake and efficiency:

• By Site and by Report: Number of times a report is executed, and the number of records in the report

• By Site: Number of investigations conducted based on reports, time invested in investigations, and number of
confirmed issues detected

These metrics, along with training, provide basic governance and allow leaders to determine which reports are
effective in detecting issues in the organization and use the knowledge to drive continual improvement. The number
of report executions measures the uptake of the reports into each site, especially in the first months of program
implementation.

13.3 Technical Solution Using Computer Lockdown via Software Shells

Some vendor programs allow analysts read/write/delete access to the desktop and folders where data is stored
because the system did not allow certain critical functions to be separated from user access roles. This creates
a serious problem with ensuring data integrity because the user can potentially edit or delete data at will and test
into compliance. They can also potentially download and run software that can cause problems with the data or the
system. Having a system with these gaps necessitates locking down the computer to prevent any potential problems
with data integrity.

Alternate Operating System Shells (“shells”) are software applications that provide additional configuration
capabilities to ensure only authorized people can access the computer and restrict what the user can do. They do not
fundamentally change the operating system in use but add missing capabilities to give greater control over files and
data. These can be used to effectively restrict the users to only important tasks related to their work and keeps them
away from deleting or moving the original data until it can be swept into the archive.

Shells may require different settings for different configurations of software. It is not a one-size-fits-all with the
configuration and may need tweaking to get different vendor software to run appropriately. When developing
the procedure for installation, some flexibility needs to be incorporated into the settings for adjustments without
compromising the lockdown that needs to be accomplished.

Lockdown Features

The shell software can log off users due to inactivity and block virtually any application or any part of it: window,
popup message, or dialog box. Applications can be set to run as Administrator, should they need Administrator
access without giving that level of access to the user.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 117
Data Integrity by Design Appendix O1

Shells can be configured to:

• Disable taskbar, desktop, clipboard, control panel, safe mode, drag and drop, and many others

• Block access to data acquisition folders

• Restrict the browser to allow access to trusted sites only and block all others

• Use a selective start menu to show only items to which the user has been given access

• Prevent users from using the downloaders, installing software, and block unwanted programs

• Hide system and network drives and block access to USB drives, DVDs, and CD burners

• Block hotkeys, function keys, and mouse buttons

The software can monitor changes to the system and write these changes to the log file.

14 Appendix O2 – Good Practices for

Appendix O2
General Archiving
This appendix discusses good archiving practice, which starts with data of high quality and that is fit for its
intended use.

14.1 What is Archiving?

The archive is intended as a long-term storage solution (years). It is important to remember that archiving fulfills
a different need to backup, which is a duplicate copy of digital data intended to assist with disaster recovery.
Maintaining backup copies of data is not a viable solution to archiving.

Data must be recorded in a durable, maintainable form for the retention period. The archive can only maintain the
level of integrity of the incoming data, it cannot remediate existing integrity issues.

14.2 What to Archive

Where the intention is to retain data, data should be retained for a reason.

Archive only those records that are required by predicate rules and legislation, by business practices (including legal
considerations), or by procedures. Determine whether there is a need to keep all records or whether those that do not
fall under these categories can be safely discarded. Decisions should be documented and justified.

The archive is intended for long-term storage of key information. Draft documents are usually either superseded
by issued versions of the same document or never issued and therefore should not be archived. Draft documents
should only be archived if the business process requires the retention of drafts. If it is important to understand why a
document was never formally issued, the decision and reasons for this can be captured in a separate document that
is archived.

Perform the archiving of data at established intervals based on the system, user requirements, and regulatory
requirements. Often, data may be archived once the likelihood of accessing that record has reduced to a given
frequency, say once a year. Alternatively, archive data at a well-defined point in the workflow, e.g., at the end of final
approval of a GLP study or manufacturing batch.

14.3 How to Archive

When designing an electronic archive, consider that the required speed of response for different system parts may
not be the same. For example, a higher speed of response generally will be required for the data management
functions, such as Search and Report, compared with the retrieval of a specific Archive Information Package.

Individual computerized systems need to provide a mechanism to archive complete and accurate records, including
relevant metadata. Controls need to be in place to retrieve and read the data (including metadata) during the
retention period.

It is not uncommon to find a record in several places. When archiving such a record, locate all copies of it, check they
are consistent and archive only one record while deleting the others (following the due process for deletion). Add a
link from the location of the deleted copies to the retained record.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
Page 120 ISPE GAMP® RDI Good Practice Guide:
Appendix O2 Data Integrity by Design

When archiving a newer version of a document, it is considered essential to record the reason for the update and
what is being replaced, augmented, or deleted. This applies also to where a new document replaces one or several
existing documents. Note that all previous versions are still be retained in the archive for auditing purposes. It is also
critical that the superseded document record provides a link to the updated document to ensure the complete records
and history can be obtained.

A directory structure should be defined where metadata is saved in the same directory as the parent data. This will
facilitate maintenance and linking of metadata with parent data. Directory and filenames should be given careful
consideration. Some archiving software truncates filenames. Other software changes file metadata, such as the
timestamps. The path name alone should not be relied upon for identifying the record. Path names may change, and
are usually not sufficiently robust for record identification and location.

Avoid archiving compressed data files. These introduce another layer of conversion and the potential for data
corruption, as well as bringing the issue of continued availability of the decompression tool and an operating system
on which to run it. In some instances, this may result in the inability to fully restore the original data, that is, there is
loss of resolution or metadata.

Use procedures to preserve the integrity and identity of data when archiving from one set of media, such as files
on hard drive to another. Consider any potential loss of resolution when migrating images to formats such as JPG;
consider using TIFF instead. Hyperlinks should be avoided where the URL could change without the knowledge of
the Archivist.

14.4 Managing an Archive

Procedures should confirm that archived data, including relevant metadata, is available and human readable. The
archive needs to protect records from deliberate or inadvertent loss, damage, and/or alteration for the retention
period. Security controls should be in place to ensure the data integrity of the record throughout the retention period,
and validated where appropriate.

Verify that record integrity is being maintained, for example, by conducting tests as part of scheduled maintenance or
by regularly reviewing audit trails. Make sure there is a test environment for testing software and hardware changes
without the risk of corrupting data which has already been archived.

Make sure that there is only one source of control to the archived data, using a small number of well-defined indices
and procedures. There may be several archivists, but they should all be using the same methods and procedures.
Keep the quality role separate from the archivist role. Quality should be seen to be independent, particularly during an
audit. Ensure that quality is empowered to report directly to management.

Collect relevant metrics from the operation of the archive so that its performance can be substantiated and monitored.
The metrics also will enable future requirements to be better predicted and planned. Sufficient funding should be
allocated for the ongoing maintenance of the archive.

If the archive is contracted to a third party, ensure contracts specify responsibilities to ensure the data is secure and
retrievable throughout the record retention period, even if the contract ends or the third party can no longer support
the contract.

Handle operational difficulties through a formal event and problem reporting system. Hold periodic meetings with the
key archive stakeholders to review archive procedures, performance, migration requirements, future requirements,
etc. Information Systems (IS)/IT are important stakeholders in the archival process and it is important to include IS/IT
in any relevant communications and meetings.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
ISPE GAMP® RDI Good Practice Guide: Page 121
Data Integrity by Design Appendix O2

Ensure that IS/IT understands the archiving needs clearly so that the infrastructure is appropriate and can safely
support the archive. Ensure that backup and disaster recovery activities are appropriate to the system and media
used. Relying on a single copy of archive data is a significant data integrity risk.

The archive should be managed in alignment with the data lifecycle. This includes the destruction of all archive
copies, including backups, when the records reach the end of their retention period unless required to be retained for
legal reasons.

Consider to what level data should be deleted to become fully destroyed. Many deletion processes simply remove
the reference to the data without actually removing the data itself. The audit trail is part of the record and should be
destroyed with it. It is important to clearly specify procedures for the destruction of data. Eventually archive systems will
need to be retired. The migration of the archived data is critical. An archive that is allowed to outlive its economic life
will become expensive to maintain and eventually non-maintainable. Retirement planning should be part of the design.

15 Appendix O3 – GLP Archiving

Appendix O3
Considerations
This appendix builds on Appendix O2 and adds the more detailed requirements needed for GLP.

Most GLP regulations (e.g., 21 CFR 58 [59], OECD Good Laboratory Practice and Compliance Monitoring No. 15
[60], etc.) have a specific requirement for an archive with an associated a position as an Archivist. The archivist
controls and oversees the archive and the input and removal of data from the archive. An archivist can have deputies
if required.

The principles described here can be the basis of records management for other GxP disciplines.

The basic requirements for the GLP archive for both physical (paper, pathology slides, specimens, etc.) and
electronic records are the same:

• The location of the archive is required for the GLP study report (Note: For an electronic record, the location may
be a UNC path or URL hyperlink to electronic storage).

• Physically secure (e.g., locks or key card access)

• Restricted access for authorized individuals. Usually there is a paper log for recording who and when an
individual enters and leaves the archive. Visitors must be escorted and this recorded in the log.

• To ensure that physical records are protected and stored in optimum conditions, the environment must be
monitored and the records protected from hazards (e.g., weather, water pipes, fire, pests, etc.)

• There must be an index of archived studies and supporting records (e.g., training records, organization charts,
etc.)

• The index is updated as studies enter the archive or as studies are destroyed at the end of the applicable
retention period.

• There must be a record of when study data was taken out and by whom and when the package was returned
and by whom. The records should be reconciled to see if any records are missing or have been added to.

• Archived records must be readable over the record retention period. This is a sponsor responsibility, as data
ownership should make clear, if the archive is outsourced.

• An archive can be contracted to a third party provided there is an agreement detailing the roles and
responsibilities of both parties and the right for the sponsor to audit and allow GLP inspections.

• Location of the electronic archive must be included in each GLP study report.

An electronic archive is allowable under OECD GLP No. 15 [60] as one of three options:

• The same system that generated the data

• A separate online electronic archive

• Off-line media storage (in which case, the media must be stored in the GLP archive)

For electronic records there are additional requirements for the archive:

• The archive can be outsourced, but the test facility management is responsible for archiving and a facility outside
of the GLP monitoring program must have a QMS. Test facility management must evaluate the QMS and risks for
SaaS and any e-archiving service. The need for an audit should be risk based.

• IT staff involved with operating or supporting a GLP electronic archive must be trained in GLP regulations as it
impacts their work.

• An electronic archive should be separate and secure, in practice explicitly marked as archived and locked so that
they cannot be changed. Studies must be under the control of the archivist.

• A single global instance is acceptable as an archive as long as relevant SOPs mention this.

• The archiving process must not change electronic data, i.e., the content and meaning is maintained for all data.
Dynamic data should remain dynamic as long as is feasible, to preserve the GxP content and meaning.

• Keeping the data in the original system is the better option as the application database will ensure the data
structures, data, and metadata of each study is maintained. This is acceptable if the study records can be locked
and cannot be changed by normal users.

Locked study records can be viewed by any user in an application and this does not count as accessing the archive.

16 Appendix O4 – Example Retention

Appendix O4
Periods and Requirements
This appendix gives examples of retention periods from various regulations around the world.

Note that this Guide mainly focuses on relevant GxP regulations. It is important to keep in mind that non-GxP
regulations can also impact the retention of records.

The ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts Appendix 5 [16] explains and defines the
specific data terminology used in the different regulations, e.g., raw data, source data, etc.

Table 16.1 summarizes extracts from the key regulations and directives listed that specify requirements for how data
is archived and for how long. It is not intended to be a complete reference but provides some additional information to
support the statements made within the body of the document. A regulated company should determine for itself which
regulations and corresponding retention periods apply.

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods

1. 21 CFR Part 11—Electronic Records; Electronic Signatures (US) [10]

2. 21 CFR Part 58—Good Laboratory Practice for Nonclinical Laboratory Studies (US) [59]

3. 21 CFR Part 211—Current Good Manufacturing Practice for Finished Pharmaceuticals (US) [29]

4. 21 CFR Part 606—Current Good Manufacturing Practice for Blood and Blood Components (US) [61]

5. 21 CFR Part 820—Quality System Regulation (Medical Device) (US) [62]

6. Commission Directive 2003/94/EC of 8 October 2003 laying down the principles and guidelines of good
manufacturing practice in respect of medicinal products for human use and investigational medicinal products for
human use (EU) [63]

7. Directive 2001/83/EC of the European Parliament and of the Council of 6 November 2001 on the Community
Code Relating to Medicinal Products for Human Use (EU) [64]

8. EudraLex – Volume 4 – Good Manufacturing Practice (GMP) guidelines (EU) [65]

PIC/S PE 009-14: Guide to Good Manufacturing Practice for Medicinal Products [66]

9. Guidelines on Good Distribution Practice of Medicinal Products for Human Use – 2013/C 343/01 (EU) [67]

10. ICH E6 (R2) Guideline for Good Clinical Practice (tripartite guideline EMA/CHMP/ICH) [45]

11. EU No. 536/2014 on Clinical Trials on Medicinal Products for Human Use [68]

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods (continued)

Source Section Extracted Text

Ref: 1 Part 11.10(c) Protection of records to enable their accurate and ready
21 CFR Part 11 [10] retrieval throughout the records retention period.

Ref: 1 Part 11.10(e) Use of secure, computer-generated, time-stamped audit

21 CFR Part 11 [10] trails to independently record the date and time of operator
entries and actions that create, modify, or delete electronic
records. Record changes shall not obscure previously
recorded information. Such audit trail documentation shall
be retained for a period at least as long as that required
for the subject electronic records and shall be available for
agency review and copying.

Ref: 2 58.33(f) All raw data, documentation, protocols, specimens, and

21 CFR Part 58 [59] final reports are transferred to the archives during or at the
close of the study.

Ref: 2 58.51 Space shall be provided for archives, limited to access by

21 CFR Part 58 [59] authorized personnel only, for the storage and retrieval of
all raw data and specimens from completed studies.

Ref: 2 58.190(a) All raw data, documentation, protocols, final reports,

21 CFR Part 58 [59] and specimens (except those specimens obtained from
mutagenicity tests and wet specimens of blood, urine,
feces, and biological fluids) generated as a result of a
nonclinical laboratory study shall be retained.

Ref: 2 58.190(b) There shall be archives for orderly storage and expedient
21 CFR Part 58 [59] retrieval of all raw data, documentation, protocols,
specimens, and interim and final reports. Conditions of
storage shall minimize deterioration of the documents
or specimens in accordance with the requirements for
the time period of their retention and the nature of the
documents or specimens. A testing facility may contract
with commercial archives to provide a repository for all
material to be retained. Raw data and specimens may
be retained elsewhere provided that the archives have
specific reference to those other locations.

Ref: 2 58.190(c) An individual shall be identified as responsible for the

21 CFR Part 58 [59] archives.

Ref: 2 58.190(d) Only authorized personnel shall enter the archives.

21 CFR Part 58 [59]

Ref: 2 58.190(e) Material retained or referred to in the archives shall be

21 CFR Part 58 [59] indexed to permit expedient retrieval.

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods (continued)

Source Section Extracted Text

Ref: 2 58.195(b) Except as provided in paragraph (c) of this section,

21 CFR Part 58 [59] documentation records, raw data and specimens
pertaining to a nonclinical laboratory study and required to
be made by this part shall be retained in the archive(s) for
whichever of the following periods is shortest:

(1) A period of at least 2 years following the date on which

an application for a research or marketing permit, in
support of which the results of the nonclinical laboratory
study were submitted, is approved by the Food and Drug
Administration. This requirement does not apply to studies
supporting investigational new drug applications (IND’s) or
applications for investigational device exemptions (IDE’s),
records of which shall be governed by the provisions of
paragraph (b)(2) of this Section.

(2) A period of at least 5 years following the date on

which the results of the nonclinical laboratory study are
submitted to the Food and Drug Administration in support
of an application for a research or marketing permit.

(3) In other situations (e.g., where the nonclinical

laboratory study does not result in the submission of
the study in support of an application for a research or
marketing permit), a period of at least 2 years following
the date on which the study is completed, terminated or
discontinued.

Ref: 2 58.195(d) The master schedule sheet, copies of protocols, and

21 CFR Part 58 [59] records of quality assurance inspections, as required by
§58.35(c) shall be maintained by the quality assurance
unit as an easily accessible system of records for the
period of time specified in paragraphs (a) and (b) of this
section.

Ref: 2 58.195(f) Records and reports of the maintenance and calibration

21 CFR Part 58 [59] and inspection of equipment, as required by §58.63(b)
and (c), shall be retained for the length of time specified in
paragraph (b) of this section.

Ref: 2 58.195(g) Records required by this part may be retained either as

21 CFR Part 58 [59] original records or as true copies such as photocopies,
microfilm, microfiche, or other accurate reproductions of
the original records.

Ref: 3 211.180(a) Any production, control, or distribution record that is

21 CFR Part 211 [29] required to be maintained in compliance with this part and
is specifically associated with a batch of a drug product
shall be retained for at least 1 year after the expiration
date of the batch or, in the case of certain OTC drug
products lacking expiration dating because they meet
the criteria for exemption under §211.137, 3 years after
distribution of the batch.

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods (continued)

Source Section Extracted Text

Ref: 3 211.180(b) Records shall be maintained for all components, drug

21 CFR Part 211 [29] product containers, closures, and labeling for at least 1
year after the expiration date or, in the case of certain OTC
drug products lacking expiration dating because they meet
the criteria for exemption under §211.137, 3 years after
distribution of the last lot of drug product incorporating the
component or using the container, closure, or labeling.

Ref: 3 211.180(c) All records required under this part, or copies of such
21 CFR Part 211 [29] records, shall be readily available for authorized inspection
during the retention period at the establishment where
the activities described in such records occurred. These
records or copies thereof shall be subject to photocopying
or other means of reproduction as part of such inspection.
Records that can be immediately retrieved from another
location by computer or other electronic means shall be
considered as meeting the requirements of this paragraph.

Ref: 3 211.180(d) Records required under this part may be retained either
21 CFR Part 211 [29] as original records or as true copies such as photocopies,
microfilm, microfiche, or other accurate reproductions of
the original records. Where reduction techniques, such as
microfilming, are used, suitable reader and photocopying
equipment shall be readily available.

Ref: 4 606.160(a) (1) Records shall be maintained concurrently with the

21 CFR 606 [61] performance of each significant step in the collection,
processing, compatibility testing, storage and distribution
of each unit of blood and blood components so that all
steps can be clearly traced. All records shall be legible
and indelible, and shall identify the person performing
the work, include dates of the various entries, show test
results as well as the interpretation of the results, show the
expiration date assigned to specific products, and be as
detailed as necessary to provide a complete history of the
work performed.

(2) Appropriate records shall be available from which to

determine lot numbers of supplies and reagents used for
specific lots or units of the final product.

Ref: 4 606.160(d) Records shall be retained for such interval beyond the
21 CFR 606 [61] expiration date for the blood or blood component as
necessary to facilitate the reporting of any unfavorable
clinical reactions. You must retain individual product
records no less than 10 years after the records of
processing are completed or 6 months after the latest
expiration date for the individual product, whichever is the
later date. When there is no expiration date, records shall
be retained indefinitely.

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods (continued)

Source Section Extracted Text

Ref: 5 820.180(b) Record retention period. All records required by this part
21 CFR Part 820 [62] shall be retained for a period of time equivalent to the
design and expected life of the device, but in no case
less than 2 years from the date of release for commercial
distribution by the manufacturer.

Ref: 6 Article 9.1 For a medicinal product, the batch documentation shall
Commission Directive be retained for at least one year after the expiry date of
2003/94/EC [63] the batches to which it relates or at least five years after
the certification referred to in Article 51(3) of Directive
2001/83/EC, whichever is the longer period.

For an investigational medicinal product, the batch

documentation shall be retained for at least five years
after the completion or formal discontinuation of the last
clinical trial in which the batch was used. The sponsor
or marketing authorisation holder, if different, shall be
responsible for ensuring that records are retained as
required for marketing authorisation in accordance with
the Annex I to Directive 2001/83/EC, if required for a
subsequent marketing authorisation.

Ref: 6 Article 9.2 When electronic, photographic, or other data processing

Commission Directive systems are used instead of written documents, the
2003/94/EC [63] manufacturer shall first validate the systems by showing
that the data will be appropriately stored during the
anticipated period of storage. Data stored by those
systems shall be made readily available in legible form
and shall be provided to the competent authorities at their
request. The electronically stored data shall be protected,
by methods such as duplication or backup and transfer
on to another storage system, against loss or damage of
data, and audit trails shall be maintained.

Ref: 7 Title IV In all cases and particularly where the medicinal products
Directive 2001/83/EC Article 51(3) are released for sale, the qualified person must certify in a
[64] register or equivalent document provided for that purpose,
that each production batch satisfies the provisions of this
Article; the said register or equivalent document must be
kept up to date as operations are carried out and must
remain at the disposal of the agents of the competent
authority for the period specified in the provisions of the
Member State concerned and in any event for at least five
years.

Ref: 7 Title VII Holders of the distribution authorization must fulfil the
Directive 2001/83/EC Article 80(f) following minimum requirements:
[64]
they must keep the records referred to under (e) available
to the competent authorities, for inspection purposes, for a
period of five years.

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods (continued)

Source Section Extracted Text

Ref: 8 Chapter 4 The term ‘written’ means recorded, or documented on

EudraLex Volume 4 [65] Principle media from which data may be rendered in a human
PIC/S PE 009-14 (Part readable form.
1) [66]

Ref: 8 Chapter 4 Records: Provide evidence of various actions taken to

EudraLex Volume 4 [65] Required GMP demonstrate compliance with instructions, e.g., activities,
PIC/S PE 009-14 (Part documentation (by type) events, investigations, and in the case of manufactured
1) [66] batches a history of each batch of product, including its
distribution. Records include the raw data which is used to
generate other records. For electronic records regulated
users should define which data are to be used as raw
data. At least, all data on which quality decisions are
based should be defined as raw data.

Ref: 8 Chapter 4 It should be clearly defined which record is related to each

EudraLex Volume 4 [65] 4.10 manufacturing activity and where this record is located.
PIC/S PE 009-14 (Part Secure controls must be in place to ensure the integrity of
1) [66] the record throughout the retention period and validated
where appropriate.

Ref: 8 Chapter 4 Specific requirements apply to batch documentation

EudraLex Volume 4 [65] 4.11 which must be kept for one year after expiry of the batch
PIC/S PE 009-14 (Part to which it relates or at least five years after certification
1) [66] of the batch by the Qualified Person (Authorised
Person), whichever is the longer. For investigational
medicinal products, the batch documentation must
be kept for at least five years after the completion or
formal discontinuation of the last clinical trial in which
the batch was used. Other requirements for retention of
documentation may be described in legislation in relation
to specific types of product (e.g., Advanced Therapy
Medicinal Products) and specify that longer retention
periods be applied to certain documents.

Ref: 8 Chapter 4 For other types of documentation, the retention period will
EudraLex Volume 4 [65] 4.12 depend on the business activity which the documentation
PIC/S PE 009-14 (Part supports. Critical documentation, including raw data (for
1) [66] example relating to validation or stability), which supports
information in the Marketing Authorisation should be
retained whilst the authorization remains in force. It may
be considered acceptable to retire certain documentation
(e.g., raw data supporting validation reports or stability
reports) where the data has been superseded by a full set
of new data. Justification for this should be documented
and should take into account the requirements for
retention of batch documentation; for example, in the case
of process validation data, the accompanying raw data
should be retained for a period at least as long as the
records for all batches whose release has been supported
on the basis of that validation exercise.

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods (continued)

Source Section Extracted Text

Ref: 8 Chapter 6 Any Quality Control documentation relating to a batch

EudraLex Volume 4 [65] 6.8 record should be retained following the principles given in
PIC/S PE 009-14 (Part Chapter 4 on retention of batch documentation.
1) [66]

Ref: 8 Chapter 6 Some kinds of data (e.g. tests results, yields,

EudraLex Volume 4 [65] 6.9 environmental controls) should be recorded in a manner
PIC/S PE 009-14 (Part permitting trend evaluation. Any out of trend or out of
1) [66] specification data should be addressed and subject to
investigation.

Ref: 8 Chapter 6 In addition to the information which is part of the batch

EudraLex Volume 4 [65] 6.10 documentation, other raw data such as laboratory
PIC/S PE 009-14 (Part notebooks and/or records should be retained and readily
1) [66] available.

Ref: 8 Annex 2 Where human cell or tissue donors are used full
EudraLex Volume 4 [65] 28 traceability is required from starting and raw materials,
PIC/S PE 009-14 (Biological substances including all substances coming into contact with the
(Annexes) [66] and products) cells or tissues through to confirmation of the receipt of
the products at the point of use whilst maintaining the
privacy of individuals and confidentiality of health related
information. Traceability records must be retained for 30
years after the expiry date of the medicinal product.

Ref: 8 Annex 3 Records should be retained for at least 3 years unless

EudraLex Volume 4 [65] 33 another timeframe is specified in national requirements.
PIC/S PE 009-14 (Radiopharmaceuticals)
(Annexes) [66]

Ref: 8 Annex 11 Data should be secured by both physical and electronic

EudraLex Volume 4 [65] Section 7.1 means against damage. Stored data should be checked
PIC/S PE 009-14 for accessibility, readability and accuracy. Access to data
(Annexes) [66] should be ensured throughout the retention period.

Ref: 8 Annex 11 Regular back-ups of all relevant data should be done.

EudraLex Volume 4 [65] Section 7.2 Integrity and accuracy of back-up data and the ability to
PIC/S PE 009-14 restore the data should be checked during validation and
(Annexes) [66] monitored periodically.

Ref: 8 Annex 11 Data may be archived. This data should be checked for
EudraLex Volume 4 [65] Section 17 accessibility, readability and integrity. If relevant changes
PIC/S PE 009-14 are to be made to the system (e.g. computer equipment or
(Annexes) [66] programs), then the ability to retrieve the data should be
ensured and tested.

Ref: 8 Annex 12 The documentation associated with the validation and

EudraLex Volume 4 [65] 45 commissioning of the plant should be retained for one year
PIC/S PE 009-14 (Ionizing radiation) after the expiry date or at least five years after the release
(Annexes) [66] of the last product processed by the plant, whichever is
the longer.

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods (continued)

Source Section Extracted Text

Ref: 8 Annex 14 (EudraLex) Data needed for full traceability must be stored
EudraLex Volume 4 [65] 4.3 for at least 30 years, according to Article 4 of Directive
PIC/S PE 009-14 (Blood and plasma) 2005/61/EC and Article 14 of Directive 2002/98/EC.
(Annexes) [66]
(PIC/S) Data needed for full traceability must be stored
according to national legislation. For EU/EEA this is for at
least 30 years according to Article 4 of Directive 2005/61/
EC and Article 14 of Directive 2002/98/EC.

Ref: 9 3.3.1 Data should only be entered into the computerised system
2013/C 343/01 [67] (Computerized systems) or amended by persons authorised to do so.

Data should be secured by physical or electronic means

and protected against accidental or unauthorised
modifications. Stored data should be checked periodically
for accessibility. Data should be protected by backing up
at regular intervals. Back up data should be retained for
the period stated in national legislation but at least five
years at a separate and secure location.

Ref: 9 4.2 (General) Documents should be retained for the period stated in
2013/C 343/01 [67] national legislation but at least five years. Personal data
should be deleted or anonymised as soon as their storage
is no longer than necessary for the purpose of distribution
activities.

Ref: 9 10.4 (Documentation) The general provisions on documentation in Chapter 4

2013/C 343/01 [67] apply:

(vi) records should be kept either in the form of purchase/

sales invoices or on computer, or in any other form for
any transaction in medicinal products brokered and
should contain at least the following information: date;
name of the medicinal product; quantity brokered; name
and address of the supplier and the customer; and batch
number at least for products bearing the safety features.

Records should be made available to the competent

authorities, for inspection purposes, for the period stated
in national legislation but at least five years.

Ref: 10 3.4 The IRB/IEC should retain all relevant records (e.g.,
ICH E6 (R2) [45] written procedures, membership lists, lists of occupations/
affiliations of members, submitted documents, minutes
of meetings, and correspondence) for a period of at
least 3-years after completion of the trial and make them
available upon request from the regulatory authority(ies).

Table 16.1: Extracts from Regulations and Directives Specifying Retention Requirements and Periods (continued)

Source Section Extracted Text

Ref: 10 4.9.5 Essential documents should be retained until at

ICH E6 (R2) [45] least 2-years after the last approval of a marketing
application in an ICH region and until there are no
pending or contemplated marketing applications in an
ICH region or at least 2 years have elapsed since the
formal discontinuation of clinical development of the
investigational product. These documents should be
retained for a longer period however if required by the
applicable regulatory requirements or by an agreement
with the sponsor. It is the responsibility of the sponsor
to inform the investigator/institution as to when these
documents no longer need to be retained.

Ref: 10 5.5.6 The sponsor, or other owners of the data, should retain all
ICH E6 (R2) [45] of the sponsor-specific essential documents pertaining to
the trial.

Ref: 10 5.5.8 If the sponsor discontinues the clinical development of

ICH E6 (R2) [45] an investigational product (i.e., for any or all indications,
routes of administration, or dosage forms), the sponsor
should maintain all sponsor-specific essential documents
for at least 2-years after formal discontinuation
or in conformance with the applicable regulatory
requirement(s).

Ref: 10 5.5.11 The sponsor specific essential documents should be

ICH E6 (R2) [45] retained until at least 2 years after the last approval of
a marketing application in an ICH region and until there
are no pending or contemplated marketing applications
in an ICH region or at least 2-years have elapsed since
the formal discontinuation of clinical development of the
investigational product. These documents should be
retained for a longer period however if required by the
applicable regulatory requirement(s) or if needed by the
sponsor.

Ref: 10 5.5.12 The sponsor should inform the investigator(s)/institution(s)

ICH E6 (R2) [45] in writing of the need for record retention and should notify
the investigator(s)/institution(s) in writing when the trial
related records are no longer needed.

Ref: 10 8.1 Addendum The sponsor should ensure that the investigator has
ICH E6 (R2) [45] control of and continuous access to the CRF data reported
to the sponsor. The sponsor should not have exclusive
control of those data.

Ref 11 Article 58 Unless other Union law requires archiving for a longer
EU No. 536/2014 [68] period, the sponsor and the investigator shall archive the
content of the clinical trial master file for at least 25 years
after the end of the clinical trial.

Table 16.2 lists extracts from non-regulatory guidances that propose requirements for how data is archived and for
how long.

Table 16.2: Extracts from Guidances Defining Retention Practices

12. PIC/S Guide PI 011-3 Good Practices for Computerised Systems in Regulated “GxP” Environments [69]

13. OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring No. 1 OECD Principles on
Good Laboratory Practice (as revised in 1997) ENV/MC/CHEM(98)17 [70]

14. OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring No. 17 Application of GLP
Principles to Computerised Systems (2016) ENV/JM/MONO(2016)13 [71]

Source Section Extracted Text

Ref: 12 Footnote 58 For information: FDA’s 21 CFR Part 11 requires the

PIC/S Guide PI 011-3 retention of electronic records in electronic form (thus
[69] including raw data electronically captured or recorded).
Also, for all good practice disciplines regulated by
competent authorities it must be possible to reconstruct
studies and reports from raw data and the electronic
records may be needed to support any paper printouts.

Ref: 12 14.4 Validation GxP compliance evidence is essential for the following
PIC/S Guide PI 011-3 Strategies and Priorities aspects and activities related to computerized systems:
[69]
• Data input (capture and integrity), data filing, data-
processing, networks, process control and monitoring,
electronic records, archiving, retrieval, printing, access,
change management, audit trails, and decisions
associated with any automated GxP related activity.
• In this context, examples of GxP related activities might
include: regulatory submissions, R&D, clinical trials,
procurement, dispensing/weighing, manufacturing,
assembly, testing, quality control, quality assurance,
inventory control, storage and distribution, training,
calibration, maintenance, contracts/technical
agreements and associated records and reports.

Ref: 12 19.5 Security The validated back-up procedure including storage

PIC/S Guide PI 011-3 facilities and media should assure data integrity. The
[69] frequency of back up is dependent on the computer
system functions and the risk assessment of a loss of
data. In order to guarantee the availability of stored data,
back-up copies should be made of such data that are
required to re-construct all GxP-relevant documentation
(including audit trail records).

Ref: 12 21.1 ERES EC Directive 91/356 sets out the legal requirements for
PIC/S Guide PI 011-3 EU GMP. The GMP obligations include a requirement
[69] to maintain a system of documentation...The main
requirements here being that the regulated user has
validated the system by proving that the system is able
to store the data for the required time, that the data is
made readily available in legible form and that the data is
protected against loss or damage.

Table 16.2: Extracts from Guidances Defining Retention Practices (continued)

Source Section Extracted Text

Ref: 12 21.3 ERES The central consideration here as in Directive 91/356,

PIC/S Guide PI 011-3 is that records are accurately made and protected
[69] against loss or damage or unauthorized alteration so that
there is a clear and accurate audit trail throughout the
manufacturing process available to the licensing authority
for the appropriate time.

Ref: 12 21.10 ERES Issues to consider where electronic records are used to
PIC/S Guide PI 011-3 retain GxP data:
[69]
• Documentary evidence of compliance exists
• Archiving procedures are provided and records of use
exist
• Procedures exist to ensure accuracy, reliability and
consistency in accordance with the validation exercise
reported for the electronic record system
• System controls and detection measures (supported
by procedures) exist to enable the identification,
quarantining and reporting of invalid or altered records
• Procedures exist to enable the retrieval of records
throughout the retention period
• The ability exists to generate accurate and complete
copies of records in both human readable and
electronic form
• Access to records is limited to authorised individuals
• Secure, computer-generated, time-stamped audit trails
to independently record GxP related actions following
access to the system are used

Ref: 13 1.1 Test Facility At a minimum it [the test facility organization and
OECD ENV/MC/ Organization and personnel] should:
CHEM(98)17 [70] Personnel
l) Ensure that an individual is identified as responsible for
the management of the archive(s).

Ref: 13 3.4 Archive Facilities Archive facilities should be provided for the secure storage
OECD ENV/MC/ and retrieval of study plans, raw data, final reports,
CHEM(98)17 [70] samples of test items and specimens. Archive design and
archive conditions should protect contents from untimely
deterioration.

Table 16.2: Extracts from Guidances Defining Retention Practices (continued)

Source Section Extracted Text

Ref: 13 10.1 Storage and The following should be retained in the archives for the
OECD ENV/MC/ Retention of Records period specified by the appropriate authorities:
CHEM(98)17 [70] and Materials
a) The study plan, raw data, samples of test and reference
items, specimens, and the final report of each study;
b) Records of all inspections performed by the Quality
Assurance Program, as well as master schedules;
c) Records of qualifications, training, experience, and job
descriptions of personnel;
d) Records and reports of the maintenance and calibration
of apparatus;
e) Validation documentation for computerized systems;
f) The historical file of all Standard Operating Procedures;
g) Environmental monitoring records.

In the absence of a required retention period, the final

disposition of any study materials should be documented.
When samples of test and reference items and specimens
are disposed of before the expiry of the required retention
period for any reason, this should be justified and
documented. Samples of test and reference items and
specimens should be retained only as long as the quality
of the preparation permits evaluation.

Ref: 13 10.2 Storage and Material retained in the archives should be indexed so as
OECD ENV/MC/ Retention of Records to facilitate orderly storage and retrieval.
CHEM(98)17 [70] and Materials

Ref: 13 10.3 Storage and Only personnel authorized by management should have
OECD ENV/MC/ Retention of Records access to the archives. Movement of material in and out of
CHEM(98)17 [70] and Materials the archives should be properly recorded.

Ref: 13 10.4 Storage and If a test facility or an archive contracting facility goes out of
OECD ENV/MC/ Retention of Records business and has no legal successor, the archive should be
CHEM(98)17 [70] and Materials transferred to the archives of the sponsor(s) of the study(s).

Ref: 14 3.2 Storage of data When data (raw data, derived data or metadata) are stored
OECD ENV/JM/ Article 73 electronically, requirements for back-up and archiving
MONO(2016)13 [71] purposes should be defined. Back-up of all relevant data
should be carried out to allow recovery following failure
which compromises the integrity of the system.

Ref: 14 3.2 Storage of data Stored data should be secured by both physical and
OECD ENV/JM/ Article 74 electronic means against loss, damage and/or alteration.
MONO(2016)13 [71] Stored data should be verified for restorability, accessibility,
readability and accuracy. Verification procedures of stored
data should be risk based. Access to stored data should be
ensured throughout the retention period.

Table 16.2: Extracts from Guidances Defining Retention Practices (continued)

Source Section Extracted Text

Ref: 14 3.2 Storage of data Regarding procedures, the test facility management
OECD ENV/JM/ Article 77 should describe how electronic records are stored, how
MONO(2016)13 [71] record integrity is protected and how readability of records
is maintained. For any GLP-relevant time period, this
includes, but may not be limited to:

a) physical access control to electronic storage media

(e.g., measures for controlling and monitoring access of
personnel to server rooms, etc.);
b) logical (electronic) access control to stored records
(e.g., authorisation concepts for computerised systems
as part of computerised system validation which
defines roles and privileges in any GLP-relevant
computerised system);
c) physical protection of storage media against loss or
destruction (e.g., fire, humidity, destructive electrical
faults or anomalies, theft, etc.);
d) protection of stored electronic records against loss
and alteration (e.g., validation of back-up procedures
including the verification of back-up data and proper
storage of back-up data; application of audit trail
systems); and
e) ensuring accessibility and readability of electronic
records by providing an adequate physical environment
as well as software environment

Ref: 14 3.11 Archiving Any GLP-relevant data may be archived electronically. The
OECD ENV/JM/ Article 110 GLP Principles for archiving must be applied consistently
MONO(2016)13 [71] to electronic and non-electronic data. It is therefore
important that electronic data is stored with the same
levels of access control, indexing and expedient “retrieval”
as non-electronic data.

Ref: 14 3.11 Archiving Viewing electronic records without the possibility of

OECD ENV/JM/ Article 111 alteration or deletion of the archived electronic records
MONO(2016)13 [71] or replicating within a computerised system or to another
computerised system does not constitute “retrieval” of
records. Only when the possibility of alteration or deletion
of the archived record exists, should that be considered
access, withdrawal, “retrieval,” or removal of records
and materials. The archivist should be able to control the
assignment of "view only" access to archived electronic
data in order to verify that the requirements for archived
data are met.

Table 16.2: Extracts from Guidances Defining Retention Practices (continued)

Source Section Extracted Text

Ref: 14 3.11 Archiving Electronic archiving should be regarded as an

OECD ENV/JM/ Article 113 independent procedure which should be validated
MONO(2016)13 [71] appropriately. A risk assessment should be applied when
designing and validating the archiving procedure. Relevant
hosting systems and data formats should be evaluated
regarding accessibility, readability and influences on data
integrity during the archiving period. Consideration may
have to be given to archiving electronic data in an open
format that is independent from proprietary file format e.g.,
from an instrument manufacturer. Where data conversion
is needed, the requirements from section 2.8 apply. The
archivist, who holds sole responsibility, may delegate tasks
during the management of electronic data to qualified
personnel or automated processes (e.g., access control).
For roles and responsibilities in the archiving process refer
to OECD GLP Advisory Document Number 15.

Ref: 14 3.11 Archiving Procedures have to be implemented to ensure that the

OECD ENV/JM/ Article 114 long-term integrity of data stored electronically is not
MONO(2016)13 [71] compromised. If data media, data formats, hardware or
software of archiving systems (not the data collection
systems) change during the archiving period, the test
facility management should ensure that there is no
negative influence on the accessibility, readability and
integrity of the archived data. The continuing ability to
retrieve the data should be ensured and tested. Where
problems with long-term access to data are envisaged
or when computerised systems have to be retired,
procedures for ensuring continued readability of the data
should be established. This may, for example, include
producing hard copy printouts or converting data to a
different format or transferring data to another system.
If migration of data including conversion to a different
data format or printing is relevant, the requirements of
this guidance for data migration should be met. Risk
assessment, change control, configuration management
and testing regime should be considered as relevant
standard procedures when changes in the archiving
system are required. As content and meaning of any
electronic data should be preserved during the archiving
period, the complete information package should
be identified and archived (e.g., raw data, metadata
necessary to understand correctly the meaning of a record
or to reconstruct its source, electronic signatures, audit
trails, etc.).

Ref: 14 3.11 Archiving Any data held in support of relevant computerised

OECD ENV/JM/ Article 117 systems, such as source code, development, validation,
MONO(2016)13 [71] operation, maintenance and monitoring records, should be
held for at least as long as study records associated with
these systems.

17 Appendix O5 – Maintaining Legacy

Appendix O5
Software
17.1 Introduction

Legacy software may be needed to read archived dynamic data. This appendix looks at the different approaches
currently available to run legacy software after system retirement. Apply critical thinking to identify the most
appropriate, robust, and long-term solution to readability of the archived data.

17.2 Non-Disposal of Retired Systems

Companies lacking a retention strategy and/or an adequate system retirement process may simply opt to leave
the retired system connected and available as a means to read data. This is a “do nothing” approach and is not
viable long term as any hardware failure or software corruption is likely to render the system unusable and the data
unreadable.

Should the system be running an unsupported operating system and be connected to the company network, this
introduces a serious vulnerability to cyberattacks. PI 041-1 (Draft 3) Good Practices for Data Management and
Integrity in Regulated GMP/GDP Environments [24] specifically recommends any outdated systems should be
isolated from the company network for this reason, for example, by the introduction of an additional firewall.

17.3 Compatibility to Modern Operating Systems

The latest version of some OS may support the installation of legacy software. There will still be challenges to keep
older software running as updates are installed since incremental changes over time can interfere as much as a
major upgrade.

Consider that the system’s validated state was maintained with the previous OS and therefore formal documented
testing should be conducted in the new environment to ensure that it is operational and suitable for its intended use
should the need arise to review the data during an inspection.

Documented periodic checks need to be conducted to ensure OS updates have not caused problems with the
opening and viewing of records, so a test data set should be retained to enable a quick review of the software and
that the software is still suitable for its intended use. The periodic review of the data will need to be clearly defined in
a procedure along with the expected interval and documentation of the outcome that would be generated.

17.4 Virtual Machine Solution

Some applications may require the original, compatible operating system to work properly. A practical solution is to
have a complete Virtual Machine (VM) image containing the application software and any supporting software needed,
all running on the compatible operating system version. This can be achieved by virtualizing the physical system before
it is retired, or by re-installing the application and supporting software into a new VM when required. If the vendor
software and compatible OS are archived together it will make the VM environment setup simple, and the image
can be loaded and run when needed to view archived data. A VM solution may have a longer viable lifespan than a
hardware museum (see Section 17.5) but there will still be limits to which OS can be prolonged this way. One particular
risk with running a legacy OS in a VM is that the unsupported OS is neither patched nor patchable against security
vulnerabilities, and therefore the VM is highly susceptible to malware attacks if directly connected to the main network.

Cooperation may be required between IT and the process or data owner to manage and maintain the VM ongoing.
This could include, but is not limited to, system access controls, security, periodic review, disaster recovery
processes, etc.

17.5 The Hardware Museum5

A hardware museum is a planned, structured approach to storing legacy hardware to run its associated software and
view records. This is the most difficult of the options and is relatively uncommon. It is a last resort where there is no
other alternative to maintain readability for dynamic data, and the risks to patient safety and product quality resulting
from converting to static data are unacceptable. It is particularly impractical for enterprise systems, and even for a
stand-alone system there is no guarantee that the legacy hardware will operate when required as old components
may fail with no replacements available. A hardware museum only has a limited viable lifespan and should only be
used as a short- to medium-term solution; there should be a clear plan that addresses record retention beyond this.
The physical storage space required also poses challenges and is compounded in the rare case of the instrument
needing to be retained along with the computer for the software to run properly.

One location for such storage would be in the archive with the paper records because those storage conditions
would also be ideal to preserve the hardware and software from degradation. Another potential location would be
in the server room as those conditions would also be good for preserving the hardware, but the associated vendor
software disks or OS disks should be separated and stored elsewhere for disaster recovery purposes. In any case,
the hardware should not be kept running because that would wear it out prematurely.

17.6 Conclusion

There is no perfect solution to ongoing operability for legacy software. At some point, as discussed in Section 7.2,
maintaining the readability of the old dynamic records may become unfeasible and conversion to static format is the
only remaining option.

5
Also known as mothballing/time capsule/computer museum.

18 Appendix S1 – Artificial Intelligence:

Appendix S1
Machine Learning
18.1 Introduction

Artifical Intelligence (AI) is a broad field of study within computer science that includes technologies such as Machine
Learning (ML), as well as deep and continuous learning. This appendix examines the area of ML and the importance
and implications of data and data integrity on the outcomes of what “machines” are able to process/learn from the
data which is made available to them.

18.2 Background

ML is a method of data analysis that builds and automates mathematical models (i.e., algorithms) based on data in
order to make predictions or decisions.

As previously stated, it is an area or branch of AI based on the idea that systems can learn from data, identify
patterns and make decisions with minimal human intervention and/or explicit programming.

In order to understand and properly appreciate the inherent importance of data to ML, and its integrity, we must first
understand and grasp the use of data within ML. For a “machine” or software system to “learn” there must be data
available to train the system (e.g., algorithms). But that is not the only reason for which data is used within such an
intelligent machine.

In general, there are four primary success factors to any good ML effort; these include:

• Data

• Algorithms

• Computations

• Predictions

As data integrity is the focus of this Good Practice Guide, this appendix focuses on the data portion of ML.

18.3 Scope

This appendix first identifies the lifecycle of data within a typical ML framework. This includes where data is required
to be input into the lifecycle, how data is split for various activities, and where data is created or reintroduced to
be used for additional learnings. This appendix then equates the typical ISPE GAMP® 5 [9] software phases (e.g.,
concept, project, and operation) to those of the ML model and those activities required to integrate this technology
into an overarching application and/or software product.

18.4 Data Lifecycle (Iterative, Autonomous, and Adaptive)

The vast majority of data is required prior to any model or algorithm selection; additionally, new data is used
throughout the iterative lifecycle of ML (see Figure 18.1). It is used for fully understanding and defining the business
case, data engineering activities, model training, tuning and selection, in addition to model evaluation (i.e., validation
and testing). Production data is also used in order to refine models (i.e., retraining), improve scoring/performance
(i.e., precision, accuracy, etc.) and allows the “machine” to keep learning, whether that be through supervised, un-
supervised, reinforced, deep, or continuous learning.

Figure 18.1: Machine Learning Iterative Lifecycle

18.5 Concept Phase (Understanding the Business Case)

The idea or concept phase to implement ML comes, as with any AI based technology, with the need to understand
the problem(s) to be addressed at a business level and develop a use case.

This requires consideration of five primary principles:

1. Transparency: How the model(s) will behave

2. Reproducibility: How the outcomes may be trusted

3. Interpretability: What may produce the outcome(s)

4. Applicability: How the model(s) shall be applied

5. Liability: Who will be ultimately liable

Once this is accomplished, the primary task is to build a data set based upon what the organization is trying to
achieve, not forgetting to identify assumptions. However, in order to build an appropriate set of data, an organization
must understand what information is important to them (i.e., meaningful) and find where that data/information
is available along with how to exclude unwanted/irrelevant data. This is defined by the desired features and/or
parameters to be directly involved in the ML, which can be small and/or even limited in quantity.

This process should begin with, and/or be supported by, a solid data governance strategy and a realization that not all
data is created equal. This includes awareness for data size, quality, and prevalence.

18.5.1 Data Acquisition and Selection

There are many places an organization can begin to look in order to acquire data. The first place is typically internal
to their company and given operations. This is often within an existing data warehouse or data lake, where the
company’s data has hopefully been already organized, leveraging a standardized data nomenclature (see Section
4.7). Ideally the organizations internal data is not siloed, as it may not be possible to converge all data streams. If so,
data from multiple sources such as this is usually in an “unorganized” format and requires proper preparation and
labeling to become useful to a ML model.

It is also possible to obtain data external to one’s own organization to serve as the basis for ML. This data may
come from various sources of which some may be structured, unstructured, and/or semi-structured. Regardless, as
previously mentioned, a data selection and governance strategy should be in place that aims for a diverse and non-
biased data set. The data set should also aim for a large number of data points, keeping in mind a reduction of overall
data complexity.

One must consider data classification (see Section 3.3), clustering (e.g., including data privacy needs), regression,
and ranking, as well as what values and/or metadata are “critical” or which simply add more complexity. It is important
to not just have the “right” data but also to have it in the “right” form or format, which is one of the major challenges in
the use of external data sources (see Section 1.6.4 Data Quality).

If obtaining and/or using external data (i.e., open source or public information) an organization must ensure that they
have the right to use such data and that Personal Information (PI) and/or Personally Identifiable Information (PII)
considerations are appropriately taken to comply with regulations such as the EU GDPR [17]. If this is not the case,
then certain data attributes may need to be anonymized.

Data selection considerations include:

• Source: Internal, external (open source), external (purchased)

• Privacy and Controls: Data classifications, data use, risk, mitigation controls

• Structure/Format: Unstructured, semi-structured, structured, unorganized, or organized

• Segmentation: Clustering, priority, or tiering, necessary versus nice to have

18.5.2 Data Engineering/Preparation

Data preparation includes, but is not limited to:

• Profiling (e.g., formatting)

• Cleansing, transforming (i.e., homogeneous)

• Anonymizing (e.g., for GDPR [17]/privacy)

• Augmenting to diversify

Data sets are often inaccurate, and thus they need to be “prepared” prior to making them available for ML activities.
The quality and integrity of the data (see Section 1.6.4 and Appendix M2) makes all the difference (the quality of
output is determined by the quality of input) and must be suitably classified and labeled (i.e., assigning tags to make
it more identifiable for predictive analysis). The data needs to be free of bias to certain regions, demographics, etc.
This is one of the most time-consuming efforts in the process and should ideally be done by dedicated data scientists
that possess distinct domain knowledge and expertise with the data. This aids in their ability to decide upon relevant
data structuring, cleaning (i.e., removal or replacement of missing values), labeling, annotating, and preparation for
further “processing” and use. This is imperative in order to achieve “good results” as the understanding and proper
preparation of data is directly attributable to proper selection, build, and testing of models.

Putting together the data in an optimal format is known as “feature transformation.” This includes the format (e.g.,
differing files), data cleaning (i.e., removing missing values), and feature extraction (i.e., which features or data
elements are most important for prediction speed and accuracy). Normalizing the data set also helps to improve
these circumstances by reducing dimensions.

18.6 Project Phase (Data Modeling and Evaluation)

During the project phase of ML, a model is selected based upon the question it is expected to answer. Common
models include, but are not limited to:

• Classification: Categorizes data to produce a binary result

• Clustering: Recognizes patterns by attribute to enable targeted action

• Outlier: Identifies anomalous data

• Forecasting: Analyzes patterns to predict the future

• Time Series: Analyzes patterns during fixed timeframe(s)

Once a model has been selected, the model and algorithms are configured to allow the machine to learn using the
“case data” selected during the concept phase, which has been divided into two sub-datasets to train, and validate/
test the machine (e.g., 80% training and 20% validation/testing).

The training data set is used to train the model for performing various actions and/or teaching it how to apply certain
concepts. This requires data input along with defined and expected outputs (i.e., supervised learning).

The validation/test data set is used to evaluate how well the model was trained and assists in fine tuning the model.
For instance, does the defined input generate the correct output? This may often require human verification and/
or the use of tools. Validation is performed to provide evidence that the accuracy of the model and its associated
algorithms delivers against output expectations. It is important to consider the need for multiple data sets as data
used may alter the original data, making it unavailable to re-validate.

Finally, there may be additional data used for the confirmation of acceptable performance (i.e., verification) within
the overarching application’s User Acceptance Testing (UAT) prior to deployment. This data is important from a
confidence perspective, in order to verify that certain performance scoring and/or defined outcomes are in fact being
achieved.

18.7 Operation Phase (Deployment and Monitoring)

In the operational phase it is possible to integrate, enrich, and prepare new data to further refine existing models,
develop new predictive models, and establish performance monitoring measures to track ongoing effectiveness.

18.7.1 Performance Monitoring

It is important to note and understand that model performance goes beyond statistics and includes items mentioned
such as applicability, reproducibility, and interpretability. There needs to be QC standards and/or measures to ensure
acceptable performance throughout the model’s usable life.

18.7.2 Risk Management

Poor data and a good model is a bad combination that can ruin the success of ML.

Risk management helps to drive much of the process and controls needed because it gives a direct indication of
possible harm that could happen in different scenarios. ML is about increasing efficiency while decreasing potential
harm, and it is imperative to outline possible scenarios and gaps from the traditional risk management approach; this
means performing risk assessments and then assessing the additional risks of having ML intervention. This must
also account for the volume and scope of the initial data set in ensuring all scenarios, features, etc. were properly
trained, and that the training data was correctly labeled prior to use, as this could lead to additional requirements to
be evaluated.

For example, one must consider:

• Benefits: What does the intervention of ML mitigate from a traditional electronic system?

• Costs: What potential new risks have been brought about? Are there mitigation plans?

• What are the risks to data sets?

• Are there possible “ripple effect” risks?

(i.e., a small change in a ML process step for serialization could potentially have a huge impact to the overall
supply chain if an error occurs or if data formats are unintentionally altered). For this it is recommended to
leverage or conduct a Failure Mode Effect Analysis (FMEA).

Other risks for consideration include:

• Use of external vendors/suppliers

• Model maturity and extent of previous usage

• Changes to production data and/or errant data

• System performance, down time, etc. (i.e., must consider business continuity and potential data loss scenarios)

18.7.3 Change Management

Change management should continue following the current change process in the company and adhere to regulatory
requirements, taking into consideration possible mitigation efforts brought about in the risk management efforts
(Section 18.7.2) and focusing on how to maintain the integrity of the data.

User experience and system controls along with other human intervention are a great place to start; drafting a visual
of the process with possible scenarios and mitigation can help identify weak points in the process.

Consider using inputs and outputs:

• Input: Internal procedures that apply to regulatory needs and how those may need to be adjusted or managed
differently with ML

• Output: Necessary regulatory deliverables required for change management, for example, if it is Software as a
Medical Device (SaMD) or a specific GxP software (i.e., a clinical system or a laboratory system can have very
different approaches and documents needed for changes)

For example:

- Are these deliverables created by humans?

- Are they a document or data element?

- How are they approved?

• Data inputs and the process of ML: Details about how a change is initiated and what impacts on the data that
could have

For example:

- What are the rollback plans or separate environments that allow for issue detection and protection of data
sets?

- How wide-spread is the impact?

ML may be used in a highly specific part of a process but could have large reaching effects; knowing the boundaries
of a change impact allows for effective mitigation controls.

18.8 Further Reading

FDA Discussion Paper: Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning
(AI/ML) – Based Software as a Medical Device (SaMD) [72]

BSI and AAMI Position Paper 2019: The emergence of artificial intelligence and machine learning algorithms in
healthcare: Recommendations to support governance and regulation [73]

ISO IEC 62304: 2006 Medical device software — Software life cycle processes [74]

Thinking on its own: AI in the NHS Section 5 – Overcoming System Challenges [75]

Artificial Intelligence Basics: A Non-Technical Introduction by Tom Taulli [76]

19 Appendix S2 – Computer Software

Appendix S2
Assurance
19.1 Introduction

Regulatory guidance on Computer System Validation (CSV)

has remained mostly unchanged and misunderstood for the Scope of CSA
past decade while technology has advanced significantly with
low coding platforms and broad adoption of computerized CSA originally focused on clarifying 21 CFR
systems throughout the business process. CSV is often seen 820.70(i) [77] automation used as part of
as an activity that is done primarily to secure evidence for production or the quality system, and the
auditors rather than as a value-adding practice to ensure concepts are applicable to a much larger
the quality of systems. The FDA intends to bridge this set of systems; specifically, systems that
misinterpretation between regulation and technology through are not the product (i.e., the medical device)
its upcoming draft guidance: Computer Software Assurance or part of the product, and includes GxP
for Manufacturing, Operations, and Quality System Software systems shared across the medical device,
(CSA) [4]. CSA provides further clarification and shifts the pharmaceutical, and biotechnology industries,
focus from documentation to critical thinking and testing. including clinical trial systems.

As described in ISPE GAMP® Guide: Records and Data There are tools used to automate and
Integrity [8], Section 1.5.6 GxP Computerized System Life supplement the tracking and testing of
Cycle: non-product systems such as application
lifecycle management systems, comparison
“Data integrity is underpinned by well-documented, tools, software testing tools, and bug tracking
validated GxP computerized systems, and the application tools. The use of these tools is not the focus
of appropriate controls throughout both the system and of the regulations, and good engineering,
data lifecycles.” software engineering, and IT practices should
be applied to ensure they are acceptable
This foundational principle is often misunderstood and used for use rather than a separate validation or
to justify the creation of non-value adding documentation. qualification effort.

Multiple GxP computerized systems may be involved in

supporting a data lifecycle within a business process, as the
data may be passed from system to system. To ensure data integrity, all GxP computerized systems supporting a
regulated business process should be trustworthy and validated for intended use.

A system lifecycle approach, such as described in ISPE GAMP® 5 [9], should be applied to each GxP computerized
system. Record and data integrity should be built-in and maintained throughout the GxP computerized system
lifecycle phases, from concept through project and operations, to retirement. The GxP computerized system lifecycle
activities should be scaled based on the complexity and novelty of the system, and potential impact on patient safety,
product quality, and data integrity.

This appendix describes the application of the FDA Center for Device and Radiological Health (CDRH) Computer
Software Assurance (CSA) [4] concepts within a ISPE GAMP® 5 [9] system lifecycle approach framework in order to
assist in clarifying what it means to have a well-documented, validated GxP computerized system to effectively and
efficiently achieve data integrity by design.

While CSA concepts apply broadly to the validation for intended use of a computerized system, in this appendix they
are specifically applied to data integrity, which is itself foundational to ensuring patient safety and product quality.
This appendix shows how a combination of critical thinking and CSA approaches can achieve data integrity through
effectively and efficiently managing data integrity risks in support of product quality and patient safety by focusing
efforts on the areas of highest risk to data integrity.

Computer Software Assurance was born from CDRH’s Case for Quality [4], which enhances and incentivizes the
adoption of practices and behaviors to improve medical safety, responsiveness, and how patients experience
products. As part of that effort, the FDA supports and encourages the use of automation, information technology, and
data solutions throughout the product lifecycle in the design, manufacturing, service, and support of life sciences.
Automated systems benefit product quality and patient safety by reducing errors, reducing patient risk, optimizing
resources, and increasing business value.

While the FDA planned guidance [4] uses the term “Computer Software Assurance” there is no intent to limit the
application of CSA to software only. Software is only one component within a computerized system, which in turn is
part of a wider operating environment. A holistic approach is needed to ensure the overall computerized system is fit
for intended use in support of the business process.

Risk-based testing models have been developed following good software engineering principles and practices.
Application of such approaches supports the effective achievement of data integrity objectives. CSA outlines a
risk-based approach to testing and details the intended uses the FDA considers as high risk. In addition, a potential
framework for testing methods and an acceptable record of objective evidence is provided. CSA emphasizes that:
quality is the primary focus, not documentation and that the record demonstrating that the system performs as
intended should be primarily of value to the regulated organization.

A cornerstone of CSA is the application of critical thinking and risk-based principles when developing the computerized
system lifecycle strategy in support of data integrity. A regulated company should focus on understanding the intended
use of the system in support of a regulated business process, and the risks introduced by the system not performing
as intended with respect to patient safety, product quality, and data integrity. This allows a knowledgeable and
experienced team of SMEs to select and apply the appropriate strategy to evaluate the process, the data flows
through the process, and the computerized systems supporting the process, and the degree of human interaction
with the system. Systems or functions directly impacting patient safety, product quality, and data integrity are likely to
require a rigorous assurance effort and recorded results. Conversely, if the system or function has only an indirect5
impact, the effort and recorded results should be appropriately scaled to be as least burdensome as possible.

CSA is consistent with General Principles of Software Validation [78], which states:

“The level of validation effort should be commensurate with the risk posed by the automated operation.”

CSA is also consistent with robust lifecycle methodologies such as ISPE GAMP® 5 [9] or AAMI TIR36 [79]. Like ISPE
GAMP® 5, CSA takes a lifecycle approach where activities and controls at earlier or later stages, such as a robust
risk assessment based on product and process understanding, can be leveraged during the verification stage. ISPE
GAMP® 5 presented the concept of risk-based testing to focus on testing functions with high-risk priorities. CSA builds
on the ISPE GAMP® 5 principles by directly acknowledging the acceptability of concepts from the software testing
industry, such as exploratory testing, which neither rely on scripted test cases nor extensively capture screenshots as
test records.

Ineffectual application of the risk-based approach combined with unskilled validation/quality practitioners has resulted in:

• Excessive time spent generating detailed, prescriptive test specifications

• Overemphasis on gathering supporting evidence such as screenshots

• Dry running of tests to reduce execution errors caused by the inordinate level of detail in the test instructions

• Focus on error-free execution of the formal script instead of software/system error detection

5
Examples of indirect impact are situations where the system feature, operation, or function is: (1) collecting and recording data from the process for
monitoring, review, or statistical process control; or (2) related to quality system integrity, e.g., electronic signatures, logs of system configuration
changes.

The focus of CSA is to:

Alignment with ISPE GAMP® 5
• Leverage experienced SMEs applying critical thinking
to determine test cases and direction based on a ISPE GAMP® 5 [9] is consistent with CSA,
documented risk assessment which applies a risk-based approach to test
documentation and supporting evidence,
• Generate supporting evidence only where it brings value following good software engineering principles
to the quality of the testing and practices.

• Spend more time actively testing to find defects and less Validation is often erroneously regarded as
time generating specifications in advance requiring a rigorous formal structure and direct
regulated company QA oversight to fulfill a
The ultimate objective of testing should always be the “compliance” checkbox requirement. In reality,
reduction of defects before the system goes live; the CSA the underlying objective of validation is to
approach expedites this in support of patient safety, product support the intended use of the computerized
quality, and data integrity. system by the regulated company in support
of a business process, and this can be
CSA is a balanced approach promoting a thorough best achieved by leveraging scripted and
understanding of the intended use of the system by the unscripted testing approaches within the
organization, an understanding of risk, use of more efficient validation activities.
and effective testing methods, and an appropriate level of
objective evidence to improve and accelerate the use of Assurance activities can occur outside of
technology in the regulated system landscape. This approach rigorous pre-defined test specifications and
is aligned with the way systems are reviewed during any still support patient safety, product quality, and
regulatory audit or inspection. The validation plan and data integrity.
summary are reviewed. The functional behavior of the system
is primarily confirmed in production, for example, “show me ISPE GAMP® 5 Appendix D5 [9] states that
the Qualified Person approval for this lot.” In the event that
testing is reviewed, it is either a high-risk area (where scripted “Unnecessary supporting documentation
testing should have been used) or an actual production failure that does not add value to the normal test
that should have been caught before final acceptance and results should be avoided.” (emphasis
release. added)

19.1.1 Current State and Potential Barriers The principles of unscripted testing are
discussed later in this appendix.
Other regulated and non-regulated industries have
increasingly moved forward and adopted frameworks
for modern testing and modern lifecycles. Technology
is continually evolving, creating new opportunities and expectations; therefore, the approaches for validation for
intended use should adapt accordingly to support companies to deliver products to market fast while continuing to
ensure patient safety, product quality, and data integrity. The life sciences industry has an opportunity to break away
from the past and change the perception of computerized systems validation. The conjunction of ISPE GAMP® 5 [9]
and CSA enables computerized systems to be validated for intended use with increased quality and speed.

19.1.2 How We Got Here

Computerized system validation practices are often anchored in the past, and greatly influenced by CDRH’s General
Principles of Software Validation [78], which was finalized in 2002. This document was referenced by the FDA
Guidance for Industry: Part 11, Electronic Records; Electronic Signatures – Scope and Application (2003) [30] and
again in 2007 in the FDA’s Computerized Systems Used in Clinical Investigation [80]. General Principles of Software
Validation [78] covers both software that is itself, or is part of, a regulated medical device, as well as systems used
as part of production or the quality system (non-product systems). The requirements for the lifecycle and validation of
the regulated medical device software itself do not directly apply to such non-product systems, but this distinction has
not always been clearly and correctly applied by regulated companies. The average publication date of the software

engineering references in General Principles of Software Validation is 1993 [78]; and testing methodologies, tools,
and techniques have advanced significantly since then. Consequently, the current state of validation is influenced by
the misapplication of guidance for product systems and by long outdated software engineering practices.

During the same time period, the pharmaceutical industry implemented lifecycle models such as ISPE GAMP® 5
[9] and AAMI TIR36 [79]. ISPE GAMP® 5 [9] supports a patient-centric and QRM approach to the assurance of
computerized systems. Unfortunately, in many cases ISPE GAMP® 5 has not been effectively implemented in the field
due to the absence of critical thinking and the use of unskilled practitioners. There have been deficiencies in historical
approaches to validation that have led to data integrity risks, such as lack of process definitions and data flows,
poorly-defined configuration settings, gaps in the development of procedures and training for review of electronic
data and audit trails, and gaps in other aspects of validation for intended use, and governance of automated business
processes. In the absence of critical thinking, validation typically becomes a template activity applying a formulaic risk
assessment per ISPE GAMP® 5 Appendix M3 [9]. Testing is done irrespective of the outcome of the risk assessment
(minimal or excessive, depending on the company), to suit their internal company ethos. In some regulated
companies, this has resulted in over-simplified table-driven approaches with no application of critical thinking and lack
of meaningful consideration of aspects such as complexity, novelty and supplier development testing in the definition
of the appropriate system lifecycle activities.

All of the above means industry has not embraced advances in development and testing methodologies, techniques,
and test tools; risk-based testing is just one technique of many techniques and tools in modern testing tools.

An area many regulated companies continue to struggle with is the continued use of detailed test scripts requiring
screen shots without applying critical thinking. Some companies make the mistake of focusing on regulatory risk
over patient risk, and hence the major goal becomes documentation that will pass an inspection. In some cases test
specifications are repeatedly executed in advance (dry run) until they can be executed without detecting a single
issue, and until they have no ability to find any new defects (which is a primary goal of effective testing).

19.1.3 Where We Should Be

The ultimate aim of effective validation is that the computerized system (including people, process, procedures) is
fit for intended use. Testing focus is often on ensuring the automation feature, operation, or function performs as
intended without compromising patient safety, product quality, and data integrity, but it is important to verify the wider
functionality in terms of the underlying business process and data flows in support of data integrity. The regulated
companies and people performing the assurance activities should apply critical thinking and ensure the assurance
activities are value added and meaningful instead of an inspection-readiness task. Focusing on ensuring a regulated
company’s business needs first, and then leveraging work already performed will lead to a higher quality system; the
validation effort will be reduced when it values work already performed instead of adding an unnecessary burden
or redundant activity.6 The generation of supporting documentation for an inspection is a by-product of the system
lifecycle.

The level of effort should be commensurate to the risk acceptable within the organization as defined in its policies,
procedures, and plans. The regulated company determines the assurance activities based on their own need to
ensure systems are fit for intended use. The key is to determine the regulated company’s level of risk acceptance,
based on the intended use of the software, and factoring in the technical or procedural controls that are currently in
place or that will be put in place.

6
An example of a redundant activity is repeating functional testing (SQA) already performed in another, but still adequately controlled environment
because it lacked independent quality assurance unit review. The adequacy of testing is a technical issue as opposed to a quality assurance/
compliance issue.

All testing, no matter when it occurs in the system lifecycle,

may contribute to traceability and to the assurance of fitness Risk Tools
for intended use. Provided the testing was performed in an
appropriately representative and controlled environment, it “When applied with expertise and good
is not necessary for every test environment to have been judgement, this Guide offers a robust, cost
formally qualified. Regulated companies have the ability to effective approach.” – Introduction to ISPE
directly or indirectly leverage and trace to vendor testing GAMP® 5 [9]
to provide assurance that the system meets the business
requirements, instead of repeating functional testing.7 It ISPE GAMP® 5’s High/Medium/Low approach
is important to apply critical thinking to the planned test [9] to risk assessment (a simple functional
activities to ensure they reflect the intended use of the system risk assessment approach) is effective for
and mitigate the corresponding risks to patient safety, product most applications, and other risk assessment
quality and data integrity. Requirements classified as high- tools may also be used to augment or replace
risk priority within the overall computerized system can be based on a regulated companies specific
included in UAT. need. These tools include:

There are strong benefits for the life science industry in • Failure Mode Effects Analysis (FMEA) is
applying tools that enable test automation and efficiency effective when real data is available, when
gains within the system lifecycle (e.g., requirements, business there are many components, and a bottom
process maps, test case management, test automation, up/cause and effect approach can be
traceability, and comparison utility). Teams can embed how helpful at identifying risks
they work and the tools they use as part of the process and • Fault-Tree Analysis (FTA) is a “top down”
no longer add manual steps to produce documentation and approach for identifying risk iteratively
testing evidence. It should be noted that the use of incidental using a cause and effect model
tools to aid in the validation effort does not trigger a separate • Hazard Analysis and Critical Control Point
validation effort; automated tools only require a documented (HACCP) is widely used as a preventive
assessment for their adequacy [11, 25], which should have food safety system where hazards are
been completed prior to their use in a validation project. A identified and controlled at specific points
tool is not a regulated system, and its acceptability should in the process
be documented using the company’s non-GxP business • Hazard Operability Analysis (HAZOP) may
practices, for example, by the application of good engineering be used to identify potential hazards in a
practices and the availability of evidence of proper selection, system and identify operability problems
installation, and control. likely to lead to nonconforming products.

The regulatory requirements define that the software must be There are many tools, and regulated
fit for its intended use [11, 30], but not how this is achieved. companies should apply critical thinking to
Testing documentation can vary in depth and detail yet still identify and apply them effectively. A data
provide assurance. Each company needs to define how to integrity risk assessment should be included
implement the most effective, least-burdensome mitigation of as part of the quality risk management
the risks to patient safety, product quality, and data integrity approach, and should leverage detailed
as part of their overall validation process. business process mapping and data flow
diagrams to identify potential data integrity
risks through the end to end process.

“It is not a prescriptive method or a standard,

but rather provides pragmatic guidance,
approaches, and tools for the practitioner.” –
Introduction to ISPE GAMP® 5 [9]

7
Indirect leveraging is all predicated on an assessment and ongoing monitoring of the vendor’s quality management system and practices, including
contracts. Direct leveraging of test results does not require this as the test results of the system are directly reviewed and deemed acceptable.

19.2 Establishing a Lifecycle-based Approach and Basic Assurance

This section describes how ISPE GAMP® 5 [9] principles can be combined with CSA concepts, and applies the
combined approach to a data integrity focused example.

ISPE GAMP® 5 [9] and CSA advocate a lifecycle approach with the application of computerized system best practices
within the QMS framework. See Figure 19.1.

Figure 19.1: Key Concepts

Adapted from ISPE GAMP® 5 [9]

The starting point within the regulated company’s system lifecycle is to ask the question: “What is needed to
be confident that the system is fit for intended use and meets the regulated company’s needs?”, and to define
an appropriate response. Confidence in system functionality can be achieved by leveraging supplier activities
demonstrating that the system performs as expected, based on the regulated company’s supplier management
practices. This includes activities performed to select the supplier, supplier activities performed within their own
lifecycle to produce the system, factory acceptance testing, etc. The regulated company will implement the
computerized system for its intended use in their operating environment, including the integration with other systems,
people, processes, and procedures. Regulated companies should identify risks associated with the intended use of
the system, and define and complete additional assurance in the form of validation activities to mitigate these risks.

CSA advocates, where available and appropriate:

• Leveraging activities already performed by the supplier

• Use of controls earlier or later in the project stage to mitigate risks (e.g., informal reviews, walk-throughs,
inspections, and static analysis)

• Use of the most appropriate and effective testing techniques at every point including unscripted testing (see
Section 19.3.2)

• Use of tools (e.g., bug tracker, comparison)

• Agile methods with iterative specification and verification cycles

• Ongoing monitoring of existing and new risks, and the effectiveness of the current risk mitigation controls

These lifecycle activities provide a comprehensive assurance approach that can reduce the need for testing within the
regulated company’s system lifecycle. The need for additional assurance through testing is determined by assessing
the system’s impact on safety and quality, to answer the question: “What is needed to ensure the system does no
harm by addressing impact to patient safety, product quality, and data integrity?”

Features, operations, or functions with a direct impact to patient safety, product quality, and data integrity may require
the most rigorous assurance efforts and objective evidence, and indirect impacts require the least amount of rigor
and objective evidence. CSA introduces the use of risk-based documentation, following good software engineering
principles and good software testing practices that include unscripted and scripted testing to address this risk-based
approach.

19.3 Risk-based Assurance

This section provides a reference model for the additional assurance activities noted in Section 19.2.
In addition to the risks to patient safety, product quality, and data integrity, testing and assurance activities should
consider the following:

• The nature, as well as the impact, of the function or requirement to be tested

• Complexity of the system

• Novelty of the system or function

• The expected or feared failure modes and the primary undesirable operational outcomes

• When in the lifecycle does the testing occur?

• Who is performing the testing, and for which specific purpose?

• How many tests are required to cover the scope under consideration?

• What is the specific testing technique applicable in this case (negative, positive, functional, structural, unit,
integration, acceptance, performance, regression, error-handling, boundary, etc.)?

• Previously performed testing

• The role the testing plays in demonstrating fitness for intended use and discovering defects

A formulaic approach to always applying a particular testing technique to a particular risk priority may not achieve
the maximum test effectiveness. The effectiveness of testing primarily aimed at defect identification, for instance,
is directly related to the nature of the function to be tested, the nature of the likely defects, the architecture of the
component to be tested, which tools are being used for testing, and the logic of the process supported.

An effective testing and assurance strategy cannot be defined based solely on risk priority, and should additionally
consider the factors listed above (which itself is not exhaustive).

Testing activities fall under two broad categories: static techniques and dynamic techniques (not related to static and
dynamic data). Static techniques test the software without executing it, for example, code reviews, walk-throughs,
and static analysis. Static techniques are typically used in the development of systems and may reduce the amount of
dynamic testing undertaken by the supplier.

Dynamic testing occurs when the system functionality is confirmed through execution. CSA is primarily focused on
dynamic testing and a risk-based application through scripted and unscripted techniques. Scripted testing is testing
carried out following a documented sequence of test cases, that is, the tester’s actions are prescribed by written
instructions within a test case. Unscripted testing is dynamic testing where tester’s actions are not prescribed step-
by-step within written instructions and are experience based in nature, and may include ad hoc, error guessing, and
exploratory testing. See Table 19.1 for further details.

CSA identifies these as possible ways of providing assurance and are not intended to be prescriptive or all inclusive.
There are many possible ways and techniques for classifying and structuring testing, and software professionals
and SME testers should choose the appropriate testing approaches, structure, and tools. Regulated companies may
leverage any of the approaches or a combination of approaches that they determine will verify the intended use and/
or mitigate the risk most appropriately.

19.3.1 Scripted Testing

Scripted test cases are the traditional validation test cases, which typically include execution instructions, expected
results, independent review, and approval of test cases. Scripted testing is not restricted to manual execution of pre-
defined test specifications and can and should leverage automated test tools where appropriate.

Limited and robust scripted testing are respectively reserved for medium- to high-risk, and highest-risk features,
functions, or operations. For example, robust scripted testing may include positive, negative, and alternate path
testing; limited scripted testing may be used to test positive scenarios explicitly documented due to the risk to patient
safety and/or product quality. It should be noted that CSA leverages risk-based documentation and test rigor is not
automatically linked to documentation rigor – see Figure 19.5.

In scripted testing, it is a common misconception that extensive screenshots are required to provide objective
evidence to demonstrate validation of computer systems used in automation of business processes. CSA addresses
this misconception by clarifying the purpose of the objective evidence and what constitutes an acceptable record
demonstrating confidence in the system reliably and repeatedly performing as intended.

In the age of modern browsers and freely available image editors, the screenshot, once the gold standard of
supporting evidence, no longer has the same irrefutable status. For example, modern web browsers have built-
in tools that can edit text and replace images prior to taking screenshots in an undetectable fashion as the edits
are indistinguishable from natively generated systems records. Similarly, image editors can be used to manipulate
screenshots after creation, in an undetectable fashion, especially when the images are printed.

While extensive screenshots are now considered to bring little value to verification activities, there are cases where
a screenshot may still have merit when collected for areas with high impact to patient safety and/or product quality.
These are explained in the ISPE GAMP® Good Practice Guide: A Risk-Based Approach to Testing of GxP Systems
(Second Edition) [22]:

“Requirements for additional test evidence in the form of printouts, screenshots, etc., should be clearly defined in
the test method and focus on:

• Test steps which produce complex results, which may be difficult or time consuming to record manually

• When it is faster than manual recording of sufficient evidence to allow independent review

• When the result is something essentially visual and easier to review from a screenshot

• When there is a need for ‘before and after’ comparisons”

19.3.2 Unscripted Testing

CSA identifies three unscripted testing approaches: (1) ad hoc testing; (2) error guessing; and (3) exploratory testing.
Ad hoc testing is unscripted testing performed without planning or pre-defined documentation, and may occur in an
informal test environment or a controlled environment, depending on the purpose of the testing. CSA recommends
ad hoc testing for the lower risk areas or as a precursor to defining scripted testing for higher risk functions. Ad hoc
tests may be experience based and are conducted randomly and informally with a minimal record of the activity (such
as the objectives and conclusions of the test activity and the identity of the tester). Figure 19.2 shows the least-
burdensome nature of the testing.

Figure 19.2: Ad hoc Test Case Example

Ad-hoc testing was performed on deletion permissions when configured for out-of-the-box
(OOB) roles and fields/screens. Testing executed by John Doe (JD) on 3rd June 2020.

Issues 005, 006, and 007 were found and logged in the bug tracker. Ad-hoc testing concluded
with all issues resolved and the system functioning as expected.

JD
Initials: __________
3 June 2020
Date: ________________________
_
_
Error guessing and exploratory testing are both experience-based testing techniques and rely on the skill and
expertise of the tester. It is important to use testers who understand the supporting business process and can
anticipate how real users will/might use the software.

Error guessing is another unscripted test design technique in which test cases are designed to expose anticipated
errors based on the tester’s experience and general knowledge of failure modes. Tests may challenge the quality
of the system by injecting invalid entries and errors into the system to evaluate behavior, especially reliability, of the
system. A structured approach to error guessing is to list common failure modes and attempt to produce them.

Exploratory testing is unscripted testing in which the tester actively controls the test design as those tests are
executed. Initial test design is based on the tester’s existing relevant knowledge, prior exploration of the test item
(including results from previous tests), and critical thinking regarding common software behaviors and types of failure.
During test execution, the tester uses information gained while testing to design new and better tests dynamically.
A structured approach to exploratory testing is to list the use cases/scenarios or specific operations that need to be
covered. Figure 19.3 is a sample of exploratory testing where the specific use cases are noted and explicitly covered.

Figure 19.3: Exploratory Testing Example

Test Overview – capture at a high level the test Use Cases Issues Tester Initials and Date
activities executed

Deletion functionality is disabled vs. hidden, i.e.,

inaccessible using keyboard shortcuts during a
OS Shortcuts None JD
transaction 2 June 2020
Permissions are enforced when users attempt to
delete via the API (without the user interface)
Folder Permissions None JD
2 June 2020
Error guessing may be documented as either ad hoc testing or exploratory testing depending on the level of rigor
involved in the planning of execution.

19.3.3 Test Records

Table 19.1 summarizes the acceptable assurance approaches and records discussed in the previous sections.

Table 19.1: Acceptable Assurance Approaches and Records

Assurance Test Plan Test Results Record (Digital Acceptable)

Approach

Unscripted Testing: Testing of Details regarding any • Summary description of

Ad hoc (with requirements or failures/deviations requirements or functions tested
least-burdensome functions with no test found • Issues found and disposition
documentation) plan • Conclusion statement
• Record of who performed testing
and date

Unscripted Testing: Testing of requirement Details regarding any • Summary description of failure
Error guessing or function failure failures/deviations modes tested
modes with optional found • Issues found and disposition
listing of expected • Conclusion statement
failure modes in • Record of who performed testing
advance and date

Unscripted Testing: Establish high-level • Pass/fail for each • Summary description of

Exploratory Testing test plan objectives test plan objective requirements or functions tested
for requirements or • Details regarding • Result for each test plan
functions (no step- any failures/ objective – only indication of
by-step procedure is deviations found pass/fail
necessary) • Issues found and disposition
• Conclusion statement
• Record of who performed testing
and date

Scripted Testing: • Limited test • Pass/fail for test • Summary description of

Limited cases (step-by- case identified requirements or functions tested
step procedure) • Details regarding • Result for each test case – any
identified any failures/ critical values and indication of
• Expected results deviations found and pass/fail
and values disposition regarding • Issues found and disposition
• Identify unscripted fails • Conclusion statement
testing applied • Record of who performed testing
• Independent review and date
and approval of • Record of who reviewed testing
test plan and date

Scripted Testing: • Test objectives • Pass/fail for test • Detailed report of assurance
Robust • Test cases (step- case activity
by-step procedure) • Details regarding • Result for each test case – any
• Expected results any failures/ critical values and indication of
and values deviations found and pass/fail
• Independent disposition regarding • Issues found and disposition
review and fails • Conclusion statement
approval of test • Record of who performed testing
cases and date
• Record of who reviewed testing
and date

These are all possible ways of providing assurance and are not intended to be prescriptive or all inclusive. Similarly,
different SMEs may be more appropriate for different test types; for example, interface testing is best done by a
system SME, whereas exploratory or error guessing may need both system and business expertise for maximum
defect detection.

The use of tools and automation for as many as possible of the activities mentioned is strongly encouraged over the
use of documents, for example, test management and test automation systems and tools, and use of traceability tools
rather than traditional traceability matrices.

19.4 Example: Applying Risk-based Approach from ISPE GAMP® 5 and CSA

This section contains an example model for integrating ISPE GAMP® 5 [9] with CSA, however regulated companies
should apply critical thinking to develop their own model. An effective testing or assurance strategy should consider
(with full and effective application of critical thinking) what is being tested, how it is built, who is performing the testing,
when, and with what tools and for what purpose. It is no longer enough to just be based on the potential residual risk
of the function, component, or attribute under test. Merging ISPE GAMP® 5 [9] with CSA begins with combining the
lifecycle approach (defined in Section 19.1.1) and risk-based assurance (Section 19.1.2).

Table 19.2 contains a simplistic example of a LIMS system using current conventional thinking, presented as a
teaching model.

Table 19.2: LIMS Approach using Conventional Thinking

Feature, Risk Risk Primary Additional Scenarios

Operation, Class Scenarios
Function

Only Privileged Severity: High M PS 1. PU can AS 1. Deletion restricted on custom fields?

Users (PU) delete laboratory
Probability: AS 2. Deletion restricted on custom forms?
can delete data
Low
laboratory data AS 3. Deletion restricted on batch jobs?
PS 2. General
Detectability:
users cannot AS 4. Deletions restricted on shortcuts?
Medium
delete laboratory
AS 5. Deletions restricted for custom PU,
data
e.g., local administrators?
PS 3. …
AS 6. …

Conventional approaches combine the primary scenarios with a few elements listed in the additional scenarios into
a formal validation protocol. Regulated companies often include the additional scenarios based on lessons learned
from production defects that escaped prior testing efforts on prior releases or other systems. Given exhaustive testing
is impossible (all combinations of preconditions and inputs), formally documenting unnecessarily creates a priority
inversion where documentation becomes more important than the quality of the system.

With CSA, the regulated company applies a risk-based approach to documentation and includes the primary
scenarios as part of the validation package. The additional scenarios are tested when the primary scenarios are
tested, but do not warrant formal validation documentation (e.g., risk assessments, scripts); the goal is to reduce the
probability of undiscovered defects remaining in the system that could impact intended use. Should the additional
scenarios warrant formal documentation and traceability, they should be moved from additional scenarios to primary.

Applying ISPE GAMP® 5 Appendix M3 [9] risk methodology shown in Figure 19.4, the company should focus its
validation effort on areas of higher risk priority.

Figure 19.4: Determining Risk Priority [9]

With CSA, the level of documentation detail needed can also be scaled based on risk priority, as shown in Figure 19.5.

Figure 19.5: Scaling the Test Rigor and Documentation Based on Risk Priority

Applying an integrated ISPE GAMP® 5 [9] and CSA approach that leverages critical thinking to the example above,
the validation activities can be determined, as shown in the example in Table 19.3.

Table 19.3: Validation Activities

Requirement Risk Risk Documentation Test Rigor

Priority Rigor

Only Privileged Severity: High M Unscripted Testing Primary scenarios as determined by the
Users (PU) – recommend regulated company
Probability:
can delete exploratory testing
Medium PS 1. PU can delete laboratory data
laboratory data
Detectability: PS 2. General users cannot delete
Medium laboratory data

PS 3. …

A full listing of all lifecycle testing activities for that requirement is shown in Table 19.4.

Table 19.4: Full Scope of Test Activities

Scenario Description Validation Activities Lifecycle Activities

PS 1. PU can delete laboratory data Exploratory testing/ Automation of testing and/or

configuration verification configuration verification

PS 2. General users cannot delete Exploratory testing/ Automation of testing and/or

laboratory data configuration verification configuration verification

PS 3. … Exploratory testing …

AS 1. Deletion restricted on custom Ad hoc testing/Error Guessing*

fields?

AS 2. Deletion restricted on custom Ad hoc testing/Error Guessing

forms?

AS 3. Deletion restricted on batch Ad hoc testing/Error Guessing

jobs?

AS 4. Deletions restricted on Ad hoc testing/Error Guessing

shortcuts?

AS 5. Deletions restricted for custom Ad hoc testing/Error Guessing

PU, e.g., local administrators?

AS 6. … … …

*These can be tested before validation testing begins, or when the primary scenarios are tested. Should they
require formal documentation and traceability, they should be changed to primary scenarios.

The emphasis should always be on testing to ensure patient safety, product quality, and data integrity rather than the
generation of documentation for inspection purposes alone. In this case, only the primary scenarios are documented,
which provides more time to test using modern techniques (e.g., automation). The additional scenarios are tested
as determined by the manufacturer to ensure they work properly for their business and not just for inspection
purposes alone. The level of documentation should take into account any future needs for regression testing (i.e.,
is the documentation sufficiently detailed that a repeat iteration would follow exactly the same test procedure?) and
whether it provides sufficient assurance to justify leveraging these tests rather than repeating them as formal scripted
testing in a later environment. Ongoing metrics and monitoring are recommended to be used to ensure the system is
operating as expected.

19.5 Example: Applying ISPE GAMP® 5 and CSA Using Direct Leveraging of Testing
Throughout the System Lifecycle

This section revisits the example in Section 19.4, but here uses the lifecycle approach endorsed by ISPE GAMP® 5
[9] and CSA to leverage the efforts from earlier stages and increase the amount of testing to ensure patient safety
and product quality. Specifically, this example presents a method for leveraging functional and UAT performed in an
earlier environment (for example, in a development or evaluation environment) instead of repeating the same testing
in a formal validation environment. As noted in Section 19.4, the regulated companies should apply critical thinking to
develop their own model.

19.5.1 Background

The regulated company applies a software testing process as part of implementing a new computerized system. This
testing (undertaken as Software Quality Assurance (SQA)) is separate and distinct from the development testing
performed by the vendor during the original development of the computerized system. The intent of this testing is risk
awareness through defect prevention and defect detection; it is led by the test team with support from the business
team and technical team; all three own the accountability for the quality of the system through reviews and analysis to
prevent defects, and scripted and unscripted testing to detect defects. Close collaboration ensures the SQA activity is
aligned with intended use and technical complexity. The test team should plan and execute the testing approach once
this alignment is achieved.

While non-GxP in nature, the earlier environments are still controlled and governed by the good engineering
practices (e.g., configuration management, release management); for example, all deployments should be approved
before execution; all deployments are assessed for impact for changes; test planning should be premeditated, test
scenarios/objectives should be reviewed before execution; and the test results should be documented in a fashion
that can be reviewed.

19.5.2 Assurance Approach

The validation plan provides the rationale and justification for using testing from the earlier environments, and the
validation SOP specifically allows leveraging testing from an earlier environment.

Upon completion of the internal test cycle, the test artifacts are evaluated by the validation team8 to confirm each
requirement has been tested to the level of rigor commensurate with the risk priority (e.g., as determined by ISPE
GAMP® 5 Appendix M3 [9]). The focus is functional acceptability of the requirements based on the artifacts, that is, is
the function, operation, or feature working as required? If yes, trace the requirement to the functional testing effort in
the traceability matrix. If no, additional testing is performed to remediate the gap before establishing traceability.

The regulated company’s GxP controls begin at the point of leveraging, that is, when the earlier environment test
artifacts are evaluated for inclusion as part of the overall validation package via the traceability matrix. The regulated
company should refrain from applying GxP documentation standards on the earlier environment testing. For example,
as part of unscripted testing the test evidence may be generated before the test documentation is completed or test
cases may be executed by automation. The question is: “Has the right testing been performed?”, not “Has the work
been documented to good documentation standards?” QA should review the traceability matrix to ensure that the test
strategies were followed and implemented as planned.

The validation summary report documents the extent of testing from the earlier environment that was leveraged, with
reference back to the justification for this in the validation plan.

8
The test team and validation team may be one and the same.

Similarly, the regulated company should determine the level of risk-based documentation from a CSA perspective.
Areas of higher risk should have more detailed test documentation as noted in Figure 1.5. Successful application
of CSA depends on the ability to recognize that different testing techniques can be combined to achieve risk-based
assurance; for example, even the highest-risk requirement can be confirmed based on unscripted testing (e.g., a
collection of exploratory test scenarios that cover the feature fully). See Table 19.5.

Table 19.5: Direct Leveraging of SQA Testing

Requirement Risk Risk Leverage Activity*

Class

Only Privileged Users Severity: High M A retrospective review of the SQA cycle documentation
(PU) can delete identified that the following test scenarios had been
Probability:
laboratory data covered:
Medium
Ad hoc/Basic Assurance
Detectability:
1. Deletion permissions work when configured for Out-of-
Medium
the-Box (OOB) roles and fields/screens

Automated testing:
1. User roles (including lab user) configuration
programmatically confirmed
2. …

Scripted (manual) testing:

1. Inheritance of permissions when member has multiple
roles
2. …

Exploratory testing:
1. Deletion functionality is disabled versus hidden,
i.e., inaccessible using keyboard short cuts during a
transaction
2. Permissions are enforced when users attempt to delete
via the API (without the user interface)
3. …

Ad hoc testing (defects detected via error guessing):

1. Deletion permissions not enforced on batch jobs
2. When a user has two different roles only the first role
alphabetically is enabled
3. Deletion permissions were not enforced properly when
OOB roles and manufacturer configured roles are
granted to a user
4. …

The manufacturer reviews the testing and determines if

it is sufficient from a validation rigor and documentation
perspective. The requirements are mapped via the
traceability matrix to the test case.

*This represents one middle-of-the road-example of leveraging, and many other techniques may be used. For
example: (1) the traceability matrix from the SQA cycle can be used directly without creating a separate one; or
(2) the business team, technical team, and test team can define a test design specification to prospectively guide
the test effort (as opposed to retrospectively guide in the example), and assures the scenarios for leveraging are
documented sufficiently.

The leveraging of UAT may occur in analogous fashion, where the business users can also provide their test
scenarios and test data for inclusion during the testing stages to minimize or obviate the need for a separate user
acceptance test cycle or PQ. The business users can review the output of the testing to ensure their needs are met.

19.6 Conclusion

CSA describes a risk-based approach to the validation of computerized systems that reinforces ISPE GAMP® 5 [9]
principles and key concepts. The foundation is the application of critical thinking by a knowledgeable and experienced
team of SMEs. The assurance activities within the quality system should be meaningful and add value, applying a
pragmatic, least-burdensome approach to ensuring patient safety, product quality, and data integrity.

20 Appendix G1 – References

Appendix G1
1. EudraLex Volume 4 – Guidelines for Good Manufacturing Practice for Medicinal Products for Human and
Veterinary Use, Chapter 1: Pharmaceutical Quality System, January 2013, http://ec.europa.eu/health/documents/
eudralex/vol-4/index_en.htm.

2. 21 CFR Part 211.160 – Current Good Manufacturing Practice for Finished Pharmaceuticals; General
requirements, Code of Federal Regulations, US Food and Drug Administration (FDA), www.fda.gov.

3. 21 CFR Part 211.194 – Current Good Manufacturing Practice for Finished Pharmaceuticals; Laboratory records,
Code of Federal Regulations, US Food and Drug Administration (FDA), www.fda.gov.

4. US FDA Center for Devices and Radiological Health (CDRH), Case for Quality, Food and Drug Administration
(FDA), https://www.fda.gov/medical-devices/quality-and-compliance-medical-devices/case-quality.

5. Wyn, S., Reid, C.J., Clark, C., Rutherford, M.L., Watson, H.D., Vuolo-Schuessler, L.L., Perez, A., “Why ISPE
GAMP® Supports the FDA CDRH: Case for Quality Program,” Pharmaceutical Engineering, November/December
2019, Vol. 39, No. 6, pp. 37-41, www.ispe.org.

6. “Understanding Barriers to Medical Device Quality,” Center for Devices and Radiological Health (CDRH),
Food and Drug Administration (FDA), www.fda.gov/about-fda/cdrh-reports/understanding-barriers-medical-
device-quality.

7. ISPE GAMP® Good Practice Guide Series, International Society for Pharmaceutical Engineering (ISPE),
www.ispe.org.

8. ISPE GAMP® Guide: Records and Data Integrity, International Society for Pharmaceutical Engineering (ISPE),
First Edition, March 2017, www.ispe.org.

9. ISPE GAMP® 5: A Risk-Based Approach to Compliant GxP Computerized Systems, International Society for
Pharmaceutical Engineering (ISPE), Fifth Edition, February 2008, www.ispe.org.

10. 21 CFR Part 11 – Electronic Records; Electronic Signatures, Code of Federal Regulations, US Food and Drug
Administration (FDA), www.fda.gov.

11. EudraLex Volume 4 – Guidelines for Good Manufacturing Practices for Medicinal Products for Human and
Veterinary Use, Annex 11: Computerized Systems, June 2011, http://ec.europa.eu/health/documents/eudralex/
vol-4/index_en.htm.

12. MHRA Guidance: ‘GXP’ Data Integrity Guidance and Definitions, Revision 1, March 2018, Medicines &
Healthcare products Regulatory Agency (MHRA), www.gov.uk/government/organisations/medicines-and-
healthcare-products-regulatory-agency.

13. OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring, Draft Advisory Document of
the Working Group on Good Laboratory Practice on GLP Data Integrity, Organisation for Economic Cooperation
and Development (OECD), August 2020, www.oecd.org/chemicalsafety/testing/draft-glp-guidance-documents-
public-comments.htm.

14. DAMA International, DAMA International’s Guide to the Data Management Body of Knowledge (DAMA-
DMBOK2), Second Edition, ISBN, PDF 9781634622363, DAMA International, https://technicspub.com/dmbok/.

15. ISPE GAMP® RDI Good Practice Guide: Data Integrity – Manufacturing Records, International Society for
Pharmaceutical Engineering (ISPE), First Edition, May 2019, www.ispe.org.

16. ISPE GAMP® RDI Good Practice Guide: Data Integrity – Key Concepts, International Society for Pharmaceutical
Engineering (ISPE), First Edition, October 2018, www.ispe.org.

17. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of
natural persons with regard to the processing of personal data and on the free movement of such data, and
repealing Directive 95/46/EC (General Data Protection Regulation), GDPR Regulation (EU) 2016/679, (General
Data Protection Regulation), gdpr-info.eu/.

18. Health Insurance Portability and Accountability Act of 1996 (HIPAA), U.S. Department of Health & Human
Services, Centers for Disease Control (CDC), www.cdc.gov/phlp/publications/topic/hipaa.html.

19. ISPE GAMP® Good Practice Guide: IT Infrastructure Control and Compliance, International Society for
Pharmaceutical Engineering (ISPE), Second Edition, August 2017, www.ispe.org.

20. H.R. 3763 – Sarbanes-Oxley Act of 2002, https://www.congress.gov/bill/107th-congress/house-bill/3763.

21. ISPE GAMP® Good Practice Guide: A Risk-Based Approach to Operation of GxP Computerized Systems,
International Society for Pharmaceutical Engineering (ISPE), First Edition, January 2010, www.ispe.org.

22. ISPE GAMP® Good Practice Guide: A Risk-Based Approach to Testing of GxP Systems, International Society for
Pharmaceutical Engineering (ISPE), Second Edition, December 2012, www.ispe.org.

23. WHO Technical Report Series, No. 996, Annex 5: Guidance on Good Data and Record Management Practices,
World Health Organization (WHO), 2016, http://apps.who.int/medicinedocs/en/d/Js22402en/.

24. PIC/S Draft Guidance: PI 041-1 (Draft 3) Good Practices for Data Management and Integrity in Regulated GMP/
GDP Environments, November 2018, Pharmaceutical Inspection Co-operation Scheme (PIC/S),
www.picscheme.org/.

25. PIC/S Guide to Good Manufacturing Practice for Medicinal Products, Annex 11: Computerised Systems, PE 009-
14 (Annexes), July 2018, Pharmaceutical Inspection Co-operation Scheme (PIC/S), www.picscheme.org/.

26. Amazon, “Amazon Compute Service Level Agreement,” Last Updated: 22 July 2020, https://aws.amazon.com/
compute/sla/#:~:text=AWS%20will%20use%20commercially%20reasonable,the%20%E2%80%9CService%20
Commitment%E2%80%9D.

27. International Council for Harmonisation (ICH), ICH Harmonised Tripartite Guideline, Quality Risk Management –
Q9, Step 4, 9 November 2005, www.ich.org.

28. ISO 14971:2019 Medical Devices -- Application of Risk Management to Medical Devices, International
Organization for Standardization (ISO), www.iso.org.

29. 21 CFR Part 211 – Current Good Manufacturing Practice for Finished Pharmaceuticals, Code of Federal
Regulations, US Food and Drug Administration (FDA), www.fda.gov.

30. FDA Guidance for Industry: Part 11, Electronic Records; Electronic Signatures – Scope and Application, August
2003, US Food and Drug Administration (FDA), www.fda.gov.

31. EudraLex Volume 4 – Guidelines for Good Manufacturing Practice for Medicinal Products for Human and
Veterinary Use, Chapter 4: Documentation, January 2011, http://ec.europa.eu/health/documents/eudralex/vol-4/
index_en.htm.

32. International Council for Harmonisation (ICH), ICH Harmonised Tripartite Guideline, Pharmaceutical Quality
System – Q10, Step 4, 4 June 2008, www.ich.org.

33. EudraLex Volume 4 – Guidelines for Good Manufacturing Practices for Medicinal Products for Human and
Veterinary Use, Annex 2: Manufacture of Biological Active Substances and Medicinal Products for Human Use,
http://ec.europa.eu/health/documents/eudralex/vol-4/index_en.htm.

34. PIC/S Guide to Good Manufacturing Practice for Medicinal Products, Annex 2: Manufacture of Biological
active substances and Medicinal Products for Human Use, PE 009-14 (Annexes), June 2018, Pharmaceutical
Inspection Co-operation Scheme (PIC/S), www.picscheme.org/.

35. FDA Guidance for Industry: Data Integrity and Compliance With Drug CGMP Questions and Answers, December
2018, US Food and Drug Administration (FDA), www.fda.gov.

36. ISPE GAMP® Good Practice Guide: Validation and Compliance of Computerized GCP Systems and Data (Good
eClinical Practice), International Society for Pharmaceutical Engineering (ISPE), First Edition, December 2017,
www.ispe.org.

37. United States Pharmacopeia–National Formulary (USP-NF), www.usp.org/USPNF.

38. European Pharmacopoeia (Ph. Eur.), EDQM Council of Europe, https://www.edqm.eu/en/ph-eur-9th-edition.

39. Japanese Pharmacopoeia (JP), Pharmaceuticals and Medical Devices Agency (PMDA), https://www.pmda.go.jp/
english/rs-sb-std/standards-development/jp/0005.html.

40. FDA Warning Letters, US Food and Drug Administration (FDA), www.fda.gov.

41. ISPE GAMP® Good Practice Guide: A Risk-Based Approach to GxP Compliant Laboratory Computerized
Systems, International Society for Pharmaceutical Engineering (ISPE), Second Edition, October 2012,
www.ispe.org.

42. ISPE GAMP® Good Practice Guide: A Risk-Based Approach to GxP Process Control Systems, International
Society for Pharmaceutical Engineering (ISPE), Second Edition, February 2011, www.ispe.org.

43. ISPE GAMP® Good Practice Guide: Manufacturing Execution Systems – A Strategic and Program Management
Approach, International Society for Pharmaceutical Engineering (ISPE), First Edition, February 2010, www.ispe.org.

44. MHRA/HRA, Joint statement on seeking consent by electronic methods, September 2018, Medicines and
Healthcare products Regulatory Agency (MHRA) and Health Research Authority (HRA), https://www.hra.nhs.uk/
planning-and-improving-research/best-practice/informing-participants-and-seeking-consent/#:~:text=Joint%20
HRA%20and%20MHRA%20statement%20on%20seeking%20consent,for%20seeking%20and%20documenti-
ng%20consent%20using%20electronic%20methods.

45. International Council for Harmonisation (ICH), ICH Harmonised Guideline, Integrated Addendum to ICH E6(R1):
Guideline for Good Clinical Practice E6(R2), Step 4, 9 November 2016, www.ich.org.

46. Rowley, J., “The Wisdom Hierarchy: Representations of the DIKW Hierarchy,” Journal of Information Science,
2007, 33(2), pp.163-180.

47. Ackoff, R. L., “From Data to Wisdom,” Journal of Applied Systems Analysis, 1989, 16(1), pp. 3-9.

dictionaries/monolingual/cbed.

49. Kane, P., A Blueprint for Knowledge Management in the Biopharmaceutical Sector, Doctoral thesis, Dublin
Institute of Technology (DIT), 2018. doi.org/10.21427/aex5-5p19.
Note on Figure 8.2: Diagram by Dr. Juan C. Dürsteler, adapted from “An Overview of Understanding” by
N. Shedroff in the book Information Anxiety 2 by R. S. Wurman (2000), further adapted by Dr. Paige Kane.

50. International Council for Harmonisation (ICH), ICH Harmonised Tripartite Guideline, Pharmaceutical
Development – Q8(R2), Step 5, August 2009, www.ich.org.

51. International Council for Harmonisation (ICH), ICH Harmonised Tripartite Guideline, Development and
Manufacture of Drug Substances (chemical entities and biotechnological/biological entities) – Q11, Step 4,
1 May 2012, www.ich.org.

52. International Council for Harmonisation (ICH), ICH Harmonised Tripartite Guideline, Technical and Regulatory
Considerations for Pharmaceutical Product Lifecycle Management – Q12, Final Version, Adopted 20 November
2019, www.ich.org.

53. Trees, L., Improving the Flow of Organizational Knowledge, 2018, Houston: APQC Knowledge Base,

www.APQC.org.

54. APQC, (American Productivity & Quality Center), www.apqc.org.

55. ISPE Cultural Excellence Report, International Society for Pharmaceutical Engineering (ISPE), Fifth Edition,
April 2017, www.ispe.org.

56. Newton, M. E., and White, C. H., “Data Quality and Data Integrity: What is the Difference?” iSpeak Blog, 15 June
2015, International Society for Pharmaceutical Engineering (ISPE), www.ispe.org.

57. Perez, A.D., Canterbury, J., Hansen, E., Samardelis, J.S., Longden, H., Rambo, R.L., “Application of the SOC2+
Process to Assessment of GxP Suppliers of IT Services,” Pharmaceutical Engineering, July/August 2019, Vol.
39, No. 4, pp. 14-20, www.ispe.org.

58. ISO/IEC 27001:2013 Information Technology -- Security Techniques -- Information Security Management
Systems -- Requirements, ISO/IEC JTC1, International Organization for Standardization (ISO), www.iso.org, and
International Electronical Commission (IEC), www.iec.ch.

59. 21 CFR Part 58 – Good Laboratory Practice for Nonclinical Laboratory Studies, Code of Federal Regulations, US
Food and Drug Administration (FDA), www.fda.gov.

60. OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring, Number 15, Establishment
and Control of Archives that Operate in Compliance with the Principles of GLP, ENV/JM/MONO(2007)10,
Organisation for Economic Cooperation and Development (OECD), July 2007, www.oecd-ilibrary.org/
environment/oecd-series-on-principles-of-good-laboratory-practice-and-compliance-monitoring_2077785x.

61. 21 CFR Part 606 – Current Good Manufacturing Practice for Blood and Blood Components, Code of Federal
Regulations, US Food and Drug Administration (FDA), www.fda.gov.

62. 21 CFR Part 820 – Quality System Regulation; General, Code of Federal Regulations, US Food and Drug
Administration (FDA), www.fda.gov.

63. Commission Directive 2003/94/EC of 8 October 2003 laying down the principles and guidelines of good
manufacturing practice in respect of medicinal products for human use and investigational medicinal products for
human use, Official Journal of the European Union, www.legislation.gov.uk/eudr/2003/94/adopted.

64. Directive 2001/83/EC of the European Parliament and of the Council of 6 November 2001 on the Community
Code Relating to Medicinal Products for Human Use, Official Journal L – 311, 28/11/2004, pp. 67-128, www.ema.
europa.eu/en/documents/regulatory-procedural-guideline/directive-2001/83/ec-european-parliament-council-6-
november-2001-community-code-relating-medicinal-products-human-use_en.pdf.

65. EudraLex Volume 4 – Guidelines for Good Manufacturing Practice for Medicinal Products for Human and
Veterinary Use, http://ec.europa.eu/health/documents/eudralex/vol-4/index_en.htm.

66. PIC/S Guide to Good Manufacturing Practice for Medicinal Products, PE 009-14, July 2018, Pharmaceutical
Inspection Co-operation Scheme (PIC/S), www.picscheme.org/.

67. Guidelines on Good Distribution Practice of medicinal products for human use – 2013/C 343/01, https://eur-lex.
europa.eu/LexUriServ/LexUriServ.do?uri=OJ:C:2013:343:0001:0014:EN:PDF.

68. Regulation (EU) No 536/2014 of the European Parliament and of the Council of 16 April 2014 on clinical trials on
medicinal products for human use, and repealing Directive 2001/20/EC, ec.europa.eu/health/human-use/clinical-
trials/regulation_en.

69. PIC/S Guidance: PI 011-3 Good Practices for Computerised Systems in Regulated “GXP” Environments, 25
September 2007, Pharmaceutical Inspection Co-operation Scheme (PIC/S), www.picscheme.org.

70. OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring, Number 1, OECD
Principles on Good Laboratory Practice (as revised in 1997) ENV/MC/CHEM(98)17, Organisation for Economic
Cooperation and Development (OECD), January 1998, www.oecd-ilibrary.org/environment/oecd-series-on-
principles-of-good-laboratory-practice-and-compliance-monitoring_2077785x.

71. OECD Series on Principles of Good Laboratory Practice and Compliance Monitoring, Number 17, Application of
GLP Principles to Computerised Systems, ENV/JM/MONO(2016)13, Organisation for Economic Cooperation and
Development (OECD), April 2016.

72. FDA Discussion Paper: Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine
Learning (AI/ML) – Based Software as a Medical Device (SaMD), US Food and Drug Administration (FDA),
www.fda.gov. https://www.fda.gov/media/122535/download.

73. BSI and AAMI Position Paper 2019: “The emergence of artificial intelligence and machine learning algorithms in
healthcare: Recommendations to support governance and regulation,” https://www.bsigroup.com/globalassets/
localfiles/en-gb/about-bsi/nsb/innovation/mhra-ai-paper-2019.pdf.

74. ISO IEC 62304: 2006 Medical device software — Software life cycle processes, ISO/IEC JTC1, International
Organization for Standardization (ISO), www.iso.org, and International Electronical Commission (IEC), www.iec.ch.

75. Harwich, E., Laycock, K., “Thinking on its own: AI in the NHS Section 5 – Overcoming System Challenges,”
Reform, Reform Research Trust, https://reform.uk/sites/default/files/2018-11/AI%20in%20Healthcare%20report_
WEB.pdf.

76. Taulli, T., Artificial Intelligence Basics: A Non-Technical Introduction, Apress L. P., 2019. ISBN-13: 978-1-4842-
5028-0 / ISBN-13: 978-1-4842-5027-3.

77. 21 CFR Part 820.70 – Production and Process Controls, Code of Federal Regulations, US Food and Drug
Administration (FDA), www.fda.gov.

78. FDA Guidance for Industry and Staff: General Principles of Software Validation; Final Guidance for Industry and
FDA Staff, January 2002, US Food and Drug Administration (FDA), www.fda.gov.

79. AAMI TIR36:2007, Validation Of Software For Regulated Processes, December 2007, Association for the
Advancement of Medical Instrumentation (AAMI), www.aami.org.

80. FDA Guidance for Industry: Computerized Systems Used in Clinical Investigation, May 2007, US Food and Drug
Administration (FDA), www.fda.gov.

81. ASTM Standard E2500-13, “Standard Guide for Specification, Design, and Verification of Pharmaceutical and
Biopharmaceutical Manufacturing Systems and Equipment,” ASTM International, West Conshohocken, PA,
www.astm.org.

21 Appendix G2 – Glossary

Appendix G2
21.1 Acronyms and Abbreviations

AAMI Association for the Advancement of Medical Instrumentation

ALCOA+ Acceptable, Legible, Contemporaneous, Original, Accurate, Complete, Consistent, Enduring, Available
AI Artificial Intelligence
API Application Program Interface
APQC American Productivity & Quality Center
BCP Business Continuity Plan
BoM Bill of Materials
CAPA Corrective and Preventive Action
CDC Centers for Disease Control (US)
CDRH Center for Devices and Radiological Health (US)
CDS Chromatography Data System
CFU Colony Forming Unit
CGMP Current Good Manufacturing Practice (FDA)
CMO Contract Manufacturing Organization
CPP Critical Process Parameter
CQA Critical Quality Attribute
CRO Contract Research Organization
CSA Computer Software Assurance
CSV Computerized System Validation
CxO Contract “x” Organizations, CMO, or other variants
DIKW Data, Information, Knowledge, Wisdom
DIKW/I Data, Information, Knowledge, Wisdom/Insights
DRP Disaster Recovery Plan
EBRS Electronic Batch Record System
ECG Electrocardiogram
eCRF Electronic Case Report Form
EDMS Electronic Document Management Systems
EHR Electronic Health Record
ELN Electronic Laboratory Notebook
ERES Electronic Records, Electronic Signatures
ERP Enterprise Resource Planning
FDA Food and Drug Administration (US)
FMEA Failure Mode Effect Analysis
FTA Fault-Tree Analysis
GCP Good Clinical Practice
GDP Good Distribution Practice
GDPR General Data Protection Regulation (EU)
GLP Good Laboratory Practice
GMP Good Manufacturing Practice
GxP Good “x” Practice
HAZOP Hazard Analysis and Critical Control Point
HIPAA Health Insurance Portability and Accountability Act (US)

HPLC High Pressure Liquid Chromatography

HRA Health Research Authority
IaaS Infrastructure as a Service
ICH International Council for Harmonisation

IDE Investigational Device Exemptions

IND Investigational New Drug

IPC In-Process Checks

IS Information Systems
IT Information Technology

ITSM IT Service Management
JP Japanese Pharmacopoeia

JPG Joint Photographic Experts Group File
KM Knowledge Management

LDAP Lightweight Directory Authentication Protocol
LES Laboratory Execution System
LIMS Laboratory Information Management Systems
LQI Leading Quality Indicators

MES Manufacturing Execution System
MHRA Medicines & Healthcare products Regulatory Agency (UK)
ML Machine Learning

NIR Near Infrared

OECD Organisation for Economic Co-operation and Development
OOB Out of the Box
OS Operating Systems

OTC Over the Counter
PaaS Platform as a Service
PAT Process Analytical Technology

PCS Process Control System
PDF Portable Document Format
PI Personal Information

PII Personally Identifiable Information
PIC/S Pharmaceutical Inspection Co-operation Scheme
PID Proportional–Integral–Derivative

PMDA Pharmaceuticals and Medical Devices Agency (Japan)
PU Privileged Users

QA Quality Assurance

QC Quality Control

QMS Quality Management System
QP Quality Person

QRM Quality Risk Management
RACI Responsible/Accountable/Consulted/Informed
RDI Records and Data Integrity

RPO Recovery Point Objective
RTO Recovery Time Objective
SaaS Software as a Service
SaMD Software as a Medical Device

SLA Service Level Agreement

SME Subject Matter Expert
SOP Standard Operating Procedures
SQA Software Quality Assurance
SQL Structured Query Language
TIFF Tagged Image File Format
UAT User Acceptance Testing

UNC Universal Naming Convention
URL Uniform Resource Locator
URS User Requirements Specification
USP United States Pharmacopeia
VM Virtual Machine

WHO World Health Organization

21.2 Definitions

Archive (MHRA [12])

A designated secure area or facility (e.g. cabinet, room, building or computerised system) for the long term, retention
of data and metadata for the purposes of verification of the process or activity.

Biometrics (US FDA, 21 CFR Part 11 [10])

A method of verifying an individual’s identity based on measurement of the individual’s physical feature(s) or
repeatable action(s) where those features and/or actions are both unique to that individual and measurable.

Computerized System (ISPE GAMP® 5 [9])

A broad range of systems including, but not limited to, automated manufacturing equipment, automated laboratory
equipment, process control and process analytical, manufacturing execution, laboratory information management,
manufacturing resource planning, clinical trials data management, vigilance and document management systems.
The computerized system consists of the hardware, software, and network components, together with the controlled
functions and associated documentation.

Critical Thinking (ISPE GAMP® Guide: Records and Data Integrity [8])

A systematic, rational, and disciplined process of evaluating information from a variety of perspectives to yield a
balanced and well-reasoned answer.

Data Governance (MHRA [12])

The arrangements to ensure that data, irrespective of the format in which they are generated, are recorded,
processed, retained, and used to ensure the record throughout the data lifecycle.

Data Integrity (DI) (MHRA [12])

Data integrity is the degree to which data are complete, consistent, accurate, trustworthy, reliable and that these
characteristics of the data are maintained throughout the data life cycle. The data should be collected and maintained
in a secure manner, so that they are attributable, legible, contemporaneously recorded, original (or a true copy) and
accurate. Assuring data integrity requires appropriate quality and risk management systems, including adherence to
sound scientific principles and good documentation practices.

Data Migration (MHRA [12])

Data migration is the process of moving stored data from one durable storage location to another. This may include
changing the format of data, but not the content or meaning.

Data Quality (MHRA [12])

The assurance that data produced is exactly what was intended to be produced and fit for its intended purpose. This
incorporates ALCOA.

Data Quality (OECD [13])

Data quality is the assurance that the data produced are generated according to applicable standards and fit for
intended purpose in regard to the meaning of the data and the context that supports it. Data quality affects the value
and overall acceptability of the data in regard to decision-making or onward use.

Data Transfer (MHRA [12])

Data transfer is the process of transferring data between different data storage types, formats, or computerized
systems.

Electronic Record (US FDA, 21 CFR Part 11 [10])

Any combination of text, graphics, data, audio, pictorial, or other information representation in digital form that is
created, modified, maintained, archived, retrieved, or distributed by a computer system.

Electronic Signature (US FDA, 21 CFR Part 11 [10])

A computer data compilation of any symbol or series of symbols executed, adopted, or authorized by an individual to
be the legally binding equivalent of the individual’s handwritten signature.

GxP Regulation (ISPE GAMP® Guide: Records and Data Integrity [8])

The underlying international pharmaceutical requirements, such as those set forth in the US FD&C Act, US PHS
Act, FDA regulations, EU Directives and guidelines, Japanese regulations, or other applicable national legislation or
regulations under which a company operates. These include but are not limited to:

• Good Manufacturing Practice (GMP) (pharmaceutical, including Active Pharmaceutical Ingredient (API),
veterinary, and blood)

• Good Clinical Practice (GCP)

• Good Laboratory Practice (GLP)

• Good Distribution Practice (GDP)

• Good Pharmacovigilance Practice (GVP, also known as GPvP)

Raw Data (MHRA [12])

Raw data is defined as the original record (data) which can be described as the first-capture of information, whether
recorded on paper or electronically. Information that is originally captured in a dynamic state should remain available
in that state.

For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
For individual use only. © Copyright ISPE 2020. All rights reserved.
gxp 1027
600 N. Westshore Blvd., Suite 900, Tampa, Florida 33609 USA
Tel: +1-813-960-2105, Fax: +1-813-264-2816

www.ISPE.org

gxp 1027

PDA TR60-2013 Process Validation A Lifecycle Approach
100% (3)
PDA TR60-2013 Process Validation A Lifecycle Approach
96 pages
Download ISPE GAMP Good Practice Guide IT Infrastructure Control and Compliance Second Edition Ispe Headquarters ebook All Chapters PDF
60% (5)
Download ISPE GAMP Good Practice Guide IT Infrastructure Control and Compliance Second Edition Ispe Headquarters ebook All Chapters PDF
40 pages
Ispe Gamp 5
No ratings yet
Ispe Gamp 5
356 pages
ISPE Practical Implementation of the Lifecycle Approach to Process Validation
100% (1)
ISPE Practical Implementation of the Lifecycle Approach to Process Validation
208 pages
Validation of Dry Heat Processes PDA TR-3
100% (1)
Validation of Dry Heat Processes PDA TR-3
118 pages
GAMP 5 for GxP Compliant Computerized Systems
No ratings yet
GAMP 5 for GxP Compliant Computerized Systems
11 pages
The Difference Between ICH Q2 (R1) and Q2 (R2) - LinkedIn
100% (5)
The Difference Between ICH Q2 (R1) and Q2 (R2) - LinkedIn
3 pages
TR84 Integrating Data Integrity Requirements into Manufacturing & Packing Operations
100% (1)
TR84 Integrating Data Integrity Requirements into Manufacturing & Packing Operations
65 pages
ISPE GPG：Membrane-Based WFI Systems 膜式WFI系统（2022）
100% (1)
ISPE GPG：Membrane-Based WFI Systems 膜式WFI系统（2022）
194 pages
Marinelli Audit Trail SOP Template
No ratings yet
Marinelli Audit Trail SOP Template
26 pages
18081
100% (1)
18081
333 pages
Latest Baby Thesis
69% (13)
Latest Baby Thesis
21 pages
PDA TR Nº40 Sterilizing Filtration of Gases PDF
No ratings yet
PDA TR Nº40 Sterilizing Filtration of Gases PDF
45 pages
?GMP Compliant Equipment Design?ECA Guideline?
100% (1)
?GMP Compliant Equipment Design?ECA Guideline?
5 pages
PDA 2020 PTC - Isolators
No ratings yet
PDA 2020 PTC - Isolators
85 pages
Omnibus Health Guidelines For Adolescents 2022
No ratings yet
Omnibus Health Guidelines For Adolescents 2022
90 pages
Pda Tr62 人工无菌工艺规范建议-英文
No ratings yet
Pda Tr62 人工无菌工艺规范建议-英文
34 pages
Astm E3219 20
100% (1)
Astm E3219 20
11 pages
PDA TR 87 2021 Currents Best Practices For Pharmaceutical Glass Vial Handling and Processing
67% (3)
PDA TR 87 2021 Currents Best Practices For Pharmaceutical Glass Vial Handling and Processing
48 pages
ISPE Boston Chapter Webinar: ISPE Baseline Guide Volume 5, 2nd Edition
0% (1)
ISPE Boston Chapter Webinar: ISPE Baseline Guide Volume 5, 2nd Edition
63 pages
Validation Master Plan (VMP) For Plasmapheresis. Plasmapheresis in 54 Steps
No ratings yet
Validation Master Plan (VMP) For Plasmapheresis. Plasmapheresis in 54 Steps
21 pages
Blend Uniformity Analysis
100% (1)
Blend Uniformity Analysis
16 pages
Cleaning Validation Guidelines - A Complete List
No ratings yet
Cleaning Validation Guidelines - A Complete List
11 pages
Torres Harding Siers Olson 2012 Development and Psychometric Evaluation of The Social Justice Scale
No ratings yet
Torres Harding Siers Olson 2012 Development and Psychometric Evaluation of The Social Justice Scale
12 pages
BIQS 2017 Slide Show
No ratings yet
BIQS 2017 Slide Show
202 pages
PDA TR 84 Presentation
No ratings yet
PDA TR 84 Presentation
30 pages
Revision de Batch Records en Farmaceuticas
No ratings yet
Revision de Batch Records en Farmaceuticas
8 pages
1 Ispe Pat Cop Dach Awareness Doc Final v1.0
100% (1)
1 Ispe Pat Cop Dach Awareness Doc Final v1.0
67 pages
2013-840 Guidance 11-25-14 Supac Adendum For Equipment
100% (1)
2013-840 Guidance 11-25-14 Supac Adendum For Equipment
48 pages
Introduction To The ASTM E3106 Standard Guide To Science-Based and Risk-Based Cleaning Process Developmentand Validation
100% (1)
Introduction To The ASTM E3106 Standard Guide To Science-Based and Risk-Based Cleaning Process Developmentand Validation
14 pages
Pda TR17 1992
No ratings yet
Pda TR17 1992
23 pages
ISPE Guide：ATMPs – Autologous Cell Therapy 前沿治疗药物 – 自体细胞疗法
No ratings yet
ISPE Guide：ATMPs – Autologous Cell Therapy 前沿治疗药物 – 自体细胞疗法
148 pages
Transfer of A Manufacturing Process: Case Study Solid Dosage Transfer Within Technical Operations
100% (1)
Transfer of A Manufacturing Process: Case Study Solid Dosage Transfer Within Technical Operations
28 pages
PDA TR 80 《制药实验室数据完整性管理体系》中英文对照版
100% (1)
PDA TR 80 《制药实验室数据完整性管理体系》中英文对照版
98 pages
ISPE Good Practice Guide: Good Engineering Practice (Second Edition) Ispe download
100% (2)
ISPE Good Practice Guide: Good Engineering Practice (Second Edition) Ispe download
46 pages
Continued Process Verification
No ratings yet
Continued Process Verification
4 pages
ISPE Guide：ATMPs – Recombinant AAV Comparability and Lifecycle Management 前沿治疗药物 – 重组腺相关病毒的可比性和生命周期管理
100% (1)
ISPE Guide：ATMPs – Recombinant AAV Comparability and Lifecycle Management 前沿治疗药物 – 重组腺相关病毒的可比性和生命周期管理
98 pages
BioPhorum-Environmental-monitoring-a-February-2019 (Áp D NG Đư C) - Share
No ratings yet
BioPhorum-Environmental-monitoring-a-February-2019 (Áp D NG Đư C) - Share
29 pages
PDA TR84 Integrating Data Integrity Requirements into Manufacturing & Packaging operations - Share
100% (1)
PDA TR84 Integrating Data Integrity Requirements into Manufacturing & Packaging operations - Share
65 pages
Commissioning and Qualificatio
No ratings yet
Commissioning and Qualificatio
4 pages
ISPE CTU Evaluation Presentation
No ratings yet
ISPE CTU Evaluation Presentation
25 pages
ASTM E 2500 - Watler
100% (2)
ASTM E 2500 - Watler
60 pages
Guide To Validation - Drugs and Supporting Activities: June 29, 2021
No ratings yet
Guide To Validation - Drugs and Supporting Activities: June 29, 2021
41 pages
Scale-Up Using QBD Webinar ISPE-cjp v3
No ratings yet
Scale-Up Using QBD Webinar ISPE-cjp v3
20 pages
PDA Vol. 77, Issue 3 - VinaGMP
No ratings yet
PDA Vol. 77, Issue 3 - VinaGMP
111 pages
Urs For Erp Ex 2
No ratings yet
Urs For Erp Ex 2
37 pages
Pda TR32 2004
No ratings yet
Pda TR32 2004
153 pages
PDA Jourrnal of Pharmaceutical Science and Technology May June 2021
100% (1)
PDA Jourrnal of Pharmaceutical Science and Technology May June 2021
92 pages
GAMP Overview Presentation 04-03-2012 FINAL
100% (1)
GAMP Overview Presentation 04-03-2012 FINAL
59 pages
Guidance For Industry: Process Validation
No ratings yet
Guidance For Industry: Process Validation
18 pages
VMP Theory
No ratings yet
VMP Theory
34 pages
URS EQUIPMENT Indian Institute of Integrative Medicine
No ratings yet
URS EQUIPMENT Indian Institute of Integrative Medicine
150 pages
QRM Tools Selection (ISPE Jurnal)
No ratings yet
QRM Tools Selection (ISPE Jurnal)
79 pages
GAMP-5 Details
No ratings yet
GAMP-5 Details
2 pages
PDA Cleanroom Contamination Prevent and Control
100% (2)
PDA Cleanroom Contamination Prevent and Control
486 pages
Annex 2 - 55 Report - HBELs Cleaning Validation
No ratings yet
Annex 2 - 55 Report - HBELs Cleaning Validation
19 pages
PDA TR 3
No ratings yet
PDA TR 3
65 pages
Llenado Aseptico PDF
100% (1)
Llenado Aseptico PDF
20 pages
Data Integrity - WHO Annex 4 Data Integrity Guidance
100% (1)
Data Integrity - WHO Annex 4 Data Integrity Guidance
29 pages
Estimating Number of PPQ Batches: Various Approaches: Journal of Pharmaceutical Innovation
No ratings yet
Estimating Number of PPQ Batches: Various Approaches: Journal of Pharmaceutical Innovation
9 pages
PDA TR No. 48
100% (2)
PDA TR No. 48
31 pages
Pda 6
No ratings yet
Pda 6
35 pages
Warning Letter - Deficiencies in Validation and OOS - ECA Academy
0% (1)
Warning Letter - Deficiencies in Validation and OOS - ECA Academy
2 pages
Current Good Manufacturing Practices (cGMP) for Pharmaceutical Products
From Everand
Current Good Manufacturing Practices (cGMP) for Pharmaceutical Products
Chandrasekhar Panda
No ratings yet
Good Documentation Practices (GDP) in Pharmaceutical Industry
From Everand
Good Documentation Practices (GDP) in Pharmaceutical Industry
Chandrasekhar Panda
No ratings yet
Format - 14 Garment load dryness
No ratings yet
Format - 14 Garment load dryness
1 page
US Query-Glucagon Injection.21052024_Redacted
No ratings yet
US Query-Glucagon Injection.21052024_Redacted
4 pages
A218537 Quality DRL
No ratings yet
A218537 Quality DRL
10 pages
7356.002_DrugMfgInspections
No ratings yet
7356.002_DrugMfgInspections
44 pages
Format - 11 Chemical indicator attachment
No ratings yet
Format - 11 Chemical indicator attachment
1 page
Microbiology-Why Less Than 1
No ratings yet
Microbiology-Why Less Than 1
5 pages
Format - 03 Steam quality
No ratings yet
Format - 03 Steam quality
4 pages
Glycerin USP_Declaration_Ethylene glycol free_Ambernath_steril-gene_281124
No ratings yet
Glycerin USP_Declaration_Ethylene glycol free_Ambernath_steril-gene_281124
1 page
Ipa Cleaning Methodology and Valodation
No ratings yet
Ipa Cleaning Methodology and Valodation
97 pages
protocol
No ratings yet
protocol
23 pages
Format - 02 HMI Timer
No ratings yet
Format - 02 HMI Timer
1 page
Format - 05 bowie dick test
No ratings yet
Format - 05 bowie dick test
2 pages
Format - 08 penetration study
No ratings yet
Format - 08 penetration study
7 pages
Format - 04 Vacuum leak test
No ratings yet
Format - 04 Vacuum leak test
1 page
11220023679904
No ratings yet
11220023679904
5 pages
Format - 03 Steam Quality
No ratings yet
Format - 03 Steam Quality
4 pages
Glass-Syringes-for-Delivering-Drug-and-Biological-Products--Technical-Information-to-Supplement-International-Organization-for-Standardization-(ISO)-Standard-11040-4---Draft-Guidance-for-Indust
No ratings yet
Glass-Syringes-for-Delivering-Drug-and-Biological-Products--Technical-Information-to-Supplement-International-Organization-for-Standardization-(ISO)-Standard-11040-4---Draft-Guidance-for-Indust
13 pages
Performance-Qualification-Protocol-of-Compressed-Air-System
No ratings yet
Performance-Qualification-Protocol-of-Compressed-Air-System
20 pages
12-03-21-93-0-35-Dr. Reddy's Laboratories Ltd., Duvvada, India 10.29.21 483
No ratings yet
12-03-21-93-0-35-Dr. Reddy's Laboratories Ltd., Duvvada, India 10.29.21 483
18 pages
ISO-8573-2-2007
No ratings yet
ISO-8573-2-2007
12 pages
Question Based Review (QBR) For Sterility Assurance of Aseptically Processed Products Quality Overall Summary Outline
No ratings yet
Question Based Review (QBR) For Sterility Assurance of Aseptically Processed Products Quality Overall Summary Outline
5 pages
Aurobindo Unit 11, Srikakulam, India 8.2.22 483
No ratings yet
Aurobindo Unit 11, Srikakulam, India 8.2.22 483
3 pages
11-18-21-85-43-44-Revance Therapeutics, Newark, CA 7.2.21 EIR Summary
No ratings yet
11-18-21-85-43-44-Revance Therapeutics, Newark, CA 7.2.21 EIR Summary
6 pages
ICW GROUP Presentation - MAR 2023
No ratings yet
ICW GROUP Presentation - MAR 2023
17 pages
Alvotech, Reykjavik, Iceland March 3.17.22 483
No ratings yet
Alvotech, Reykjavik, Iceland March 3.17.22 483
10 pages
Hydro Split Study Protocol Updated
No ratings yet
Hydro Split Study Protocol Updated
16 pages
QA-128-T-09 Trainer Profile
No ratings yet
QA-128-T-09 Trainer Profile
1 page
Final Report - Nike & Its International Labor Practices
No ratings yet
Final Report - Nike & Its International Labor Practices
38 pages
Capstone Connect April 2024
No ratings yet
Capstone Connect April 2024
2 pages
Abbass Et Al-2014-Cochrane Database of Systematic Reviews
No ratings yet
Abbass Et Al-2014-Cochrane Database of Systematic Reviews
107 pages
Bio Manual
No ratings yet
Bio Manual
4 pages
Qatar Chapt 02
No ratings yet
Qatar Chapt 02
31 pages
Thermaline 2977: Selection & Specification Data
No ratings yet
Thermaline 2977: Selection & Specification Data
3 pages
Cancer de Laringe
No ratings yet
Cancer de Laringe
35 pages
Vitamin B17, Known As Laetrile and Amygdalin, Kills Cancer
75% (4)
Vitamin B17, Known As Laetrile and Amygdalin, Kills Cancer
2 pages
Tooth Setting
No ratings yet
Tooth Setting
68 pages
Eda Guideline (Validation Protocol)
No ratings yet
Eda Guideline (Validation Protocol)
79 pages
31 PDF PDF
No ratings yet
31 PDF PDF
4 pages
8th Kannada Socialscience 1
No ratings yet
8th Kannada Socialscience 1
5 pages
HSA Melody and Company
No ratings yet
HSA Melody and Company
4 pages
Regenerative Endodontics Study Guide
No ratings yet
Regenerative Endodontics Study Guide
21 pages
4 +Rinita+Amelia
No ratings yet
4 +Rinita+Amelia
6 pages
Cosmetics 2017, 4, 15 9 of 11: 9. Conclusions
No ratings yet
Cosmetics 2017, 4, 15 9 of 11: 9. Conclusions
1 page
PDA - Zappy Loan Data
No ratings yet
PDA - Zappy Loan Data
12 pages
Cryolipolysis: Clinical Best Practices and Other Nonclinical Considerations
No ratings yet
Cryolipolysis: Clinical Best Practices and Other Nonclinical Considerations
11 pages
3 - Environmental Risk Assessment - EPA Procedures
No ratings yet
3 - Environmental Risk Assessment - EPA Procedures
19 pages
Oie 2018 Avian Influenza (Ai)
No ratings yet
Oie 2018 Avian Influenza (Ai)
23 pages
Amputations in Natural Disasters and Mass Casualties Staged Approach
No ratings yet
Amputations in Natural Disasters and Mass Casualties Staged Approach
6 pages
Empowerment: The Core of Social Quality: Peter Herrmann
No ratings yet
Empowerment: The Core of Social Quality: Peter Herrmann
11 pages
Method Statements 04 - Flooring Works
100% (1)
Method Statements 04 - Flooring Works
7 pages
JLC Catalogue 2020
No ratings yet
JLC Catalogue 2020
56 pages
Health Promotion and Disease Prevention
No ratings yet
Health Promotion and Disease Prevention
32 pages
Den 26095012365
No ratings yet
Den 26095012365
3 pages