0% found this document useful (0 votes)
35 views8 pages

BODS Design Guideline

This document provides naming conventions and best practices for developing projects in SAP Data Services (SDS), an extract, transform, and load (ETL) tool. It outlines naming standards for server environments, server objects like job servers and repositories. It also provides naming conventions for reusable objects like projects, jobs, workflows, data flows, and custom functions. The goal is to establish consistency that eases support of single or multi-user development environments and enables auto-generated documentation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views8 pages

BODS Design Guideline

This document provides naming conventions and best practices for developing projects in SAP Data Services (SDS), an extract, transform, and load (ETL) tool. It outlines naming standards for server environments, server objects like job servers and repositories. It also provides naming conventions for reusable objects like projects, jobs, workflows, data flows, and custom functions. The goal is to establish consistency that eases support of single or multi-user development environments and enables auto-generated documentation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Community Topics Groups Answers Blogs Events Programs Resources What's New Explore SAP

Do you have an S- or P- account? 



If so, you need SAP Universal ID. In a few months, SAP Universal ID will be the only option to login to SAP Get started with SAP Universal ID
Community. Without it, you will lose your content and badges. If you have multiple accounts, use the Consolidation
Tool to merge your content.

Home › Community › Blogs Ask a Question Write a Blog Post Login / Sign-up

Former Member
August 29, 2014 | 26 minute read

SAP BODS Reference guide for any project

 8  28  28,956

Follow 1. Introduction

SAP Business Objects Data Services is an Extract, Transform and Load (ETL) tool used to move and manipulate data between source and target
environments.  SAP Data Services provides a data management platform that can support various initiatives including business intelligence, data
 Like migration, application integration and many more speci c applications.  SAP Data Services are the executable components within the application
that can be deployed in either a batch or real time (services) based architecture.

 RSS Feed The following document details the best practices regarding development within the SAP Data Service product. This includes:

General SAP Data Services naming standards


Design best practices
Performance consideration
Audit  and Execution framework
Audit database schema

Related areas that are not covered in this document include:

Change control and project migration


Data modeling techniques
1. 1. Audience

This is technical document that is only indented only for developers and peer-reviewers who are already experienced in SAP Data Services.

1. Data Services Naming Standards

1. 1. Overview

The use of naming conventions within SAP Data Services will assist in the ability to support a single or multi user development environment in a
controlled fashion.  It will also assist in the production of documentation as through correct naming and object descriptions Data Services can
produce component based documentation through its Auto Documentation tool found within the Management Console web based application.

The following sections describe the naming conventions for each type of object in Data Services.

The use of naming conventions can result of long names being used. To avoid very long objects names being truncated in the design workspace of
Data Services Designer, it is possible to increase the number of characters displayed for an object. To do so:

DI Designer > Tools > Options Menus:

The parameter “Number of characters in workspace icon name” de nes the maximum number of characters displayed in the workplace. Set this
parameter to the desired value.

As a general note, Data Services object names should not have the following imbedded in them:

Object versions (i.e. naming a Data Flow DF_LOAD_SALES_V0.3) Versioning should be handled by central repositories, not by naming conventions.

Environment speci c information (i.e. naming a datastore DS_WAREHOUSE_DEV_1). Environment information should be con gured using datastore
con gurations, not by creating di erent names for each datastore.

1. 2. Server Environment Objects

Object Naming Convention Example


SANDPIT SPT JS_PRJ_SPT_001
DEVELOPMENT DEV JS_PRJ_DEV_001
TESTING TST JS_PRJ_TST_001
PRODUCTION PRD JS_PRJ_PRD_001

1. 3.    Server Objects

The naming conventions for other server side objects are de ned below:

Object Naming Convention Example


Job Server JS_ JS_PRJ_SPT_001
Job Server JS_GR_ JS_GR_PRJ_TST_001
Group(Cluster)
Data Services Local DSL_ DSL_SPT_001
Repository
Data Services DSC_ DSC_SPT_001
Central Repository
Data Services DSP_ DSP_DEV_001
Pro le Repository
Data Services  Data DSQ_ DSQ_DEV_001
Quality Repository

1. 4.    Reusable Objects

Object Naming Convention Example


Project PRJ_{Name} PRJ_Load_Warehouse
Batch Job JOB_{Short Name}_{ JOB_LW_Load_Data_Warehouse
Description }
Real Time Job RJB_{Short Name}_{ RJB_LW_Update_Customers
Description }
Work Flow WF_{JOB Short Name}_{ WF_LW_Load_Dimensions
contained in one Description }
Job only
Work Flow that is WF_G_{Description} WF_G_Load_Dimensions
reused
Data Flow DF_{JOB Short Name}_{ DF_LW_Load_Customer
contained in one Description }
Job only
Data Flow that is DF_G_{Description} DF_G_Start_Job
reused
Embedded Data Same as Data Flow except use EDF_G_Write_Audit
Flow EDF_
ABAP Data Flow Sales as Data Flow expect ADF_LW_Load_Customer
ADF_
Custom Function FN_{JOB Short Name}_{ FN_LW_Customer_Lookup
contained in one Description }
Job only
Custom Function FN_G_{ Description } FN_G_Convert_Time                         
that is reused
SAP Datastore {ENV}_{SYSTEM}_{Client} TST_BIO_400
Con guration
Non SAP Datastore {ENV}_{SYSTEM}_{Description} DEV_IM_ADMIN
Con guration
This site uses cookies and related technologies, as described in our privacy statement, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of
Server {Number}.{ENV}_{Description} 1. 1.Sandpit or
these technologies, or manage your own preferences.
Con guration 4.TST_MIGRATION

Cookie Statement Manage Settings Accept Reject All

1. 5. Sources and Targets


Object Naming Convention Example
Datastore that DS_{ Description } DS_Source
connects to database
Datastore that DS_WS_{ Description } DS_WS_Customers
connects to web
service
Datastore that DS_{Type} _{ Description } DS_HTTP_Legacy_Customers
connects to custom
adapter
Application Datastore AP_{Application}_{ Description } DS_R3_Finance
that connects to an
application e.g. SAP
R/3
Application Datastore AP _BW_{ Description } DS_BW_Sales
that connects to SAP
BW Source
File Format Template FMT_{Delimiter}_{Description} FMT_CSV_Customers
Delimiter = CSV,TAB,FIX
DTD’s DTD_{Name} DTD_Customer_Hierarchy
XSD Schema XSD_{Name} XSD_Customer_Hierarchy
SAP IDoc IDC_{Name} IDC_SAP_Customers
Cobol Copy Book CCB_{Name} CCB_Account

1. 6. SAP speci c Extractions

The principle of keeping all the SAP extraction names the same helps with debugging of ows inside Sap. The application name describes the source
(SAP DM, BP or BW)

Object Naming Convention Example


SAP R/3 Data ow Z{APP}{Name}{DESC} ZBWFLEX3DSO
Generated ABAP Z{APP}{Name}{DESC} ZBWFLEX3DSO.sba
Filename
ABAP Program Name Z{APP}{Name}{DESC} ZBWFLEX3DSO
in R/3
Job Name in R3 Z{APP}{Name}{DESC} ZBWFLEX3DSO

            

1. 7. Work Flow Objects

Object Naming Convention Example


Script SCR_{Description} SCR_Initialise_Variables
Condition CD_{Description} CD_Full_Or_Delta
While Loop WHL_{Description} WHL_No_More_Files
Try TRY_{Description} TRY_Dimension_Load
Catch CAT_{Description}_{Error group} CAT_Dimension_Load_All

1. 8.    Variables

Object Naming Convention Example


Global Variable $G_{Description} $G_Start_Time
Parameter Variable – $P_I_{Description} $P_I_File_Name
Input
Parameter Variable – $P_O_{Description} $P_O_Customer_ID
Output
Parameter Variable – $P_IO_{Description} $P_IO_Running_Total
Input/Output
Local Variable $L_{Description} $L_Counter

1. 9.    Transforms

Object Naming Convention Example


CASE CSE_{Description} CSE_Countries
Date Generation DTE_{Description} DTE_GENERATION
Data Transfer DTF_{Description} DTF_StageData
E ective Date EFD_{Description} EFD_E ective_From_Date_Seq
Hierarchy Flattening HFH_{Description} HFH_Customers
(Horizontal)
Hierarchy Flattening HFV_{Description} HFV_Customers
(Vertical)
History Preservation HSP_{Description} HSP_Products
Map CDC Operation CDC_{Description} CDC_Products
Map Operation MAP_{Description} MAP_Customer_Updates
Merge MRG_{Description} MRG_Customers
Pivot PVT_{Description} PVT_Products
Query QRY_{Description} QRY_Map_Customers
Reverse Pivot RPT_{Description RPT_Products
Row Generation ROW_{Number of Rows} ROW_1000
SQL SQL_{Description} SQL_Extract_Customers
Table Comparison TCP_{target table} TCP_Customer_Dimension
Validation VAL_{Description} VAL_Customer_Flat le
XML Pipeline XPL_{Description} XPL_Cust_Hierachy

1. General Design Standards

1. 1. Batch Jobs

Batch Jobs should generally contain all the logic for a related set of activities.  The content and functionality of each Job should be driven by the scheduling
requirements.  This mechanism generally separates Jobs by source system accessed and by frequency of execution i.e. for each period (such as nightly,
weekly, etc.) that needs to be delivered.  This is because di erent systems will have di erent availability times, and hence the jobs will have di erent
scheduling requirements.

Jobs should also be built with the following guidelines:

Work ows should be the only object used at the job level. The only exceptions are try and catch and conditionals where Job level replication is required.

Parallel work ows should be avoided at the Job level as Try and Catch cannot be applied when items are in parallel.

1. 2. Real-Time Jobs

Real time jobs should only be considered when there is a need to process XML messages in real-time or where real-time integration is required with another
application i.e. SAP R/3 IDocs.  Real time jobs should not be used where:

Systems only need to be near-time.  A better approach is to create a batch job and run it regularly (i.e. every 5 minutes)

Complex ETL processing is required, such as aggregations etc.

Often real-time jobs will be used to process XML into a staging area, and a batch job will run regularly to complete the processing and perform aggregations
and other complex business functions.

1. 3. Comments

Comments should be included throughout Data Services jobs.  With the Auto documentation functionality, comments can be passed straight through into the
technical documentation.

Comments should be added in the following places:

Description eld of each object.  Every reusable object (i.e. Job, Work Flow, Data Flow, etc) has a description eld available.  This should include the author,
date, and a short description of the object.
Scripts and Functions – comments are indicated by a # in scripts and functions.  At the top of any code should be the author, create date, and a short
description of the script.  Comments should be included within the code to describe tasks that are not self-explanatory.
Annotations – Annotations should be used to describe areas of a work ow or Data Flow that are not self-explanatory.  It is not necessary to clutter the
design areas with useless comments such as “this query joins the table”.
Field Comments – Tables should have comments attached to each eld.  These can be manually entered, imported from the database, or imported from
any tool that supports CWM (Common Warehouse Metamodel).

1. 4. Global
This site uses cookies and related technologies, Variables
as described in our privacy statement, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of
these technologies, or manage your own preferences.
Variables that are speci c to a Data Flow or Work Flow should NOT be declared as global variables.  They should be declared as local variables and passed as
parameters to the dependent objects.  The reasoning behind these statements is two-fold. Firstly, due to the ability for Data Services to run these objects in a
Cookie Statement sequential or parallel execution framework, local variables and parameters allow for values to be modi ed without a ecting
Manage other processes. 
Settings Accept Secondly,
RejectWork
All
Flows and Data Flows can be reused in multiple Jobs and by declaring local variables and parameter you break the reliance on the Job level global variables
having been con gured and assigned the appropriate values.   Some examples of variables that should be de ned locally are:

The lename for a at le source for a Data Flow to load


Incremental variables used for conditionals or while-loops

The global variables that are used should be standardized across the company.  Some examples of valid global variables are:

Variable Description Example


Recovery Flag A ag that is used to indicate the job $G_Recovery
should be executed in recovery
mode
Start Date-Time The start time variable should $G_Start_Datetime
indicate the date and time that the
job should start loading data from. 
This is often the nish date of the
last execution
End Time The end time variable should $G_End_Datetime
indicate the date and time that the
job should end loading data from. 
This should be set when the job
starts in order to avoid overlaps.
Debug A ag that tells the job to run in a $G_Debug
debug mode.  The debug allows
custom debug commands to run.
Log A ag that tells the job to run in $G_Log
Logging mode.
Execution Id An ID that represents the current $G_Exec_ID
execution of the job.  This is used as
a reference point when writing to
audit tables.
Job Id An ID that represents the job.  This $G_Job_ID
is used as a reference point when
writing to audit tables.
Database Type When developing generic jobs, it can $G_DB_Type
often be useful to know the
underlying database type (SQL
Server, Oracle etc.

1. 5.    Work Flows

The following guidelines should be followed when building Work Flows:

Objects can be le unconnected to run in parallel if they are not dependent on each other. Parallel execu on is par cularly useful for workflows that are replica ng
a large number of tables into a different environment, or mass loading of flat files (common in extract jobs). However, care needs to be taken when running parallel
Data Flows, par cularly if the parallel Data Flows are using the same source and target tables. A limit can be set on the number of available parallel execu on
streams under tools – op ons – Job Server – Environment se ngs (default is 8) within the Data Services Designer tool.

Workflows should not rely on global variables for local tasks; instead local variables should be declared as local and passed as parameters to Data Flows that require
them. It is acceptable to use global variables for environment and global references, however other than the “ini aliza on” workflow that starts a Job, generally
Work Flows should only be referencing global variables, and not modifying them.

1. 6. Try/Catch

The try-catch objects should generally be used at the start of a job, and at the end of a job. The end of the try catch can be used to log a failure to audit tables, no fy
someone of the failure or provide other required custom func onality. Try-Catch objects can be placed at the Job and Work Flow level and can also be programma cally
referenced within the scrip ng language.

Generally try-catch shouldn’t be used as you would in typical programming languages, such as java, as in Data Services if something goes wrong, generally the best
approach is to stop all processing and inves gate.

It is quite common in the “catch” object to have a script that re-raises an excep on (using the raise_excep on () or raise_excep on_ext func ons). This allows the error
to be caught, and logged, and at the same me the Data Services Administrator job is s ll marked with a red-light to indicate that it failed.

1. 7. While Loops

While loops are mostly used for jobs that need to load a series of flat files or xml files, and perform some addi onal func ons on them such as moving them to a backup
directory and upda ng control tables to indicate load success and fail.

The same standards regarding the use of global variables should also be applied to while loops. This means variables that need to be updated (such as an itera on
variable) should be declared as local variables. The local variables should be passed to underlying Data Flows using parameters.

1. 8. Conditionals

Condi onals are used to choose which object(s) should be used for a par cular execu on. Condi onals can contain all objects that a Work Flow can contain. They are
generally used for the following types of tasks:

Indica ng if a job should run in recovery mode or not.


Indica ng if a job should be an ini al or delta load.
Indica ng whether a job is the nightly batch or a weekly batch (i.e. the weekly batch may have addi onal business processing).
Indica ng whether parts of a job should be executed, such as execu ng the extract, clean, and conform steps, but don’t execute the deliver step.

1. 9. Scripts and Custom Functions

The following guidelines should be followed when building scripts and custom func ons:

The sql() func on should be used only as a last resort. This is because tables accessed in sql() func on are not visible in the metadata manager. The lookup_ext
func on can be used for lookup related queries, and a Data Flow should be built for insert/update/delete queries.
Custom func ons should be wri en where the logic is too complex to write directly into the mapping part of a Data Flow or the logic needs to be componen zed,
reused and documented in more detail.
Global variables should never be referenced in a custom func on; they should be passed in/out as parameters. A custom func on can be shared across mul ple Jobs
and therefore referencing Job level global variables is bad prac ce.

Note the following areas to be careful of when using custom functions:

O en custom func ons will cause the Data Flow’s pushdown SQL to not generate effec vely. This o en happens when using a custom func on in the where clause
of a query.
Calling custom func ons in high volume Data Flows can cause performance degrada on (par cularly where parallel execu on is used).

1. 10.Data Flows

In general a Data Flow should be designed to load informa on from one or more sources into a single target. A single Data Flow should generally not have mul ple
tables as a target. Excep ons are:

Wri ng out to audi ng tables (i.e. wri ng out the row count).
Wri ng out invalid rows to a backup table.

The following items should be considered best prac ces in designing efficient and clean Data Flows:

All template/temporary tables should be imported and approved and op mized by database experts before releasing in to a produc on environment.
The “Pushdown SQL” should be reviewed to ensure indexes and par ons are being used efficiently.
All redundant code (such as useless transforms or extra fields) should be removed before releasing.
Generally the most efficient method of building a Data Flow is to use the least number of transforms.

There are several common prac ces that can cause instability and performance problems in the design of Data Flows. These are mostly caused when Data Services
needs to load en re datasets into memory in order to achieve a task. Some ps toavoid these are as follows:

Ensure all sources tables in the Data Flow are from the same datastore, thus allowing the en re SQL command to be pushed down to the database.
Each Data Flow should one use one main target table (this excludes tables used for audi ng and rejected rows)
Generally the “Pushdown SQL” should contain only one SQL command. There are cases where more commands are acceptable, for example if one of the tables
being queried only returns a small number of rows, however generally mul ple SQL command will mean that Data Services needs to perform in memory joins, which
can cause memory problems.
Check that all “order by”, “where”, and “group by” clauses in the queries are included in the pushdown SQL.
If the reverse pivot transforms are used check that the input volume is known and consistent and can therefore be tested.
If the “PRE_LOAD_CACHE” op on is being used on lookups, ensure that the transla on table dataset is small enough to fit into memory and will always be of a
comparable
This site uses cookies and related technologies, size. in our privacy statement, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of
as described
these technologies, or manage your own preferences.
Always try and use the “sorted input” op on on table comparisons, being careful to ensure that the input is sorted in the “pushdown sql”.

Cookie Statement 1. SAP Data Services Design Guidelines Manage Settings Accept Reject All
1. 1. Overview

Technical requirements should iden fy all sources, targets, and the transforms and mappings that should occur between. The best technique for transla ng these
requirements into a SAP Data Services design is to use the Kimball[1] ETL recommended technique of Extract, Clean, Conform, and Deliver. The Kimball techniques are
industry accepted work very well with the Data Services architecture. These steps translate to the following real-world examples:

Staging (Extract) – Staging the informa on from source systems and loading it into a temporary/persistent staging area.
Transforma on (Conform) – The transforma on step is where the data is standardized for the target system. This step is generally the most complex and will include
matching disparate data sources, de-duplica on, aggrega ons and any other business rules required to transform the source informa on into the target data
structures.
Valida on (Clean) – The valida on step is used to detect and record the existence of data-quality errors from target side.
Load (Deliver) – This is the final step that involves loading the informa on into target systems or generate flat files

Each of these steps can be translated in SAP Data Services to a Data Flow (or a series of Data Flows for more complex opera ons).

1. 2. Extract Data Flow

The purpose of an extract dataflow is to take a source dataset and load it into an equivalent staging table. The source datasets could be any of the following:

A table in a database (i.e. Oracle, SQL Server)


A xed format or delimited at le
An xml document
A supported application interface (i.e. SAP IDoc)

The extract data ow should be designed based on the following principles:

The staging table should be a near match of the source dataset and should include all the elds from the source dataset. Including all elds is a trivial
exercise and can be useful in that the extract job will not need to be modi ed and retested if other elds are required in the future.
Additional value-add elds can be added to the staging table such as:
A surrogate key for the record (this is useful for auditing and data lineage)
Date/time the record was loaded into staging
Date/time the record was loaded into the target system
Flag indicating if the record quality has been validated
Flag indicating if the record has been processed into the target system
The source system that the record came from.

Note: All of the additional values above can be assigned to the record by referencing the execution id of the Job within the EIM framework database.

The data ow should generally be very simple; containing only the source, one query transform, the target table, and any audit tables.

Where possible, the query transform should be used to lter the incoming dataset so only new or updated records are loaded each time (Source based
changed data capture)

1. Performance Consideration
1. 1. Overview

The approach to producing stable and efficient Data Flows within data integrator is to ensure that the minimal amount of data in flowing through the data flows and
that as many opera ons as possible are performed on the database. When this doesn’t happen bo lenecks can occur that can make the flows inefficient. Some of the
typical causes of the problems can be:

SQL not being pushed down correctly to the database (i.e. the where condi on, group by, and order by commands)
Table comparisons using incorrect target table caching op ons
Target table auto-update
Reverse Pivot transforms
Complex XML genera on

1. 2. Pushdown SQL

It is important with large incoming datasets to ensure that the “pushdown sql” command is running efficiently. Running large queries that have not been op mized can
create a severe impact on the database server.

The following items in the pushdown SQL should be checked:

If the incoming datasets are small, it may not be necessary to index every field, however in general the indexes should be in place on all filtered and joined fields
(This may not be possible depending on the source environment). The extract, clean, conform and deliver model described previously allows us to reduce the impact
of the source systems to the overall ETL process by staging the data at the various points in the process and therefore allowing us to index and par on the data
tables where required.
The op mized SQL generated by Data Services should be pushed down to one command. If there are mul ple SQL commands, this usually means that SDS will need
to perform a poten ally memory intensive join on the Job Server.
Any Sort, Where, and Group By clauses in queries, should be reflected in the op mized SQL.

Some common reasons the Where clause doesn’t push down to the SQL include:

Using a custom or complex func on in the Where clause. The way around this is to set variable values in a script prior to the Data Flow and replace the custom
func on with the variables where possible.
Rou ng a source table into mul ple queries. If you need to use the same source table mul ple mes in a single Data Flow you should add mul ple instances of the
source table to the Data Flow and connect each to the respec ve Query object(s).

The above statements are not strict rules and there are many excep ons that can pass through without the pushdown being affected. These include:

Using the Where clause to route data to mul ple queries (for example rou ng rejected records to a different table)
When filtering values that have been derived within the Data Flow

1. 3. Table Comparison Optimization

In general the “sorted input op on” should be cked when using table comparisons. The alterna ves are:

No caching – this op on doesn’t have any memory impact, however it is by far the slowest op on and should only be used if the input dataset is known to be very
small.
Cached comparison table – this op on is similar in speed to the sorted input op on, however it means that the en re comparison table will be cached into memory.

The key to using the “sorted input op on” is to ensure that the incoming dataset is sorted. This sort must be done in the pushdown SQL, otherwise the memory
problems associated with large datasets could s ll occur.

1. 4. Reverse Pivot Transform

The reverse pivot transform is a very useful transform that can be used to turn row values into column names. This transform has a group by checkbox that allows it to
perform the pivot more efficiently if the incoming dataset is grouped by the non-pivot columns. Generally a query should be used before the reverse pivot, to sort the
data by the non-pivoted columns (ensuring this sort is reflected in the pushdown SQL). This will improve performance and reduce the memory requirements of the
transform.

1. 5. Target Table Auto-Update

Auto correct load within the Update control op ons can be a temp ng method of ensuring that primary key viola ons do not occur. The problems with using this are
that it performs very badly across heterogeneous databases (updates all rows whether they have changed or not) and is o en unno ced when code reviews are
performed. A be er method of achieving the same func onality is to use a Table Comparison transform before loading the target table. Using the table comparison has
the following advantages:

Columns that cause an update can be defined (vs. just using all the columns)
The sorted input op on and caching op ons can be used to improve performance
It is more readable and clear on the dataflow

On Oracle the auto correct load op on can be implemented as a Merge command to improve performance. If auto correct is selected then document that this is the
case in the Data Flow by adding an Annota on. This will improve visibility, and support and maintenance of the Data Flow.

1. 6. Case Transforms

The case transform should never be simply used as a filter. The reason for this is that the “Pushdown SQL” will not reflect the filter and unnecessary rows will be pulled
from the underlying database into the SDS engine. The be er way to do this is to use the Where clause in a Query object to filter the data set you require from the
source database and then use a Case transform to split the dataset and route the data down the correct path.

1. Job Template and Execution Framework

SAP Data Services provides a data management pla orm that can support various ini a ves including business intelligence, data migra on, applica on integra on and
many more specific applica ons. SAP Data Services Jobs are the executable components within the applica on that can be deployed in either a batch or real me
(services) based
This site uses cookies and related technologies, architecture.
as described in our privacy statement, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of
these technologies, or manage your own preferences.

To ensure that all SAP Data Services Jobs follow a consistent strategy for storing Job parameters, recording the Job execu on, including messages, sta s cs and error
Cookie Statement handling, a framework has been designed. The framework contains a number of shared components where commonalityManage
is possible delivering efficiency
Settings Accept and cost saving
Reject All
in mul ple project deployments and maintenance.
Details of the database schema that is required to support the framework are:

The database schema is designed for use in four main ways:

To parameterize the Jobs and store the parameter values in a database structure external to the Job and applica on layer
To record the execu on of the Jobs within the SAP Data Services applica on framework, recording either successful execu on or failure within the schema. Execu on
can be either recorded at Job or step level
To record messages, sta s cs and parameter values within the Jobs in a standard framework for repor ng and monitoring purposes
To enable flexible configura on considering mul ple environments, type of execu on runs, various execu on steps etc

1. Framework Custom Functions

Following custom functions are utilized to perform shared tasks within the Framework templates. These custom functions are written to perform generic
tasks and as such are not tied to any speci c template or project.  If project speci c, additional functionality is required, then the custom functions can be
replicated and renamed with a project reference.

1. 1.    FN_G_StartProcessExecution

Inputs: $P_I_DB_Type
        $P_I_ObjectID
        $P_I_Execution_Type_ID
        $P_I_Process_Start_DateTime

Output: $L_ProcessExecID

Description: Records the execution of the job in PROCESS_EXECUTION table, by setting the STATUS to ‘STARTED’. This status is then updated to either
‘COMPLETED’ or ‘ERROR’ based on execution ow of the job. It returns the execution ID of the current execution.

1. 2.    FN_G_StartProcessExecutionStep

Inputs: $P_I_DB_Type
        $P_I_ExecutionID
        $P_I_ObjectStepID

Output: $L_ExecStepID

Description: Records the execution of the job step in PROCESS_EXECUTION_STEP table, by setting the STATUS to ‘STARTED’. This status is then updated
to either ‘COMPLETED’ or ‘ERROR’ based on execution ow of the job for that step. It returns the execution ID of the current step execution.

1. 3.    FN_G_InsertStatistic

Inputs: $P_I_ProcessExecutionID
        $P_I_StatisticID
        $P_I_MeasuredObjectID
        $P_I_StatisticValue
        $P_I_DB_Type

Output: NA

Description: Records the statistics for particular object in PROCESS_STATISTIC table. You can de ne the object to be measured in PROCESS_OBJECT and
the type of statistic in PROCESS_OBJECT table.

1. 4.    FN_G_InsertProcessExecutionParameter

Inputs: $P_I_ProcessExecutionID
        $P_I_ParameterValue
        $P_I_DB_Type
        $P_I_ProcessObjParamID

Output: $L_ProcessExecParam_ID

Description: Records the instance of the parameters for the speci c execution. For every execution, it records the parameter values which were passed to
execute that particular job. This provides quick insight into the job execution during troubleshooting.

1. 5.    FN_G_InsertMessage

Inputs: $P_I_ProcessExecutionID
        $P_I_MessageText
        $P_I_MessageType
        $P_I_Debug
        $P_I_Log
        $P_I_DB_Type
        $P_I_Version

Output: NA

Description: Records the messages from various components of job for speci c execution and message type. These messages are generally information,
warning and error messages.

1. 6.    FN_G_GetStepTypeID

Inputs: $P_I_StepTypeName
        $P_I_Version

Output: $L_StepTypeID

Description: Returns the PROCESS_STEP_TYPE_ID from the PROCESS_STEP_TYPE table for the input StepTypeName.

1. 7.    FN_G_GetProcessObjStepID

Inputs: $P_I_StepTypeID
        $P_I_ObjectID
        $P_I_Version

Output: $L_ObjectStepID

Description: Returns the PROCESS_OBJECT_STEP_ID from the PROCESS_OBJECT_STEP table for input object and step type.

1. 8.    FN_G_GetProcessObjParamID

Inputs: $P_I_ObjectID
        $P_I_ParameterName
        $P_I_Version

Output: $L_ProcessObjParam_ID

Description: Returns the process_object_parameter_ID for input object and parameter name.

1. 9.    FN_G_GetParameterValue

Inputs: $P_I_ObjectID
        $P_I_ParameterName
        $P_I_Version

Output: $L_paramValue

Description: Returns the Parameter Value for input object and parameter name.

1. 10.FN_G_GetObjectID

Inputs: $P_I_ObjectName
        $P_I_ObjectTypeName
        $P_I_Version

Output: $L_ObjectID

Description: Returns the Parameter Value for input object and object type.

1. 11.FN_G_GetMessageTypeID

Inputs: $P_I_MessageTypeName
        $P_I_Version

Output: $L_MessageTypeID

Description: Returns the PROCESS_MESSAGE_TYPE_ID from the PROCESS_MESSAGE_TYPE table


This site uses cookies and related technologies, as described in our privacy statement, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of
these technologies, or manage your own1.preferences.
12.FN_G_GetLatestRunType

Inputs: $P_I_JobName
Cookie Statement Manage Settings Accept Reject All
Output: $L_RunType
Description: Gets the Execution Type of the speci ed Job for the latest execution record in the PROCESS_EXECUTION table

1. 13.FN_G_GetLatestRunStatus

Inputs: $P_I_JobName

Output: $L_status

Description: Gets the STATUS of the speci ed Job for the latest execution record in the PROCESS_EXECUTION table

1. 14.FN_G_GetExecutionTypeID

Inputs: $P_I_EXECUTION_TYPE
        $P_I_Version

Output: $L_ExecTypeID

Description: Gets PROCESS_EXECUTION_TYPE_ID from the PROCESS_EXECUTION_TYPE table for the de ned RUN_TYPE for the job

1. 15.FN_G_ErrorProcessExecutionStep

Inputs: $P_I_ProcessExecutionStepID
        $P_I_DB_Type

Output: NA

Description: Updates a row in the process_execution_step table, setting the STATUS to ‘ERROR’ based on the input execution step ID

1. 16.FN_G_ErrorProcessExecution

Inputs: $P_I_ProcessExecutionID
        $P_I_DB_Type

Output: NA

Description: Updates a row in the process_execution table, setting the STATUS to ‘ERROR’ and END_TIME to system time, based on the input execution ID

1. 17.FN_G_EndProcessExecutionStep

Inputs: $P_I_ProcessExecutionStepID
        $P_I_DB_Type

Output: NA

Description: Updates a row in the process_execution_step table, setting the STATUS to ‘COMPLETED’ based on the input execution step ID

1. 18.FN_G_ErrorProcessExecution

Inputs: $P_I_ProcessExecutionID
        $P_I_DB_Type

Output: NA

Description: Updates a row in the process_execution table, setting the STATUS to ‘COMPLETED’ and END_TIME to system time, based on the input
execution ID

1. Framework Structure

The purpose of a framework is to maintain control over the Data Services deployment as it is rolled out through out through the enterprise. The initial
design considerations are for a number of interface patterns, particularly multi source merge, multi target split and point to point direct patterns.

The screen shot below shows the layout of the Generic Template Job.  The Job is broken down into a number of components and the main processing logic
is surrounded by a series of Try/Catch blocks which apply standardized error handling logic

1. 1.    Pre-Processing Logic – WF_G_Pre_Processing

The pre processing step contains the standard template logic to initialize the global variables and record the successful start to the Job execution in the
PROCES_EXEUCTION table. The processing sequence in this section should not be altered, as there is dependency of steps on each other and might cause
the execution to FAIL.

The global variables that are populated by this step are detailed in the table below:

Variable Description Method of Population SPECIAL


Considerations

$G_Job_Start_Time To capture the system sysdate() This value is


date time, for passing mandatory and must
the timestamp value to be set before calling
START_PROCESS_EXEC the start of process
UTION. This value is execution function
logged at start time of
the execution

$G_Prev_Exec_Type To fetch the execution Calls the custom If NULL value is


type of the latest function returned, it doesn’t
execution. This variable FN_G_GetLatestRunT impact the execution
is normally set before ype () by passing the as this is required for
the start of the current job name display only
execution, else it will
fetch the execution type
of the current execution

$G_Prev_Exec_Stat To fetch the execution Calls the custom If NULL value is


us status of the latest function returned, it doesn’t
execution. This variable FN_G_GetLatestRunS impact the execution
is normally set before tatus () by passing as this is required for
the start of the current the job name display only
execution, else it will
fetch the execution
status of the current
execution which will be
‘STARTED’

$G_DB_Type Identi es the DB_TYPE Calls the function This value is mandatory
of the db_type by passing to be set before calling
‘DS_PROJ_ADMIN’ data data store name any other custom
store. Admin framework functions, as most of
templates and custom them require this value
functions uses this data for execution. If this
store for accessing value is not set, jobs will
stored procedures and fail
tables

$G_Job_ID To fetch the object ID of Calls the custom This value is


the current Job function mandatory to fetch
FN_G_GetObjectID by data from other
passing job name and masters. It should
object type not be null. Ensure
that the job you are
running is de ned in
framework tables

$G_Execution_Type To fetch the execution Calls the custom For all jobs,
type of the current job function parameter values
execution. If the value is FN_G_GetParameterV must be set in
not set for this job in alue for parameter PROCESS_OBJECT
PROCESS_OBJECT_PA name ‘RUN_TYPE’ _PARAMETER
RAMETER, then it is set
to ‘DEVELOPMENT’ by
default

$G_Execution_Type To fetch the execution Calls the custom


_ID type ID for the input function
execution type FN_G_GetExecutionT
ypeID for
$G_Execution_Type

$G_Debug Set to the parameter Calls the custom If the value is not
value for the job from function de ned in database,
This site uses cookies and related technologies, as described in our privacy PROCESS_OBJECT_PA FN_G_GetParameterV
statement, for purposes that may include it defaults
site operation, analytics, enhanced toexperience,
user ‘Y’ or advertising. You may choose to consent to our use of
these technologies, or manage your own preferences. RAMETER. This is set to alue for value ‘DEBUG’
‘Y’ for debugging
Cookie Statement messages to be printed Manage Settings Accept Reject All
on job monitor.
FN_G_InsertMessage
prints and logs
messages based on this

$G_Log Set to the parameter Calls the custom If the value is not
value for the job from function de ned in database,
PROCESS_OBJECT_PA FN_G_GetParameterV it defaults to ‘Y’
RAMETER. This is set to alue for value ‘LOG’
‘Y’ for logging messages
in PROCESS_MESSAGE
table.
FN_G_InsertMessage
prints and logs
messages based on this

$G_Log_Exec_Steps Set to the parameter Calls the custom If the value is not
value for the job from function de ned in database,
PROCESS_OBJECT_PA FN_G_GetParameterV it defaults to ‘Y’
RAMETER. This is set to alue for value
log the execution at ‘LOG_EXEC_STEPS’
step level. If this is set to
‘N’, nothing will be
logged in
PROCESS_EXECUTION
_STEP. To use this
feature, you must de ne
the values in
PROCESS_OBJECT_STE
P and
PROCESS_STEP_TYPE

$G_Check_Pre_ReG Sets to parameter value Calls custom function It defaults to ‘Y’ if not
for the job. This is used FN_G_GetParameterV set
in conditional to check if alue for
CHECK_PRE_REG step ‘CHECK_PRE_REG’
should be executed or parameter value
not

$G_Exec_ID Set to the execution ID Calls the custom


of the current execution function
once job is run. This ID FN_G_StartProcessEx
is referred in all the ecution
transactional tables for
relationship and
reporting

$G_Step_Type This value is manually Set manually Setting wrong value


step based on where will not cause any
you are putting this error, however no
code. This should be set step execution will
to the step name and be recorded for that
should be exactly the step
same de ned as one of
the step types in
PROCESS_STEP_TYPE.
This value should be set
at the start of every step

$G_Step_Type_ID Set to the step type ID Calls the custom


for the step type set in function
$G_STEP_TYPE fn_G_GetStepTypeID

$G_Exec_Step_ID Set to the step Calls the custom


execution ID of the function
current execution of FN_G_StartProcessEx
step during job run. This ecutionStep
ID is referred in step
level transactional
tables for relationship
and reporting

$G_Error_Level Set to ‘ERROR’ when Set manually Defaults to ’ERROR’,


used in catch blocks if nothing is speci ed
and execution ends
up in raise error of
catch block

$G_Error_Type Set to the type of error Set manually Defaults to ’No Type
which is catched by Speci ed’, if nothing
try/catch block. Used by is speci ed and
raise_exception_ext execution ends up in
raise error of catch
block

$G_Error_ID Set to the unique error Set manually Defaults to ’9999’, if


ID in every catch block. nothing is speci ed
Used by and execution ends
raise_exception_ext up in raise error of
catch block

As part of SAP Data Services development best practices, all of the global variables and their respective values will be written to the
PROCESS_EXECUTION_PARAMETERS and underlying PROCESS_MESSAGE  table to record the values for the speci c Job execution (if the variable
$G_Log is set to ‘Y’).  The writing of the variable values to the parameter and message table aids the maintenance and support e ort by providing visibility
of Job parameter values at the time of execution.

1. 2.    Post Processing Logic – WF_G_Post_Processing


2. 3.    Error handling – try / catch blocks

Alert Moderator

Assigned Tags

SAP Data Services

Similar Blog Posts 


SAP Data Services 4.x Certi cation Tips SAP Data Services 4.x Upgrade steps
By Former Member Jan 27, 2014 By Daya Jha Dec 12, 2013

SAP Data Migration Using 'Migrate your Data – Migration Cockpit' & SAP BODS as ETL
Platform
By Soyel Rana Apr 11, 2023

Related Questions 
Integration between BODS and MDG Convet blob datatype into varchar in BODS 4.2
By uhdam reddy Jun 20, 2016 By Murali Krishna Jan 16, 2017

How Do I Automate the script for Target table in HANA


By Former Member May 14, 2016

This site uses cookies and related technologies, as described in our privacy statement, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of
8 Comments
these technologies, or manage your own preferences.

Cookie Statement Manage Settings Accept Reject All


You must be Logged on to comment or reply to a post.
Ahalya Inturi
September 3, 2014 at 8:47 am

Very good document .. helpful for quick reference...

Like 0 | Share

basis basis
February 16, 2015 at 10:23 am

very useful document and good work done by you.

Like 0 | Share

Swetha N
May 22, 2015 at 4:36 am

good article.

Like 0 | Share

Former Member
February 23, 2016 at 10:06 am

Very good article . Really helped me a lot.

Like 0 | Share

Former Member
November 14, 2016 at 2:02 pm

Very nice document for reference .Thanks Alok

Like 0 | Share

Former Member
January 9, 2017 at 11:06 am

Hi  Alok,

Really very good document. Can you please upload full document.

Thanks

Like 0 | Share

Azhar Uddin
August 17, 2017 at 10:29 am

Very good article, nicely done.

Like 0 | Share

Narendra Dandi
June 29, 2020 at 2:51 pm

Excellent Document.

Like 0 | Share

Find us on

Privacy Terms of Use Legal Disclosure Copyright Trademark Cookie Preferences Newsletter Support

This site uses cookies and related technologies, as described in our privacy statement, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of
these technologies, or manage your own preferences.

Cookie Statement Manage Settings Accept Reject All

You might also like