SQLScript

The 2nd edition of SQLScript for SAP HANA is now available.

Officially, the 2nd edition of SQLScript for SAP HANA , the English edition of my book, has been available since mid-August. Today I received my author copies from SAP PRESS and for the first time I am holding a bound copy in my hand. The format has become slightly larger. And the colouring corresponds to the German edition.

Comparison 1st and 2nd edition
Comparison between 1st and 2nd edition

In terms of content, a lot has changed. Some topics are new, some small things have been corrected or optimized and one item has even been removed: the Web-Based Development Workbench on the Neo Cloud environment will probably no longer play a role.

New topics in the 2nd edition of SQLScript for SAP HANA

Some topics have been added in SP3 and SP4 of HANA 2.0, which have also found their way into the new edition. And I had simply forgotten one important point in the first edition. 😉

What else has changed from the first edition

And, of course, there are countless other minor additions and corrections. For example, many screenshots of the BW interfaces in the SAP GUI were replaced by the new BW/4HANA interfaces in Eclipse.

Although the content has increased, the new edition has fewer pages. This is due to the new, larger format.

Is it worth buying if you already have the first edition?

The basics about SQLScript in the book haven’t changed. Thus, the update to the 2nd edition is not necessary for beginners in the subject. But for professionals interested in the latest features, it may be worth buying. And, of course, it should be the current edition for the company library, so that employees are always up to date.

The HANA Web Console from Brandeis Consulting

The HANA Web-Console (HWC) from Brandeis Consulting is the quick entry into SQLScript programming on the SAP HANA database. For trainings and also for the daily work with SQLScript it is very practical, if you don’t have to start the SAP HANA Studio or Eclipse first. And not every participant in our e-learning courses has the time or inclination to deal with the installation of the tools. That’s why we developed the HANA Web Console (HWC). With this tool you can easily query the data model from the English edition of my book SQLScript for SAP HANA .

A first test of our HANA Web-Console

For example, the data model contains a table TASKS with tasks. With a little query, we can take a look at those. Just enter the statement SELECT * FROM TASKS in the input field below and click on the Execute button.

The HANA Web Console
A picture of the HANA Web-Console in SE16 style 😉

And more complex queries with anonymous blocks are also possible.

What is the HWC?

The name says it all: The HANA Web Console is an SQL console for SAP HANA as a web application. The focus is on simplicity. Less is more. For our SQLScript courses, a very simple SQL console is actually sufficient. And it doesn’t matter whether that’s the SQL console from SAP HANA Studio, the Web-Based Development Workbench, or the Database Explorer of the Web IDE. The most important difference comparing to these three: The HWC just runs. Tests can be carried out at any time without configuration and lengthy start-up.

Use cases for the HANA Web Console[1]Not all features are implemented yet.

  • Live training courses, the HWC can be used directly as a standalone application. No configuration and tedious setup. Just share the URL with everyone and you’re good to go.
  • In e-learning courses, the HWC can be positioned within the learning content. The participant can perform the exercises directly inplace and does not have to switch between the programs.
  • Interactively in blog posts, readers can directly try out what the author means. The console can be prefilled with queries.
  • The console can provide limited, read-only access to an SAP system to support development.

Limitations of the HANA Web Console

But of course, there are some limitations. Because our HWC is unprotected on the Internet for everyone to reach. That is why we have restricted this access accordingly:

  • Only read operations are allowed
  • The user and the schema cannot be changed
  • Dynamic SQL is not allowed
  • Only one instruction can be executed at a time. This can also be an anonymous block, so that complex programs can be tested.

The HWC is still under development. It can be changed or moved at any time. So please don’t use it in blogs or articles yet.

Same Same – but different: The model for the HWC

The sqlbolt.com site has been a model and inspiration for the HWC in many aspects. This is a simple, interactive SQL tutorial that I recommend to attendees who are unsure if their SQL skills are sufficient for the SQLScript Basics or SQLScript for BW Consultants training. What excited me most was the mix between content and exercises. And all this without installing a development environment or complicated connection parameters.

And that’s how the idea of the HANA Web Console came about. Same concept, but slightly different. HWC’s initial focus is more on blogs and articles than training. The project was implemented by our working student Gautam Kumar and will be further supervised by him.

Outlook

The first version of the HANA Web Console is just about ready. And already it is incredibly practical and with me personally almost daily in use. I’m looking forward to my next blog article with SQLScript where I can incorporate the HWC.

But there are also other features planned:

  • Presetting with a query
  • A history of the last queries
  • Selection of different configurations (schema/user/stylesheet…) via URL parameter

There will be a lot more happening here in the coming months. I will keep up to date on this page.

I welcome any feedback on the HANA Web Console. Suggestions are very welcome.

Anmerkungen und Verweise

Anmerkungen und Verweise
1Not all features are implemented yet.

The 2nd edition of SQLScript for SAP HANA will be published in August

The 2nd edition of the English edition of SQLScript for SAP HANA is scheduled for release on August 26, 2021. The issue is updated and expanded with a few new topics. It thus largely corresponds to the current German edition.

New topics

SQLScript for SAP HANA 2nd Edition

Other topics include:

  • Lateral Join – The term lateral join is somewhat misleading, as it is not another type of join, such as inner join or left outer join. Rather, a lateral join is a join where the join partner is a correlated subquery or function. You can think of a lateral join as executing the subquery separately for each row of the previous result. It is possible to access the data of the current row by the correlation name.
  • The SQLSCRIPT_STRING library provides additional functions for processing character strings:
    • With SPLIT you can split a string of separators into several small sections
    • With FORMAT you can format a string with a syntax from the Python language
    • And with TABLE_SUMMARY you can make a string out of a table
  • With ABAP BASIS 7.53, scalar AMDP functions are possible. This is very important, especially for BW/4HANA routine development, since this was not possible until now.
  • Data series can be generated with the SERIES_GENERATE functions. This is especially handy for date values.

And, of course, there are countless other minor additions and corrections. For example, many screenshots of the BW interfaces in the SAP GUI were replaced by the new BW/4HANA interfaces in Eclipse.

The ABAPVARCHARMODE: Blanks and Empty Strings in ABAP and SQLScript

The two SAP programming languages ABAP and SQLScript behave differently when it comes to processing strings with blanks. This can lead to errors and confusion. It is therefore important to understand the ABAP VARCHAR mode or ABAPVARCHARMODE.

Strings in ABAP with Text Fields and Text Strings

Unfortunately, there are already the first differences here even within ABAP. I have taken the terms text fields and text strings from the SAP documentation. Here is the direct comparison

Text fields, internal ABAP type c

Definition with

DATA lv_char(10) TYPE c 
                 VALUE 'Brandeis '.
WRITE lv_char && lv_char.

In this data type, spaces are always removed at the end. This is already true for the Text field literal which is noted with a “normal” apostrophe.

Text strings, internal ABAP type string

Definition with

DATA lv_string TYPE string 
               VALUE `Brandeis `.
WRITE lv_string && lv_string.

In the case of the string, the blanks are retained. But only if it is Text string literal with the simple backquotes (=slanted inverted commas) in the source text. Unfortunately, most people don’t even know how to type these backquotes on a German keyboard. 😉

The constant SPACE: Just no space character

SPACE

By the way, the global constant SPACE is a text field of length 1. And in the ABAP debugger we can see that there is exactly one blank in it. But unfortunately it behaves differently. We notice this when we perform a concatenation with this constant SPACE . It is always ignored:

DATA lv_char(10) TYPE c
                 VALUE 'Brandeis '.

WRITE lv_char && SPACE && lv_char.  " Ergibt: BrandeisBrandeis

The behavior of SPACE is also somewhat confusing because it behaves differently again as a separator in the CONCATENATE statement:

DATA lv_char(10) TYPE c
                 VALUE 'Brandeis'.
DATA lv_concat(20) TYPE c.

CONCATENATE lv_char lv_char INTO lv_concat SEPARATED BY space.

WRITE lv_concat. "Ergibt: Brandeis Brandeis

Here exactly the expected space is inserted!

Strings with spaces in SQLScript

In the SAP HANA database, things are much simpler by default: In SQLScript, strings are really stored and processed exactly as they are. Spaces remain at the end and nothing is automatically removed. But of course there is one exception that is almost the rule: the ABAP VARCHAR mode [1]See SAP Note 2262114 – One Space Handling for ABAP in SAP HANA. This is used when the SAP HANA database is located under an ABAP system such as an S/4HANA or BW/4HANA in order to be compatible with the SPACE of the ABAP.

Why an extra mode?

Empty or initial character strings are sent to the database by ABAP as SPACE, which is actually a blank character. In ABAP, SPACE therefore also works correctly as a comparison value in WHERE conditions. This means that a selection in SQLScript should actually take place with a space character. Use the ABAP VARCHAR mode to communicate the HANA: Make no distinction between a single blank literal and an empty string. Both are mapped to the empty string. This makes the selection in SQLScript easier.

This solves the selection problem, but a new problem has arisen: If we ever need a single space character just like that, we can no longer specify it as a literal ' '. Because it automatically becomes a ''. So, as with text fields in ABAP, we have a problem if we want to separate two words with a space, for example. Instead of a literal with a space, we instead need to trouble the function call CHAR(32) to produce us a space:

SELECT 'Peter' || ' ' || 'Mueller' FROM DUMMY;      --Not working as expected

SELECT 'Peter' || char(32) || 'Mueller' FROM DUMMY; --Separated with a single blank

The session variable ABAPVARCHARMODE

We can find out directly in our source code whether the ABAP VARCHAR mode is active or not. This information is in the session variable ABAPVARCHARMODE. If this is set to TRUE, the processing in SQLScript behaves as described above. We can also set this mode ourselves for testing purposes. However, this is dangerous. This is because, as a session parameter, this setting applies to the entire database session. Here we should not change the default value for AMDPs.

But to try it out in the SQL console, you can use this code here. It nicely shows how the behavior changes when queried:

SET 'ABAPVARCHARMODE' = 'FALSE';
SELECT 'Peter' || ' ' || 'Mueller' FROM DUMMY;
SET 'ABAPVARCHARMODE' = 'TRUE';
SELECT 'Peter' || ' ' || 'Mueller' FROM DUMMY;

Summary

Spaces at the end of a string are normally removed in ABAP, but not in SQLScript. Empty strings or strings and literals that consist of exactly one blank character cause problems in SQLScript if an ABAP system is involved. This is because there is a mapping from a space character to the empty string. However, the function call CHAR(32) is a good way to get around this.

Anmerkungen und Verweise

SQL Expressions in BW/4HANA Composite Provider (HCPR)

YouTube

Mit dem Laden des Videos akzeptieren Sie die Datenschutzerklärung von YouTube.
Mehr erfahren

Video laden

With the SPS04 for BW / 4HANA 2.0, SAP has built some useful features into the composite provider (list of new features in the SAP documentation: https://help.sap.com/viewer/b3701cd3826440618ef938d74dc93c51/2.0.6/de-DE/d8b12ec099e04e6aa77a312be687db63.html ):

  • New join types are possible: Full Outer, Referential, Right Outer Join
  • The restrictions regarding the order of JOIN and UNION nodes no longer apply.
  • The operators , =, != now also allow non-equi joins.
  • Projection and aggregation nodes can be created
  • Columns calculated with SQL expressions can be defined
  • The data can be filtered with SQL predicates the data.

Especially the last two points are of course interesting for me as a SQLScript expert. This opens up a whole new range of possibilities, which I would like to analyse in this article. Unfortunately, the interface for the SQL features is an imposition, which is why I’m giving a bit of a tutorial on how to use them here.

The differences to the Calculation Views (CV) of SAP HANA are visibly disappearing. Only the modeling interface for the CVs is much more comfortable, especially in the WebIDE. I will return to this striking similarity to the CVs later.

Filtering in BW/4HANA Composite Providers with SQL Expressions

Context menu on a node in the composite provider

In each node type, except for the part providers themselves, it is possible to specify an SQL filter. To do this, you have to go to the context menu of the respective node.

The interface that then pops up in a popup is anything but self-explanatory. At first it looks like a typical formula editor. Because it is supposed to be a filter, I guess a predicate has to be entered. So let’s start with a simple comparative predicate. The first question here is how we can refer to fields or InfoObjects. Unfortunately, there is no corresponding support by value help or selection and we have to experiment willy-nilly. Here it turns out that we have to refer to the names in the target of the respective node, using the special notation with quotation marks and capital letters for identifiers. Errors in the SQL expressions are only displayed when the BW/4HANA Composite Provider is activated.

Formula editor for filtering with SQL predicates in HCPR

The following predicates work:

  • Comparison predicates with the comparison operators specified in the selection list
  • Comparison with BETWEEN, e.g. "0CO_AREA" BETWEEN 'A' AND 'B'
  • Quantity comparison with IN, e.g. "0CO_AREA" IN ( 'A' , 'B')
  • LIKE for filtering with simple patterns
  • LIKE_REGEXPR for filtering with regular expressions. Caution: The value “like_regex” suggested in the selection box does not work.
  • IS (NOT) NULL – This could be handy for filtering OUTER JOINs data to simulate the EXISTS predicate.
  • CASE-expressions in comparisons
"0CO_AREA" = CASE "GXXMSSID" 
                  WHEN 'P01100' THEN 'ABC'
                  WHEN 'Q01100' THEN 'CDE'
                  WHEN 'D01100' THEN 'DEF'
                  ELSE ''
             END

Not allowed are:

Unfortunately, this means that all options for filtering with subqueries are missing. But many things you can do with EXISTS and IN/ANY/ALL/SOME with subquery can now be done with the extended JOIN conditions and aggregation nodes.

Fields in HCPR calculated with SQL expressions

The procedure for calculated fields is quite similar to that for filters. Use the context menu in the right target area to create the calculated field. A modal window is displayed in which, this time, the field properties are also to be entered:

  • Group
  • Type
  • Name
  • Description
  • association with
  • Unique name
  • Checkbox Force Group By
  • Data type & Length
Popup window for creating a calculated field

With the experiences from the filtering I tried here also again, which expressions are permitted and which not.

Permitted expressions for calculated fields

  • SQL functions from the selection box – with a few exceptions such as the LIKE_REGEX already mentioned above. However, here are SQL functions that are not mentioned in the most recent reference documentation. For example, the ABAP_EXTRACT_DAY( ) function. However, this is also available normally in the SQL console on a HANA Express.
  • Other SQL functions not listed here. An example of this would be the SQL function WEEKDAY(), which can be used without further ado, but is not offered for selection.
  • Simple and complex CASE expressions.
  • Literals, also typed
  • Fields from the target list in special notation with quotes and in capital letters

And also all combinations of these expressions to more complex expressions are allowed.

Not allowed expressions are:

  • Other SQL functions known in SAP HANA will cause errors when enabling HCPR. For example, the RAND() function.
  • Aggregate functions, not even in aggregation nodes 🙁
  • Window Functions
  • Sub-Queries

So you’re left with a bit of trial and error as to which functions work here and which don’t. Especially with the SQL functions it is very experimental. By the way, the error messages are identical to those of the Calculation Views. From there, I suspect it’s the same amount of features allowed here.

Does anyone know an overview of what is allowed there and what is not?

Conclusion on SQL Expressions in BW/4HANA Composite Providers

The new features are nice, but not 100% mature yet. Specifically, I would like to see:

  • A value help for the available fields in an expression
  • A complete and correct list for the selection of possible SQL functions
  • An ad-hoc check of the syntax and not only when activating the HCPR
  • Better a good code editor with syntax highlighting and code completion than a bad formula editor.
  • That all SQL functions displayed are also included in the SAP HANA SQL reference documentation.
  • That also the other SQL functions are allowed, because the selection seems to me very arbitrary

The other missing features, for example the window functions, can have an extreme impact on the execution plan and optimization. SAP obviously does not want to take this risk. Personally, I think that’s a shame, but I can understand it well.

Addendum (January 29th, 2021): Today I saw the use of both features in a project. Using it had a positive effect on the architecture used, which I really liked.

Cheat Sheet – The SAP HANA SQLScript Overview

Cheat Sheet – The SAP HANA SQLScript Overview

This SQLScript Overview is the perfect complement to the training of Brandeis Consulting

Syntax, examples and descriptions, matching the contents of the SQLScript basic training at Brandeis Consulting. This overview shows only the most important aspects. For details, please follow the links to the reference. If there are several identical concepts (e.g. identical SQL functions), only one of them is usually described here. The text only includes links if there are relevant additional information behind it. Examples always refer to the English demo data model for the book.

Notation in code

  • [Square Brackets] – Optional elements
  • UPPERCASE – Keywords and SQL functions
  • lowercase – Identifiers
  • AngleBrackets – descriptive elements. Abbreviations are explained in the text
  • {Repetitions}... are optional and are written in curly brackets with 3 dots.

Under the line are references, for example

  • Book S. – Refers to the page number of the 2nd, german edition SQLScript für SAP HANA . The 2. english edition will follow this year…
  • SAP Documentation – A link to the corresponding chapter of the latest SAP documentation from SAP HANA on Premise! Not Cloud!

Identifier

Table or column identifiers are case ensitive. This is not noticeable with the identifiers in simple notation without quotes, since they are always implicitly converted to UPPERCASE. There is no difference between the names ColumX, CoLuMx and COLUMX . Internally, all three are COLUMX interpreted as. Only ASCII letters, digits and the characters #_ are allowed.

Names in special notation are included in double quotes. This will interpret you exactly as they appear. The three names "ColumX", "CoLuMx" and "COLUMX" are different. All Unicode characters are allowed between the double quotes.


Comments

Line-end comments are introduced with two hyphens -- and continue until the line break.

Block comments start with /* and end with */ . They can extend within a line or over several lines..

ABAP comments with * at the first position only work in AMDP and should therefore not be used.


Book p. 59 SAP Documentation

Scalar expressions

Deliver exactly one elementary value, such as a date, string, or number.


Book p. 69 SAP Documentation

Literals

Literals are constant values in SQLScript code.

LabelFormatExample
ASCII StringsIn Quotes‘Peter’
Unicode StringsIn quotes with an N as a prefixN’Jörg’
IntegersSequence of digits123
DecimalsNumber sequence with decimal point123.456
Floating-point numbersMantisse and exponent, separated by an E1.23E+04
DatePrefix DATEDATE’2017-11-10′
TimePrefix TIMETIME’15:42:04.123′
TimestampPrefix TIMESTAMPTIMESTAMP’2011-12-31 23:59:59′

Book p. 61SAP Documentation

SQL Functions

The HANA database comes with hundreds of SQL functions for calculating data. The most important ones can be found on this page for the respective data types:


Book p. 167ffSAP Documentation

CASE Expressions

A CASE expression returns a scalar value. There are two variants of CASE expressions.
The simple CASE expression compares an expression to several other expressions for equality:

SELECT id, 
       CASE status
         WHEN 1 THEN 'Neu'
         WHEN 2 THEN 'In Bearbeitung'
         ELSE 'Nicht relevant'
       END AS status_text
  FROM tasks;

The complex CASE expression evaluates N independent predicates. The first to TRUE be evaluated yields the result:

SELECT CASE 
         WHEN status NOT IN (1, 2, 3) 
           THEN 'Open'
         WHEN due_date < current_date 
           THEN 'Critical'
         ELSE 'Others' END AS statustext
  FROM tasks;

If no true comparison or predicate is found, either the value is returned from the ELSE clause or NULL


Book p. 119 SAP Documentation

Table expressions

An expression that returns a table. These can be:


Book p. 59

SQLScript Übersicht

Brandeis Consulting

Training and advice from the author of the book SQLScript for SAP HANA.
If you have questions about SQLScript and training, simply email info@brandeis.de.
(C) Brandeis Consulting GmbH

SELECT

The SELECT statement defines a table. The columns are created with the field list, while the corresponding rows result from the other clauses.

Syntax:

SELECT [TOP number]  
  [DISTINCT] Fieldlist
  FROM TableExpression 
  [WHERE Predikate] 
  [GROUP BY ExpressionList] 
  [SetOperation] 
  [ORDER BY ExpressionList] 
  [LIMIT NumberOfLines[OFFSET Offset]]

The FROM clause describes the source of the data

FROM TableExpression [[AS] Alias]

If necessary, there is a further table expression with JOIN one.

The WHERE clause filters the data. Only records for which a predicate evaluates to TRUE are added to the result set of the query.


Book p. 114 SAP Documentation

Field List

Definition of the columns of the query.

ScalarExpression1 [[AS] Alias1 ] [ { ,
ScalarExpression2 [[AS] Alias2] } ... ]

Multiple columns are separated by commas. All columns of the sources are addressed with *.

SELECT *, 
       LEFT(coarea, 2) AS country, 
       amount * 1,16   AS net_amount, 
       CASE LEFT(coarea,2) 
         WHEN 'DE' THEN '1' 
         WHEN 'FR' THEN '2' 
         ELSE '9' 
       END AS ccode, 
  FROM TableExpression

Book p. 116 SAP Documentation

Joins

One JOIN is used to create a combination of multiple tables. The ON condition defines which rows from the participating tables are combined in a row of the query result.

SELECT ...
  FROM TableExpression1
  JoinType JOIN TableExpression2
  ON JoinPredikate;

The CROSS JOIN is the only join type without a ON condition. It is the Cartesian product of two tables.

As a INNER JOIN result, only rows that have found a partner on the other side.

If no partner is found in an OUTER JOIN, the NULL Value is populated.


Book p. 134SAP Documentation

Subqueries

Scalar subqueries provide exactly one row and one column, for example in field lists or for comparison. More than one result row results in a runtime error.

SELECT assignee,
       id,
       due_date,
      (SELECT lastname
         FROM users AS u
        WHERE o.assignee = u.id)
        AS assignee_name
  FROM tasks AS o
 WHERE  o.due_date =
           (SELECT MAX(due_date)
              FROM tasks AS i
             WHERE o.assignee = i.assignee )

Column subqueries return multiple values in exactly one column. They are used for quantity comparison with IN.

SELECT * 
FROM tasks
WHERE
   status IN (SELECT id
                FROM status
               WHERE is_final = true)

Table subqueries provide a table that can be used as a table expression in the FROM clause. They can be elegantly replaced with table variables.

SELECT category,
       COUNT(*)
  FROM ( SELECT
          CASE
            WHEN status IN (1, 2)
            AND due_date < current_date
               THEN 'Critical'
            WHEN status IN (1, 2)
               THEN 'Open'
            ELSE 'OK' END AS category
         FROM tasks)
 GROUP BY category;

Book S.159ffSAP Documentation

Predicates

Predicates are logical expressions that have the value TRUE, FALSE or UNKNOWN . In WHERE clauses, ON clauses, or when evaluating conditions in CASE expressions, it is only relevant whether a predicate evaluates to TRUE or not.

Predicates can be combined with the logical operators NOT , AND and OR. Brackets increase the readability enormously here!

The main predicates are:

  • Comparisons:
    Expression1 Operator Expression2
  • IS NULL – The only predicate that can determine NULL values
  • EXISTS – Evaluated TRUE when the subquery returns at least one row.
  • Set comparisons with IN, ANY, SOME or ALL
  • LIKE and LIKE_REGEXPR – Search for patterns, also possible with regular expressions.

EXISTS Predicate

The EXISTS quantor checks whether a subquery returns a result or not.

SELECT DISTINCT assignee
  FROM tasks AS t 
 WHERE NOT EXISTS (
      SELECT id
      FROM projects AS p
      WHERE p.project_manager= t.assignee );

Book S.146SAP Documentary

Table variables

Table variables are usually declared by assignment and populated with data. Because they represent a table expression with a leading colon, table variables can be used within a SELECT query in the same way as a DB table. You can also think of a table variable as a view, which also describes their role in execution quite well.

DO BEGIN
  lt_tmp = SELECT id, 
                  title,
                  assignee 
             FROM :tasks;
   SELECT *
     FROM :lt_tmp; 
END;

Book p. 112 SAP Documentation

Anonymous blocks

An Anonymous Block is a procedure that is not stored under a name in the DB. Instead, the entire code is passed from the application or console to the DB.

DO BEGIN
  SourceCode
END;

Book p. 81 ff.SAP Documentation

UDF Functions

User Defined Functions (UDF) are sub-programs that represent an expression, either

They are often created via WebIDE, AMDP, or the HANA XS repository. This is the SQL way:

CREATE FUNCTION FunctionName 
  [(ParameterList)]
  RETURNS ParameterDefinition
  [LANGUAGE SQLSCRIPT] 
  [SQL SECURITY {DEFINER|INVOKER} ] 
  [DEFAULT SCHEMA DefaultSchema] 
  [DETERMINISTIC] 
AS BEGIN 
  SourceCode
END

Book p. 91SAP Documentation

Aggregation

Aggregation reduces the number of rows. The expression list in the GROUP BY clause determines the granularity of the query result because a row is formed for each existing combination.

Columns that do not appear in the GROUP BY clause must be summarized with an aggregate function, such as.B. with MIN() or SUM(). Typical query:

SELECT assignee, 
       status, 
       SUM(effort)
  FROM tasks
  GROUP BY assignee, 
           status;     

Book S.123 SAP Documentation

Procedures

are sub-programs in SQLScript. They are often created through a development environment such as WebIDE, Eclipse-HANA repository, or AMDP. Direct creation via SQL is also possible:

CREATE [OR REPLACE] PROCEDURE ProcedureName 
     [(ParameterList)] 
     [LANGUAGE {SQLSCRIPT|RLANG} ] 
     [SQL SECURITY {DEFINER|INVOKER} ] 
     [DEFAULT SCHEMA DefaultSchema] 
     [READS SQL DATA] 
     [WITH ENCRYPTION] 
AS 
BEGIN [SEQUENTIAL EXECUTION]
    SourceCode
END

Book p. 82SAP Documentation

NULL

Actually, NULL it is not a value, but a symbol that represents the absence of a value. A comparison with NULL always results UNKNOWN. Each calculation with NULL results in the NULL.

Only the IS NULL predicate can filter NULL in a column. To handle it in expressions, there are the two SQL functions:

  • IFNULL( Expression, Fallback )
  • COALESCE( Expression1 {, Expression2 } ...)

Typical causes of NULL values

  • OUTER JOINs
  • CASE without ELSE branch
  • NULL in DB tables
  • The NULLIF() SQL function

Book p.73SAP Documentation

The table DUMMY

This table cannot be changed and contains exactly one column named DUMMY and a row that holds the value X. The table is useful for testing scalar expressions:

SELECT Expression FROM DUMMY;

Or for the construction of fixed table variables:

lt_year = SELECT '2020' AS year FROM dummy 
          UNION ALL
          SELECT '2021' AS year FROM dummy;

Book p. 76

UNION ALL and Set Operators

SELECT queries with a compatible column structure can be combined with the following operators:

  • UNION ALL – The union of two tables.
  • UNION – dito, but is slower because it eliminates duplicates
  • INTERSECT – Forms the intersection, which can alternatively be realized with
    INNER JOIN
    one.
  • EXCEPT or MINUS – Is a Set-Subtraction that can alternatively be implemented with the EXISTS predicate.

These operators, except UNION ALL , consider the rows to be elements of a set. The elements are identical if all fields are identical, i.e. there are no “key fields” here. The specified alternatives are suitable if you do not want to perform the operations on all columns.


Book p. 157SAP Documentation

Time data types

The following data types can be used for Points in time (PIT).

TypeStandard
DATE'YYYY-MM-DD'
TIME'HH24-MI-SS'
SECONDDATE'YYYY-MM-DD HH24-MI-SS'
Timestamp'YYYY-MM-DD HH24-MI-SS.FF7'

SQL Functions

SQL functionDescription
CURRENT_DTLocal time for the data type (DT)
CURRENT_UTCDTdito with coordinated world time
ADD_UOTS(PIT, distance)adds the distance (+/-) in the Unit of Time (UOT) to the point in time (PIT)
UOTS_BETWEEN( PIT1, PIT2)Distance of the Points in Time in the unit of time.
TC(PIT)Time component (TC) as INTEGER
ISOWEEK(PIT)CalendarWeek ISO, e.B. 2021-W12
WEEK(PIT)CalendarWeek US, Num.
WEEKDAY(PIT)Numeric: Mo=0, Sun=6
QUARTER(PIT,[Offset])quarter, if necessary differerent Fiscal year
LOCALTOUTC(PIT, TimeZone])Local time to UTC
UTCTOLOCAL(PIT, TimeZone])UTC to local time

The unit of time (UOT) in the functions is either SECOND, DAY, MONTH or YEAR .


Book p. 191 SAP Documentation

Conversion between time and character strings

TO_VARCHAR( PIT, [Format])
Conversion of the point in time (PIT) to a character string.

TO_TDT( String, [Format])
Conversion string to time data type (TDT).

Symbols for formatting

UnitSymbolDescription
YearYYYYyear, 4 digits
 YYYear, 2 Digits
QuarterQNumeric
MonthMMNumeric, 2-digit
 MONTHName in EN
 MONabbreviation in EN
 RMRoman spelling
WeekWW. in the month
 WWW. in the year, not ISO!!!
DayDNumeric
 DDNumeric, 2-digit
 DAYName in EN
 DYabbreviation in EN
HourHH1212h Time without AM/PM
 HH24Hours (0-23)
 AM | PMmorning or afternoon
MinuteMINumeric, 2-digit
SecondSSNumeric, 2-digit
 SSSSSSec. after midnight
 FF[1..7]Digits of sec. N-digit

In addition to the symbols, delimiters can also be used:

SELECT TO_VARCHAR(CURRENT_TIME, 
                  'HH24.MI.SS') 
       TO_DATE('DEC-29-20', 
               'MON-DD-YY')
FROM DUMMY;

Book p. 193 SAP Documentation

Character Strings

Unlike ABAP, spaces are not automatically removed at the end! The operator || concatenates two strings.

Types

Data typeDescriptionMax. Length
NVARCHAR(N)Unicode String5000
VARCHAR(N)ASCII String5000
ALPHANUM(N)Alphaconverted127
CLOB, NCLOBLarge Strings2GB

SQL Functions

SQL FunctionDescription
LENGTH(Str)Length
ABAP_LOWER( Str)Conversion of the string to lowercase / uppercase letters
ABAP_UPPER( Str)
LEFT(Str , length)Left/Right part of the string with the length
RIGHT(Str, length)
SUBSTR( Str, Pos, length)Part of the string
SUBSTR_BEFORE( Str1, Str2)Part of String1 before/after String2
SUBSTR_AFTER( Str1, Str2)
LOCATE(Str1, Str2)Position of String2 in String1
REPLACE(Str1, Str2, Str3)Replaces String2 in String1 with String3
LPAD(Str, length [Pattern])Filling left/right with pattern up to length
RPAD(Str, length [Pattern])
ABAP_ALPHANUM( Str, length)Alpha conversion
LTRIM(Str [,CS])Remove the character set (CS) from left/right
RTRIM(Str [,CS])

There are also some of the functions available in a variant for use with regular expressions:


Book p. 82 SAP Documentation

Numeric data types

In contrast to ABAP, an assignment is not automatically rounded off commercially! If the target format does not fit, it is truncated. 😥

TypeDescription
IntInteger
DEC(p,s)Fixed-point number
DECDecimal Float
RealBinary Float

SQL Functions

SQL functionDescription
ROUND(Num, Digits)Commercial rounding
NDIV0(Num, N)Division of Num by N, NULL if N=0
RAND()Random number between 0 and 1
ABS(Num)Absolute value
SIGN(Num)Sign 1 or -1
MOD(Num, N)Division remainder of Num/N
CEIL(Num)Up/down on INTEGER
FLOOR(Num)

And many more in the SAP documentary…


Book P.207SAP Documentation

Reusing Business Logic in AMDP Transformation Routines

Last week I read Lars Breddemann’s great article about separating business logic from the underlying database tables without using dynamic SQLScript. This is a problem that also arises again and again in SAP BW/4HANA: the reuse of logic in AMDP transformation routines.

Lars’s example is very small, and I want to expand it a bit in this article based on a scenario I’ve experienced in a project in the past. In addition, I would like to show a more elegant way of calling such outsourced function, which facilitates the application. Here’s the problem:

The problem – always the same logic on different tables

A customer has branches in several countries. The data models are strictly separated by country, i.e. they are repeated for each country. The name of the ADSO consists of the ISO Code of the country at the 4. and 5. position:

Examples from the Profit Center Accounting:

  • PCADE01 – Germany, balances
  • PCADE02 – Germany, Statistical key figures
  • PCAFR01 – France, balances
  • PCAFR02 – France, Statistical figures
  • PCAUK01 – Great Britain, balances,
  • PCAUK02 – United Kingdom, Statistical figures

For the master data, all InfoObjects are compounded to the sourcesystem characteristic The characteristics clearly correspond to a country.

  • DE100 – Master data Germany
  • FR100 – Master data France

There are, of course, other approaches to deal with data of several countries. But I do not want to discuss in this article whether the modeling chosen here is so optimal. The aim of the project was to provide data models for more than 20 countries in future.

As a developer, I am now particularly interested in the requirement that the business logic should be exactly the same in the different transformation routines. Of course, a lookup for a data model in France should only go to the French tables and the same lookup in Germany to the German tables. And the master data should also be restricted to the source system France or Germany for optimal access. In ABAP, this is a trifle, because in a SELECT statement you can specify only the table name dynamically.

The solution approach

In Lars’ example, SQLScript solves this as this:

DO BEGIN
    my_customers = SELECT customer_name, location, tx_date
                   from customers_a
                   where customer_name = 'ACME'
                     and location = 'NEW YORK';
                        
    SELECT  
         customer_name, location, tx_date
    FROM
        earliest_customer_transaction (:my_customers);
END;

So the business logic is encapsulated in a EARLIEST_CUSTOMER_TRANSACTION function, and the data is passed through a table variable. The caller selects from the appropriate DB tables. This works very well, but it can be simplified. And of course, as announced, I’ll build a more complex example of how to reuse business logic in AMDP transformation routines.

IN-Parameter

First, let’s look at the input parameters of the procedure [1]For simplicity, I limit OBDA to procedures. The same always applies to IN parameters of functions. The table parameters defined a structure: Which fields are needed with which data types. This is useful so that the program code can be checked statically within the procedure.

Untyped table parameters – a dead end

But it is not mandatory. Since HANA 2.0 SPS04, you can also do without typing table parameters:

CREATE OR REPLACE PROCEDURE do_something( IN inTab TABLE(...), 
                                          OUT outTab TABLE(...))
...

However, these untyped table parameters only make sense if you want to access the data dynamically. That is, if the column names are not known until run time. However, we did not want to consider dynamic coding in this article, as written in the introduction.

Partially typed table parameters

It is often overlooked that as a caller to a procedure, you do not have to precisely hit the structure of the table parameters. The order of the columns doesn’t matter here[2]The book “Thinking in Sets” by Joe Celco describes this very nicely. Column sequences are only relevant for the application layer. Internally, they don’t matter. Similarly, additional columns are irrelevant,they are ignored by the system and later removed by the Optimizer.

Another interesting aspect is the fact that you can not only pass table variables into a table parameter. The names of database tables, views, or table functions are also a valid table expression and can be used in SQLScript. By specifying the table name directly, we can optimize the above example as such:

DO BEGIN
    SELECT  
         customer_name, location, tx_date
    FROM
        earliest_customer_transaction (customers_a)
    WHERE customer_name = 'ACME'
      AND location = 'NEW YORK';
      
END;

Unfortunately, there is a limitation in AMDP, so you always have to use a table variable.

Typing as a kind of interface

In Lars’s article, I like perspective at the data as instances of table definitions. He writes:

Tables/views are the data types that developers work with, while records (rows) are the instances of the data types.

If we stay in this image, IN table parameters only describe something that corresponds to the concept of an interface in object-oriented programming languages: it determines what properties must have the passed data at least. There may also be additional columns.

OUT parameters

The OUT table parameters of a procedure, but not the RETURNS value of a table function, can also be untyped. However, this can only be used if this procedure is called directly from the application layer via the SQL interface.

Attempting to call such a procedure in an anonymous block or other procedure results in this error message:

Error: (dberror) [7]: feature not supported: nested call on procedure "MULTI_TAB"."EARLIEST_CUSTOMER_TRANSACTION_PROC" has any table output parameter RESULT

This means that the return tables must be fully typed. It is not possible to return more columns than defined. This means a lot of work in an AMDP transformation routine with 200 InfoObjects. Because at the end of such a procedure, a projection must be made to the expected output format.

Reusing Logic in AMDP Transformation Routines

If we now transfer these findings to the reuse of logic in AMDP transformation routines with the above problem, then a solution might look like this:

The Common AMDP Procedure

The same business logic should be followed for all countries. This is defined in one AMDP procedure. The data is passed to this procedure from the outside as a table parameter. Only the columns actually used are defined per table parameter.

All data or just the dynamic ones?

If all DB tables are passed to the procedure, this has a huge advantage: no database accesses must take place within the procedure. All data comes from the outside. And with that we have something called dependency injection in the object-oriented world: the dependencies are passed on from the outside. This allows us to easily test our business logic from ABAP with unit testing. When you ask ABAP developers about a unit test for ABAP transformation routines, you are often laughed at! (Has already happened to me. 😉 ) This is now quite simply possible.

So we should also pass standard tables and master data tables if they are needed.

CLASS zcl_tr_demo DEFINITION
  PUBLIC
  FINAL
  CREATE PUBLIC .
  PUBLIC SECTION.
    INTERFACES if_amdp_marker_hdb.
    TYPES: BEGIN OF gty_s_totals,
             sysid    TYPE c LENGTH 6,
             ryear    TYPE n LENGTH 4,
             rbukrs   TYPE c LENGTH 4,
             prctr    TYPE c LENGTH 10,
             balance  TYPE p LENGTH 000009 DECIMALS 000002,
             currunit TYPE c LENGTH 5,
           END OF gty_s_totals.
    TYPES gty_t_totals TYPE STANDARD TABLE OF gty_s_totals WITH DEFAULT KEY.
    TYPES: BEGIN OF gty_s_stat_kyf,
             sysid    TYPE c LENGTH 6,
             ryear    TYPE n LENGTH 4,
             rbukrs   TYPE c LENGTH 4,
             prctr    TYPE c LENGTH 10,
             balance  TYPE p LENGTH 000009 DECIMALS 000002,
             currunit TYPE c LENGTH 5,
           END OF gty_s_stat_kyf.
    TYPES gty_t_stat_kyf TYPE STANDARD TABLE OF gty_s_stat_kyf WITH DEFAULT KEY.
    TYPES: BEGIN OF gty_s_pplant,
             sysid TYPE c LENGTH 6,
             plant TYPE c LENGTH 4,
             bukrs TYPE c LENGTH 4,
           END OF gty_s_pplant.
    TYPES gty_t_pplant TYPE STANDARD TABLE OF gty_s_pplant WITH DEFAULT KEY.
    TYPES: BEGIN OF gty_s_in_out_tab,
             recordmode                     TYPE rodmupdmod, " InfoObject: 0RECORDMODE
             logsys                         TYPE rsdlogsys, " InfoObject: 0LOGSYS
             sysid                          TYPE c LENGTH 6,
"            ...
             fiscper                        TYPE n LENGTH 7,
             fiscvar                        TYPE c LENGTH 2,
"            ...
             balance                        TYPE p LENGTH 000009 DECIMALS 000002,
             record                         TYPE c LENGTH 56,
             sql__procedure__source__record TYPE c LENGTH 56,
           END OF gty_s_in_out_tab.
    TYPES gty_t_in_out_tab TYPE STANDARD TABLE OF gty_s_in_out_tab WITH DEFAULT KEY.
    METHODS: do_transformation_pca IMPORTING VALUE(it_intab)    TYPE gty_t_in_out_tab
                                             VALUE(it_totals)   TYPE gty_t_totals
                                             VALUE(it_stat_kyf) TYPE gty_t_stat_kyf
                                             VALUE(it_pplant)   TYPE gty_t_pplant
                                             VALUE(iv_sysid)    TYPE char6
                                   EXPORTING VALUE(et_result)   TYPE gty_t_in_out_tab.
ENDCLASS.

As you can see from the example above: The typing of the whole tables can quickly become complex. The return tables must either fully match the structure of the transformation routines. Or it is limited to the fields determined in the routine. But then it has to be mixed with the other data from the INTAB by join. I will show this below.

However, calling the new procedure in the AMDP transformation routine is easy, even if we are not allowed to pass the table names directly to the procedure here:

CLASS /BIC/SKCP38MSW2ZH49HEXE5D_M IMPLEMENTATION.
METHOD GLOBAL_END BY DATABASE PROCEDURE FOR HDB LANGUAGE SQLSCRIPT OPTIONS READ-ONLY
using ZCL_TR_DEMO=>DO_TRANSFORMATION_PCA
      /BIC/APCADE0022
      /BIC/APCADE0012
      /BIC/PPLANT      .
      
-- *** Begin of routine - insert your code only below this line ***
-- Note the _M class are not considered for DTP execution.
-- AMDP Breakpoints must be set in the _A class instead.
lt_stat_kyf = select * from "/BIC/APCADE0022";
lt_totals = select * from   "/BIC/APCADE0012";
lt_plant = select * from    "/BIC/PPLANT"; 
"ZCL_TR_DEMO=>DO_TRANSFORMATION_PCA"( it_intab => :intab,
                                      it_totals => :lt_totals,
                                      it_stat_kyf => :lt_stat_kyf ,
                                      IT_PPLANT => :lt_plant,
                                      ET_RESULT => OUTTAB );
                                  
-- *** End of routine - insert your code only before this line ***
ENDMETHOD.
ENDCLASS.

It is important that all DB tables must be specified in the USING clause.

Same Same, but different – Flexibility and change robustness

Flexible input

With regard to the database tables to be read, we are already very flexible with the above approach. The tables can vary from country to country as long as the columns defined in the table parameters exist. Since we pass the data with SELECT * from the source tables, the type of table parameters can also change later, without the need to adjust the transformation routines.

Decoupling the output

In the example above, the procedure returned an output table ET_RESULT with the exact structure of the transformation OUTTAB routines. This saves some paperwork on the one hand, but it is not very flexible. What happens if the country ADSOs differ marginally? For example, a few additional fields that have nothing to do with the implemented business logic. Or if you want to add a field for all data models. Then this must be done simultaneously for all countries. The above approach is therefore less flexible.

However, if we limit the fields of the output of the procedure to the relevant fields for the procedure, then we can mix their result with the original INTAB by means of the INNER JOIN original. For the join condition, the field RECORD that has a unique value for each record in the INTAB. The INNER JOIN can also filter out records in the procedure.

The result could look like this:

-- *** Begin of routine - insert your code only below this line ***
-- Note the _M class are not considered for DTP execution.
-- AMDP Breakpoints must be set in the _A class instead.
lt_stat_kyf = select * from "/BIC/APCADE0022";
lt_totals = select * from   "/BIC/APCADE0012";
lt_plant = select * from    "/BIC/PPLANT"; 
"ZCL_TR_DEMO=>DO_TRANSFORMATION_PCA"( it_intab => :intab,
                                      it_totals => :lt_totals,
                                      it_stat_kyf => :lt_stat_kyf ,
                                      IT_PPLANT => :lt_plant,
                                      ET_RESULT => lt_result );
                                      
outtab = select it.RECORDMODE,
                it.LOGSYS,
                it.RYEAR,
"               ...
                it.CURTYPE,
                res.FISCPER,
                res.FISCVAR,
                it.CHARTACCTS,
"               ...
                it.CURRUNIT,
                res.BALANCE,
                it.RECORD,
                it.SQL__PROCEDURE__SOURCE__RECORD
           from :intab as it
           inner join :lt_result as res
           on it.record = res.record ;
-- *** End of routine - insert your code only before this line ***

The Unit Test

And now a small example of testing the procedure with unit tests. Of course, all parameters must be supplied with appropriate data. This can be done very elegantly with the constructor expression VALUE # The three points in the example only need to be replaced by appropriate data:

*"* use this source file for your ABAP unit test classes
CLASS ltcl_ DEFINITION FINAL FOR TESTING
  DURATION SHORT
  RISK LEVEL HARMLESS.
  PRIVATE SECTION.
    DATA mo_cut TYPE REF TO zcl_tr_demo.
    METHODS:
      setup,
      first_test FOR TESTING RAISING cx_static_check.
ENDCLASS.
CLASS ltcl_ IMPLEMENTATION.
  METHOD first_test.
    mo_cut->do_transformation_pca(
      EXPORTING
        it_intab    = VALUE #( ( ... ) )
        it_totals   = VALUE #( ( ... ) )
        it_stat_kyf = VALUE #( ( ... ) )
        it_pplant   = VALUE #( ( ... ) )
      IMPORTING
        et_result   = DATA(lt_result)
    ).
    cl_abap_unit_assert=>assert_equals( act = lt_result
                                        exp = VALUE zcl_tr_demo=>gty_t_in_out_tab( (  ... ) ) ).
  ENDMETHOD.
  METHOD setup.
    CREATE OBJECT mo_cut.
  ENDMETHOD.
ENDCLASS.

Conclusion

Reusing business logic in AMDP transformation routines is also possible in SQLScript. However, you have to know, what you are doing. And you have to write a lot of code. Therefore, it will probably only be worthwhile when more than two data flows require exactly the same logic. The overhead for a flexible solution is larger than in ABAP, because more needs to be typed and mapped. A positive side effect: you get the opportunity to perform unit tests that are decoupled from the database status.

In large BW/4HANA projects with multiple identical data flows, there is no way around outsourcing business logic. If you violate the DRY principle [3]Don’t Repeat Yourself too often, you accept technical debts. These are then due for payment during operation and maintenance of the system.


We are happy to support you in the planning phase of your BW/4HANA in the field of modelling. With training, workshops, advice or through the placement of specialized consultants. Please contact us.

Anmerkungen und Verweise

Anmerkungen und Verweise
1For simplicity, I limit OBDA to procedures. The same always applies to IN parameters of functions
2The book “Thinking in Sets” by Joe Celco describes this very nicely. Column sequences are only relevant for the application layer. Internally, they don’t matter
3Don’t Repeat Yourself

Treat unallowed characters in SAP BW/4HANA with SQLScript

With SQLScript, errors can be avoided elegantly and performantly during loading due to the unallowed characters in SAP BW. This article provides an overview of the problem and outlines the solutions in AMDP transformation routines to clean up the data.


The problem

The allowed characters

In SAP BW, not all characters are allowed in the values of InfoObjects. By default, these are only the following characters:

!"%&'()*+,-./:;<=>?_0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ

If you need more than this, which is likely to be an SAP customer for an international clientele, you have to set this separately in the BW Customizing Guide or in the RSKC code> transaction. There are three options that cannot be combined:

Single characters

Either you maintain here a ridiculously short list of 75 more characters. This will not be enough sooner or later, as no umlauts, accents or Asian characters are allowed in the standard.

ALL_CAPITAL

Or you can allow nearly all characters when you use here the ALL_CAPITALcode> constant (SAP Note 173241 – allowed characters in the BW) .

ALL_CAPITAL _PLUS_HEX

The third option is to use ALL_CAPITAL_PLUS_HEXcode> in the customizing. SAP explicitly discourages this, as control characters are also allowed in the InfoObject values. This can cause problems with the front-end tools (SAP Note 1075403 – unallowed characters and ALL_CAPITAL). This option is intended only for the short-term maintenance of productive operations. If it has to be used, then the data should be cleaned up in a timely manner and then the value should be deferred.

Irrespective of the above settings, a characteristic value must never consist of a simple double cross or hash ‘#’ and must not begin with a quotation mark ‘!’.

Treatment of unallowed characters in ABAP

In ABAP transformation routines, there are different approaches to removing the unauthorized characters. In the blog of Denis Reis, for example, you will find ideas on how to do this manualy as long as you have a positive list of allowed characters (Denis Reis: Useful ABAP transformation routines).

Function

And of course there is more than one function module to do this. With RSKC_CHAVL_CHECKcode> you can check the values against the current settings of the system. And with SCP_REPLACE_STRANGE_CHARScode> (SAP Note 1062237 – Transliteration of characters with SCP_REPLACE_STRANGE_CHARS) the suspicious characters can be replaced. But there can occur performance problems with this function module(SAP Note 2305674 – Performance of SCP_REPLACE_STRANGE_CHARS is bad).

Analysis of errors due to unallowed characters in SQLScript

If the loading or activation of data stops due to unallowed characters, then the first thing to do is to understand why the error happens. In the ABAP debugger there is the HEX-Code of each character visible.

With the scalar function UNICODE code> you can determine in SQLScript the corresponding Unicode number of a character ( SAP documentation UNICODE).

Now we only have to analyze every single letter of a faulty value. To do this, we create an anonymous block in the SQL console, in which we disassemble the suspicious strings using the SERIES_GENERATE_INTEGERcode> function. The example goes to a table from the demo data model from my book and needs to be adopted to your needs. The relevant parts are provided with comments.

DO BEGIN

 lt_suspect_data = SELECT title AS suspect --Replace with the column name
 FROM tasks --Replace by your table e.g. "/BIC/A... 2"
 WHERE id = 1; --Restrict to the corrupt data
                     
                     
 SELECT suspect,
 element_number,
 SUBSTR(suspect, element_number, 1),
 UNICODE(SUBSTR(suspect, element_number, 1))
 FROM :lt_suspect_data
 CROSS JOIN SERIES_GENERATE_INTEGER( 1,1,60) 
 WHERE SUBSTR(suspect, element_number, 1) < > ''
 ORDER BY suspect, 
 element_number;
END;

Non-Breaking Space

Although the above SAP note mentions above all the control characters as a source of error, there may be other characters that cause problems. In this way, for example, we identified the Unicode character 0160 code> in a project that did not want to be loaded. It is a so-called non-breaking spacecode> (NBSP) that also causes problems in other contexts(SAP notes on this).

Handling Unallowed Characters in SQLScript

Single characters

When it comes to replacing individual characters, the simplest method is the SQL-Function REPLACE. One difficulty with programming is that the character to be replaced is not on the keyboard. But if we know its Unicode character number, then we can use the SQL function NCHAR ( < character number > ) to generate the character.

OutTab = SELECT ...
 REPLACE("/BIC/BUTXT", NCHAR(0160), ' ') AS "/BIC/BUTXT",
  ...
 FROM :InTab;

If you want to replace multiple characters in this way, you have to nest the REPLACE code> function. This reduces the clarity and thus the maintainability.

Replace with regular expressions

Regular expressions describe a pattern that can also be used to make substitutions. In SQLScript, there is the SQL-Function REPLACE_REGEXPR. In these patterns, UNICODE characters (see also here) can be entered with the following notation:

\x{<HEX-Zeichennummer>} for \x{A0} code> example, for the above-mentioned non-breaking space (NBSP). And with the square bracket you can also search elegantly for several Unicode characters at the same time.

OutTab = SELECT ...
 REPLACE_REGEXPR('[x {A0} x {8239} ]' IN title WITH ' ') AS "/BIC/BUTXT",
  ...
 FROM :InTab;

However, we still have to name all characters individually with their HEX character number.

All control characters

If all control characters HEX00 to HEX1F mentioned in SAP Note 1075403 are to be removed, this can also be done very elegantly with a regular expression. The character class [[:cntrl:]]code> includes exactly these characters ( Wikipedia RegularExpressions).

The corresponding query will look like this:

OutTab = SELECT ...
 REPLACE_REGEXPR('[[:cntrl:]]' IN title WITH '') AS "/BIC/BUTXT",
  ...
 FROM :InTab;

Or you can use the negation of the character class of all printable characters: [^[:print:]]code> . This expression already includes the NBSP.

OutTab = SELECT ...
 REPLACE_REGEXPR('['[:p rint:]]' IN title WITH '') AS "/BIC/BUTXT",
  ...
 FROM :InTab;

Warning – please test

Unfortunately, regular expressions are extremely fragile. A small thing can render the expression completely unusable. This means that we should always test our logic beforehand. I usually do this in the SQL console with a small example:

create table test_char(line nvarchar(100));
insert into test_char values ('ABCDE');
insert into test_char values ('AB'||nchar(160)||'CDE');
insert into test_char values (nchar(0010)||'ABCDE');
insert into test_char values ('AB'||nchar(10)||'CDE');
insert into test_char values ('ABCDE');
 SELECT line,
        REPLACE_REGEXPR('[[:cntrl:]\x{00A0}]' IN line WITH '#') AS "/BIC/BUTXT"
   from test_char;
                               

Conclusion

There are also very elegant ways in SQLScript to remove the characters that are not allowed. However, in-depth SQLScript knowledge and careful testing are very important here.


I am pleased that this text will help you in your work in the projects. I would be happy about links, quotes with references and, above all, mentions on social media. And of course also about any form of feedback, if you have had other or complementary experiences.

Why are scalar UDFs so slow?

When loading data from source systems, there are requirements in SAP BW that are often repeated at the field level. This includes, above all, logics for cleaning up and processing the data, such as:

  • Add and remove leading zeros
  • Case distinctions
  • Remove spaces at the beginning or end
  • Conversion to capital letters
  • Derive one field from another with solid pattern substrings, e.g. COAREA ==> COUNTRY
  • Derive time characteristics

The following listing shows a few examples in the source code:

outTab = SELECT TO_VARCHAR(TO_DATE(calday), 'YYYYMM') AS calmonth,
				"/BIC/SRCSYS",
				...
				"/BIC/CURTYP",
				LEFT("/BIC/COAREA", 2) AS "/BIC/COUNTRY",
				SUBSTRING("/BIC/BI_PROFCT", 9, 2) as "/BIC/PCACCID",
				CASE SUBSTRING("/BIC/PROFCT", 8, 3)
				  when '643' then '1' 
				  when '655' then '1'
				  when '641' then '2' 
				  when '651' then '2'
				  when '643' then '3' 
				  when '655' then '3'
				  else ''
				end as  "/BIC/PRBY',
				...
		   FROM :intab;

Typical expressions in the field list of a transformation routine

What all these requirements have in common is that they

  • Easy to implement in SQLScript with existing SQL capabilities
  • often repeat
  • can be executed very quickly by SAP HANA as long as they are implemented directly in the transformation routines.

DRY – Don’t Repeat Yourself

The well-known DRY principle should actually come into play here. So that you don’t repeat the source code all the time. In the field list, the scalar, custom functions (UDF) are available for this purpose. Until recently, these were not an issue for the transformation routines in the BW, because they can only be created with the AMDP Framework since AS ABAP Release 753 [1]SAP documentation for AMDP functions. But scalar UDFs would be ideal for precisely these requirements. This ensures a uniform implementation. And in the source code, one comes from a technical description of the expressions to a technical view:

outTab = SELECT "ZCL_CALDAY=>TO_CALMONTH"(calday) AS calday
				"/BIC/SRCSYS",
				...
				"/BIC/CURTYP",
				"ZCL_COAREA=>TO_COUNTRY"("/BIC/COAREA") AS "/BIC/COUNTRY",
				"ZCL_PROFCT=>TO_PCACCID"(/BIC/PROFCT") as "/BIC/PCACCID",
				"ZCL_PROFCT=>TO_PRBY"(/BIC/PROFCT") as "/BIC/PRBY',
				...
		   FROM :intab;

Outsourcing the expressions to a scalar UDF

The outsourced functions look much more elegant. And they are also superior from the point of view of maintenance. For example, if another profit center appears in the list, there is exactly one feature that needs to be adjusted. Using the example of ZCL_PROFCT=>TO_PRBY I show such a scalar AMDP function:

CLASS zcl_profct DEFINITION
  PUBLIC
  FINAL
  CREATE PUBLIC .

  PUBLIC SECTION.
    INTERFACES if_amdp_marker_hdb.
    METHODS to_prby IMPORTING VALUE(iv_profct) TYPE char10
                    RETURNING VALUE(rv_prby)   TYPE char1.
ENDCLASS.

CLASS zcl_profct IMPLEMENTATION.
  METHOD to_prby BY DATABASE FUNCTION FOR HDB LANGUAGE SQLSCRIPT OPTIONS DETERMINISTIC READ-ONLY.
    rv_prby = case SUBSTRING(:iv_profct, 8, 3)
                  WHEN '643' then '1'
                  WHEN '655' then '1'
                  WHEN '641' then '2'
                  WHEN '651' then '2'
                  WHEN '643' then '3'
                  WHEN '655' then '3'
                  ELSE ''
                END ;
  ENDMETHOD.

ENDCLASS.

The example shows that the function is just a simple “shell” for a CASE expression. By the way, outsourcing to the ABAP world also makes it easy to write UnitTests for the functions. Since the functions in the database are only generated from the ABAP on the first call, UnitTests are also suitable for this purpose. [2]I deliberately do not use data items generated by BW-InfoObjects to minimize dependency.

Elegant but slow

As elegant as the outsourcing of logic in UDFs is, the concept is unfortunately not to be used for large amounts of data. This is because the UDFs have a significant influence on the duration. I would like to show this in another expression from the example above:

TO_VARCHAR(TO_DATE(calday), 'YYYYMM')

This simple expression converts an ABAP DATS value into a SQLScript date, which is then output to a yYYYMM string. This corresponds to the format of the popular InfoObject CALMONTH.

Example: 20200928 ==> 202009

For this purpose, I create an AMDP UDF according to the above pattern:

  METHOD to_calmonth BY DATABASE FUNCTION FOR HDB LANGUAGE SQLSCRIPT OPTIONS DETERMINISTIC READ-ONLY.
    rv_result = TO_VARCHAR(TO_DATE(:iv_calday), 'YYYYMM');
  ENDMETHOD.

I made two queries in the SQL console on a small BW table with about 3 million entries: one directly with the printout and one with the function call.

Without UDF

do begin
	select budat, 
	       TO_VARCHAR(TO_DATE(budat), 'YYYYMM'),
	       account, 
	       amount, 
	       curr
	       from "/BIC/AZBR_E1_S2";     
end;

Console

Statement 'do begin select budat, TO_VARCHAR(TO_DATE(budat), 'YYYYMM'), account, amount, curr from ...' 
successfully executed in 501 ms 237 µs  (server processing time: 801 ms 664 µs)
Fetched 1000 row(s) in 102 ms 468 µs (server processing time: 0 ms 677 µs)
Result limited to 1000 row(s) due to value configured in the Preferences

With UDF

do begin
	select budat, 
	       "ZCL_CALDAY=>TO_CALMONTH"(BUDAT),
	       account, 
	       amount, 
	       curr
	       from "/BIC/AZBR_E1_S2";     
end;

Console

Statement 'do begin select budat, "ZCL_CALDAY=>TO_CALMONTH"(BUDAT), account, amount, curr from ...' 
successfully executed in 2:09.418 minutes  (server processing time: 4:18.632 minutes)
Fetched 1000 row(s) in 103 ms 144 µs (server processing time: 0 ms 675 µs)
Result limited to 1000 row(s) due to value configured in the Preferences

The runtimes are so different that you actually have to assume a bug. Because the complexity of the Besipiel is minimal. With less complexity, I don’t need UDF anymore. What I find amazing:

  • In fact, there are only 365 different values of BUDAT in the table. Thus, the system has the information: With the same input, the same result should also be delivered. See [3]SAP documentation for the CREATE FUNCTION statement . This should run it a maximum of 365 times and then be read out of the buffer, right?
  • I would also have expected only the dictionary of the column to be converted, so each value exactly once.
  • If you had replaced the function call with the function content with a primitive text editor with simple search/replacement, you are over 100 times faster. The Optimizer of an SAP HANA 7[4]Scalar User Defined Functions in SAP HANA – The first blog post on the topic of Rich Heilmann from 2013 should get this years after the introduction of scalar UDFs, right?

In order to fully document the whole thing, I also did the analysis with PlanViz.

PlanViz of the query with expression
PlanViz of the query with UDF

It is clearly visible that the same execution plan is broadly chosen. But the runtime of the one, red-marked JECalculate node contains the entire runtime for the UDF call. This should at least be parallelised.

Why is that? Do I have a misconception in my example? Can I still optimize this? Or are the UDFs just so slow. While researching forums, I have seen that I am not alone with my problems. There are also several ways to slow down the system with unfavorable functions. [5](Examplesof unfavorable UDF function in memory problem [6]Multiple call of scalar UDFs with multiple return parameters [7]No parallel execution of multiple UDFs

In any case, there is still great potential. At least for simple expressions without querying other DB tables.
I realise that there are also complex requirements that cannot be reduced to a single expression. But then at least parallel processing would be desirable.

Update 10/8/2020:

Update of 8.10.2020: I still have a bit of a discussion about performance optimization today. I also stumbled across the HINTS. The HINT (INLINE) is supposed to trigger exactly the desired behavior: Optimization should be done globally. But unfortunately there is nothing in the documentation about functions. Everything refers to procedures. And this is also in line with my observation: the hint brings absolutely no change. It remains slow.

do begin
    select budat, 
           "ZCL_CALDAY=>TO_CALMONTH"(BUDAT),
           account, 
           amount, 
           curr
           from "/BIC/AZBR_E1_S2"
           WITH HINT(INLINE);     
end;

Addendum, 11/10/2020:

Lars Breddemann has taken up the topic in his blog and has made his own measurements in this area. Some of these measurements I can relate to, but others I cannot. I suspect the differences are related to the version of the HANA database being used. All my measurements have been done on a BW/4HANA 2.0 SPS0 system with the HANA 2.00.036.00.1547699771. Lars probably used a newer version of HANA. Here are my comments on individual points:

The keyword DETERMINISTIC

The worse performance of DETERMINISTIC functions was confirmed by Lars. He recommends not using the DETERMINISTIC keyword, since it almost always leads to a non-optimal execution plan.

This also improves performance for me, but the scalar UDF still remains slower by a factor of 10. Here are the two implementation plans for it:

Without scalar UDF

With scalar UDF

For 3million records, that’s about 0.35 seconds vs. 3.7 seconds. The difference remains too great.

Memory consumption

I also can’t confirm Lars’ observation that memory consumption is significantly lower when using a scalar UDF. For me, the memory consumption is significantly larger with the scalar UDF:

532.4MBytes vs. 31.8 GBytes

When using TOP 500:

906.6 KBytes vs. 7.2 MBytes

The Engines

Another difference is the engines used. In Lars’ case, EXPLAIN shows that both Row and Column engines are used. For me, only the column engine is involved.

On my system state HANA 2.0 SP03 the performance of scalar UDFs is already much better after omitting DETERMINISTIC. However, it is still a far cry from using the expressions directly in the field list of a query. Since Lars has different observations on some points, there are obviously also dimensions that we have not considered so far. Probably the HANA version plays a big role in this.

Interim balance

On SAP HANA Cloud

Since I don’t have HANA 2.0 SP04 or SP05 available, I moved to SAP HANA Cloud for my next measurements. In the process, I initially noticed considerably shorter runtimes. However, since it’s a completely different hardware, I can’t make any valid statements about how much of it is the HANA version.

But what strikes me about the execution plans is that the system optimizes much more cleverly here. The queries used above do not process all 3M records, as this is not necessary for the output of 1000 records in the console. 😉

DISTINCT

If you perform the SELECT query with DISTINCT, then the system is forced to process the entire data set. However, this leads to additional work in a JEPreAggregat node, which then takes care of the GROUP BY. To keep this effort from dominating, I only looked at the one column:

select distinct  
   -- to_calmonth(budat) as calmonth
    TO_VARCHAR(TO_DATE(budat), 'YYYYMM') as calmonth
           from ztr_data;

The difference between the scalar UDF and the direct use of the expressions seems to melt more and more.

I will provide more detailed measurements in this article soon.

Anmerkungen und Verweise

SQLScript Unit Tests, the End-User Test Framework

With version 2.0 SPS04 of the SAP HANA database, the so-called “End-User Test Framework in SQLScript” was introduced. This makes it possible to create SQLScript Unit tests in the form of  User Defined Libraries (UDL). These can then be SQLSCRIPT_RUN_TESTS_ON_ORIGINAL_DATA executed automatically when the procedure is called.

What is being tested?

Most of the time, test cases are used to check for a subprogram whether the expected output is also generated when a given input is entered. In the case of databases, therefore, these are procedures and functions that are primarily tested. However, you can also test the state of the database by querying, for example, the number of records in a table. For example, in the following example, we can verify that the scripts for the demo data model were successful and generated the expected number of records.

Example

To explain the concept of the Test Framework, let’s start with a simple example. In Listing 1.2 you will find a test library that contains only the bare essentials. It consists of only T_AUFGABEN_CNT one test procedure . It checks whether there are exactly 1000 entries in the table AUFGABEN.

CREATE LIBRARY test_demo_dm 
LANGUAGE SQLSCRIPT TEST 
AS BEGIN 

 @test()
 PUBLIC PROCEDURE t_aufgaben_cnt 
 AS BEGIN
 USING sqlscript_test AS test;
 test:expect_eq((SELECT COUNT(*) FROM tasks),
 1000 );
 END; 
END;

To run this SQLScript Unit test in our test library, we call the following procedure in the SQL console. You may need to customize the name of the schema:

CALL SQLSCRIPT_RUN_TESTS_ON_ORIGINAL_DATA('{"schema":"SYSTEM",
 "library":"TEST_DEMO_DM"}', ?, ?, ?);

The result of this procedure call is 3 tables. In our case, only in the first one there is actually an entry, because the test was successful. As soon as a test fails, the 2. Table a message is issued and the 3. Table contains details about the call stack of errors. Below is an example of the output tables

The language of the tests

The first thing that is striking about the small example is that the library is created with the language SQLSCRIPT TEST. This distinguishes test libraries from productive libraries. In fact, the implementation of the test is also SQLScript.

Pragmas for procedures

Pragmas are used to classify the individual procedures of the test libraries. The most important role is played by the parameterless test procedures. These are marked by a leading pragma @test() .

In order to establish a defined system state  before the test, a corresponding procedure can be marked with the pragma @TestSetUpLibrary().

After the tests are performed, the procedure with the pragma @TestTearDownLibrary() may be called to clean up after the test procedures.

Test procedures

Within the test procedures, the system’s test procedures can be used to verify SQLSCRIPT_TEST the states of the program. The different inspection procedures can be found in Table 1.1. Any scalar expression can be used as a parameter value of 1 or 2.

Testing procedure Description
EXPECT_EQ(<Wert1>, <Wert2>) Checks whether both values are equal
EXPECT_NE(<Wert1>, <Wert2>) Checks whether the two values are different
EXPECT_GE(<Wert1>, <Wert2>) Checks whether value 1 is greater than or equal to the value 2
EXPECT_GT(<Wert1>, <Wert2>) Checks whether value 1 is greater than value 2
EXPECT_LE(<Wert1>, <Wert2>) Checks whether value 1 is less than or equal to the value 2
EXPECT_LT(<Wert1>, <Wert2>) Checks whether value 1 is less than value 2
EXPECT_NULL(<Wert>) Checks whether the value is null
EXPECT_TABLE_EQ(<Tabelle1>, <Tabelle2>[,<Reihenfolge ignorieren>]) Checks whether the two tables are identical. The order does not matter, as long as the corresponding flag is not set to FALSE.
FAIL(<Nachricht>) The test is set to failed and the Message is issued

Modularization in the test library

These check procedures of our SQLScript Unit tests do not need to be called directly in the test procedure. It can also be further modularized within the test library. For example, if multiple tables should be checked for the number of records it contains, it is a good way to offload this part of the code to a separate procedure to avoid unnecessary repetition.

In Listing 1.3, the logic to check the number of records within a table was outsourced to a private procedure. This makes it easy to check a lot of tables without having to repeat the same code every time.

CREATE LIBRARY TEST_DEMO_DM 
LANGUAGE SQLSCRIPT TEST 
AS BEGIN 
  
 PRIVATE PROCEDURE check_table_count
 (IN iv_tabname NVARCHAR(30),
 IN iv_expected_count INTEGER)
 AS BEGIN
 USING sqlscript_test AS test;
 DECLARE lv_query NVARCHAR(5000);
 DECLARE lv_count INTEGER;
    
 lv_query = 'SELECT COUNT(*) FROM' || :iv_tabname;
 EXEC lv_query INTO lv_count;
 test:expect_eq( :lv_count, :iv_expected_count );
      
 END;
  
 @test()
 PUBLIC PROCEDURE t_table_count 
 AS BEGIN
 check_table_count( 'TASKS', 1000);
 check_table_count( 'USER', 30);
 check_table_count( 'STATUS', 6); 
 check_table_count ( 'STATUS_TEXT', 12);
 --check_table_count( 'STATUS_TEXT', 13);
 END; 
END;

Test

If we run the above test library and have done everything correctly, we will not receive an error message. To demonstrate, I changed STATUS_TEXT the parameters for the table. See the following three illustrations for the three return tables in the test. The first table shows us that the test procedure was executed in the test library with the result FAILED:

Result of the test execution

In the second table we find the reason for the error. In our case, the number of records is not correct.

Error messages of test execution

The third table now returns the corresponding call stack.

Call stack of reported bugs

Further concepts

In addition to these basic concepts, there is also the possibility to use so-called configurations to create different preconditions for create the test procedures.

The test procedures, in turn, can divided into groups by a classification.

When calling SQLSCRIPT_RUN_TESTS_ON_ORIGINAL_DATA  the procedure, you can enter a JSON document that specifies in detail which procedures with which classifications should be executed in which configurations.

These details can be found in the reference documentation SAP HANA SQLScript Reference for SAP HANA Platform.

Review

With the Test Framework, we now have a very useful tool in our hands that allows us to automatically test our programs at the database level. Unit testing is now a widely used standard, which is also applicable to SQLScript developments. Test-driven development (TDD) is also possible with the framework.