Application Development Blog Posts
Learn and share on deeper, cross technology development topics such as integration and connectivity, automation, cloud extensibility, developing at scale, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 
kilian_kilger
Advisor
Advisor
32,484

New ABAP expressions for generic and dynamic programming in ABAP Platform 2021:


Part I - Dynamic Access to (maybe generic) references


Do you use the new ABAP expressions like constructor operators or table selectors in your coding? But you often find that when using generic programming, i.e. data types like REF TO DATA, DATA or ANY you fall back to programming style of the 70th? Then the new ABAP platform 2021 (which shipped with kernel 7.85 last week) has some new features to get you clean up your coding.

The main mantra of the new release is: "Get rid of field-symbols!"

This is part of a series of multiple blog posts. Please revisit this page as it might point to the sequel in a few weeks or if new topics concerning generic programming in ABAP may arise.

1. The old days: how to handle generic data references classically?


When using non-generic references in ABAP you always could write the following:
DATA foo TYPE REF TO i.
...
foo->* = 5.

Here and in the following the CREATE DATA statement or NEW operator has been omitted.

But when using generically typed references this was not possible:
DATA foo TYPE REF TO data.
...
foo->* = 5. " Syntax error: No dereferencing of generic reference possible

The only possibility to access the variable "foo" would be to use field-symbols.
DATA foo TYPE REF TO data.
...
ASSIGN foo->* TO FIELD-SYMBOL(<fs>).
<fs> = 5.

This makes the code uglier and more difficult to read. It also makes dereferencing the reference impossible inside ABAP expressions, as there is no expression variant of ASSIGN. Another disadvantage is the tricky error handling of ASSIGN. You can have subtle bugs when the error handling is forgotten.

2. Dereferencing generic references is now possible: (nearly) everywhere!


We now lifted the above restriction. You can now use the dereferencing operator in most places in ABAP where you can use generically typed ABAP variables. A simple example would be:
DATA foo TYPE REF TO data.
...
my_object->meth( foo->* ).

If FOO is the initial reference, then a runtime error will occur, as in the non-generic case. So no error handling is necessary in most cases.

Of course this also works in ABAP SQL like follows:
DATA ref TYPE REF TO data.
...
SELECT * FROM T100
INTO TABLE @ref->*.

This however, immediately leads to a new question: The variable REF is a "REF TO DATA", not a reference to an internal table type.

The latter is not possible in ABAP yet. There simply is no "REF TO TABLE" - type.

3. Generic References and Internal Tables


In the past in many circumstances you could not use field-symbols of type ANY or variables of type DATA to access internal tables.
FIELD-SYMBOLS <any> TYPE any.
...
READ TABLE <any> ASSIGNING FIELD-SYMBOL(<line>)
WITH TABLE KEY (dyn_key) = value. " Syntax error: <any> is no internal table

Note that I am using a dynamic key specification here.

You had to manually "reassign" the field-symbol like follows:
FIELD-SYMBOLS <any> TYPE any.
FIELD-SYMBOLS <table> TYPE any table.
...

ASSIGN <any> TO <table>.
IF sy-subrc <> 0.
... " error handling!
ENDIF.

READ TABLE <table> ASSIGNING FIELD-SYMBOL(<line>)
WITH TABLE KEY (dyn_key) = value.

This makes the coding at least 5 lines longer, because of the error handling and the check for sy-subrc. It is also error-prone, as you might forget the error handling, which yields all kinds of funny results if you do this inside a loop and the field-symbol of the last loop iteration is still assigned.

You can now use variables and field-symbols of type ANY and DATA directly in LOOP and READ statements. This gives many new possibilities:
DATA ref TO REF TO data.
...
LOOP AT ref->* ASSIGNING FIELD-SYMBOL(<fs>). " now possible
ENDLOOP.

READ TABLE ref->* ASSIGNING FIELD-SYMBOL(<fs>) " now possible
WITH KEY (dyn_key) = value.

It also makes it possible to directly dereference a reference and apply a table selector.
DATA itab_ref TYPE REF TO data.
...
itab_ref->*[ (dyn_key) = key_value ] = value.

The same mechanism has been applied to internal table functions like LINES:
DATA itab_ref TYPE REF TO data.
...
IF lines( itab_ref->* ) > 0.
...
ENDIF.

In case that ITAB_REF does not point to a an internal table at runtime, there is the new runtime error ITAB_ILLEGAL_OPERAND.

There is however a serious limitation to this. You can still not access variables of type DATA or ANY by index. You will still need a field-symbol of type INDEX TABLE.
DATA itab_ref TYPE REF TO data.
...
itab_ref->*[ 1 ] = value. " syntax error
itab_ref->*[ (dyn_key) = key_value ] = value. " ok

4. Introducing Dynamic Reference Expressions


The previous paragraphs are only one part of the solution. What if the target of your reference is a structure type, which you do not know exactly at compile time? How to access the individual components of the structure?

In the past you would have done something like this:
DATA struct_ref TYPE REF TO data.
...
ASSIGN struct_ref->* TO FIELD-SYMBOL(<fs>).
IF sy-subrc <> 0.
" error handling 1
ENDIF.
ASSIGN COMPONENT 'COMP' OF STRUCTURE <fs> TO FIELD-SYMBOL(<fs2>).
IF sy-subrc <> 0.
" error handling 2
ENDIF.
<fs2> = value.

Some more knowledgeable colleagues even know the completely dynamic ASSIGN, where you could do this all in one step:
DATA struct_ref TYPE REF TO data.
...
ASSIGN ('STRUCT_REF->COMP') TO FIELD-SYMBOL(<fs>).
IF sy-subrc <> 0.
" error handling
ENDIF.
<fs> = value.

This has of course serious drawbacks:

  • You can not do this inside expressions

  • Everything is dynamic. If you change the name of the variable STRUCT_REF, you will only know at runtime of there is an error

  • ASSIGN is dangerous, because of the sy-subrc error handling you might forget

  • You need many lines of code


There also is a very unknown variant of ASSIGN you could use:
DATA struct_ref TYPE REF TO data.
...
ASSIGN STRUCT_REF->('COMP') TO FIELD-SYMBOL(<fs>).
IF sy-subrc <> 0.
" error handling
ENDIF.
<fs> = value.

We now decided that this gives a good hint for a new kind of ABAP expression, which you can use in many places in ABAP platform 2021. You can now write:
DATA foo TYPE REF TO data.
DATA comp_name TYPE string VALUE `comp`.
...
my_object->meth( foo->(comp_name) ).

You can use these new kind of expression in most places where you can use expressions and generically typed variables.

But what to do, if the component is again a reference to another structure or a reference to a simple type? You have two possibilities.

  • The component name can be an arbitrary assign expression, like: COMP->* or
    COMP-COMP2 or COMP->COMP2

  • You can use chaining on these new kind of expressions.


DATA foo TYPE REF TO data.
...
" assign expression:
my_object->meth( foo->('comp1->comp2->*') ).

" assign expression with structures:
my_object->meth( foo->('comp1-comp2->*') ).

" new kind of daisy-chaining:
my_object->meth( foo->('comp1')->('comp2')->* ).

Of course you always get nice exceptions if the components do not exist or are not assigned.

No sy-subrc is set, of course!

The last example with daisy-chaining does not work with structures yet. So you can't write:
DATA foo TYPE REF TO data.
...
my_object->meth( foo->('COMP')-('COMP2') ).

This might by a possible improvement in later ABAP releases.

Of course this new feature makes most sense, if you do not know the target type exactly. If you know the target type exactly at compile time, you can always do a:
DATA foo TYPE REF TO data.
...
CAST concrete_type( foo )->comp = 5.

This makes of course even more sense, if you need to access many components of the structure in one method.

So the rule of thumb would be:

  • If you know the type of the reference exactly at compile time and need to access multiple components, you should opt for CAST #( ).

  • If you don't know the type of the reference exactly, but only know that the target type contains a column named BLA, use REF->('BLA')

  • If you just need a single component from REF but you know the target type of REF statically, both methods are possible. It depends on the context if you should prefer REF->('BLA') or
    CAST concrete_type( ref )->bla. I would probably use the CAST #( ) more often in this case, as you will get syntax errors when the component is deleted or renamed in the original structure. This also enables "where used" functionality. Of course you will get a stonger coupling to the specific type in this case, which might not be desired in all cases.


5. Dereferencing fully generic types


In the past you could only dereference variables which are explicitely typed as references. But to allow daisy-chaining this had to be lifted. Here is why:
DATA foo TYPE REF TO data.
...
FOO->(COMP)->(COMP2) = 5.
" ^__________ the result of FOO->(COMP) is *no* reference,
" but of type DATA.

It is now also possible to dereference completely generic types. An exception is thrown at runtime, if no reference is assigned to the variable.
METHODS foo IMPORTING value TYPE data.
...
METHOD foo.
value->* = 5. " possible, runtime error if value is not a reference.
ENDMETHOD.

From a language theoretic point this is the following: If something can not be checked at compile time but could be valid at runtime, we should not disallow it, but postpone the check to the runtime.

In ASSIGN, it was always possible to dereference a fully generic variable. But this is not really any safer:
METHODS foo IMPORTING value TYPE data.
...
METHOD foo.
FIELD-SYMBOLS <fs> TYPE REF TO i.
ASSIGN value->* TO <fs>.
IF sy-subrc <> 0.
... " error handling
ENDIF.
ENDMETHOD.

6. Performance


Many people ask for performance when new ABAP expressions are introduced.

The new ABAP expressions described in this document have a very small performance penalty in comparison with their non-generic counterparts. The following holds regarding performance:

  • The new expressions are faster or equally fast than the ASSIGN command. If you do proper error handling in ASSIGN, the new expressions will be faster.

  • When you use the same expression many times in one ABAP method, using the old ASSIGN with a field symbol and continuously using that single field symbol is still a bit faster.


Both assertions are not new: they apply (in a similar way) for nearly every other kind of ABAP expression. Also be aware regarding performance: measure, don't guess! Prefer the kind of coding which is most clean. Only if you have performance problems stick to less clean coding.

The immediately leads to another question: What is faster?
object->meth( BLA->('BLUB->BLOB->*') )

vs.

object->meth( BLA->('BLUB')->('BLOB')->* )

i.e. old style ASSIGN-expressions vs. the new kind of daisy chaining.

The answer, of course, depends. Generally we assume that old style ASSIGN expressions are a very very tiny bit faster if the chain is short. This (nearly unmeasurable) performance benefit should diminish for longer chains.

If you need ABAP coding to produce  the string 'BLUB->BLOB->*' then the new daisy chaining will have an advantage. So it might depend on the context which of the two variants will perform better. But also here we would stick to the coding which is more clear and more easily understandable.

7. Outlook


The new expressions provide an easier way to handle fully generic variables or references in ABAP. They can be used in expressions, throw exceptions and do not set any sy-subrc. They can be combined to form even more powerful constructs.

As of ABAP platform 2021 still not every combination of the expressions is possible at every position in the ABAP coding. This leaves room for (possible) improvements in later releases.

What additional stuff does not work at the moment?

7.1. Possible Improvement: Daisy-Chaining in ASSIGN for "simple" variables


Using chaining of generic reference component access in ASSIGN does not work at the moment, i.e. the following does not work yet:
DATA foo TYPE REF TO data.
...
" Syntax error in ABAP platform 2021:
ASSIGN foo->(comp_name)->* TO FIELD-SYMBOLS(<fs>).

This does only work outside of ASSIGN. The reason is the sy-subrc semantics, as inside ASSIGN the sy-subrc must be set and no exception should be thrown.

Due to implementation choices in the original ASSIGN implementation, it does work when using a table selector though:
DATA itab TYPE TABLE OF REF TO data.
...
"No syntax error in ABAP platform 2021:
ASSIGN itab[ 1 ]->('BLA')->* TO FIELD-SYMBOL(<fs>).

The latter sets sy-subrc to 4 if the itab is empty, but will throw a RABAX if there is no component BLA or references are not assigned. This is in sync with the behaviour in previous ABAP releases.

7.2. Possible Improvemement: Dynamic access to structure components


At the moment the new dynamic expressions can only be used when the starting variable is a reference. If the starting variable is of type DATA or ANY or of structure type and points to a variable of structure type at runtime they provide no benefit.

The following would be imaginable:
METHODS foo IMPORTING value TYPE data.
...
METHOD foo.
value-(comp_name) = 5. " Syntax error in ABAP platform 2021
value->(comp_name)-(struc_comp) = 5. " Syntax error in ABAP platform 2021
ENDMETHOD.

This could also be a possible future improvement.

8. Resumé


In this article we described how you:

  • can write better code when using generic data references by using the arrow operator ->* directly inside expressions

  • use the new dynamic reference expressions to access components of generic data or object references


Usage of these new tools will make generic coding ready for the 21th century and leads to much shorter, more concise code with less field symbols.

Please give your feedback in the comments below. The ABAP language is strongly influenced from your user input.

Please ask questions also in the corresponding Q&A forums.
16 Comments
Labels in this area