‎2018 Oct 02 4:02 PM
Hi,
I'm a little lazy to develop it myself... Did anyone implement the algorithm "Longest common subsequence problem" between 2 strings, in ABAP, and would like to share it? This is an algorithm which can be used in Git to display the characters which are different between two lines for instance.
I want to achieve something like that (Azure DevOps), i.e. 2 colors on the same line (light green versus darker green) - My issue is to know which characters are different, not how to render the colors of course:

Or do you know a better alternative?
Thanks 🙂
Sandra
‎2018 Oct 02 5:51 PM
I'm not so lazy in fact 🙂 - Here is a basic solution, not much tested, use it at your own risk (and from there, it's easy to implement the colors - note that there can be different algorithms for the diff, more intelligent like comparison based on words) :
CLASS lcl_lcs DEFINITION.
PUBLIC SECTION.
METHODS constructor
IMPORTING
x TYPE string
y TYPE string.
DATA: result TYPE string READ-ONLY.
PRIVATE SECTION.
METHODS lcs_length
RETURNING
VALUE(lcs_length) TYPE i.
METHODS backtrack
IMPORTING
i TYPE i
j TYPE i.
DATA: x TYPE REF TO string,
y TYPE REF TO string,
c TYPE TABLE OF i,
m TYPE i,
n TYPE i,
dim_j TYPE i,
dim_c TYPE i.
ENDCLASS.
CLASS lcl_lcs IMPLEMENTATION.
METHOD constructor.
me->x = REF #( x ).
me->y = REF #( y ).
m = strlen( x ).
n = strlen( y ).
dim_j = strlen( y ) + 1.
dim_c = ( m + 1 ) * ( n + 1 ).
" add lines by using "factorial" (may overflow by 2, more memory but it's fast)
APPEND INITIAL LINE TO c.
DO.
APPEND LINES OF c TO c.
IF lines( c ) >= dim_c.
EXIT.
ENDIF.
ENDDO.
lcs_length( ).
backtrack( i = ( m - 1 ) j = ( n - 1 ) ).
FREE c.
ENDMETHOD.
METHOD lcs_length.
*function LCSLength(X[1..m], Y[1..n])
* C = array(0..m, 0..n)
* for i := 0..m
* C[i,0] = 0
* for j := 0..n
* C[0,j] = 0
* for i := 1..m
* for j := 1..n
* if X[i] = Y[j]
* C[i,j] := C[i-1,j-1] + 1
* else
* C[i,j] := max(C[i,j-1], C[i-1,j])
* return C[m,n]
*Alternatively, memoization could be used.
DO m TIMES.
DATA(i) = sy-index.
DATA(im) = sy-index - 1.
DO n TIMES.
DATA(j) = sy-index.
DATA(jm) = sy-index - 1.
IF x->*+im(1) = y->*+jm(1).
c[ dim_j * i + j + 1 ] = c[ dim_j * im + jm + 1 ] + 1.
ELSE.
c[ dim_j * i + j + 1 ] = nmax( val1 = c[ dim_j * i + jm + 1 ] val2 = c[ dim_j * im + j + 1 ] ).
ENDIF.
ENDDO.
ENDDO.
lcs_length = c[ dim_j * m + n + 1 ].
ENDMETHOD.
METHOD backtrack.
*Reading out a LCS
*The following function backtracks the choices taken when computing the C table. If the
*last characters in the prefixes are equal, they must be in an LCS. If not, check what
*gave the largest LCS of keeping {\displaystyle x_{i}} x_{i} and {\displaystyle y_{j}} y_{j},
*and make the same choice. Just choose one if they were equally long. Call the function with i=m and j=n.
*function backtrack(C[0..m,0..n], X[1..m], Y[1..n], i, j)
* if i = 0 or j = 0
* return ""
* if X[i] = Y[j]
* return backtrack(C, X, Y, i-1, j-1) + X[i]
* if C[i,j-1] > C[i-1,j]
* return backtrack(C, X, Y, i, j-1)
* return backtrack(C, X, Y, i-1, j)
IF i = -1 OR j = -1.
" nothing (stop the recursivity)
ELSEIF x->*+i(1) = y->*+j(1).
backtrack( i = i - 1 j = j - 1 ).
result = result && x->*+i(1).
ELSEIF c[ dim_j * ( i + 1 ) + ( j ) + 1 ] > c[ dim_j * ( i ) + ( j + 1 ) + 1 ].
backtrack( i = i j = j - 1 ).
ELSE.
backtrack( i = i - 1 j = j ).
ENDIF.
ENDMETHOD.
ENDCLASS.
CLASS ltc_main DEFINITION
FOR TESTING
DURATION SHORT
RISK LEVEL HARMLESS
INHERITING FROM cl_aunit_assert.
PRIVATE SECTION.
METHODS test FOR TESTING.
ENDCLASS.
CLASS ltc_main IMPLEMENTATION.
METHOD test.
cl_abap_unit_assert=>assert_equals( act = NEW lcl_lcs( x = `XMJYAUZ` y = `MZJAWXU` )->result exp = `MJAU` ).
ENDMETHOD.
ENDCLASS.
‎2018 Oct 02 5:51 PM
I'm not so lazy in fact 🙂 - Here is a basic solution, not much tested, use it at your own risk (and from there, it's easy to implement the colors - note that there can be different algorithms for the diff, more intelligent like comparison based on words) :
CLASS lcl_lcs DEFINITION.
PUBLIC SECTION.
METHODS constructor
IMPORTING
x TYPE string
y TYPE string.
DATA: result TYPE string READ-ONLY.
PRIVATE SECTION.
METHODS lcs_length
RETURNING
VALUE(lcs_length) TYPE i.
METHODS backtrack
IMPORTING
i TYPE i
j TYPE i.
DATA: x TYPE REF TO string,
y TYPE REF TO string,
c TYPE TABLE OF i,
m TYPE i,
n TYPE i,
dim_j TYPE i,
dim_c TYPE i.
ENDCLASS.
CLASS lcl_lcs IMPLEMENTATION.
METHOD constructor.
me->x = REF #( x ).
me->y = REF #( y ).
m = strlen( x ).
n = strlen( y ).
dim_j = strlen( y ) + 1.
dim_c = ( m + 1 ) * ( n + 1 ).
" add lines by using "factorial" (may overflow by 2, more memory but it's fast)
APPEND INITIAL LINE TO c.
DO.
APPEND LINES OF c TO c.
IF lines( c ) >= dim_c.
EXIT.
ENDIF.
ENDDO.
lcs_length( ).
backtrack( i = ( m - 1 ) j = ( n - 1 ) ).
FREE c.
ENDMETHOD.
METHOD lcs_length.
*function LCSLength(X[1..m], Y[1..n])
* C = array(0..m, 0..n)
* for i := 0..m
* C[i,0] = 0
* for j := 0..n
* C[0,j] = 0
* for i := 1..m
* for j := 1..n
* if X[i] = Y[j]
* C[i,j] := C[i-1,j-1] + 1
* else
* C[i,j] := max(C[i,j-1], C[i-1,j])
* return C[m,n]
*Alternatively, memoization could be used.
DO m TIMES.
DATA(i) = sy-index.
DATA(im) = sy-index - 1.
DO n TIMES.
DATA(j) = sy-index.
DATA(jm) = sy-index - 1.
IF x->*+im(1) = y->*+jm(1).
c[ dim_j * i + j + 1 ] = c[ dim_j * im + jm + 1 ] + 1.
ELSE.
c[ dim_j * i + j + 1 ] = nmax( val1 = c[ dim_j * i + jm + 1 ] val2 = c[ dim_j * im + j + 1 ] ).
ENDIF.
ENDDO.
ENDDO.
lcs_length = c[ dim_j * m + n + 1 ].
ENDMETHOD.
METHOD backtrack.
*Reading out a LCS
*The following function backtracks the choices taken when computing the C table. If the
*last characters in the prefixes are equal, they must be in an LCS. If not, check what
*gave the largest LCS of keeping {\displaystyle x_{i}} x_{i} and {\displaystyle y_{j}} y_{j},
*and make the same choice. Just choose one if they were equally long. Call the function with i=m and j=n.
*function backtrack(C[0..m,0..n], X[1..m], Y[1..n], i, j)
* if i = 0 or j = 0
* return ""
* if X[i] = Y[j]
* return backtrack(C, X, Y, i-1, j-1) + X[i]
* if C[i,j-1] > C[i-1,j]
* return backtrack(C, X, Y, i, j-1)
* return backtrack(C, X, Y, i-1, j)
IF i = -1 OR j = -1.
" nothing (stop the recursivity)
ELSEIF x->*+i(1) = y->*+j(1).
backtrack( i = i - 1 j = j - 1 ).
result = result && x->*+i(1).
ELSEIF c[ dim_j * ( i + 1 ) + ( j ) + 1 ] > c[ dim_j * ( i ) + ( j + 1 ) + 1 ].
backtrack( i = i j = j - 1 ).
ELSE.
backtrack( i = i - 1 j = j ).
ENDIF.
ENDMETHOD.
ENDCLASS.
CLASS ltc_main DEFINITION
FOR TESTING
DURATION SHORT
RISK LEVEL HARMLESS
INHERITING FROM cl_aunit_assert.
PRIVATE SECTION.
METHODS test FOR TESTING.
ENDCLASS.
CLASS ltc_main IMPLEMENTATION.
METHOD test.
cl_abap_unit_assert=>assert_equals( act = NEW lcl_lcs( x = `XMJYAUZ` y = `MZJAWXU` )->result exp = `MJAU` ).
ENDMETHOD.
ENDCLASS.