Implementing a plain text search in ABAP is more challenging than mere querying table attributes. A user-specified search pattern cannot be matched using a simple test for equality. Some sophisticated plain text
matching algorithms are needed for attractive Google-like search:
- the Full Text Search (FTS) matches a pattern exactly.
- the Fuzzy search performs approximate string matching according to a distance algorithm tuned to a problem domain that finds both exact matches and deviations.
- the Acoustic search is an approximate String matching according to a phonetic algorithm.
Those algorithms make the difference between searching and finding, but the challenge is to create an efficient implementation.
An approach used by search engines like TREX or the SAP Enterprise Search tool is to pre-process the text and store results in index tables, enabling fast access
https://en.wikipedia.org/wiki/String-searching_algorithm
Without TREX, generic text search might not be high on your agenda, but let us consider other options...
ABAP SQL
Fuzzy search is not supported in ABAP SQL (formerly known as Open SQL). A simple pattern matching is enabled by the
LIKE statement of the WHERE clause
SELECT product FROM catalog WHERE name LIKE '%_agento%'.
Long texts saved in tables STXH and STXL are accessed with function modules like READ_TEXT that can also read the text buffer and the archive. This approach does not allow searching for long text.
ABAP Database Connectivity
ABAP Database Connectivity (ADBC) can be used as a powerful alternative to ABAP SQL to
- create and manage connections to all database maintained in transaction DBACOCKPIT (table DBCON).
- execute DB specific statements
- process the query results
Known Problems:
- No automatic client handling
- Security risk: SQL Injection.
Important ADBC Classes
- CL_SQL_CONNECTION
- CL_SQL_STATEMENT
- execute_ddl( ) - CREATE TABLE, DROP TABLE
- execute_query( ) - SELECT FROM WHERE
- execute_update( ) - INSERT INTO VALUES ( )
- CL_SQL_PREPARED_STATEMENT
- CL_SQL_RESULT_SET - Result set of an SQL query
- set_param_table( )
- next_package( )
- close( )
- CX_SQL_EXCEPTION
- CL_SHDB_SELTAB method sql_where_condition( ) and the static method COMBINE_SELTABS( ) shall be used to convert SELECT-OPTIONS into Native SQL.
- Demo Report DEMO_ADBC_DDL_DML
HANA DB
Queries with the
CONTAINS( ) predicate can be created via ADBC on HANA DB (HDB) to implement FTS, Fuzzy and Acoustic search (cf.
search,
analysis and
mining).
As a prerequisite, a full-text index must be created for the column in the dictionary (this is only possible for HDB) or the column must be defined as TEXT at HDB table level.
In my scenario, I replicated available long text content into a HDB TEXT column. This in effect creates an index, enabling queries using CONTAINS( ).
Sample Code
generates SQL statement for fuzzy text search for the pattern TEST OR PALLET
SELECT TOP 200 DISTINCT SCORE() AS FUZZY_SCORE, K.EBELN, P.EBELP
FROM EKKO AS K INNER JOIN EKPO AS P ON K.EBELN = P.EBELN
AND K.MANDT = P.MANDT
INNER JOIN LFA1 AS L ON K.LIFNR = L.LIFNR AND K.MANDT = L.MANDT
LEFT OUTER JOIN ADRC AS A ON L.ADRNR = A.ADDRNUMBER
AND L.MANDT = A.CLIENT
WHERE K.MANDT = '140' AND A.NATION = ''
AND CONTAINS(P.TXZ01, 'TEST OR PALLET',
FUZZY(0.8, 'textSearch=compare,similarCalculationMode=search' ))
AND K.EKORG = 'GLOB'
AND K.KDATB <= '20190213' AND K.KDATE >= '20190424'
AND K.LOEKZ = '' AND P.LOEKZ = ''
AND SCORE() >= 0.20000000
ORDER BY FUZZY_SCORE DESC
It works! (and yes, you could also use AMDP).
After this successful endeavor, I searched the web and realized the
CONTAINS( ) predicate is also available on other platforms.
Sybase
The SAP Netweaver demo system NPL runs on Sybase (SYB or SAP ASE). ABAP does not support full text index for other DBs.
Index ZTOAD-Y01 is of type Full Text. This is not implemented for DB SYB
Message no. DT558
But
Sybase also support the CONTAINS( ) statement for exact and fuzzy text search.
So how to enable the indexing? Sybase's help says
NGRAM TEXT index search is mainly useful when words are misspelled. Sybase IQ does not support searches like synonyms and antonyms.
Oracle
Oracle supports fuzzy text search with variants of the CONTAINS( ) statement.
SQL Server
MS SQL Server supports full-text searching with the CONTAINS( ) statement.
Other DBs
You could try to use a custom of the string distance functions as a DB procedure if not already available on a given platform.
- Here is a T-SQL (MS SQL server) implementation of the Levenshtein distance.
- MySQL has Full Text Search
WHERE MATCH(column)
AGAINST('Rose', 'Crown')
and the SOUNDEX( ) function for acoustic search
SELECT * FROM `author_bios`
WHERE SOUNDEX(`auth_last_name`) = SOUNDEX('Williams')
Summary
High performance full-text search is a feature implemented by many database in non standard ways. Using ABAP DB Connectivity opens up new vistas, like the helpful CONTAINS( ) statement supported by SAP HANA DB an other databases.
I would call
Fuzzy Text search the poster child of ADBC.
Content reposted from www.informatik-dv.com