Solved: problem exporting text in hebrew

Former Member · ‎2007 Jan 20

When using the OPEN DATASET...CLOSE DATASET code to save content from the SAP system into an external file, I ran across an annoying problem.

This is the relevant sample of the code:

open dataset CURR_FILE in text mode for output encoding utf-8.

transfer CURR_LINE to CURR_FILE.

close dataset CURR_FILE.

The purpose of the process was to create a file that could be opened by a certain different program that can only read CSV files. The information went properly except for fonts in hebrew which came out as jiberish. That same text was okay when opened as a text file, and the same code was okay when used in a previous SAP system which is pre-unicode, but due to the new UTF8 standard, this problem surfaced. When opened in EXCEL the text shows the same jiberish content in hebrew fonts, unless it's imported through the excel import wizard.

There is an option to save the file as non-unicode by declaring it in the OPEN DATASET line, but this causes a short dump when the TRANSFER command is called. I currently have no option to handle the file externally after it's created as it is transfered automatically once created. If anyone knows of a solution to this problem. I'd be grateful for the help.

Former Member · ‎2007 Jan 21

Hi Max,

It looks like you don't need to write the file as Unicode, but as simple ASCII.

Look at the docu for class CL_ABAP_CONV_OUT_CE and use it to convert from unicode to ascii.

After that you should call <b>OPEN DATASET IN LEGACY MODE</b> and write the file.

Bye,

Ofer

rahulkavuri · ‎2007 Jan 21

hi please check whether you are working on a Unicode system or not first

If not then check whether u have the language packs installed on ur system

Run the program RSCINST in SE38 which throws a pop up if u work on a Unicode system and also shows what are the language packs currently installed on ur system.... if u dont have Hebrew installed then u need to contact Basis People for installion of that language pack

award points if found helpful

Former Member · ‎2007 Jan 21

Hi Max,

Try Encoding Default in open dataset. This may solve your problem

Asvhen

Former Member · ‎2007 Jan 21

Hi Max,

If you have a non-unicode system (how to check this: in system->status there's a field: Unicode: Yes/No) - then you need to convert your text from system codepage to the UTF-8 encoding. This can be done with class CL_ABAP_CONV_OUT_CE. The following text & sample program are from the documentation for this class.

Best Regards,

Ofer

CL CL_ABAP_CONV_OUT_CE

____________________________________________________

Short Text

Code Page and Endian Conversion (System Format -> External)

Functionality

Instances of the class CL_ABAP_CONV_OUT_CE allow you to convert ABAP data objects to binary data. (That is, data in the system format is converted to an external format.)

You can convert character sets (for text data) and the byte order (for numeric data).

Additionally, there are static methods, which allow you to ascertain the hexadecimal or decimal value in the Unicode codepage for any character in the current codepage. These methods are: CL_ABAP_CONV_OUT_CE=>UCCP and CL_ABAP_CONV_OUT_CE=>UCCPI.

Relationships

CL_ABAP_CONV_IN_CE

Converts binary data into ABAP data objects

CL_ABAP_CONV_X2X_CE Converts ABAP data objects between two external binary formats. Binהrformaten.

CL_ABAP_CHAR_UTILITIES

Various attributes and methods for character sets and byte order

CL_NLS_STRUC_CONTAINER Corrects alignment of structures in containers of type C (or STRING). You need to make this correction if East-Asian characters ("full-width" characters in Chinese, Japanese, and Korean) are to be copied from a non-Unicode to a Unicode system or vice versa. You do not need to make the correction if you use the method CONVERT_STRUC from this class.

Example

In the following example, text from the system codepage is converted to UTF-8 and numbers from the system codepage are converted to little-endian format:

DATA:

text(4) TYPE c VALUE 'ABC',

int TYPE i VALUE 258.

DATA:

buffer1 TYPE xstring,

buffer2 TYPE xstring,

conv TYPE REF TO cl_abap_conv_out_ce.

conv = cl_abap_conv_out_ce=>create(

encoding = 'UTF-8'

endian = 'L' ).

conv->convert( EXPORTING data = text

IMPORTING buffer = buffer1 ).

conv->convert( EXPORTING data = int

IMPORTING buffer = buffer2 ).

Calling the CREATE method creates a conversion instance. For example, the target codepage or the byte order used in the input buffer can be specified as the parameter.

After calling the CONVERT method, the buffer1 variable contains the 4 bytes 41424320 (hexadecimal) that represent the string "ABC" in UTF-8. The buffer2 variable contains the 4 bytes 02010000 that represent the value 258 in little-endian format.

Former Member · ‎2007 Jan 21

Thanks for your answers.

The system is Unicode (otherwise, i wouldn't have had the problem).

Former Member · ‎2007 Jan 21

Hi Max,

Is the program marked as unicode-compatible and did it pass extended check/code inspector when the unicode flag was on?

Also, can you give the exact hebrew character that you expect to see, its original Hex value in the debuggger, and its Hex value found in the file that was created? Is the value in the file different than the value in the program?

How can you tell that the hebrew is "garbaged"? Did you view the file in IE with the UTF-8 encoding?

Cheers,

Ofer

Former Member · ‎2007 Jan 21

Hi Ofer

The program is marked unicode-compatible and it passed the inspections. It creates hebrew fonts in the file without any problem, and those hebrew fonts can be seen without garbaging when opened as a text file or when opened through the import function of Excel or when opened when using IE. What I'm looking for is a way to lose the unicode prior to downloading so that it can be opened in CSV format (the excel format that seperates items with comas).

The proper way should be the non-unicode definition but this definition causes a dump when I use the TRANSFER command to move data into the file.

Former Member · ‎2007 Jan 21

Hi Max,

It looks like you don't need to write the file as Unicode, but as simple ASCII.

Look at the docu for class CL_ABAP_CONV_OUT_CE and use it to convert from unicode to ascii.

After that you should call <b>OPEN DATASET IN LEGACY MODE</b> and write the file.

Bye,

Ofer

Former Member · ‎2007 Jan 21

OK. The problem was solved after using the directions provided by Ofer, the standard program RSCP_convert_file, and letting go of the standard NON-UNICODE code page 1100 for code page 1824 (which was found after a great deal of experimenting). The 'open dataset' line is now:

OPEN DATASET filename FOR OUTPUT IN LEGACY TEXT MODE CODE PAGE '1824'.

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My SAP Profile

My SAP Profile

problem exporting text in hebrew