‎2007 Jan 20 7:10 PM
When using the OPEN DATASET...CLOSE DATASET code to save content from the SAP system into an external file, I ran across an annoying problem.
This is the relevant sample of the code:
open dataset CURR_FILE in text mode for output encoding utf-8.
transfer CURR_LINE to CURR_FILE.
close dataset CURR_FILE.
The purpose of the process was to create a file that could be opened by a certain different program that can only read CSV files. The information went properly except for fonts in hebrew which came out as jiberish. That same text was okay when opened as a text file, and the same code was okay when used in a previous SAP system which is pre-unicode, but due to the new UTF8 standard, this problem surfaced. When opened in EXCEL the text shows the same jiberish content in hebrew fonts, unless it's imported through the excel import wizard.
There is an option to save the file as non-unicode by declaring it in the OPEN DATASET line, but this causes a short dump when the TRANSFER command is called. I currently have no option to handle the file externally after it's created as it is transfered automatically once created. If anyone knows of a solution to this problem. I'd be grateful for the help.
‎2007 Jan 21 12:48 PM
Hi Max,
It looks like you don't need to write the file as Unicode, but as simple ASCII.
Look at the docu for class CL_ABAP_CONV_OUT_CE and use it to convert from unicode to ascii.
After that you should call <b>OPEN DATASET IN LEGACY MODE</b> and write the file.
Bye,
Ofer
‎2007 Jan 21 6:15 AM
hi please check whether you are working on a Unicode system or not first
If not then check whether u have the language packs installed on ur system
Run the program RSCINST in SE38 which throws a pop up if u work on a Unicode system and also shows what are the language packs currently installed on ur system.... if u dont have Hebrew installed then u need to contact Basis People for installion of that language pack
award points if found helpful
‎2007 Jan 21 8:08 AM
Hi Max,
Try Encoding Default in open dataset. This may solve your problem
Asvhen
‎2007 Jan 21 8:14 AM
Hi Max,
If you have a non-unicode system (how to check this: in system->status there's a field: Unicode: Yes/No) - then you need to convert your text from system codepage to the UTF-8 encoding. This can be done with class CL_ABAP_CONV_OUT_CE. The following text & sample program are from the documentation for this class.
Best Regards,
Ofer
CL CL_ABAP_CONV_OUT_CE
____________________________________________________
Short Text
Code Page and Endian Conversion (System Format -> External)
Functionality
Instances of the class CL_ABAP_CONV_OUT_CE allow you to convert ABAP data objects to binary data. (That is, data in the system format is converted to an external format.)
You can convert character sets (for text data) and the byte order (for numeric data).
Additionally, there are static methods, which allow you to ascertain the hexadecimal or decimal value in the Unicode codepage for any character in the current codepage. These methods are: CL_ABAP_CONV_OUT_CE=>UCCP and CL_ABAP_CONV_OUT_CE=>UCCPI.
Relationships
CL_ABAP_CONV_IN_CE
Converts binary data into ABAP data objects
CL_ABAP_CONV_X2X_CE Converts ABAP data objects between two external binary formats. Binהrformaten.
CL_ABAP_CHAR_UTILITIES
Various attributes and methods for character sets and byte order
CL_NLS_STRUC_CONTAINER Corrects alignment of structures in containers of type C (or STRING). You need to make this correction if East-Asian characters ("full-width" characters in Chinese, Japanese, and Korean) are to be copied from a non-Unicode to a Unicode system or vice versa. You do not need to make the correction if you use the method CONVERT_STRUC from this class.
Example
In the following example, text from the system codepage is converted to UTF-8 and numbers from the system codepage are converted to little-endian format:
DATA:
text(4) TYPE c VALUE 'ABC',
int TYPE i VALUE 258.
DATA:
buffer1 TYPE xstring,
buffer2 TYPE xstring,
conv TYPE REF TO cl_abap_conv_out_ce.
conv = cl_abap_conv_out_ce=>create(
encoding = 'UTF-8'
endian = 'L' ).
conv->convert( EXPORTING data = text
IMPORTING buffer = buffer1 ).
conv->convert( EXPORTING data = int
IMPORTING buffer = buffer2 ).
Calling the CREATE method creates a conversion instance. For example, the target codepage or the byte order used in the input buffer can be specified as the parameter.
After calling the CONVERT method, the buffer1 variable contains the 4 bytes 41424320 (hexadecimal) that represent the string "ABC" in UTF-8. The buffer2 variable contains the 4 bytes 02010000 that represent the value 258 in little-endian format.
‎2007 Jan 21 8:59 AM
Thanks for your answers.
The system is Unicode (otherwise, i wouldn't have had the problem).
‎2007 Jan 21 10:26 AM
Hi Max,
Is the program marked as unicode-compatible and did it pass extended check/code inspector when the unicode flag was on?
Also, can you give the exact hebrew character that you expect to see, its original Hex value in the debuggger, and its Hex value found in the file that was created? Is the value in the file different than the value in the program?
How can you tell that the hebrew is "garbaged"? Did you view the file in IE with the UTF-8 encoding?
Cheers,
Ofer
‎2007 Jan 21 12:06 PM
Hi Ofer
The program is marked unicode-compatible and it passed the inspections. It creates hebrew fonts in the file without any problem, and those hebrew fonts can be seen without garbaging when opened as a text file or when opened through the import function of Excel or when opened when using IE. What I'm looking for is a way to lose the unicode prior to downloading so that it can be opened in CSV format (the excel format that seperates items with comas).
The proper way should be the non-unicode definition but this definition causes a dump when I use the TRANSFER command to move data into the file.
‎2007 Jan 21 12:48 PM
Hi Max,
It looks like you don't need to write the file as Unicode, but as simple ASCII.
Look at the docu for class CL_ABAP_CONV_OUT_CE and use it to convert from unicode to ascii.
After that you should call <b>OPEN DATASET IN LEGACY MODE</b> and write the file.
Bye,
Ofer
‎2007 Jan 21 3:52 PM
OK. The problem was solved after using the directions provided by Ofer, the standard program RSCP_convert_file, and letting go of the standard NON-UNICODE code page 1100 for code page 1824 (which was found after a great deal of experimenting). The 'open dataset' line is now:
OPEN DATASET filename FOR OUTPUT IN LEGACY TEXT MODE CODE PAGE '1824'.