‎2014 Mar 10 5:51 AM
Dear SDN members,
I am facing somem issues with xml encoding for diacritical characters (special characters (non-english)).
For example below org unit name AfricaéçaçÃtest is converted to Africa a test.
special chars are replaced with spaces , i used below code
IF lv_xml_data IS NOT INITIAL.
* ====================================================================================
* Correct the Encoding of output
* ====================================================================================
lv_xml_data TYPE string,
lv_buffer TYPE xstring.
EXPORT lv_xml_data TO DATA BUFFER lv_buffer.
lo_conv = cl_abap_conv_in_ce=>create( input = lv_buffer
encoding = '1105'
ignore_cerr = 'X'
replacement = '' ).
lo_conv->convert( EXPORTING input = lv_buffer
IMPORTING data = lv_xml_data ).
SHIFT lv_xml_data UP TO '<root'.
ENDIF.
ev_xml_data = lv_xml_data.
Tried removing export which is causing the issue (AfricaéçaçÃtest converts to Africaéçaçà test
i tried changing encoding types 4110 (utf-8) , 1160,1101,1100 etc. but it didn't work .
i tried this CALL FUNCTION 'SCMS_STRING_TO_XSTRING' EXPORTING text = lv_xml_data IMPORTING buffer = lv_buffer.
and also string to binary and then binary to xstring but none has slved my problem .
Appretiate your inputs to resolve ths issue .
Thanks & Regards
Satish
‎2014 Mar 10 7:34 AM
Hi Satish,
CALL FUNCTION 'SCMS_STRING_TO_XSTRING this FM has this issue.
Use BCS class to avoid this where the special characters are sent without truncation or dump.
try.
call method cl_bcs_convert=>string_to_solix
EXPORTING
iv_string = lv_string
iv_codepage = lc_codepage
iv_add_bom = gc_x
IMPORTING
et_solix = lt_solix.
catch cx_bcs .
endtry.
*-- Create persistent send request
l_send_request = cl_bcs=>create_persistent( ).
wt_contents[] = t_contents[].
*-- Get the length of the Document
describe table wt_contents lines l_cnt.
read table wt_contents into ws_contents index l_cnt.
l_doc_len = ( l_cnt - 1 ) * 255 + strlen( ws_contents ).
*-- Subject of the mail
l_sub = w_mail_subj.
*-- Create Document
try.
l_document = cl_document_bcs=>create_document(
i_type = lc_htm
i_text = wt_contents
i_length = l_doc_len
i_subject = l_sub
i_language = sy-langu
i_importance = '1' ).
catch CX_DOCUMENT_BCS.
endtry.
*-- Subject of the mail
move w_mail_subj to l_subj.
w_document = l_document.
Raghav
‎2014 Mar 10 7:41 AM
‎2014 Mar 11 7:08 AM
You can use class cl_abap_conv_x2x_ce to change encoding of hex representation of text.
Looking at your example, you are trying to change encoding of a text directly.
Consider é, Hex representation for UTF-8 encoding would be C3A9. When this hex data is interpreted as latin encoded text, text would be é.
See below snippet that is converting AfricaéçaçÃtest to Africaéçaç#test.
/.
‎2014 Mar 11 7:31 AM
Hi Manish ,
The above code is replacing accent chars with # , i want without # .
please suggest.
Thanks & Regards
Satish
‎2014 Mar 11 7:47 AM
It is supposed to show #, as à is not followed by something that can directly be interpreted correctly in UTF-8. You can dive deeper into hex level and do some substitutions that correct the output. Hex equivalent of Ãtest is C374657374. C3 could not be converted, so you get a #. In order to get à in UTF-8, C3 should be replaced by C383.
So, do not ignore the conversion error, and apply suitable substitution at hex level.