on 2022 Sep 14 12:36 PM
Hi,
SAP Anywhere 17.0.11.7058
Our database is of character encoding 1252 Latin (Western Europe) for CHAR and UTF-8 for NCHAR character sets, everything gets stored just fine and retrieved on our Powerbuilder application, but we also have an interface that exports XML files - if we try to export a record with some UTF-8 characters in them, we get the following error:
There was an error reading the results of SQL Statement. The displayed results may be incorrect or incomplete. Control character in XML output SQLCODE=-911, ODBC 3 State = "HY000"
Our output statement is as follows:
EXECUTE IMMEDIATE 'UNLOAD SELECT XMLGEN( xml version="1.0" encoding="UTF-8" Root{$x}/Root, (SELECT Header.No, Header.JournalName, ... for xml auto, elements) AS x ) TO ''' + filepath_filename + ''' ENCODING ''UTF-8'' format ASCII quotes off escapes off';
Outputs of database queries which do not retrieve UTF characters are working fine.
Any help appreciated.
Cheers
Request clarification before answering.
XML datatype appears to use the CHAR datatype so yours will be encoded in 1252. https://dcx.sap.com/index.html#sqla170/en/html/819b6cda6ce2101488ce846acd462106.html*loio819b6cda6ce...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
That following discussion with Nick Elson might give some clues how to proceed...
Thanks, but I think there's a misunderstanding - we are not storing the XML at all as a datatype in any table, we are just retrieving text/numeric fields of 2 tables, with some of the fields being NVARCHAR and LONG NVARCHAR, others being mostly CHAR and LONGS, as the XML files - just pure text files with XML markup notation.
And this works fine as long as there are no UTF characters to be retrieved, but as soon as we get some, we are getting the above message?
Our "unload SELECT XMLGEN" line clearly states that encoding should be UTF-8:
EXECUTE IMMEDIATE 'UNLOAD SELECT XMLGEN( <?xml version="1.0" encoding="UTF-8" ?>
and the last output file settings specifies it as well:
ENCODING UTF-8 format ASCII quotes off escapes off'
Here is another thing - when I tried it on a database which has even CHAR as UTF encoding set, everything works just fine. It's only on this database which has CHAR as Latin1252 and NCHAR as UTF-8 which is giving us this problem.
That's exactly the point: The XML functions like XMLGEN() return XML as data type which is based on VARCHAR - and as such, is restricted to your CHAR char set. Therefore it works with UTF8 as CHAR char set but will fail with single-byte char for non-fitting code points...
I see, thanks. So there is simply no way for me to fix it like that?
In that case, I will convert the database to both CHAR and NCHAR encoding UTF-8. Are there any implications of it I should know about? I was a bit surprised UTF-8 wasn't a default encoding set for CHAR, but Latin 1252. Is there any reason for that, anything I should be cautious about?
I don't think you can change the XML data type. If you rely on cp1252 as CHAR charset (like we do...), another approach would be to create the XML document and then use csconvert() to convert it to NCHAR (like we have done regularly). You would still be restricted to the cp1252 charset as the XML functions can only handle that (and therefore won't support, say eastern European characters) but would then encode the result accordingly.
This should resolve the issue if your client expects UTF-8 or other encodings.
Handling multibyte characters always has a bit of a performance impact relative to singlebyte characters but it is probably very minor. It might even help you end-to-end if your web clients are UTF-8 since you will skip a charset translation from 1252 to UTF-8 for all data going to the client.
You will need to choose between UTF8BIN (smaller performance impact but less flexible sorting/comparisons [for example, you cannot get accent insensitivity]) and UCA (more impact but has linguistic sorting). NCHAR uses UCA by default; I recommend using that for CHAR too.
User | Count |
---|---|
82 | |
29 | |
9 | |
8 | |
7 | |
7 | |
7 | |
6 | |
6 | |
6 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.