on 2018 Dec 20 10:51 AM
We are hitting a problem with the extremely useful FOR JSON RAW option:
CREATE TABLE Blah67(PK int default autoincrement, stuff char(50), primary key (PK));If I select the contents of the table as XMLINSERT INTO Blah67(Stuff) values ('£'); INSERT INTO Blah67(Stuff) values ('€'); INSERT INTO Blah67(Stuff) values ('é');
select * from Blah67 for xml raw;everything is fine:
forxml(1,'PK',xmltab."1",'stuff',xmltab."2")
====================================================================
<row PK="1" stuff="£"/><row PK="2" stuff="€"/><row PK="3" stuff="é"/>
but with JSON
select * from Blah67 for JSON raw;it all goes haywire:
forjson(256,'PK',jsontab."1",'stuff',jsontab."2")
====================================================================
[{"PK":1,"stuff":"£"},{"PK":2,"stuff":"€"},{"PK":3,"stuff":"é"}]
reproducible in 17.0.9.4913 & 16.0.0.2546
I am using catalog collation 1252LATIN1 and character set windows-1252
That is a bit odd. I can see that, internally, FOR JSON always builds its strings as NCHAR since JSON is always in unicode according to rfc34627.
But the server seems to describe FOR JSON values as the "XML" datatype which seems to be equivalent to "LONG VARCHAR". That would mean that the UTF-8 string would be treated as if it were CHAR charset and, assuming your terminal is using 1252LATIN1, it would not undergo character set translation on the way to the client. Unfortunately, I'm not very familiar with our XML datatype handling but, given that you appear to be getting a UTF-8 encoded string back (a string that should have been converted back to 1252LATIN1), we do appear to be treating FOR JSON as CHAR which is incorrect.
In the end, you want your JSON to be in NCHAR (UTF-8) since valid JSON must always be encoded in Unicode according to the standard.
Given what I see, I think the only workaround to get truly valid JSON is to use a database with a CHAR encoding of UTF-8. And I will need to report a bug...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
FWIW, here's an older answer from Nick on the pecularties of XML being implemented as LONG VARCHAR...
Hi, i have the same problem as Justin. Is there any fix for this problem yet? I consume the JSON RAW result on a webserver and can't get the characters displayed correctly. I tried setting the JSON RAW as "LONG NVARCHAR" but that doesn't work.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Well, it's easy but requires a database reload as explained here - to quote:
You can change your database collation from one collation to another by performing an unload and reload into the collation of choice. Collations are chosen at database creation time and cannot be changed without rebuilding the database.
Once you've done it, do check that any non-standard characters are displaying properly in your application - it may be expecting something else!
This workaround worked for me:
DECLARE vTemp LONG VARCHAR; select * INTO vTemp from Blah67 for JSON raw; SELECT CAST(CAST(vTemp AS BINARY) AS LONG NVARCHAR);
It should even work (do nothing) if the bug gets fixed some time in the future...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
69 | |
11 | |
10 | |
10 | |
9 | |
9 | |
6 | |
6 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.