cancel
Showing results for 
Search instead for 
Did you mean: 

dbremote on windows + linux (Character encoding problem)

Former Member
4,068

Hello,

we use SQL Anywhere 10.0.1.4103.

The consolidate server is running under windows and a most of the remote sites are running windows too. We have recently added a few linux remotes, but there we see a replication problem with some characters.

When a windows remote enter text in the database, sometimes the replication fails on the linux systems with this message:

I. 2013-09-03 09:03:00. INSERT INTO DBA.CardEntries(CardEntries,ClientCard,CreationDate,Amount,
                              REMOTENAME,Salesperson,EntryProcessed,BatchID)
                        VALUES ('O000CB','J001UW','12:26:15.446918 2013/08/30',169,'SiteTest1','Sil Sch.r',0,NULL)
E. 2013-09-03 09:03:00. SQL-Anweisung fehlgeschlagen: (-131) Syntaxfehler bei 'Sil Sch.,0,NULL)' in Zeile 3
E. 2013-09-03 09:03:00. Wird übersprungen:
E. 2013-09-03 09:03:00. INSERT INTO DBA.CardEntries(CardEntries,ClientCard,CreationDate,Amount,
                              REMOTENAME,Salesperson,EntryProcessed,BatchID)
                        VALUES ('O000CB','J001UW','12:26:15.446918 2013/08/30',169,'SiteTest1','Sil Sch.r',0,NULL)

The "offending" text is the entry for the sales person which should be "Sil Schär", but apparently it stumbles over the ä character.

All databases have the same encodings:

CHAR Collation: 1251LATIN1

CHAR Encoding: windows-1252

NCHAR Collation: UCA

NCHAR Encoding: UTF-8

The replication is done via Email. In the connection string of dbremote we specify nothing special, only uid,pwd,eng,dbn

I think that the linux dbremote is using a wrong character set when deconding the emails received and then trys to apply the sql operation in a wrong encoding.

Strange is, that messages with the ü character are correctly replicated, but the ä seems to cause problems....

Any ideas how to solve it ?

VolkerBarth
Contributor
0 Kudos

"ß" won't be a problem in CH, right?

Former Member
0 Kudos

Yes, we don't use the "ß", but then, we have éèà and ç for our french customers as well... 🙂

Accepted Solutions (0)

Answers (3)

Answers (3)

regdomaratzki
Product and Topic Expert
Product and Topic Expert

Try adding "charset=none" in the connection string for dbremote on all nodes. John's comment that dbremote is assuming OS charset when reading a message is correct, and Volker points to very relevant section of the readme file.

Former Member
0 Kudos

Thanks, this solved the problem. I assume a "CharSet=windows-1252" would also work?

regdomaratzki
Product and Topic Expert
Product and Topic Expert

Assuming that is the proper character set, yes. If all the database use the same encoding, I prefer using charset=none everywhere to tell dbremote never to do characeter set translation, and this solution will work on every computer, regardless of the locale specified on the computer.

johnsmirnios
Participant

I think you are right. Something is likely interpreting the data as UTF8 on Linux. In cp1252, 'ä' is encoded as 0xE4 which introduces a 3-byte character and that will end up gobbling up the 'r' and the closing quote as part of that character. That's why you see the syntax error. In cp1252 'ü' is encoded as 0xFC which is not a valid lead byte (or follow byte for that matter) so it just gets passed through as a single byte and doesn't cause problems.

I don't know anything about the dbremote or email side of things though. I expect that the emails being sent don't have the encoding specified in the header and therefore the other side is assuming OS charset? Perhaps you can add such a header yourself?

VolkerBarth
Contributor

Here's a note to SQL Remote and charsets from the newest v12 EBF (3942).

In my understanding, it doesn't describe a bugfix but a "how to" - and possibly that might work for you, too:

    ================(Build #3850  - Engineering Case #730270)================

SQL Remote always assumes that all databases involved in replication share 
    the same character set. By default, SQL Remote will always apply source CHAR 
    data to a target database using the default character set for the operating 
    system it is running on, ignoring the source data character set.

When using a database character set that is different than the default character 
    set for the operating system, dbremote must be instructed to perform explicit 
    data conversion to that character set on its connection string:

e.g. dbremote -c “CHARSET=utf8;…”

or instruct dbremote to always use the CHAR character set 
    of the target database to apply the remote CHAR data:

e.g. dbremote -c “CHARSET=none;…”