cancel
Showing results for 
Search instead for 
Did you mean: 

cyrillic alphabet not available in initial replication , but available for later insert and update

Former Member
2,407

Hello, I have a central db (UCA , UTF-8 , sybase 11.0.1.2960) and remote db’s (same set-up). In this database I store brands , CREATE TABLE "icat"."Brand" ( "BrandID" NUMERIC(16,0) NOT NULL, "Name" NVARCHAR(50) NOT NULL PRIMARY KEY ( "BrandID" ASC ) ) when I add a new brand that contains cyrillic (“трудно”) this has no problem replicating to one of my remote db’s. The statements looks like INSERT INTO icat.Brand(BrandID, Name) VALUES (1000003637,TO_NCHAR(0xD0A2D0A0D0A3D094D09DD09E,'UTF-8')

Mind the TO_NCHAR and the fact that the name field is NVARCHAR.

However , when I create a new remote DB and I run the sql remote to generate the replication files and read these at the remote db, the “трудно” turns into some other even weirder characters (squares and what not). If I then update the name in the central and run the replication it will be correct, using the to_nchar in the update statement.

Anyone have a clue as to why the initial load of data is handled different from later updates in inserts?

Kind regards Dimitri

Former Member
0 Kudos

Maybe I am not understanding the nature of your problem but the sample string (shown) being passed to TO_NCHAR() does not seem to include any Cyrillic characters. ?did I miss something?

In UTF-8 0x313233C387C389C389C387 is

'1', '2', '3' [ 0x31, 0x32, 0x33 ] followed by 'Ç', 'É', 'É', 'Ç' [0xc387, 0xc389, 0xc389,0xc387 ]

http://www.decodeunicode.org/en/u+00c7/properties http://www.decodeunicode.org/en/u+00c9/properties

Since it works after you change the data, I suspect the input side of this is somehow failing you.

HTH

Former Member
0 Kudos

The insert statement that I copied must of been from another test, I'll make another test on Monday and post that. But the problem are not the insert statements that I can get from the log , rather the initial load of data which does not produce these kinds of statements , rather just a select and a count of how many records are inserted

Former Member
0 Kudos

TO_NCHAR(0xD0A2D0A0D0A3D094D09DD09E,'UTF-8') is what I get when I pass трудно

VolkerBarth
Contributor
0 Kudos

I run the sql remote to generate the replication files and read these at the remote db

What exactly do you mean by that? Do you use the DBXTRACT utility to create the remote database? As to the "read these" - do you relate to a reload.sql and the according unloaded DAT.files that are referenced by LOAD TABLE statements? Or are these message files sent by DBREMOTE?

Former Member
0 Kudos

All messages are generated and read by using the dbremote.exe

During the initial replication the log on the receiving (remote database) it says

select name , .... from brand

5000 rows synchronized.

Later when I insert or update a record at the central database, the log at the remote database will have insert and update statements, using the to_nchar

VolkerBarth
Contributor

So you are issuing SYNCHRONIZE SUBSCRIPTION statements at the consolidated to "fill" the remote database? (I'm asking since we have always extracted data from the consolidated and re-loaded those into the remotes locally before we have shipped them to remote users...)

Do you use the CharSet (CS) connection parameter when using SQL Remote?


When you compare the contents of the message file of an initial replication and of a "normal" run (say, by using DBREMOTE -v -o MySrConsole.log), do the data for the unicode column differ for the same row (say, when altered to the same value)?

Former Member
0 Kudos

Adding the charSet connection parameter to the dbremote command seems to fix everything.

However, I was under the impression that I was going to need to convert all my varchar fields to nvarchar to hold the Cyrillic texts. But yesterday during the tests I ran with the charSet I noticed that even the varchar fields had no problem with the Cyrillic symbols. Is this normal behavior?

VolkerBarth
Contributor
0 Kudos

If you only (or primarily) use Cyrillic, I guess the single-byte codepage 1251 and the according collation 1251CYR should be sufficient to store these values in CHAR fields (and historically, for SQL Anywhere databases, they will have been sufficient before Unicode support/NCHAR has been introduced with v10...). However, you will not be able to store characters from different languages/scripts there, say Latin characters.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member

Adding the charSet= utf-8 connection parameter to the dbremote command seems to fix everything.

"C:\\Program Files (x86)\\SQL Anywhere 11\\BIN32\\dbremote.exe" -c "eng=icat9636;dbn=icat9636;CharSet=utf-8" -b -qc -r -os 50M -o "d:\\applicationdata\\ICAT\\Sync\\SqlRemoteLogs\\icatcentral\\dbremote_messages.log" -l 100000 -t -v "D:\\Databases\\Sybase\\icat\\icatlocal"

Answers (0)