cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

Remove illegal char from xml column

Former Member
10,534

So I've inherited this database with some invalid xml data in one table.

I know for a fact that the character is a control char and I can find it in the column (which is of type XML):

SELECT CHARINDEX(char(26), BLA_COLUMN) FROM BLA_TABLE where ID_RECORD = 1234 Returns a valid index. Now, when I try to replace that char with something else:

UPDATE BLA_TABLE SET BLA_COLUMN = REPLACE(BLA_COLUMN, char(26), char(32)) WHERE ID_RECORD = 1234

The statement fails with message: "XML parser error: character: 603, line:1, column:603 Illegal control character SQLCODE=-888, ODBC 3 State="HY000"

Any ideas how to get rid of that character? (Running SQLA 12)

Thank you!

EDIT

Looks like Volker's suggestion to first update the BLA_COLUMN to a Null value then to update to a valid XML, replacing char(26) with char (32), provides a workaround. Thanks Volker!

View Entire Topic
VolkerBarth
Contributor

Is the same true if you try to set the column to null (which would look like the previous contents is parsed, too)...?


EDIT: As this seems to have done the trick, it shows that

  • the parser does not validate the old contents (preventing some kind of "correction deadlock")
  • and NULL is not that bad now and then(Breck, do you hear me?:)
justin_willey
Participant
0 Likes

If that is happening it's hard to see how you would ever be able to fix it - it's odd that it could get in there in the first place unless the behaviour has been changed at some time and it is now stricter. If this is the case, it's beginning to look more like a bug than a feature 🙂

Former Member

Setting the column to Null first, then updating it with a valid XML (ie char 26 replaced with char 32) seems to do the trick.

justin_willey
Participant

I had a go reproducing the problem (on 10.0.1 and 16.0) but couldn't - I could select and update the bad XML without an issue - so I was wondering what was causing the problem. Anyway, you have a work around so that's great.

create table TestXML(PK int default autoincrement, XMLStuff xml, primary key(PK));

insert into TestXML(XMLStuff) values ('<Bad XML>');
insert into  TestXML(XMLStuff) values ('<GoodXML/>');
insert into  TestXML(XMLStuff) values (string('<ReallyBadXML',char(26),'>'));
insert into  TestXML(XMLStuff) values (string('<EvenWorseXML',char(26)));

select * from TestXML where PK = 1;
-- works OK

update TestXML set XMLStuff = '<BetterXML/>' where  PK = 1;
-- works OK

select * from TestXML where PK = 3;
-- works OK

update TestXML set XMLStuff = '<BetterXML/>' where  PK = 3;
-- works OK

select * from TestXML where PK = 4;
-- works OK

update TestXML set XMLStuff = '<BetterXML/>' where  PK = 4;
-- works OK

drop table TestXML;
VolkerBarth
Contributor

12.0.1.4085 shows the same success.

Also trying to turn a valid XML into an invalid one doesn't raise an error, such as:

update TestXML set XMLStuff =
  replace(XMLStuff, char(47), char(26)) where  PK = 2;
-- works OK
Former Member

SQL Anywhere only validates XML when parsing it (for example, using OPENXML). It is not validated when inserting or updating a value, or casting to the XML type. I wonder if the original database has a trigger, computed column, or check constraint that ends up parsing the XML.

Former Member

You guys are right. Having looked more carefully at the table, I've notice the presence of a trigger that indeed does some parsing on update (disabling the trigger allowed the updates of fixing the xml just fine). Can't believe that didn't cross my mind! So there you go, case closed. Thank you!