cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

openxml xml parser error

6,504

Hello!

When using "openxml" on results of the Google Geocoding API, I receive an XML parser error with the message "Invalid control character" for certain address details. Here's an example:

select * from openxml("CallGoogleapi"('https://maps.googleapis.com/maps/api/geocode/xml?address=pasica+75000+tuzla&language=en'),'/GeocodeR... with("lat" long varchar 'text()')

with "CallGoogleapi" defined as

FUNCTION "CallGoogleapi"( in "sLocation" long varchar ) 
returns xml
url '!sLocation!'
type 'HTTP:GET'

What can I do, so that openxml does not geerate this error?

Thanks in advance!

Dirk

View Entire Topic

I forgot to mention that this call normally works. For "berlin" I can calculate lat/log without oproblems:

"https://maps.googleapis.com/maps/api/geocode/xml?address=berlin&language=en"

"......<lat>52.5200066</lat><lng>13.4049540</lng>....."

In the query "pasica+75000+tuzla" sa-17 reports the parsing error at character 150, line 6, column 41:

1    < ?xml version="1.0" encoding="UTF-8"?>
2    -< GeocodeResponse>
3     < status>OK< /status>
4     < result>
5        < type>route< /type>
6        < formatted_address>Hadži Hasanage Pašića, Tuzla 75000, Bosnia and Herzegovina< /formatted_address>
7        < address_component>
8          < long_name>Hadži Hasanage Pašića< /long_name>
          .....

Saving the result to a file, I see the following result with an unicode character:

  <formatted_address>Hadži Hasanage Paši**/u001a**a, Tuzla 75000, Bosnia and Herzegovina</formatted_address>

In line 6, column 41 there is a hex "1A00" or U+001A. Apparently, openxml has a problem with this character. Would I have to do a conversion here?

Kind regards, Dirk

Former Member

I'm not sufficiently expert in XML to decide whether this is correct but my guess is that 'encoding="UTF-8"' makes it valid. Unless your database's char collation is UTF-8, it might be worth trying to use nvarchar instead of varchar in your definitions.
At least, this information could help to locate the problem.
Cheers
Volker

0 Likes

I tried both, but I still get the XML parsing error at the same position.

VolkerBarth
Contributor
0 Likes

Does it work when you replace the mask "/u" with "\\u", as that is the expected encoding when using UNISTR to generate NCHAR data based on their code point?

VolkerBarth
Contributor
0 Likes

What encoding does the file use?

0 Likes

I called "SELECT "CallMapsGoogleapi"('https://maps.googleapis.com/maps/api/geocode/xml?address=pasica+75000+tuzla&language=de@region=de')" and saved the rsult directly in Interactive SQL to a file?

Interactive SQL used UTF-8.

0 Likes

There is no /u or \\u in the output. I can't copy the character to the editor. It is the 20th character in this string: "Hadži Hasanage Paši\x1aa" (between i and a at the end).

As soon as I save the comment, the UTF-8 character disappears.