on 2014 Feb 13 3:54 PM
Hi,
Interested in the intriguing possibilities of spatial data manipulation now that you have your SPS07 instance?
Wish to learn more or contribute along with other explorers in this brave new world of business data meets geospatial?
There appears to be very little in way of example content. The official reference guide, which albeit useful, does not provide use cases within the context of SQLScript where the functionality could really be exploited over large geospatial datasets.
http://help.sap.com/hana/SAP_HANA_Spatial_Reference_en.pdf
There are a few other blogs out there which present wonderful slides or flashy videos yet add little to illuminate what's going on under the hood, how the engine was put together component by component with example code which we as developers love and learn from best.
I thought I'd throw out this discussion into the wild to hopefully attract like-minded souls who may be starting out on this geo adventure, where we can share thoughts, solutions, workarounds and information to everyone else.
So without further ado.....
Hi Aron,
Regarding your test of ST_DISTANCE - taken from the spatial reference guide:
"The ST_Distance method computes the shortest distance between two geometries. For planar spatial reference systems, the distance is calculated as the Cartesian distance within the plane, computed in the linear units of measure for the associated spatial reference system. For round-Earth spatial reference systems, the distance is computed taking the curvature of the Earth's surface into account using the ellipsoid parameters in the spatial reference system definition."
This to me signifies that when I use ST_DISTANCE (linear measure) I already get back my result in metres, and performing a few tests this would seem correct.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Just to double check.....the above example is where I have specifically indicated the data as SRID 4326. Let's do some direct tests.
Let's take the distance from John o' Groats, Highland KW1 to Land's End.
With ST_DISTANCE
SELECT NEW ST_Point(58.641759,-3.068672).ST_Distance( NEW ST_Point(50.05935,-5.708054) ) FROM dummy
I get 8.979090690612793. This is however with direct provision of values where the SRID is not specified and hence I assume it is using the default of Off course we well know distance from John o'Groats to Land's End is more than 8 metres, else we'd all walk/cycle it no sweat!
Assuming this is decimal degrees, let's convert using the following formula:
distance [m] = 6378137.0 [Earth radius m] * Pi * distance [degree] / 180.0
thus
distance = 6378137.0 * 3.14159 * 8.979090690612793 / 180.0
thus
distance = 999546.959 metres or 999.54Km.
Just double checking my math.....Also need to figure out how to specify SRID 4326 else I assume the default SRID 0 is being used.
I'm not convince yet.
I don't think the SAP pdf is very clear on this.
For example:
select NEW ST_Point( 100, 100 ).ST_Distance(NEW ST_Point( 100.0005, 100.0005)) from dummy;
Results: 0.0007066726684570312
Are you suggesting the results are in Metres? It seems very low.
Using the following 2 websites I manually get a distance of 0.06270 km
Convert Latitude/Longitude to Decimal
What am I missing?
Only(!) 30Km out
Still trying to determine if we are working within the SQLConsole with hard-coded geographic
values how we specify the SRID (or indeed can we change the default SRID).
In my case I always want to work with 4326. In the Insert clause you can specifiy data represents 4326, as an example:
INSERT INTO "SCHEMA"."package" VALUES(1, new ST_POLYGON('POLYGON((6.892857208251808 46.47508312878526, 6.892847162786849 46.47506597958532, 6.892975863871317 46.47503129729259, 6.892980220749916 46.47504694818233, 6.892857208251808 46.47508312878526))',4326));
If working within the console for testing, I don't want default SRID 0 (which I understand is for geometric than geographic representation).
It's not easy to determine from the reference guide how you may specify the SRID with ST_POINT, ST_DISTANCE etc. Or indeed change the default. Looking at at.
With regard to SRID, consider the following:
SELECT NEW ST_Point(58.641759,-3.068672).ST_AsEWKT() FROM dummy
This returns:
SRID=0;POINT (58.641759000000000 -3.068672000000000)
Which tells me that by default all values and computations reference default SRID 0.
I thought we might have been able to do something along the lines of the following (according to other non-hana examples I have seen), doesn't like the additional argument for SRID
SELECT NEW ST_Point(58.641759,-3.068672,4326).ST_Distance( NEW ST_Point(50.05935,-5.708054,4326)) FROM dummy;
or something like below, according to format used for creating my polygon shapes via INSERT:
SELECT NEW ST_Point('POINT(58.641759,-3.068672)',4326).ST_Distance( NEW ST_Point('POINT(50.05935,-5.708054)',4326)) FROM dummy;
I also tried the follow which doesn't seem to produce any different results.
CREATE COLUMN TABLE SpatialLocations(
id integer, location ST_POINT(4326));
INSERT INTO SpatialLocations VALUES(1, new ST_POINT('POINT(100 100)'));
INSERT INTO SpatialLocations VALUES(2, new ST_POINT('POINT(100.0005 100.0005)'));
INSERT INTO SpatialLocations VALUES(3, new ST_POINT('POINT(100 100)',4326));
INSERT INTO SpatialLocations VALUES(4, new ST_POINT('POINT(100.0005 100.0005)',4326));
select A.location.ST_AsGeoJSON(), B.location.ST_AsGeoJSON() , A.location.ST_Distance(B.location) as "ST_Distance"
from SpatialLocations A , SpatialLocations B where A.id = 1 and B.id = 2;
select A.location.ST_AsGeoJSON(), B.location.ST_AsGeoJSON() , A.location.ST_Distance(B.location) as "ST_Distance"
from SpatialLocations A , SpatialLocations B where A.id = 3 and B.id = 4;
Adding 4326 to the insert didn't seem to make any difference also to the SRID assigned:
select location, location.ST_SRID() from SpatialLocations;
LOCATION LOCATION.ST_SRID()
010100000000000000000059400000000000005940 0
010100000079E926310800594079E9263108005940 0
010100000000000000000059400000000000005940 0
010100000079E926310800594079E9263108005940 0
NOTE: I tried with 1000004326 & 4326 will similar results. We must be missing a trick.
That's a good test Aron. I quote from page 7 of the spatial reference guide:
"By default, the database server adds the following spatial reference systems to a new database:
Default - SRID 0
This is the default spatial reference system used when constructing a geometry and the SRID is not specified in the SQL and is not present in the value being loaded."
You have specified the SRID when inserting the values - so the SRID should be "present in the value being loaded" when performing your SELECT.
Only questionable part is the "SRID is not specified in the SQL". I would not have thought this necessary as you have explicitly specified this with the values upon insertion as I have done.
The "trick" is - How do we specify the SRID in SQL?
With the examples I have posted earlier, trying to emulate other spatial SQL implementations, I have had no luck specifying SRID 4326 as a third argument.
Looking at page 53 - ST_SRID Method
"Returns the SRID of the geometry".
But on page 95, it states:
"Retrieves or modifies the spatial reference system associated with the geometry value.".
I have tried chaining ST_SRID(4326) to ST_POINT and ST_DISTANCE without success.
And if only there was a system configuration value we could change to set default from 0 to 4326. I had a quick look but couldn't find anything.
We'll get there......
Hi Aron, Hi Jon-Paul,
You came across a tough question; the SRS handling is not trivial.
There are two important meta data items for a spatial column: its spatial reference system and the used unit of measure. The spatial reference system is linked to the units of measure, if you look at them with:
>>
select * from "PUBLIC"."ST_SPATIAL_REFERENCE_SYSTEMS";
<<
You will see two units columns: LINEAR and ANGULAR Units of Measure (UoM), you can find them here:
>>
select * from "PUBLIC"."ST_UNITS_OF_MEASURE";
<<
So the ST_Distance function is using the LINEAR UoM, which is for 4326 meters.
In order to invoke it you must assure that both points used in the distance calculation are in the same SRS. In the next revision we added SRS deduction, so you don’t need to specify it in every parameter as we look it up from the column but this is not available in revision 70, thus it must be specified such as:
>>
create column table spatialtest
(
location st_point(4326)
);
insert into spatialtest values (new ST_Point(9.283844, 48.544462));
insert into spatialtest values (new ST_Point(9.285121, 48.544924));
insert into spatialtest values (new ST_Point(9.278437, 48.543582));
select
location.st_distance(ST_GeomFromEWKT('SRID=4326;POINT(9.283844 48.544462)'))
from
spatialtest ;
<<
This will give you the distances in meters.
If the SRS is not specified the default 0 is taken, which results in a euclidian distance calculation, Thus you get: 0,0007066726684570312 for:
>>
select NEW ST_Point(100, 100).ST_Distance(NEW ST_Point(100.0005,100.0005)) from dummy;
<<
Cheers
Gerrit
Thanks Gerrit.
Simliar to my comment further down SRID still does not appear to be saved on Insert for me using your example. After using your Inserts I see:
SELECT location.ST_AsEWKT() FROM spatialtest;
So because SRID is still stored as 0 then I get no valid results with your example:
select location.st_distance(ST_GeomFromEWKT('SRID=4326;POINT(9.283844 48.544462)') from spatialtest ;
Unless I specify SRID again with the following:
select new ST_POINT(location,4326).st_distance(ST_GeomFromEWKT('SRID=4326;POINT(9.283844 48.544462)'))from spatialtest ;
Any ideas?
Jon-Paul - Did all of Gerrits example work for you?
Thanks for trying it out. I'm glad it works for you. That's weird that we are getting different results.
Interestingly I have insufficient authorisation to view (select *) from ST_SPATIAL_REFERENCE_SYSTEMS, and it will take an eternity for me to get access, so I wonder if there is something missing in my assigned roles that might be causing the problem.
There's nothing mentioned in the PDF about any specific roles needed but.....
hi Jon-Paul,
thanks for this great post!
below is some syntax for a haversine db function for hana for testing. it's not 100% due to the inaccuracies of the calc, but it does get close with 966 and change km for the above example.
you can change the units of output as per the syntax below as well.
cheers,
jamie
-- units to use: e.g. use 3956 for output in miles
-- rPd:= 0.017453293; // rad per degree (PI/180 where PI = 3.1415926535)
-- rMi:= 3956; // radius in miles
-- rKm:= 6371; // radius in kilometers
-- rFt:= 20895592 (rMi * 5282); // radius in feet
-- rM:= 6371000 (rKm * 1000); // radius in meters
CREATE FUNCTION DISTBYUNIT(LAT1 DECIMAL(13,10), LON1 DECIMAL(13,10), LAT2 DECIMAL(13,10), LON2 DECIMAL(13,10), UNIT FLOAT)
RETURNS DIST DOUBLE
LANGUAGE SQLSCRIPT READS SQL DATA AS
BEGIN
DIST := :UNIT * 2 * ATAN(SQRT(SIN((LAT2-LAT1)*0.017453293/2) * SIN((LAT2-LAT1)*0.017453293/2) + COS(LAT1*0.017453293) * COS(LAT1*0.017453293) * SIN((LON2-LON1)*0.017453293/2) * SIN((LON2-LON1)*0.017453293/2)) / SQRT(1-SIN((LAT2-LAT1)*0.017453293/2) * SIN((LAT2-LAT1)*0.017453293/2) + COS(LAT1*0.017453293) * COS(LAT2*0.017453293) * SIN((LON2-LON1)*0.017453293/2) * SIN((LON2-LON1)*0.017453293/2)));
END;
-- function input example
DISTBYUNIT(58.641759, -3.068672, 50.05935, -5.708054, 6371) AS DIST_KM
--result
966.3523928691291
Hi Jamie,
Thanks for the example, Ive been using similar geo formula solutions in JavaScript and it's nice to see a variation in a defined function.
It's also why Im passionate about the spatial engine as a feature-rich toolkit doing all the heavy lifting for us, so we can focus on the application than being concerned about ATAN2 and SIN.
I have an existing table with LATITUDE and LONGITUDE columns and want to use them to create a new ST_POINT column. From there, I hope to take advantage of the new geospatial functions.
This sounded like a simple extension to Gerrit's example above but it turns out there's a little more to it.
As Gerrit says, to use latitudes and longitudes, you must specify the SRID of 4326 when you create the column:
alter table GHCND_STATIONS add( LOCATION st_point(4326) );
However, in my case, it was also necessary to specify the SRID during the update:
update GHCND_STATIONS
set LOCATION = new st_point( 'POINT(' || LONGITUDE || ' ' || LATITUDE ||')',4326)
Garrit's INSERT statements worked without the SRID but when I tried this in an UPDATE statement, the resulting ST_Point has an SRID of -1 and attempts to use them in distance calculations return NULL.
Performance was disappointing. The table in question contains 91,000 weather stations. The query below returns stations within 50 km of a specified lat/long and takes almost 11 seconds to run (on an AWS HANA):
SELECT ID, name, longitude, latitude, location.ST_AsEWKT() As EWKT,
location.st_distance(ST_GeomFromEWKT('SRID=4326;POINT(-94.1167 29.7000)'))/1000 as Dist_KM
FROM GHCND_STATIONS
where location.st_distance(ST_GeomFromEWKT('SRID=4326;POINT(-94.1167 29.7000)'))/1000 < 50
Almost all the time went into WHERE clause.
Hi Michael,
thank you for your example. I looked into it and yes the performance is surely not acceptable. The reason for this is that the query is not getting optimized correctly, in detail the predicate is not evaluated at an early stage which leads to a full materialization of all points as an intermediate result. I found the issue and we will bring it to the next revision.
There is a more "friendly" way to express the query by using the ST_WithinDistance predicate:
>>
WHERE location.ST_WithinDistance(ST_GeomFromEWKT('SRID=4326;POINT(-94.1167 29.7000)'),50000) = 1;
<<
However, it is suffering from the same problem currently. But in principle it is more optimizer friendly.
If you want to have an intermediate solution you can use ST_Intersects in combination with ST_Buffer which will execute much faster (on my machine with 100.000 points ca. 100ms).
>>
WHERE location.ST_Intersects(ST_GeomFromEWKT('SRID=4326;POINT(-94.1167 29.7000)').ST_Buffer(0.5)) = 1;
<<
Let me know if this works for you.
Cheers
Gerrit
Thanks for the information. I look forward to improved optimization.
The immediate solution you proposed is behaving as if ST_Buffer wasn't working. The test point happens to match exactly one record in the database and this is the only record returned. If I move the test point .0001 degrees, no results are returned.
My reading of the doc is that ST_Buffer(0.5) creates a half-meter buffer, so I tried some variants such as ST_Buffer(50000, 'meter'). I also tried putting the buffer on the location column. None of this made any difference.
Here is one example (the intent is to return all points within 50km of a reference point):
WHERE location.ST_Intersects(ST_GeomFromEWKT('SRID=4326;POINT(-94.1167 29.7000)').ST_Buffer(50000, 'meter')) = 1;
User | Count |
---|---|
66 | |
8 | |
8 | |
6 | |
6 | |
6 | |
6 | |
6 | |
5 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.