on 2010 Sep 16 12:45 AM
It's common for the first line of a comma-separated or tab-separated text file to be the column names for the data that follows, so it's natural to want to take advantage of this in OPENSTRING or LOAD TABLE.
I'm sure that this would have already been considered by the SQLA development folks, so I'm wondering:
- Have you found a reason why this isn't a good idea or not feasible, so don't hold my breath?
OR
- Is this on a long list of "good ideas", but you've been adding more important stuff and you can't do everything all at once, so maybe someday down the road, no promises?
Thanks!
Loyal fan,
Dan K.
Request clarification before answering.
Maybe I am completely off the mark here, but assuming your server is on Windows, could you not create a remote data access server using the Microsoft Access Text Driver as the underlying ODBC driver. I believe the driver will then interpret the first line as the column names and make the appropriate guess as to what datatype the data is. You can then create a proxy table to the csv file and use select * into ... to create the base table with the proper column names, datatypes and data.
Try the following:
1) create a file called foo.csv with the following contents
col1,col2,col3,col4,col5
1,abc,3,4,5
6,def,8,9,10
2) create a dsn called mycsvdsn using the Mictosoft Access Text Driver and set the directory to point to the directory that foo.csv resides in.
3) connect to SA and create a remote server
CREATE SERVER rem CLASS 'odbc' USING 'mycsvdsn'
4) create the proxy table
CREATE EXISTING TABLE foo AT 'rem;;;foo.csv;
5) query the proxy table to make sure the column names and data look correct
SELECT * FROM foo
6) create and load the base table
SELECT * INTO mybasetab FROM foo
7) verify that mybasetab has the correct schema and data
HTH
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
OPENSTRING WITH and LOAD TABLE (...) serve multiple roles:
I'm suggesting that you want to sever the first role from them when input can self-identify its columns.
Case 1:
The input's first row says it has columns InA, InB, InC, InD, not necessarily in that order, and where you don't need InC. You might have
OPENSTRING WITH ( InA CHAR(30), InB CHAR(30), InD CHAR(30))
OPTION(COLUMNNAMES 1 SKIP 1)
and not care that there's an extra column or if the file actually has the columns in a different order.
Or, your destination table has columns DestP, DestQ, DestR, where InA -> DestQ, InB -> DestP, InD -> DestR. Making up a silly syntax just as an example, you might have
LOAD TABLE DestTbl (DestP COMEFROM InB,
DestQ COMEFROM InA,
DestR COMEFROM InD) COLUMNNAMES 1 SKIP 1
Case 2:
The input file's first row has the same column names as your destination table, but not necessarily in the same order, and it might have extra columns or missing columns. Then, you might want something like a WITH AUTO NAME option, e.g.
LOAD TABLE DestTbl (DestP, DestQ, DestR) COLUMNNAMES 1 SKIP 1
WITH AUTO NAME
which won't care about extra columns in the input and will set missing columns to their default.
A capability like this would make external unload/reload much more robust and flexible. The old I-SQL INPUT had some self-defining formats, and though those formats are obsolete, the concept was good. A column name header row is so common in CSVs and TSVs that (humbly, and with great respect) I'd rather not have to go through Microsoft Desktop Driver ODBC support to use them.
[All syntax examples are only to illustrate the point - not actual syntax proposals.]
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
64 | |
8 | |
7 | |
7 | |
6 | |
5 | |
5 | |
4 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.