on 2013 Jul 22 1:02 PM
Hi I have huge table which is coming from network in chunks. Rows are not ordered so can be duplicates. I need to keep track of duplicates and preserve all rows so that in future I can return rows as and when it came. Is there any framework or algorithms that help me achive this task? I would like to do versioning for all the rows so that I can go back in time and get rows for particular time date etc.
Please guide. How should I approach? I am new to Database. I need to implement this in Java. Thanks in advance.
Request clarification before answering.
Add an extra column to the table...
input_order BIGINT NOT NULL DEFAULT AUTOINCREMENT,
Load the input data into the other columns, but let this column default by leaving it out of the column list being inserted. You can do this (leave a column out) when using either the INSERT or LOAD TABLE statements.
Then, input_order will be assigned the values 1, 2, 3 according to the order the rows were inserted.
This column WILL be unique, so you can make it the primary key if you want.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You say, you would like to select rows for a particular date afterwards. To simplify that, you could additionally add a column to track the date/time when the row was inserted automatically:
input_time datetime NOT NULL DEFAULT CURRENT TIMESTAMP,
The DEFAULT CURRENT TIMESTAMP (or CURRENT UTC TIMESTAMP, if you prefer a consistent time reference) will be filled automatically with the current date/time.
Just as with the column Breck has suggested, let out this coulmn in the INSERT/LOAD TABLE statements.
Hi @Breck thank a lot for the answer. How do I achieve this versioning using JDBC? I am uploading data into batches using addBatch() method.
e.g. If you have a table:
create table t1( c1 integer, c2 varchar(32), input_order integer not null default autoincrement, input_time datetime not null default current timestamp );
Then your PreparedStatement
JDBC code would look like:
PreparedStatement ps = conn.prepareStatement("INSERT INTO t1 (c1, c2) VALUES (?, ?)"; ps.setInt(1,1); ps.setString(2,"Hello"); ps.addBatch(); ps.executeBatch();
The key importance here is specifying the columns to be filled in the INSERT statement ((c1, c2)
), while letting the other columns use the default values.
User | Count |
---|---|
82 | |
29 | |
9 | |
8 | |
7 | |
7 | |
7 | |
6 | |
6 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.