You are not logged in or registered. Please login or register to use the full functionality of this Website SybaseTeam.Com...
Hello There, Guest! (LoginRegister) Remember Me? Current time: 11-26-2014, 02:07 AM
   Home  |  About Us  |  Sybase Training  |  Synergy  |  Consulting  |  Job Openings  |  Tech Videos  |  Rules and Disclaimer  |  Search
Post Reply 
Forum Tools
Best approach to archive a table of 6 billion rows
12-05-2011, 07:19 AM
Post: #1
Quote this message in a reply
Best approach to archive a table of 6 billion rows


Hi,
We have a table (MyTable) of 534315157 rows with a Clustered index of 3363756 KB.
This table is replicated to other databases (both warm stand by and oracle databases). On average 300K rows are insetred in this table in a day. Now we have to delete rows older than 2.5 years.After calculation it is found that total number of rows to be deleted are 267216286.
During production relase we have only 3 hours time (at max) to delete these rows after switching off the replication.After analysis we have following options:
  • Extract all keys (Clustered index) from warm standby database. An nonclustered index is present on date column in warm standby database. Load all the keys into KEY_TABLE in main database. Then delete from MyTable in batch of 500/1000 rows joining KEY_TABLE & MyTable.
  • Create a new table MyTable_New and insert the number of rows into MyTable_New. Truncate the old table MyTable and finally rename the new table MyTable_New to MyTable.

Since application is logging 300K rows on a day, number of rows to be inserted will be higher than number of rows to be deleted.

What will be best approach to delete 267216286 rows from MyTable considering the time?
Please feel free to add any new approach. We are going to test both these options and find out the statistics.

Thanks in advance for your suggestion.



BRs,
Pradyut
Find all posts by this user
04-13-2012, 02:30 AM
Post: #2
Quote this message in a reply
RE: Best approach to archive a table of 6 billion rows


Pradyut

(Sorry, I have been away.)

For both options, the configuration of the server, and the resource allocations for the database (or table0 concerned, is goin gto have a large impact on the speed, whether the tasks complete in 3 hours or not. So I recommend that you spend some time gearing up the server , and ensuring optimal performance.

Depending on how you perceive it, this is Option 3 or a variation of Option 2. For such a large percentage of DELETEs, I would:
• Create MyTable_New correctly
• SELECT *
____INTO EXISTING TABLE
____FROM MyTable
____WHERE retain-condition

SELECT INTO is much faster, as it logs Allocation Units only. You will need the db_option SELECT-INTO set, which will break the transaction log recovery cycle, but I am assuming you have to DUMP DATABASE anyway, after this task, and re-synch with the replicated dbs.

SELECT INTO now allows EXISTING TABLE. If you are on an older version which does not have it, remove the explicit CREATE TABLE, and create the target table the old-fashioned way, implicitly, within the SELECT-INTO. Make sure you provide a full and correct column list.

----

Option One. DELETEs are the slowest operation. If you consider that option seriously, I would take this approach. Forget about the 3-hour window. Treat it like a normal OLTP process, that will get replicated to the external dbs in the normal way, as per normal daytime constraints. Run the DELETE in the loop, with a batch size of 100 (you want really small transactions, which will get replicated without issue, and avoid interfering with normal daytime usage). Set the priority to LOW, so that it will not interfere with the normal business usage during the day. Allow it to run for hours or days.

Within the loop, when the rows deleted is a multiple of 1000, PRINT a progress message with timestamp to the output, which you can capture with -o in isql. That will allow you to watch the progress. You also need to SET FLUSHMESSAGE ON.



Ashirvad to my Shishyas, Cheers to the others
Derek Asirvadem
Information Architect / Sr Sybase DBA
Website
Selection of Useful Documents for the Sybase DBA
Visit this user's website Find all posts by this user
Post Reply 


Possibly Related Threads...
Thread: Author Replies: Views: Last Post
  Proper Table design amol 1 1,225 07-30-2014 07:11 AM
Last Post: jadhav
  Can ASE optimizer use more than one index at a time for the same table to process? Bharathvajan 7 3,734 02-20-2014 09:10 AM
Last Post: jdenboer
Exclamation Sybase ASE table fragmentation issues kgollam 8 10,970 01-17-2013 04:58 AM
Last Post: ketan7484
  SQL (sybase) query using TOP N performs very badly when inserted into table tmatharu 2 2,291 01-14-2013 11:19 AM
Last Post: Bharathvajan
  Table partitioniong amolkadu 5 2,968 08-29-2012 10:12 PM
Last Post: DerekAsirvadem
  UPDATE for two almost identical temp tables (350000 rows each of them) bearbull 1 1,955 05-29-2012 05:28 AM
Last Post: ipletcherucl
  starttime column in syslogshodl table chandrashekhar 4 3,292 07-30-2011 07:39 PM
Last Post: DerekAsirvadem
  Table Fragmentation threshold manos 2 4,235 04-21-2010 07:52 AM
Last Post: manos

Options:
Forum Jump:


Contact Us | SybaseTeam | Disclaimer & Rules | Return to Top | Return to Content | Lite (Archive) Mode | RSS Syndication