Skip to content

Periodic Maintenance of Large Scale Data warehouse

September 9, 2011

This section covers the maintenance tasks that need to be done periodically in order to run a smooth SQL Server computer. This includes, but is not limited to, the following tasks:

  • Index maintenance
  • Statistics maintenance
  • Database consistency checks

The most challenging tasks in the largest databases are the maintenance tasks. This can be more difficult than tuning the queries. And since maintenance tasks can parallelize over many CPUs, they may require more CPUs than you might originally estimate for your user workload.

Index Maintenance

Indexes on tables that are frequently modified (via insert, update, and delete statements) need to be rebuilt occasionally. It is recommended that you rebuild your indexes before they reach 50 percent fragmentation. If your table is too big to constantly rebuild your indexes, consider partitioning the table and indexes so you can rebuild one index partition at a time. If that is not feasible, at least repack the indexes; this does not reclaim the unused space but it does reorder the pages. Use the sys.dm_db_index_physical_stats dynamic management function to check for the amount of fragmentation in a table or partition.

Index maintenance is a resource-intensive process if you have one or more indexes on a large fact table.

Statistics Maintenance

Statistics maintenance is critical for getting the best query plans for SQL Server 2008. For most databases, it is best to leave the default of having Auto Create Statistics and Auto Update Statistics turned on. Even with these on, you still need to periodically update the statistics on your most frequently modified tables. Create jobs that will run periodically, perhaps once a week.

Statistics are still by table, not by partition. SQL Server does not yet have the ability to manage statistics by partition.

Use the sampling rate that will give you the best plans. Most databases will be fine with the default sampling rate. You may want to change the sampling rate if you start getting bad query plans and discover that increasing the sampling rate gives better query plans. In this case, you need to experiment with higher rates until you start getting better plans. Start with FULLSCAN as the sampling rate and use it unless statistics gathering takes too much time. If you do not have time to do FULLSCAN, reduce the sampling rate.

Backup

A good backup strategy is essential for a well-managed data center. Use hardware snapshot backups whenever possible to get near instantaneous backups. If your hardware does not have that feature, use the compressed backups from SQL Server 2008 or any of the third-party vendors that supply compressed backup software. The compression will save you valuable disk space and is faster than the regular backup.

If your database is very large, striped backups to multiple files and tapes is best for increasing speed. The number of files you use depends on the saturation rate of your disk subsystem.

Reference to Book online of Microsoft

Advertisements

From → DBA

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: