EnjoY | Database Research And Development: MSSQL Server:Auto Create Statistics and Auto Update Statistics

Wednesday, September 16, 2020

MSSQL Server:Auto Create Statistics and Auto Update Statistics

This article is half-done without your Comment! *** Please share your thoughts via Comment ***

Statistics comprises lightweight objects that are used by SQL Server Query optimizer to determine the optimal way to retrieve data from the table. SQL Server optimizer uses the histogram of column statistics to choose the optimal query execution plan. If a query uses a predicate which already has statistics, the query optimizer can get all the required information from the statistics to determine the optimal way to execute the query. SQL Server creates statistics in two ways:

  1. When a new index is created on a column.
  2. If the AUTO_CREATE_STATISTICS option is enabled.

In this article, Auto Create Statistics and Auto Update Statistics options are analyzed. They are database specific and can be configured using SQL Server management studio and T-SQL Query. The following topics are covered in the paper:

  • A detailed explanation of Auto Create Statistics and Auto Update Statistics options.

  • Different approaches to enable Auto Create Statistics and Auto Update Statistics options.

  • A working example of Auto Create Statistics and Auto Update Statistics options.

  • Auto Create Statistics: Good or Bad.

Auto Create Statistics

When a user creates an index on a table, SQL Server optimizer by default creates statistics on the indexed column. If Auto Create Statistics option is enabled, SQL optimizer creates statistics on the nonindexed columns that are used in a query predicate. Optimizer creates statistics for individual columns that are used in a predicate. This statistics is created when a column does not have any statistics created on it previously. And it is used to generate an optimal query execution plan. The naming convention for the statistics created by Auto Create Statistics option is as follows: column_name + statistics_name in hexadecimal format. For example, _WA_SYS_00000003_01142BA1.

Auto Update Statistics

As the name suggests, when we enable Auto Update Statistics option, SQL Server automatically updates the statistics if it is outdated. Statistics is considered to be outdated if:

  1. Insert operation is performed on an empty table.
  2. A table had less than 500 rows when the statistics was generated, and the column modification counter was changed more than 500.
  3. A table had more than 500 rows when the statistics was generated, and the column modification counter is changed more than 500 + 20% of the number of rows in the table.

Different Approaches to Enable Auto Create Statistics and Auto Update Statistics

Now, we can enable Auto Update Statistics by the following methods.

Method 1: Using SQL Server Management Studio

To enable Auto Update Statistics, open SQL Server Management Studiothen in object explorer expand SQL Server instance and right-click the database which you want to enable Auto Update Statistics on. See the image below:

After that database Properties dialog window opens. In the dialog window, click Options. In the right pane, you will see the following options:

  • Auto Update Statistics
  • Auto Create Statistics.

See the image below:

To enable an option, click on drop down box, opposite to Auto Update Statistics or Auto Create Statistics option and select True. See the image below:

After you’ve selected TRUE, click OK to close the dialog box.

Method 2: Using T-SQL Script

We can also change Auto Create Statistics and Auto Update Statistics using T-SQL. To do that, open SQL Server management studio. Press Ctrl + N to open the new query editor window. In Query editor window run the query given below.

USE [master] 
go 

ALTER DATABASE [StackOverflow2013] 
SET auto_create_statistics ON 
go 

ALTER DATABASE [StackOverflow2013] 
SET auto_update_statistics ON WITH no_wait 
go

Once both options are enabled, run the following query to verify:

SELECT NAME, 
       CASE 
         WHEN is_auto_create_stats_on = 1 THEN 'Enabled.' 
         ELSE 'Disabled' 
       END AS 'Auto Creates statistics Status', 
       CASE 
         WHEN is_auto_update_stats_on = 1 THEN 'Enabled' 
         ELSE 'Disabled' 
       END AS 'Auto Update statistics Status' 
FROM   sys.databases 
WHERE  database_id > 4

The output is supposed to be as follows:

Demo Setup

In this demonstration, I am going to use StackOverFlow2013 database. It can be downloaded from here. Once you’ve extracted the files, please attach the database with SQL Server instance and create a new database named DemoDatabase. After that, create a table Users in DemoDatabase. To do that, run the following query:

USE [DEMODATABASE] 
GO 

CREATE TABLE [DBO].[USERS] 
  ( 
     [ID]             [INT] IDENTITY(1, 1) NOT NULL, 
     [ABOUTME]        [NVARCHAR](MAX) NULL, 
     [AGE]            [INT] NULL, 
     [CREATIONDATE]   [DATETIME] NOT NULL, 
     [DISPLAYNAME]    [NVARCHAR](40) NOT NULL, 
     [DOWNVOTES]      [INT] NOT NULL, 
     [EMAILHASH]      [NVARCHAR](40) NULL, 
     [LASTACCESSDATE] [DATETIME] NOT NULL, 
     [LOCATION]       [NVARCHAR](100) NULL, 
     [REPUTATION]     [INT] NOT NULL, 
     [UPVOTES]        [INT] NOT NULL, 
     [VIEWS]          [INT] NOT NULL, 
     [WEBSITEURL]     [NVARCHAR](200) NULL, 
     [ACCOUNTID]      [INT] NULL 
  ) 
GO

Once the table is created in DemoDatabase, we will import data from Users table which is in StackOverflow2013 database. To do that, execute the following code in StackOverflow2013 database:

INSERT INTO DEMODATABASE..USERS 
            (ID, 
             ABOUTME, 
             AGE, 
             CREATIONDATE, 
             DISPLAYNAME, 
             DOWNVOTES, 
             EMAILHASH, 
             LASTACCESSDATE, 
             LOCATION, 
             REPUTATION, 
             UPVOTES, 
             VIEWS, 
             WEBSITEURL, 
             ACCOUNTID) 
SELECT ID, 
       ABOUTME, 
       AGE, 
       CREATIONDATE, 
       DISPLAYNAME, 
       DOWNVOTES, 
       EMAILHASH, 
       LASTACCESSDATE, 
       LOCATION, 
       REPUTATION, 
       UPVOTES, 
       VIEWS, 
       WEBSITEURL, 
       ACCOUNTID 
FROM   USERS 
Output

(2465713 rows affected).

The query above will insert 2465713 rows in Users table created on DemoDatabase. Please note that we have not created any indexes or statistics on the Users table. To verify that, run the following query:

USE demodatabase 
go 
EXEC Sp_helpstats 
  'Users'

As neither indexes nor statistics have been created, the output will be as follows:

Now, AUTO_CREATE_STATISTICS option is enabled on Users table, meaning that SQL optimizer will create an index and a predicate on a column. Now run the following queries to check the behavior of a Query optimizer:

USE DEMODATABASE 
GO 

SELECT DISTINCT AGE 
FROM   USERS 
GO 

SELECT DISTINCT LOCATION 
FROM   USERS 
GO

As we have enabled AUTO_CREATE_STATISTICS on DemoDatabase, SQL server will automatically create statistics on Age and Location columns. To verify that, run the following query:

USE demodatabase 
go 
EXEC Sp_helpstats 
  'Users'

The output is supposed to be as follows:

As you can see in the image above, SQL Optimizer automatically creates statistics on Age and Location columns.

As I mentioned before, when Auto Update Statistics is enabled, SQL Server updates the statistics of the column if a table has more than 500 rows and if row modification counter has been changed more than 500 + 20% of the rows after the last statistics update.

Now, let’s consider the behavior of a query optimizer when Auto Update Statistics is enabled. Before we run our workload, first check the last statistics update time. To do that, Open SQL Server management studio >> Expand Database >> Expand Users table >> Right click Statistics. See the image below:

 

In Statistics Properties dialog window you can find the value for “Statistics for these columns was last updated.” In the picture given below, it is 5/29/2019 01:03:12 PM.

Alternatively, you can check the last time for statistics update by running the following query:

DBCC SHOW_STATISTICS(<Table Name>, StatisticsName) WITH STAT_HEADER

Now let’s run our workload to verify that statistics is being updated. Execute the following query:

INSERT INTO DEMODATABASE..USERS 
            (ID, 
             ABOUTME, 
             AGE, 
             CREATIONDATE, 
             DISPLAYNAME, 
             DOWNVOTES, 
             EMAILHASH, 
             LASTACCESSDATE, 
             LOCATION, 
             REPUTATION, 
             UPVOTES, 
             VIEWS, 
             WEBSITEURL, 
             ACCOUNTID) 
SELECT ID, 
       ABOUTME, 
       AGE, 
       CREATIONDATE, 
       DISPLAYNAME, 
       DOWNVOTES, 
       EMAILHASH, 
       LASTACCESSDATE, 
       LOCATION, 
       REPUTATION, 
       UPVOTES, 
       VIEWS, 
       WEBSITEURL, 
       ACCOUNTID 
FROM   USERS

Once the query is executed, check the statistics properties by running the following query:

DBCC SHOW_STATISTICS(‘USERS’, ‘_WA_SYS_00000003_01142BA1’) WITH STAT_HEADER 

The output is supposed to be as follows:

As you can see in the picture above, statistics has been updated after executing the workload on the Users table.

Auto Update Statistics: Good or Bad

The most important question is, “Should I enable Auto Update Statistics or not?” The answer is, “It depends on the workload and how the application is configured.”

As we know, outdated statistics causes many performance issues because SQL optimizer is not able to create an optimal path to retrieve data from a table; hence, statistics must be up to date. In that respect, Auto Update Statistics is always a good option, but the real problem is OLTP applications.

In the OLTP database, there are chances that Auto Update Statistics can reduce the performance because it is a resource-intensive operation, and it will impact other transactions. Similarly, if the application executes a bulk insert or update operation, update statistics can hurt the performance.

But in most cases, it is always advisable to enable auto update statistics.

Summary

In this article, the following topics are covered:

  1. A detailed explanation of Auto Create Statistics and Auto Update Statistics options.
  2. Different approaches to enable Auto create statistics and Auto Update Statistics options.
  3. A working example of Auto Create Statistics and Auto Update Statistics.
  4. Auto Create Statistics: Good or Bad.

No comments:

Post a Comment

It’s all about friendly conversation here at small review :) I’d love to be hear your thoughts!

Be sure to check back again because I do make every effort to reply to your comments here.

Featured Post

SQL Server : SELECT all columns to be good or bad in database system

This article is half-done without your Comment! *** Please share your thoughts via Comment *** In this post, I am going to write about one o...

Popular Posts