I was asked for help on a problem that can be simplified to:
A SQL Server is exhibiting excessive disk space usage. Much of that space is used by one table and appears to be unused space within the table. It appears to be growth in the LOB_DATA. In this table one column named body is stored as nvarchar(max). The number of rows in the table is not varying much over time.
To assist with diagnosing this I asked some questions starting with the simplest explanations I could think of.
Question 0: What is the edition and version of the SQL Server instance and what is the database compatibility level? We can use this information to look for known issues.
Question 1: What is the “large value types out of row” setting for the table? You should be able to query something like select large_value_types_out_of_row from sys.tables WHERE …
My primary bet is that code is repeatedly writing to the database. That by itself may not be a huge problem, but combined with snapshot isolation being enabled, SQL Server may be keeping ghost records around for transactions that are in progress aiding in the growth.
Question 2: Do we know if there are repeated writes to the table (i.e. instead of just one insert for each row that exists, there may be multiple updates or even deletes and inserts replacing the rows)?
Question 3: If we haven’t confirmed that there are repeated writes, have we proved there are not repeated writes? We might do this by adding a trigger to the table to record at least some minimal data about inserts, updates and deletes.
Question 4: If records are repeatedly being updated can that be prevented or reduced? Examples I have seen in the past include trimming the whitespace from columns but not in all places, thus that data being synced with the database is different during a comparison process but becomes the same during the save process.
If Questions 2, 3 or 4 indicate repeated writing of the same values, an INSTEAD OF trigger that skips the UPDATE if all the fields match could be a temporary workaround to prevent the excessive file growth.
Question 5: What percentage of rows have data_length(body) > 7850? A short-term workaround might be to make some views and/or triggers such that we limit the length of the body to fit within rows by changing the actual storage of body to nvarchar(4000) or whatever fits and truncate the rest. If there isn’t any (or much) data exceeding what could be stored in a nvarchar(4000) field then perhaps a temporary workaround is changing the column definition from nvarchar(max) to nvarchar(4000) and the application code may not even need to be changed at all for that schema change.
Question 6: Has rebuilding the table by copying it to a new table been tried? Much less likely, but I have seen a problem before that due to a bug somewhere around 2000-2005 SQL Server had extra space in the table that could not be reclaimed. We had to copy the data into a new table and delete the old one after updating to a version of SQL Server that had the fix to prevent it from happening again.
Resolution
I believe that the resolution for this problem may have been a reduction in the number of updates to the rows of the table combined with a drastic reduction of how many transactions were rolled back after performing updates to the table.