Heaps (Tables without Clustered Indexes)

A heap is a table without a clustered index. One or more nonclustered indexes can be created on tables stored as a heap. Data is stored in the heap without specifying an order. Usually data is initially stored in the order in which is the rows are inserted into the table, but the Database Engine can move data around in the heap to store the rows efficiently; so the data order cannot be predicted. To guarantee the order of rows returned from a heap, you must use the ORDER BY clause. To specify a permanent logical order for storing the rows, create a clustered index on the table, so that the table is not a heap.

Note

There are sometimes good reasons to leave a table as a heap instead of creating a clustered index, but using heaps effectively is an advanced skill. Most tables should have a carefully chosen clustered index unless a good reason exists for leaving the table as a heap.

When to Use a Heap

When a table is stored as a heap, individual rows are identified by reference to an 8-byte row identifier (RID) consisting of the file number, data page number, and slot on the page (FileID:PageID:SlotID). The row ID is a small and efficient structure. Sometimes data professionals use heaps when data is always accessed through nonclustered indexes and the RID is smaller than a clustered index key. Heaps are also used for

If a table is a heap and does not have any nonclustered indexes, then the entire table must be read (a table scan) to find any row. SQL Server cannot seek a RID directly on the heap. This can be acceptable when the table is small.

When Not to Use a Heap

Do not use a heap when the data is frequently returned in a sorted order. A clustered index on the sorting column could avoid the sorting operation.

Do not use a heap when the data is frequently grouped together. Data must be sorted before it is grouped, and a clustered index on the sorting column could avoid the sorting operation.

Do not use a heap when ranges of data are frequently queried from the table. A clustered index on the range column will avoid sorting the entire heap.

Do not use a heap when there are no nonclustered indexes and the table is large, unless you intend to return the entire table content without any specified order. In a heap, all rows of the heap must be read to find any row.

Do not use a heap if the data is frequently updated. If you update a record and the update uses more space in the data pages than they are currently using, the record has to be moved to a data page that has enough free space. This creates a forwarded record pointing to the new location of the data, and forwarding pointer has to be written in the page that held that data previously, to indicate the new physical location. This introduces fragmentation in the heap. When scanning a heap, these pointers must be followed which limits read-ahead performance, and can incur additional I/O which reduces scan performance.

Managing Heaps

To create a heap, create a table without a clustered index. If a table already has a clustered index, drop the clustered index to return the table to a heap.

To remove a heap, create a clustered index on the heap.

To rebuild a heap to reclaim wasted space:

Create a clustered index on the heap, and then drop that clustered index.

Use the ALTER TABLE ... REBUILD command to rebuild the heap.

Warning

Creating or dropping clustered indexes requires rewriting the entire table. If the table has nonclustered indexes, all the nonclustered indexes must all be recreated whenever the clustered index is changed. Therefore, changing from a heap to a clustered index structure or back can take a lot of time and require disk space for reordering data in tempdb.