Wednesday, April 15, 2009

Custom Paging

Problem
I need to query a large amount of data to my application window and use paging to view it. The query itself takes a long time to process and I do not want to repeat it every time I have to fetch a page. Also, the number of rows in the result set could be huge, so I am often fetching a page from the end of the result set. I can't use the default paging because I wait a long time until I get the data back. What are my options?

Solution
There are few possible solutions out there for paging through a large result set. In this tip, I am going to focus on three examples and compare the performance implications. The examples are:

  • Example 1 - I use a temporary table (#temp_table) to store the result set for each session.
  • Example 2 - I use a Common Table Expression (CTE) to page through the result set.
  • Example 3 - I populate a global temporary table to store the complete result set.

The first two examples are similar to some of the most commonly used paging stored procedure options, the third example is my own extension which I wanted to show for comparison in this specific case of a complex query with a large large result set.


Example #1 - Using a session temporary table (#temp_table)

In this stored procedure, I create the temporary table and insert only the relevant rows into it based on the input parameters:

CREATE PROCEDURE dbo.proc_Paging_TempTable
(
@Page int,
@RecsPerPage int
)
AS

-- The number of rows affected by the different commands
-- does not interest the application, so turn NOCOUNT ON
SET NOCOUNT ON

-- Determine the first record and last record
DECLARE @FirstRec int, @LastRec int

SELECT @FirstRec = (@Page - 1) * @RecsPerPage
SELECT @LastRec = (@Page * @RecsPerPage + 1)

-- Create a temporary table
CREATE TABLE #TempItems
(RowNum int IDENTITY PRIMARY KEY,
Title nvarchar(100),
Publisher nvarchar(50),
AuthorNames nvarchar(200),
LanguageName nvarchar(20),
FirstLine nvarchar(150),
CreationDate smalldatetime,
PublishingDate smalldatetime,
Popularity int)

-- Insert the rows into the temp table
-- We query @LatRec + 1, to find out if there are more records
INSERT INTO #TempItems (Title, Publisher, AuthorNames, LanguageName,
FirstLine, CreationDate, PublishingDate, Popularity)
SELECT TOP (@LastRec-1)
s.Title, m.Publisher, s.AuthorNames, l.LanguageName,
m.FirstLine, m.CreationDate, m.PublishingDate, m.Popularity
FROM dbo.Articles m
INNER JOIN dbo.ArticlesContent s
ON s.ArticleID = m.ID
LEFT OUTER JOIN dbo.Languages l
ON l.ID = m.LanguageID
ORDER BY m.Popularity desc

-- Return the set of paged records
SELECT *
FROM #TempItems
WHERE RowNum > @FirstRec
AND RowNum < @LastRec

-- Drop the temp table
DROP TABLE #TempItems

-- Turn NOCOUNT back OFF
SET NOCOUNT OFF
GO


Example #2 - Using a Common Table Expression (CTE)

In this example, I use a CTE with the ROW_NUMBER() function to fetch only the relevant rows:

CREATE PROCEDURE dbo.proc_Paging_CTE
(
@Page int,
@RecsPerPage int
)
AS
-- The number of rows affected by the different commands
-- does not interest the application, so turn NOCOUNT ON

SET NOCOUNT ON


-- Determine the first record and last record

DECLARE @FirstRec int, @LastRec int

SELECT @FirstRec = (@Page - 1) * @RecsPerPage
SELECT @LastRec = (@Page * @RecsPerPage + 1);

WITH TempResult as
(
SELECT ROW_NUMBER() OVER(ORDER BY Popularity DESC) as RowNum,
s.Title, m.Publisher, s.AuthorNames, l.LanguageName,
m.FirstLine, m.CreationDate, m.PublishingDate, m.Popularity
FROM dbo.Articles m
INNER JOIN dbo.Content s
ON s.ArticleID = m.ID
LEFT OUTER JOIN dbo.Languages l
ON l.ID = m.LanguageID
)
SELECT top (@LastRec-1) *
FROM TempResult
WHERE RowNum > @FirstRec
AND RowNum < @LastRec



-- Turn NOCOUNT back OFF
SET NOCOUNT OFF
GO


Example #3 - Using a global temporary table to hold the whole result

In this example, I use a global temporary table to store the complete result set of the query. In this scenario, this temporary table will be populated during the first execution of the stored procedure. All subsequent executions of the stored procedure will use the same temporary table. The idea behind this approach is that, when using a Global temporary table, other sessions can also use the same table (if they are aware of the GUID and need the same data). In order to drop the temporary table, you will have to either drop it explicitly or disconnect the session.

If this approach does not work for you, you could use the same technique method to create "temporary" tables in your user defined database with a unique extension. One specific scenario when this technique could be useful is when the tempdb database is already being a bottleneck. If that is the case, with this approach you can always create a dedicated database for these tables. Just do not forget to drop the temporary objects when they are not required.

CREATE PROCEDURE dbo.proc_Paging_GlobalTempTable
(
@Page int,
@RecsPerPage int,
@GUID uniqueidentifier = null OUTPUT -- will output the extension of the table.
-- This parameter should be sent by the application:
-- First time it should be NULL and after, it should be
-- populated by the value that was sent back from the SP.
)
AS
-- The # of rows affected ny the different commands
-- does not interest the application, so turn NOCOUNT ON
SET NOCOUNT ON

-- Determine the first record and last record
DECLARE @FirstRec int, @LastRec int, @cmd varchar(2000)

SELECT @FirstRec = (@Page - 1) * @RecsPerPage
SELECT @LastRec = (@Page * @RecsPerPage + 1)

-- If the GUID is null (first execution) -
-- The global table is created, otherwise it will be queried only:
IF @GUID IS NULL
BEGIN

SET @GUID = NEWID()

SET @cmd = 'SELECT RowNum=IDENTITY(INT,1,1),
s.Title, m.Publisher, s.AuthorNames, l.friendlyName,
m.FirstLine, m.CreationDate, m.PublishingDate, m.Popularity
INTO [##tmp_' + CONVERT(VARCHAR(40),@GUID) + ']
FROM dbo.Abstracts m
INNER JOIN dbo.AbstractsContentSearch s ON s.AbstractID = m.ID
LEFT OUTER JOIN dbo.Languages l on l.ID = m.LanguageID
ORDER BY Popularity DESC;
CREATE UNIQUE INDEX [IDX_##tmp_' + CONVERT( VARCHAR(40),@GUID) + ']
ON [##tmp_' + CONVERT(VARCHAR(40),@GUID) + '] (RowNum)'
EXEC (@cmd)
END

-- Fetch the rows of the desired page
SET @cmd = 'SELECT top (' + CONVERT(VARCHAR(20),@LastRec-1) + ') *
FROM [##tmp_' + CONVERT(VARCHAR(40),@GUID) + ']
WHERE RowNum > ' + CONVERT(VARCHAR(20),@FirstRec) +
' AND RowNum < ' + CONVERT(VARCHAR(20),@LastRec)
EXEC (@cmd)

-- Turn NOCOUNT back OFF
SET NOCOUNT OFF
GO

No comments:

Post a Comment