Pivot Multiple Worksheets

Pivot Multiple Worksheets

As I’ve mentioned before, these days I use Excel more and more for developing and testing SQL code. As part of that I often compare of sets of output from SQL. And as part of that I sometimes I find it useful to pivot multiple worksheets.

For example, I just finished a project of translating a query from one data warehouse to another. The new database has a completely different schema than the old – new tables, new fields, new behaviors. My goal was to develop a query that returned the same results from the new database as those from the old.

To compare the outputs, I created two tables (listobjects) in a single workbook. The first table had a connection to the old data warehouse and uses the old query as its Command Text. The second table is connected to the new data warehouse, and was where I’d test the SQL I was developing.

Especially at first, there were quite a few differences in the output of these two queries in these two tables. Comparing the outputs in a pivot table let me see these differences clearly, both in summary and in detail.

Here’s a very simple example using my trusty pie data. In this example I have two different tables on two sheets with slightly different pie orders. Here’s the output from data warehouse 1…

pie table 1

and here it is from data warehouse 88b…

pie table 2

I conveniently placed differences in the Quantity column near the top, so you may be able to just pick them out by eye. And you may even have caught the one date field discrepancy. However, after combining the two tables into one, adding a “Source” column and then pivoting, the differences become easy to pick out, especially with a little conditional formatting:

pie pivot comparison

In the pivot above, “2”s in the Grand Total column represent all the records where the two queries returned the same results. The “1”s point to the discrepancies.

This is a flexible and powerful comparison method. Benefits include:

  • You can quickly add or subtract fields from the pivot to pinpoint the differences.
  • You can change the orders of the fields.
  • If you add subtotals you can then double-click on those with disrepancies to drill down to just those results.

For a while I created these combined source tables manually, just pasting the two sets of results together, adding a column “Source” column with “DW_1” and “DW_88b.” This worked fairly well, but after several times it cried out for automation.

The VBA below keys off of selected sheets in a workbook. Just select the ones you want to pivot and then run the code. Here you can see that both sheets are selected, and I’ve added the “Pivot Multiple Sheets” macro to the tab’s right-click menu (with MenuRighter, of course).

ply menu

The code first collects all the data necessary for the connection and then closes the source workbook. (I did this to avoid memory leaks or whatever it is that makes things go wonky if the workbook is open at the same time I’m creating a connection to it.) It uses that data to create the Source and SQL strings. The SQL is just a series of SELECTS, one for each selected worksheet, connected with UNION ALLs.

The newly created worbook contains a table with the connection to the source workbook and a pivot table pointed at that table. The table’s “Source” field becomes a column in the pivot table, containing the names of the two or more worksheets. The rest of the table columns become pivot table row fields. The connection in this workbook is live, so that if you make changes to the source they will appear in this workbook once you refresh the data and pivot:

Sub Pivot_Multiple_Sheets()

Dim wbToPivot As Excel.Workbook
Dim SheetsToPivot As Excel.Sheets
Dim SourceFullName As String
Dim SourceString As String
Dim wbWithPivot As Excel.Workbook
Dim wsWithQueryTable As Excel.Worksheet
Dim SheetsToPivotCount As Long
Dim SheetsToPivotNames() As String
Dim qt As Excel.QueryTable
Dim i As Long
Dim SqlSelects() As String
Dim sql As String
Dim pvt As Excel.PivotTable
Dim pvtField As Excel.PivotField

If ActiveWorkbook Is Nothing Then
    MsgBox "No active workbook."
    Exit Sub
End If
Set wbToPivot = ActiveWorkbook

If Not wbToPivot.Saved Then
    MsgBox "Please save this workbook before running." & vbCrLf & _
                 "Workbook will be closed by this utility" & _
                 "after the process is completed."
    Exit Sub
End If

'This code acts on the Selected Sheets
Set SheetsToPivot = wbToPivot.Windows(1).SelectedSheets
If SheetsToPivot.Count = 1 Then
    MsgBox "Please select two or more worksheets (no charts)."
    Exit Sub
End If

SheetsToPivotCount = SheetsToPivot.Count

For i = 1 To SheetsToPivotCount
    If Not TypeName(SheetsToPivot(i)) = "Worksheet" Then
        MsgBox "Please select two or more worksheets (no charts)."
        Exit Sub
    End If
Next i

SourceFullName = wbToPivot.FullName
ReDim SheetsToPivotNames(1 To SheetsToPivotCount)
For i = 1 To SheetsToPivotCount
    SheetsToPivotNames(i) = SheetsToPivot(i).Name
Next i
'Change Selection to only one sheeet
SheetsToPivot(1).Select
'Close the source workbook before creating the new one and its connections
'Save it so not prompted
wbToPivot.Close True

Set wbWithPivot = Workbooks.Add
'Delete any extra worksheets
For i = wbWithPivot.Worksheets.Count To 2 Step -1
    Application.DisplayAlerts = False
    wbWithPivot.Worksheets(i).Delete
    Application.DisplayAlerts = True
Next i
Set wsWithQueryTable = wbWithPivot.Worksheets(1)
wsWithQueryTable.Name = "Data Table"
'Don't know why this is needed, but otherwise .CommandText line below fails
wsWithQueryTable.Activate

'I got rid of a lot of fields in connection - still seems to work
SourceString = "ODBC;DSN=Excel Files;DBQ=" & SourceFullName
'Create an array of SELECT statements
ReDim SqlSelects(1 To SheetsToPivotCount)
For i = 1 To SheetsToPivotCount
    SqlSelects(i) = "SELECT" & vbCrLf & _
                    "'" & SheetsToPivotNames(i) & "' as Source," & vbCrLf & _
                    "Sheet" & i & ".*" & vbCrLf & _
                    "FROM" & vbCrLf & _
                    "`" & SourceFullName & "`.[" & SheetsToPivotNames(i) & "$] AS Sheet" & i
Next i
'Connect the SELECTS with UNION ALL
For i = LBound(SqlSelects) To UBound(SqlSelects) - 1
    sql = sql & SqlSelects(i) & vbCrLf & "UNION ALL" & vbCrLf
Next i
sql = sql & SqlSelects(i)

Set qt = wsWithQueryTable.ListObjects.Add(SourceType:=0, Source:=SourceString, Destination:=wsWithQueryTable.Range("$A$1")).QueryTable
With qt
    .CommandText = sql
    .ListObject.DisplayName = "tbl" & Format(Now(), "yyyymmddhhmmss") & Right(Format(Timer, "#0.00"), 2)
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .RefreshPeriod = 0
    .PreserveColumnInfo = True
    'I like it to preserve the widths the first time it's run, and below turn it to false
    .AdjustColumnWidth = True
    .Refresh BackgroundQuery:=False
    .AdjustColumnWidth = False
End With
wbWithPivot.Worksheets.Add
With ActiveSheet
    .Name = "Pivot"
    Set pvt = .Parent.PivotCaches.Create(SourceType:=xlDatabase, SourceData:=qt.ListObject.Name).CreatePivotTable(TableDestination:=.Range("A1"))
    pvt.AddDataField Field:=pvt.PivotFields("Source"), Function:=xlCount
    With pvt.PivotFields("Source")
        .Orientation = xlColumnField
        .Position = 1
    End With
    For Each pvtField In pvt.PivotFields
        If pvtField.Name <> "Source" Then
            pvtField.Orientation = xlRowField
            pvtField.Position = pvt.RowFields.Count
        End If
    Next pvtField
End With
End Sub

To use this code put it in your Personal.xlsb or any workbook besides the one with the source data.

This code could use some more error-checking. For example, if the two sheets have a different number of columns. Even more important is the addition of whatever kind of general error handling you use so you exit gracefully from bad connection strings and other such inevitable problems.

Speaking of bad connection strings, you may notice that I’ve ditched the Default Directory, DriverId, BufferSize, MaxPageTimeOuts and whatnot from the connection. I did that to see if it worked. It did, so I never added them back. I see that they reappear in the connection properties for the table:

connection properties

I ran this code in Excel 2010 and 2013. I don’t know how portable this code is to other Excel versions. I also don’t know if you’ll have performance issues if you have the source and pivot workbooks open at the same time.

If you’re interested in this topic be sure to take a look at Kirill Lapin’s method, posted on Contextures. His method keeps the source and the pivot table in one workbook, deleting the connection in between refreshes of the pivot table. I think Kirill’s method is nice for more traditional pivot table use where you want to merge different data sets with the same format, e.g., eastern and western sales regions.

I like my method because it requires no setup for the source workbook, keeps a refreshable connection and arranges the pivot table for comparison.

I’d love to hear anybody’s opinion on the stability of this method, i.e., when opening both the source and the connected data at the same time. Also, I’m curious if this code works in other versions besides 2010 and 2013. These are areas where my knowledge is pretty piecemeal, so any help would be appreciated.

4 thoughts on “Pivot Multiple Worksheets

  1. Doug,
    I like your idea for comparing two data sets in one table – I struggle with this all the time, so I will try and adapt it to my data problem. Unfortunately, your code crashes for me every time at the workbook.close step, perhaps because I have the code in a separate workbook, but not an add-in.
    I think PowerQuery might be adaptable to achieving the same result, so I might give that a try too.

    • Hey AlexJ, sorry to hear that. I did a bunch of shuffling things around in order to close the source before I opened the target – not sure how necessary that was – and the code was erroring out on me at that line as well. I added the line to activate the new sheet and it fixed it for me.

      Yes, I’m pretty sure PowerQuery can do that. I think they’re testing facilities might be better than mine too 🙂 .

  2. Doug,
    So I stumbled on Matt Allington’s post at http://www.powerpivotpro.com/2015/07/consolidated-worksheets-with-power-query/#more-13366
    which automates appending sheet data in Power Query. I think this is an easier approach. I’m still learning PQ, but I’m considering a couple of modifications to his approach. First, I think I’d like a function that only does the data appending from the sheets, a second query could do transforms. Second, I want to add a utility to easily select sheets to use from outside PQ, and read those items into PQ as parameters.

    • AlexJ, that looks interesting, and something I should learn more about. As to being “easier” I guess that depends to some extent on what you already know!

      Keep us posted if you want (and feel free to send a sample).

Leave a Reply to Doug Glancy Cancel reply

Your email address will not be published. Required fields are marked *

To post code, do this: <code> your vba here </code>