Percent of True Items in a Pivot Table Field

Sometimes you may want to use a pivot table to summarize data whose values are either true or false. For example, whether congressional representatives have law degrees, whether cities have chlorinated water supplies, or whether students are taking classes at the honors level. I’m thinking of traits which exist not in opposition to another trait, e.g., blue eyes versus brown, but, in isolation, e.g., “has brown eyes.” In many cases, such as pulling from a database, these types of items are only filled if they’re true, otherwise they are left blank.

While struggling to summarize some data like this in a single column of a pivot table, I had the following realization:

The percentage of True items in a list is the average of zeros and ones, where True is represented by 1 and False by 0.

For example, assume you have a list of students in different classes, some of whom are taking the class at the honors level. This is indicated in an”Honors” column that’s either marked True or left blank. (Note that the following would work exactly the same if the blanks were instead marked False.)

The Clunky Way

To show the percent at the honors level, you could pivot on the Honors column as it is, but you’d have to show both the True and blank values, as in the pivot table below:

Pivot True and False

In this pivot table, the Values field is Students, “Summarize Values By” is set to “Count” and “Show Values As” is set to “% of Row Total”.

With this setup you’re stuck with using two pivot table columns. If you uncheck “(blank)” in the Honors dropdown, the pivot table reports 100% for every class, since it’s now filtered to only the True items.

The Un-Clunky Way

So instead let’s add a helper column to our data. I called it “Honors for Average” in the picture below. It just multiplies the adjacent Honors column cell by one, resulting in either 0 or 1.

helper column added

We can now pivot on the Honors for Average column. In the pivot table below, Class is in the Row area and the Value field is Honors for Average. “Show Values As” is set to the default of “No Calculation” and, most important, “Summarize Values By” is set to “Average.”

pivot-True only

Then, to finish it up, I changed the title to something more meaningful.

title changed

I think this is a much clearer and more concise way to represent this type of data.

Re-Apply Pivot Table Conditional Formatting

I often use conditional formatting in pivot tables, often to add banding to detail rows and highlights to total rows.  I like conditional formatting in XL 2010 for the most part, but sometimes it’s persnickety.  It seems to change its mind from day-to-day about what’s allowed.

One well-known problem is that if you apply conditional formatting to both your row fields and the data items, like this:
pivot table with intact conditional formatting

and then refresh it, the formatting is wiped from the data (values) area, as shown below:

There are a couple of ways to fix this.  One is to specifically apply the formats to the values area(s), a new feature as of Excel 2007.  Conditional formats added this way aren’t cleared by pivot table refreshes:

apply CF to data area

This works fairly well as long as your data area only includes one values field, but if you are pivoting on multiple values fields, you’ll have to add the rule for each one.  And you can’t specify row fields in this dialog, so you’ll have define the formats again for those areas.  And if you alter the formats you’ll have to do it all again.

For these reasons I’d rather just apply the conditional formatting to the row headings and the values area in one fell swoop.  But I don’t want to visit the condtional formatting dialog to re-expand the range each time a pivot table is refreshed.

So, I wrote the code below to expand the condtional formatting from the first row label cell into all the row label and data area cells:

Sub Extend_Pivot_CF_To_Data_Area()
Dim pvtTable As Excel.PivotTable
Dim rngTarget As Excel.Range
Dim rngSource As Excel.Range
Dim i As Long

'check for inapplicable situations
If ActiveSheet Is Nothing Then
    MsgBox ("No active worksheet.")
    Exit Sub
End If
On Error Resume Next
Set pvtTable = ActiveSheet.PivotTables(ActiveCell.PivotTable.Name)
If Err.Number = 1004 Then
    MsgBox "The cursor needs to be in a pivot table"
    Exit Sub
End If
On Error GoTo 0

With pvtTable
    'format conditions will be applied to row headers and values areas
    Set rngTarget = Intersect(.DataBodyRange.EntireRow, .TableRange1)
    'set the format condition's source to the first cell in the row area
    Set rngSource = rngTarget.Cells(1)
    With rngSource.FormatConditions
        For i = 1 To .Count
            'reset each format condition's range to row header and values areas
            .Item(i).ModifyAppliesToRange rngTarget
        Next i
    End With

    'display isn't always refreshed otherwise
    Application.ScreenUpdating = True
End With
End Sub

The key to this code is the ModifyAppliesToRange method of each FormatCondtion. This code identifies the first cell of the row label range and loops through each format condition in that cell and re-applies it to the range formed by the intersection of the row label range and the values range, i.e., the banded area in the first image above.

This method relies on all the conditional formatting you want to re-apply being in that first row labels cell. In cases where the conditional formatting might not apply to the leftmost row label, I’ve still applied it to that column, but modified the condition to check which column it’s in.

This function can be modified and called from a SheetPivotTableUpdate event, so when users or code updates a pivot table it re-applies automatically. I’ve also added this macro to the Pivot Table Context Menu and some days it gets used a lot.