Deadline for receiving credit for this article:
December 31, 2010 |
JHQ CE 209: Quartile Dashboards: Translating Large Data Sets into Performance Improvement Priorities
Diane Storer Brown, Carolyn E. Aydin, Nancy Donaldson
Keywords:
Benchmarking, Dashboard, Prioritization, Radar diagrams
November/December 2008
| Quality professionals are the first to understand challenges of transforming data into meaningful information for frontline staff, operational managers, and governing bodies. To understand an individual facility, service, or patient care unit’s comparative performance from within large data sets, prioritization and focused data presentation are needed. This article presents a methodology for translating data from large data sets into dashboards for setting performance improvement priorities, in a simple way that takes advantage of tools readily available and easily used by support staff. This methodology is illustrated with examples from a large nursing quality data set, the California Nursing Outcomes Coalition. |
Understanding Performance Data Traditionally, large quality data sets have been
summarized using descriptive statistics such as frequencies, averages, and
standard deviations placed in tables, bar graphs, or line graphs to track key
metrics over time. Those operationally accountable to improve patient care
quality and safety depend on quality professionals to translate data into
usable information, which is then used to determine performance thresholds for
drill-down analyses or benchmarks and performance goals to understand relative
comparative performance. This article uses common definitions for performance
metrics as follows from Merriam Webster Online
Dictionary (2007): Goal is the end
toward which effort is directed (where you want your performance to be) and is
synonymous with target, a goal to be
achieved; threshold is a level, point, or value
above which something will take place, and below which it will not (the point
where performance has declined and you need to drill down further to understand
why); a benchmark is something that serves as a
standard by which others may be measured or judged (a best practice that you
strive to meet or exceed).
Those new to the field of healthcare quality must learn how to
translate data for benchmarking endeavors based on the data set under review. Raw
data reported out as frequencies (the count or number of
occurrences) has little use in performance monitoring, with the exception of
monitoring rare events. When monitoring patient safety indicators that occur
rarely, monitoring days between occurrences may be an important metric for
frontline staff watching zero-tolerance indicators such as falls with major
injury. The mean or average, calculated
as the sum of all occurrences divided by the number of occurrences, is a
statistic likely reported in all numerical data sets. However, the mean is
known to be sensitive to extreme values or outliers, especially when sample
sizes are small (Dawson & Trapp, 2004). This means that one patient with an
extreme value can pull the mean for the data set and leave the wrong impression
about performance for all patients, which could lead to unnecessary improvement
efforts. The median or middle value may be a better
reference point for data sets when there are extreme values. The median
reflects the middle point of all observations—half the observations are larger
than the median, and half are smaller. The median is also more appropriate to
use for ordinal data—data where there is an
inherent order to the values, but the values themselves may not have meaning.
An example of ordinal data consists of the numeric response choices on a
satisfaction survey where 1 may represent dissatisfaction and 5 may represent
complete satisfaction. The average of these data (response choices of 1–5) may
be distorted or skewed by survey respondents selecting complete satisfaction
(5), and those interpreting the results may not clearly see the distribution of
the patients or staff responses.
Understanding how the data actually spread out is important for
determining performance goals and benchmarks from data sets. Traditionally, the
average may have been used as a goal. However, in today’s competitive
healthcare industry, striving to be average may not be the benchmark that
senior leaders wish to target. Quality professionals have the task of
interpreting the spread of the data to help establish useful benchmarks from
the data set so that leaders can establish realistic targets. Healthcare
quality data are often skewed data—data that
are not symmetrically distributed (bell-shaped or normally distributed) in such
a way that half the data are above the mean and half are below. In symmetrical
data, the mean and the median are numerically equal. This is
important information to confirm when using a mean for a target––when the mean
is pulled by extreme values, it may not be representative. The range
may be included in reports to show where the mean sits in the data set. The
range describes the data spread from the highest to the lowest numbers and is
calculated by subtracting the minimum value from the maximum (Dawson &
Trapp, 2004). The same information is available if data sets provide the
minimum and maximum values.
Most data sets report standard deviations when means are
reported. The standard deviation mathematically
describes how the data spread out around the mean by representing the average
distance of observations from the mean (Dawson & Trapp, 2004). You might
recall from statistics classes that if the observations are symmetrical or
normally distributed (in a bell-shaped curve), then 67% are between the mean
and plus or minus 1 standard deviation; 95%, between the mean and plus or minus
2 standard deviations; and 99.7%, between the mean and plus or minus 3 standard
deviations. By taking the mean and adding or subtracting 1, 2, and 3 standard
deviation values from it, you will see the distribution of the data and will
better understand the usefulness of the mean to set performance metrics.
An example of setting performance metrics with service times
(minutes of waiting) follows. When measuring minutes of waiting, negative values
would not be possible (minutes below zero), and if the mean minus 1 standard
deviation produces negative numbers, consider whether there were patients with
extremely long wait times that pulled the group average up (resulting in a
large standard deviation). The average for this data set may not be useful for
performance metrics. Consider pulling the outliers out of the data set after
reviewing the individual data points. A scattergram is an easy way to see the
outliers. By looking at the actual data and pulling out extreme values (e.g.,
more than 3 standard deviations), the average for these data would be lower and
would better reflect actual patient experiences.
As benchmarking data sets become more sophisticated, reporting
percentiles is emerging as another way to understand the spread of the data and
to provide more specificity for establishing performance metrics. Percentiles
are easier to explain to those who operationally use the data, and it is easier
to set benchmarks or targets with percentiles. A percentile
is the percentage of a distribution (responses or values) that are equal to or
below that number (Dawson & Trapp, 2004). Percentiles are commonly reported
in healthcare with growth charts for children and in academia with test scores.
For example, in a growth chart, if 60 pounds is the 90th percentile, that
number tells us that 90% of the children at that age weigh 60 pounds or less,
and 10% of the children weigh more. It is easy to understand that this child is
heavier than 89% of the other children the same age.
When percentiles are available, quartiles and interquartile
ranges describe how the data spread out and thus are extremely valuable for
establishing performance metrics. Quartiles divide the
data set into four quarters, with the 25th percentile as the first or lower
quartile; the 50th percentile as the median or middle, which separates the
second and third quartiles; and the 75th percentile as the upper quartile (Figure
1). The interquartile range
is the spread of data between the 25th and 75th quartiles—the middle values
that represent 50% of the data set. Quartiles demonstrate performance relative
to others in the data set and are used to set meaningful metrics. For example,
if service satisfaction scores are being compared, and your unit or hospital
falls in the lower quartile, this means that 75% of those compared have higher
satisfaction. A meaningful goal might be to reach the 50th percentile for
performance. Setting the 75th percentile or upper quartile as the goal may be a
stretch goal and difficult to achieve, creating frustration for those
accountable to implement improvements. The 50th percentile, or median, could be
a short-term goal; and the 75th percentile, a long-term goal. Another hospital
might already be in the upper quartile at the 85th percentile; quality
professionals at that hospital may wish to set the 75th percentile as the
threshold indicating that their performance has declined (or indicating that
the competition has gotten better).
Use of percentiles and quartiles for benchmarking expands the
toolbox for quality professionals for data display beyond traditional pie
charts, bar graphs, and trend or line graphs. Today, quality professionals can
use the following guidelines in deciding which measure of data spread may be
most appropriate for a given data set (Dawson & Trapp, 2004):
- Standard deviations are appropriate when the
mean is used and the data are symmetrical numerical data.
- Percentiles and the interquartile range are
appropriate when the median is used for ordinal data or the numerical data are
skewed.
- Interquartile ranges can be used to describe
the middle 50% of the data distribution regardless of its shape.
- Ranges are used with numerical data when the
purpose is to understand extreme values.
Where does the quality professional begin to translate data
sets into dashboards and set performance targets, thresholds, and benchmarks?
Armed with a basic understanding of the statistics described earlier, quartiles
may provide a more sophisticated methodology to establish evidence-based
performance metrics. Quartiles or percentiles can be selected as goals for
performance, as thresholds for drill-down analyses if performance is already at
the desired level, or as benchmarks for best practices from high performers.
Understanding Data Set Reports Databases provide information to users in a variety of
formats. Selecting which format to use may be overwhelming for new quality
professionals. Keeping the purpose of the data review in mind will help make
the selection easier. Typical reports include summaries of multiple indicators
at a point in time, comparison of performance against outside benchmarks,
comparison of performance on an individual or multiple indicators with a
picture, and monitoring performance on individual indicators over time. To
illustrate reports that are commonly available, examples from the California
Nursing Outcomes Coalition (CalNOC) data set are described, with discussion on
how to use the reports to meet the reviewer’s intended purpose.
CalNOC, a regional nursing quality measurement database, is a
collaborative effort of the American Nursing Association–California (ANA/C) and
the Association of California Nurse Leaders to advance improvements in patient
care by sustaining a valid and reliable statewide outcomes database. Voluntary
membership is available to all acute care hospitals in the state of California,
as well as selected hospital groups in other states in the western region of
the United States. In 2007, more than 180 of California’s 366 acute care
hospitals participated in CalNOC, with additional hospitals from Nevada,
Arizona, Oregon, and Hawaii. Nurse-sensitive quality indicators are collected
at the patient care unit level and clustered into categories of variables
related to nurse staffing (hours of care, skill mix, use of contract staff,
staff turnover, and bed turnover); registered nurse (RN) education level,
certification, and years of experience; patient falls; pressure ulcer (PU)
prevalence; restraint prevalence; central line–associated bloodstream
infections; and medication administration accuracy. Hospitals access Web-based
customized reports generated directly from the data set to compare their own
performance with that of like hospitals. CalNOC hospitals develop their own
facility dashboards, combining reports from the Web site with those from other
data sources to display indicators on a single document (Donaldson et al.,
2005). The CalNOC project has been described in detail elsewhere (Aydin et al.,
2004; Brown et al., 2001).
Summary statistic reports
provide a quick reference for aggregated data at a given point in time (e.g.,
the current quarter) to populate dashboards or view indicators tracked over
time. These reports often provide columns of aggregated numeric data without
graphs, and they usually include averages and measures of data spread such as
standard deviations or minimum and maximum values and may provide quartiles.
CalNOC summary statistics reports provide member hospitals with aggregated
statistics for all CalNOC hospitals on all variables. Figure
2 shows an example of summary statistics for staffing and falls by
unit type and hospital average daily census.
Graph reports provide
a visual comparison of performance on select indicators at a point in time
(e.g., the current quarter). Graphs provide a visual representation of
comparative hospital performance, which may quickly provide performance
information. Graphs should not be used to summarize all
data, only those prioritized for performance monitoring. When reports include
pages and pages of graphs, the key messages and analyses from the data set are
lost on those reviewing the reports. Figure 3 shows a sample
comparison graph for falls per 1,000 patient days for all medical/surgical
units in hospitals with an average daily census under 100 patients. This graph
gives hospitals a visual representation of the variation among hospitals,
followed by a report that lists the actual performance for each hospital (not
included).
Trend reports provide
the ability to monitor prioritized indicators over time. These reports often
include graphs as well as a data table for monitoring. Using trend charts can
help hospitals understand their ongoing performance over time by watching the
slope of the line or bars to understand whether performance is improving,
declining, or stable compared to the same hospital (your hospital) each month
or quarter. Figure 4 provides an example of a hospital trend
report for falls per 1,000 patient days for one hospital. Both the facility
average and CalNOC average for the selected time period are shown by lines
across the graph. The report includes the graph shown, followed by a table
listing the actual numeric fall rates for each month (not included).
Be careful when monitoring only trend reports.
Even if performance remains stable (i.e., flat slope), comparison to others is
still important to see whether the bar rises. As the group prioritizes
improvement over time, the group average may raise the bar or benchmark. Even
if individual performance is stable, relative performance may decline—for
example, from the 90th percentile to the 80th percentile—simply because the
rest of the group in the data set improved. It would be a mistake to monitor
only individual performance over time.
Monitoring trends over time for prioritized indicators is very
important in determining whether gains are held. When data are being viewed
over time, it is usually better to use line graphs to better visualize trends. Figure
5 provides an example of the same data using vertical bar graphs and
line graphs. Although both graphs clearly demonstrate the spike in restraint
use in 2005, the trend of decrease over time is much clearer in the line graph.
Benchmarking reports
provide a succinct summary of performance, together with the performance of
like groups. These reports may be helpful to senior leaders such as the chief
officers or the board of directors when data are at the facility level, and
they may be helpful to individual unit managers when data are at the unit level
of analysis. These reports are usually numeric data in columns and provide
comparisons for the individual performance with other groups such as state or
national averages, or averages of other like facilities based on criteria from
the given database. Data may be similar to summary statistics with averages and
data spread information and may include percentiles or quartile information.
CalNOC’s facility-level benchmarking reports show summary data for the total
facility and by unit type (i.e., critical care, step-down, and medical/surgical
units). Figure 6 shows a facility-level benchmarking
report for prevalence studies. Unit-level data allow managers to compare their
performance within the facility as well as externally. Unit managers can
examine unit performance in detail, including both PU prevention process
variables and patient outcomes. These statistics track the actual number of
patients with ulcers in addition to the percent. Actual numbers may be
meaningful to frontline unit staff when tracking rare events by days between
occurrences. Also included are statistics useful for performance metrics such
as the facility mean by unit type, like hospital mean by unit type, and CalNOC
mean by unit type. Taken together, the statistics on this unit-level report
provide a valuable drill-down into both patient outcomes and the PU prevention
process.
Translating Data into Quartile Dashboards A six-step process has been developed to guide quality
professionals through the translation process. Continuing with the CalNOC
example, and using the definition for dashboards presented earlier, prioritized
indicators representing structure, process, and outcomes were selected to
demonstrate a simple method to translate quartile information from summary
reports using readily available tools in software products such as Microsoft
Excel or PowerPoint.
Step 1: Prioritization After reviewing all
the reports available to quality professionals in databases, the next challenge
is one of synthesizing the information to narrow the focus to indicators that
are important to monitor compared to benchmarks. Prioritization should come
from the key stakeholders who manage operations associated with the data set.
Indicators should be limited to the “vital few” and should represent structure,
process, and outcomes. The prioritized indicator list will need to be placed
into a spreadsheet to create the dashboard.
Step 2: Translating Performance into Quartiles Performance on the
prioritized indicators will next need to be translated into quartiles. Gather
the reports that provide benchmark quartile values with facility performance.
For each indicator, identify the numeric value that defines the range of values
for each quartile in the data set. Next, identify the facility’s individual
performance and where that value falls within the identified quartile range
(this can be done concurrently or as individual steps). Transfer this
information into the spreadsheet. This abstraction from
summary reports can be completed by support staff after training on the
specific reports that will be used and the fundamentals of quartile metrics. Figure 7 shows
a very simple worksheet for capturing performance by indicating which quartile
the hospital fell into for each indicator. Percentile numbers (25, 50, 75) were
assigned in the last column of the worksheet, which will be used to generate
dashboard graphs.
As a practice example for
translating quartile information, refer back to Figure 2, Summary Statistics,
as a reference. Total hours
per patient day in
medical/surgical units has the following quartiles: the lower quartile is 7.44
(1st to 25th percentiles), the median value is 8.56 (50th percentile), and the
upper quartile begins at 9.75 (75th to 100th percentile). Next, identify the
individual hospital’s performance on the same indicator. If the value is 7.44
or less, it is in the lower quartile; if it is 7.45 to 8.56 (the median value),
it is below the median but above the lower quartile; if it is 8.57 to 9.74, it
is above the median but below the upper quartile; and if it is 9.75, it is in
the upper quartile.
Step 3: Creating the Dashboard The next step in the translation process is to use the
quartile data to create a picture that will show performance priorities using
the data in the last column of the worksheet and a readily available software
application, Microsoft Excel or PowerPoint. Again, support staff will be able
to accomplish this translation once the indicators have been selected and the
worksheet has been set up.
Figure 8 shows a traditional way to look at
these data using horizontal bar graphs. The quartiles are demarcated
numerically by the percentiles that define them. A more powerful picture may be
available for quartiles using radar or spider diagrams. Figure
9 provides the same information, but the picture is more powerful
visually. Similar to the bar graph, the quartiles are demarcated numerically by
the percentiles that define them. The center of the diagram represents the
lower quartile, with each quartile moving away from the center progressively,
so that the upper quartile is the outer ring of the diagram, which resembles a
spider web. Performance is identified by coloring of the diagram—with more
color indicating performance reaching out from the center and lower quartile.
Step 4: Consolidation to a One-Page Dashboard Cluster the graphs on a one-page document so that all
information is readily available at a glance. Two examples are provided in Figure
10 and Figure 11, showing the
horizontal bar graphs and the radar diagrams, respectively, using structure,
process, and outcome indicators from the worksheet. Because all the data are on
one page, the end user can quickly visualize comparative performance on prioritized
indicators.
Step 5: Supporting Documentation Creation of an appendix or supporting document for the
dashboard is based on the end user’s need for additional information. A table
of indicator definitions may be included, which also could provide data sources
and time frames for the data set. When quartiles are used as benchmarks, it is
also helpful to identify the desired direction for performance. For example,
using the indicator data in these dashboards for PUs, process data related to
assessment for PU risk or prevention intervention performance in the upper
quartiles would be desirable, and outcome performance related to acquiring PUs
in the lower quartiles would be desirable. Arrows
indicating the desired direction can be placed on the dashboard as one helpful
tool, as shown in Figure 10. Another option, one requiring further explanation
to the users, is to rescale the dashboard so that low performance is always in
the lower quartile and desired performance is always in the upper quartile. For
the information on PUs, this would require transposing actual quartile
performance data for acquiring ulcers—in this case, being in the lower quartile
is good—and representing that as the upper quartile on the dashboard. The
dashboard must be clearly labeled with footnotes so it is clear to those using
the dashboard that good performance is always high, even though intuitively you
wish it to achieve low prevalence.
Step 6: Interpretation The final step in the translation process involves
analysis or interpretation of comparative performance to others in the data
set. The key operational stakeholders who prioritized the indicator set must be
involved in this process. Key conclusions must be summarized for senior
leadership.
Continuing with the CalNOC example, the following
interpretation might be drawn by those with operational accountability. (Note
that this dashboard was not rescaled for
desired performance placement in the upper quartile.) Looking at the structure
data, one sees that this hospital has more licensed vocational nurse (LVN)
hours than the median and has little LVN turnover of the staff (lower
quartile). Unlicensed support staff use is low (lower quartile) although RN
hours of care are at the median, but the number of patients for each RN is high
(upper quartile). The number of patients in a bed (bed turnover) on a given day
is high (indicating many admissions, discharges, or transfers), which would
require a lot of RN time. RN turnover on the workforce is also high (perhaps
the unit is too busy), and staffing is accomplished with contract or registry
staff (upper quartile). This unit likely would examine its staffing patterns
because the situation appears to be a difficult one for the RN workforce.
Next, looking at the process and outcome data within the
context of these structure data, one might make the following interpretation.
Restraint use is high (upper quartile), although use of sitters to prevent
restraint or falls is in the lower quartile. Patients at risk for PUs are not
getting prevention interventions (lower quartile), and the risk assessments for
PU development are only at the median. Risk assessments and determination of
appropriate interventions may not be getting accomplished, given the RN
patterns just identified. Although the percent of patients at risk for
hospital-acquired PUs (HAPUs) is low (lower quartile), this hospital is in the
upper quartile for HAPU development. This hospital will want to investigate
these
outcomes further by drilling down into the data to better understand
performance. This hospital may be doing well with fall prevention, however:
falls with injury are in the lower quartile. Note that “all falls” are high
(upper quartile), which could be interpreted as good reporting or as a high
rate that needs further investigation. If this hospital has been working on a
culture of safety and responsible reporting, a high fall rate may indicate
success in this area (good reporting).
Based on this dashboard,
quality professionals at this hospital would likely prioritize performance
improvement around PU development and use of restraint; they may wish to set
performance targets of being below the 75th percentile as a short-term goal,
and below the 50th percentile or median as a long-term goal. Given that they
are doing well with injury falls, they may wish to set the median as a
threshold for further analyses should the hospital’s performance decline to
that level. They would also likely investigate further staffing patterns to
support the high volume of patients that are admitted, discharged, or
transferred into this unit daily. Given the high RN staff turnover, they may
also wish to conduct a survey or focus group to better understand the staff’s
perspective on the work environment. They may wish to set a performance target
to be below the median for total voluntary staff turnover.
Summary This article provides tools for the quality
professional to translate data sets into dashboards and to set performance
targets, thresholds, and benchmarks. Armed with a basic understanding of the statistics
described, quartiles may provide a more sophisticated methodology for
benchmarking. Depending on how data are reported, quartiles or percentiles can
be selected as goals for performance, as thresholds for drill-down analyses if
performance is already at the desired level, or as the benchmarks for best
practices from high performers. Graphs can be used to create powerful visual
tools to quickly inform frontline staff, operational leaders, and governing
bodies on prioritized metrics.
Take a test on the article you just read for continuing education credit!
Author's Biography Diane Storer Brown, PhD RN FNAHQ, is the California Nursing Outcomes Coalition (CalNOC) coprincipal investigator and has been part of the CalNOC research team for more than 10 years. She is currently the clinical practice leader for hospital accreditation programs at Kaiser Permanente Northern California Region in Oakland, CA.
Carolyn E. Aydin, PhD, is a California Nursing Outcomes Coalition (CalNOC) coinvestigator and has been the CalNOC data manager for the past 10 years. She is currently a research scientist at Cedars-Sinai Health System Burns and Allen Research Institute in Los Angeles, CA.
Nancy Donaldson, DNSc RN FAAN, is the California Nursing Outcomes Coalition (CalNOC) coprincipal investigator and has also been part of the CalNOC research team for more than 10 years. She is the American Nurses Association–California (ANA/C) CalNOC project director and coprincipal investigator as well as the director for the Center for Research and Innovation in Patient Care at University of California–San Francisco Stanford Health Care through the University of California–San Francisco School of Nursing.
For more information on this article, contact Diane Storer Brown at Diane.Brown@kp.org.
References Aydin, C. E., Burnes, B. L., Donaldson, N., Brown, D. S., Buffum, M., & Sandhu, M. (2004). Creating and analyzing a statewide nursing quality measurement database. Journal of Nursing Scholarship, 36(4), 371–378.
Brown, D. S., Donaldson, N., Aydin, C. E., & Carlson, N. (2001). Hospital nursing benchmarks: The California Nursing Outcome Coalition project experience. Journal for Healthcare Quality, 23(4), 22–27.
Dawson, B., & Trapp, R. G. (2004). Basic and clinical biostatistics. Lange Medical Books.
Donaldson, N., Brown, D. S., Aydin, C. E., Bolton, M. L., & Rutledge, D. N. (2005). Leveraging nurse-related dashboard benchmarks to expedite performance improvement and document excellence. Journal of Nursing Administration, 35(4), 163–172.
Gregg, A. C. (2002). Performance management data systems for nursing service organizations. Journal of Nursing Administration, 32(2), 71–78.
Lindenauer, P. K., Remus, D., & Roman, S. (2007). Public reporting and pay for performance in hospital quality improvement. New England Journal of Medicine, 356(5), 486–496.
Merriam Webster online dictionary. (2007). Retrieved October 15, 2007, from www.merriam-webster.com/dictionary.
Rosow, E., Adam, J., Coulombe, K., Race, K., & Anderson, R. (2003). Virtual instrumentation and real-time executive dashboards. Solutions for health care systems. Nursing Administration Quarterly, 27(1), 58–76.
|