Research Report

Properties of Peak Achievement’s Diagnostic Assessment of Mathematics Gaps

Vonda L. Kiplinger, Ph.D.

Former Director of Assessment at the Colorado Department of Education

Previous research shows that students who score “below proficient” on their state tests have substantial gaps in their mathematical foundations. The short diagnostic assessment (16 items) developed by Peak Achievement LLC (PA) to identify specific gaps in students’ mathematics foundational knowledge has proven to be a very effective tool for this task. Research studies of the interventions since 2006 in three different states indicate that students’ performance on PA’s math diagnostic is highly predictive of performance on their state tests (and on NWEA, as well). While our research also demonstrates that filling the math gaps identified by the diagnostic assessment using PA’s intervention program leads to dramatic improvement in student math scores, that issue is not discussed here. The purpose of this discussion is to provide information on the test characteristics of PA’s math diagnostic assessment.

In the most recent study, the 16-item assessment was administered to 127 sixth grade students and to 53 ninth grade students in an urban charter school in Colorado in September 2009. Results from the Spring 2009 Colorado state test, the Colorado Student Assessment Program (CSAP), are available for 58 of the sixth-graders and for 24 of the ninth-graders. Spring 2009 NWEA-MAP results are also available for 57 sixth-graders and 31 ninth-graders.

Correlations Between Tests. Student scores on the PA pre-intervention diagnostic test and the CSAP and NWEA-MAP assessments administered prior to the intervention are highly correlated. The correlation coefficients are provided in the table below.

Correlations Between the PA Diagnostic Test and Large-Scale Assessments

Grade 6 Grade 9

PA	PA	CSAP	NWEA	PA	PA	CSAP	NWEA
PA	1.00			PA	1.00
CSAP	.82**	1.00		CSAP	.87**	1.00
NWEA	.77**	.87**	1.00	NWEA	.84**	.87**	1.00
** Correlation significant at the 0.01 level (2-tailed)

The high correlations between the PA diagnostic test results and the large-scale assessments are notable for several reasons:

Because the students designated for PA’s math intervention program are relatively low performers on the state mathematics assessment, we are dealing with a restriction of range problem. Restriction of range of variability typically attenuates the correlation between the two correlates.
The correlations between the short diagnostic test and the much longer assessments are on the order of year-to-year correlations between the tests themselves and between the two tests, CSAP and NWEA. (Correlations between NWEA and CSAP typically run between .86 and .92 in Colorado school districts. Year-to-year correlations of CSAP scores are typically (.86 to .94).
The correlations with the PA test are noteworthy because the tests were designed for completely different purposes and for different cognitive levels.
High correlations between the PA and the state test are indicative of a strong association between students’ math gaps and their performance on the state assessment.
Because the PA diagnostic test is effective at identifying math gaps, it stands to reason that filling those gaps will lead to improved performance on state tests and NWEA.

Test properties and item characteristics, including item-test correlations, item difficulty and item discrimination, for the PA diagnostic assessment also are examined. The same 16-item test was administered to students in both the sixth and ninth grades. However, the results are analyzed separately.

Item-Test Correlations. The short diagnostic assessment was designed to identify a range of specific gaps in students’ math foundations with only a few questions each (sometimes only 1 item is sufficient to identify a specific gap). All correlations (except for the easiest item) are significant to the 0.01 level, and most are moderate to high.

Item Difficulty. Estimates of item difficulty are based on p-values, the proportion of examinees who answer a question correctly. P-values for the grade 6 students range from .07 for the most difficult item to .95 for the easiest item. P-values for the grade 9 students on the same16-item test are slightly higher, ranging from .25 for the most difficult item to .98 for the easiest item[1]. The same item has the highest p-value for both grades and should be dropped or completely revised. While one would expect p-values to be higher for the ninth-graders, it must be emphasized that the difference between the two grades is rather small, indicating that the ninth grade students demonstrate most of the same math gaps as the sixth-graders. Inspection of the items shows that the ninth- and sixth-graders are missing many of the same questions. It is likely that these students have been carrying very basic, critical gaps along with them since before sixth grade.

Index of Discrimination. The Index of Discrimination, is one indicator of how effectively an item discriminates between examinees of higher ability and those who demonstrate lower ability on a criterion of interest. (The criterion measure can be the total test performance or score on an external criterion.) The goal is to identify items for which high-scoring examinees have a high probability of answering correctly and low-scoring examinees have a low probability of answering correctly. In this study, the Index of Discrimination was used. Examinees are divided into two performance groups: those whose total scores on the criterion are below the median and those whose total scores are at or above the median. Item discrimination is examined relative to two criterion measures: (1) total score on the PA assessment and (2) total test score on the NWEA-MAP. For both grades six and nine, the majority of the items are highly discriminating between lower and higher ability students, as measured by the PA diagnostic assessment and the NWEA-MAP.

In summary, the short diagnostic assessment (16 items) developed by Peak Achievement LLC is very effective in identifying specific gaps in students’ mathematics foundational knowledge. Results from the diagnostic assessment are highly correlated with large-scale assessments of mathematics, indicating a strong association between student math gaps and their performance on state assessments. Other test properties (item-test correlations, item difficulty and item discrimination) all indicate that this short diagnostic assessment, which is designed to identify a range of specific gaps in students’ math foundations with only a few questions each, effectively identifies these gaps and discriminates between student who demonstrate varying degrees of foundational gaps in their mathematics knowledge.