Contract No.: ED-04-CO-0112 (09)
An Evaluation of Secondary Math Teachers From Two Highly Selective Routes to Alternative Certification
Part B: Supporting Statement for Paperwork Reduction Act Submission
Submitted to:
Institute of Education Sciences IES/NCEE U.S. Department of Education 555 New Jersey Avenue, NW Washington, DC 20208
Project Officer:
|
Submitted by:
Mathematica Policy Research, Inc. P.O. Box 2393 Princeton, NJ 08543-2393 Telephone: (609) 799-3535 Facsimile: (609) 799-0005
Project Director: |
CONTENTS
Page
PART B: SUPPORTING STATEMENT FOR PAPERWORK
REDUCTION ACT
SUBMISSION 1
B. COLLECTION
OF INFORMATION EMPLOYING STATISTICAL
METHODS 1
1. Respondent Universe and Sampling Methods 1
2. Statistical Methods for Sample Selection and Degree of Accuracy Needed 3
3. Methods to Maximize Response Rates and Deal with Nonresponse 8
4. Tests of Procedures and Methods to Be Undertaken 9
5. Individuals Consulted on Statistical Aspects of the Design 10
REFERENCES 11
APPENDICES
APPENDIX A: DISTRICT NOTIFICATION PACKAGE
APPENDIX B: DISTRICT TELEPHONE CALL PROTOCOL
APPENDIX C: DISTRICT VISIT PROTOCOL
APPENDIX D: PRINCIPAL NOTIFICATION LETTER
APPENDIX E: SCHOOL TELEPHONE CALL GUIDE
APPENDIX F: SCHOOL VISIT GUIDE
APPENDIX G: TEACHER BACKGROUND FORM
APPENDIX H: CONSENT FORM FOR PILOTING STUDENT ASSESSMENT
APPENDIX I: REQUEST FOR CLASSROOM ROSTER
Exhibit Page
1 minimum detectable effect sizes 8
This package requests clearance to recruit school districts and schools for a rigorous evaluation of secondary math teachers who have entered teaching through highly selective routes to alternative certification (HSAC). This evaluation is being conducted by the Institute of Education Sciences (IES), U.S. Department of Education (ED); it is being implemented by Mathematica Policy Research, Inc. (MPR) and its partners—Chesapeake Research Associates LLC and Branch Associates.
The objective of the evaluation is to estimate the impact on secondary student math achievement of teachers who obtain certification via HSAC routes compared with teachers who receive certification through traditional or less selective alternative certification routes. The evaluation design is a randomized experiment in which the researchers will randomly assign secondary school students to a treatment or control group. The treatment group will be taught by an HSAC teacher and the control group will be taught by a non-HSAC teacher. Both teachers must teach the same math class at the same level under the same general conditions. We will compare student math achievement between the treatment and control groups to estimate the impact of HSAC teachers.
The package will be submitted in two stages because the study schedule requires that district and school recruitment begin before all the data collection instruments are developed and tested. In this package, we are requesting approval for recruitment, a teacher background form, a spring 2009 pilot of a test of teacher math content knowledge, and the random assignment of students. This package also provides an overview of the study, including its design and data collection procedures.
An addendum to this package, submitted at a later date, will request clearance for the remaining data collection for the evaluation, including the consent forms. The addendum will also provide a detailed discussion of the data collection activities and copies of the instruments and consent forms.
The respondent universe for the study will consist of secondary school math teachers from two HSAC programs (Teach For America [TFA] and The New Teacher Project [TNTP]), non-HSAC teachers of the same courses in the same schools, and the students in these courses. The sample will be selected in four stages. MPR will (1) identify districts with TFA or TNTP secondary math teachers, (2) identify the schools within these districts that employ secondary math teachers from TFA or TNTP, (3) select at least one HSAC and one non-HSAC secondary math teacher who are teaching the same course in the same school, and (4) randomly assign students between the classrooms taught by HSAC and non-HSAC teachers and include all students in these classrooms in the research sample.
The study will include a total of 450 “classroom matches,” each match consisting of a math class taught by an HSAC teacher and one taught by a non-HSAC teacher in the 2009-2010 school year, for a total of at least 900 classrooms.1 All classes in the match must be the same subject (for example, Algebra I), at the same level (for example, honors, remedial, or regular), and must be taught under the same circumstances (for example, English language learners must be evenly distributed across the classrooms rather than clustered with one teacher or the other). Furthermore, it must be possible for researchers to assign students randomly between sections of the match with no disruption to or involvement in school scheduling procedures; this will typically be the case when all sections of the match are taught concurrently.2 The same teacher may be in more than one classroom match if he or she teaches more than one eligible class during the school day. Assuming each teacher in the sample teaches an average of three study classes, we anticipate a total sample of 150 HSAC teachers and 150 non-HSAC teachers. We anticipate that it will require the participation of approximately 112 schools in 20 districts. Assuming 20 students per classroom, the study will include approximately 18,000 students.
A summary of our sampling plan and respective respondent universes is as follows:
Identify and Recruit Districts with TFA or TNTP Math Teachers. Each program will provide us with a list of districts that have hired secondary math teachers from their program. From this list, we will prioritize our recruiting efforts to focus first on the districts with the most HSAC math teachers and later on the districts with fewer.
Identify and Recruit Schools with HSAC Teachers. TFA and TNTP will provide the names of the schools that hired math teachers from their programs in the 2008‑2009 and prior school years. When we have obtained the requisite permission from the districts, we will contact these schools to determine whether they anticipate having at least one eligible classroom match in the 2009‑2010 school year; if so, we will attempt to recruit them for the study.
Identify HSAC and Non-HSAC Teachers in Eligible Classroom Matches. Each match will include at least one HSAC teacher and at least one non-HSAC teacher.
Randomly Assign Students to Classrooms. Students signed up for a course that is to be included in the study will be randomly assigned to a classroom taught by an HSAC teacher or a classroom taught by a non-HSAC teacher. All students with appropriate parental consent in these classrooms will be included in the research sample, with the exception of those who are unable to take an assessment or who enter the class after the first two weeks of school.
The respondent universe for the pilot study of the student assessment includes high school students taking General Math, Algebra I, Algebra II, or Geometry. We will purposively select high schools in a low income district that is similar to the districts that are likely to participate in the evaluation. The students will be selected purposively, based on the willingness to participate of the school, teachers, and parents. In total, we expect to include 160 students, 40 for each of the four assessments.
Recruiting for the Evaluation. Ideally, we would randomly sample teachers from the entire universe of secondary school HSAC math teachers for the evaluation. Impact estimates would then generalize to all secondary school HSAC math teachers. However, random sampling is not possible because the evaluation will necessarily be limited to schools in which the experimental design is feasible—those with eligible classroom matches. Thus, we propose to draw a purposive sample designed to meet a specified statistical standard of precision. Although we will not be able to generalize to all HSAC math teachers, we will obtain valid estimates of the impacts of the set of HSAC math teachers who meet our eligibility requirements. In this section we describe in greater detail how the sample will be drawn. In addition, for each level of data collection we indicate how the data are to be gathered.
Selection of School Districts. Rosters from TFA and TNTP will indicate the districts that have hired teachers from these programs. Because we are collecting a purposive sample, we will not randomly sample these districts. Instead, we will prioritize our recruiting efforts to focus first on the districts with the most math teachers and later on the districts with fewer. We will begin district recruitment efforts with an introductory package mailing. This package will include the following three documents (Appendix A):
Notification Letter. The one-page notification letter on ED letterhead highlights the policy issue to be addressed by the study, introduces the study and study team, and notes that a member of the study team will be calling soon to arrange an in-person meeting to discuss the study in more detail.
Study Summary. The two-page study summary describes the purpose and research objectives of the study, identifies the organizations comprising the evaluation team, and provides contact information for the project director and project officer at ED. It also discusses the eligibility criteria for school participation and the activities required of participating districts and schools.
Letters of Support. We will include letters from top TFA and TNTP officials stating their strong support for the evaluation—their desire to have rigorous research findings on the relative effectiveness of individuals they help bring into teaching—and urging districts and schools to cooperate whenever possible.
Within two days of the delivery date of the FedEx mailing of the introductory package, we will call the superintendent’s office to determine the appropriate contact. We will call the contact person to briefly describe the study and offer to meet with key district officials (see Appendix B for the call protocol). We will attempt to arrange a meeting that includes all relevant district personnel, including representatives of the superintendent’s office, human resources office, and research approval office; the top official for secondary schools; the top official for math instruction; and officials who can discuss the availability of student records.
At the district-level meetings, we will explain many of the details of the study, such as random assignment of students, student testing and other data collection plans, student and parent consent procedures, and the work plan agreement (see Appendix C for the visit protocol). We will comply with all district research requirements. We will request that the district staff help coordinate contacts with individual schools.
Selection of Schools. From the school districts selected, we will contact schools to participate in the study and to determine whether or not they anticipate having an eligible classroom match in the 2009-2010 school year. Once we have approval to contact schools, we will send principals an information packet that will include a notification letter, study summary, and letters of support from TFA and TNTP. The principal notification letter notes that the study team has discussed the proposed study with district officials and has received preliminary approval to investigate the school’s potential eligibility and interest in assisting with the study (Appendix D).
Calls will then be made to the school principals, in which we will describe the policy relevance of the study, the research objectives, and the study design; we will also administer the screening protocol to explore the likelihood that the school will have at least one eligible classroom match in the 2009‑2010 school year (see Appendix E for the school call guide). We will ask about the math courses taught in the school, the number of sections of these courses, the teachers of these courses and whether they are from an HSAC program, and school scheduling procedures.
When school conditions appear favorable to the study, we will offer to meet with the school principal, key individuals responsible for course scheduling, and, at the principal’s discretion, the teachers who might be included in the study. During this meeting, we will discuss the study in more detail and determine the school’s willingness to participate (Appendix F). Principals who express interest in the study will be requested to ask teachers in potentially eligible classroom matches to complete teacher background forms that we will use to verify eligibility (Appendix G). We will encourage the principal to sign a school work plan describing the responsibilities of the school.
Selection of Teachers. At least one classroom match will be selected from each school, with each match consisting of a class taught by an HSAC teacher and one taught by a non-HSAC teacher. In each school, we will include as many eligible classroom matches as possible in order to maximize the statistical power of the study.
Selection of Students. All students signed up for classes in a classroom match and who can take an assessment (if in high school) will be included in the study sample. These students will be randomly assigned to either a classroom taught by an HSAC teacher or a classroom taught by a non-HSAC teacher within the match.
Pilot Study of Student Assessment. We will select a purposive sample of students for the pilot study of the student assessment. We will identify New Jersey high schools in low-income districts that offer at least General Math, Algebra I, Algebra II, or Geometry classes. We will send a letter to the school principals to describe the study and to ask if we can contact the teachers of the appropriate classrooms. After receiving approval from the principals, we will send the teachers a letter about the pilot study. The letter will describe the pilot study, the payments, and the student assessment. It will also indicate that the principal has provided permission for us to contact the teachers. We will follow up with telephone calls to discuss the pilot in more detail, highlight the benefits of participation, and attempt to schedule the pilot. We will ask the participating teachers to distribute a passive consent notification letter to the parents of the students in their math classes (Appendix H). Students whose parents do not notify MPR to exclude their children from participation in the study will be administered the assessment.
Pilot Study of Student Assessments. We will examine data obtained from the pilot of the student assessments to determine the number of questions students can complete within a class period, how the number of questions answered affects the precision of students’ scores, and whether an assessment may be too difficult or too easy (resulting in floor or ceiling effects).
Evaluation. To estimate the impact of HSAC teachers on secondary student math achievement for the full evaluation, we will treat each classroom match as a separate “mini-experiment.” For each classroom match, we will compare the average outcome math assessment score of students randomly assigned to the class taught by the HSAC teacher to the average score of those assigned to the non-HSAC teacher—the difference in average scores will provide an estimate of the HSAC teacher’s impact in that particular classroom match. We will then average the impact estimates across all classroom matches in the study to come up with an overall estimate of the HSAC teachers’ impact on secondary student math achievement.
Primary Impact Analysis. Due to random assignment, the differences in mean outcomes in each classroom match will provide an unbiased estimate of the impact of HSAC teachers. However, the precision of the estimates can be improved by controlling for student-level baseline characteristics that may explain some of the differences in achievement, such as sex, race, free/reduced price lunch eligibility, special education status, whether the student is an English language learner, and prior math achievement. We will therefore estimate the following model of student math achievement for student i in classroom match j:
where Yij is the outcome math test score of student i in classroom match j, Pj is a vector of classroom match indicators, Xij is a vector of student-level baseline characteristics, Tij is an indicator for whether the student was in the HSAC teacher’s class in classroom match j, εi is a random-error term that represents the influence of unobserved factors on the outcome, and β and δ are vectors of parameters to be estimated. Because the randomization is done within classroom matches within schools, and schools may differ from each other in student compositions, the model includes a vector of classroom match indicators, Pj, to control for differences in the average student characteristics between classroom matches and schools. If a sufficient number of classroom matches contain three teachers instead of two, the estimated standard errors will account for clustering of students within classroom.
The vector δ represents the experiment-level impacts of the HSAC teachers in each classroom match that can then be aggregated to estimate the overall HSAC impact. The simplest and perhaps most intuitively appealing way to aggregate these impacts is to calculate an equally weighted average of the classroom match-level impacts. In this way, each classroom match will have an equal influence on the overall impact estimate. As a specification check, we will also explore alternative weighting schemes that have the potential to provide greater statistical efficiency and test the robustness of the findings, including giving greater weight to more precisely estimated classroom match-level impacts and weighting proportionally to the size of the sample in each classroom match.
Subgroup Analyses. In addition to estimating the overall impact of HSAC teachers on secondary student math achievement, we will conduct a limited number of subgroup analyses. Specifically, we will separately estimate the impact of TFA and TNTP teachers, middle and high school HSAC teachers, and novice and experienced HSAC teachers. To calculate subgroup impacts, the classroom match-level impact estimates will be aggregated for each relevant subgroup. For example, to calculate the subgroup impacts for high school and middle school teachers, the impact estimates from experiments in high schools will be aggregated separately from those from the experiments in middle schools. While we will test the statistical significance of the impact for each subgroup, we will not test the significance of differences between subgroups (for instance, between TFA and TNTP teachers), as the sample will not provide adequate statistical power for these comparisons.
Non-Experimental Analysis. If we find that HSAC teachers are more effective than non-HSAC teachers, policymakers will want to understand the reasons they are more effective. To shed light on this, we will investigate whether there are particular observable teacher characteristics that are correlated with the impacts. Because the effects of the teacher characteristics cannot be separated from the HSAC recruiting model experimentally, we will rely on non-experimental methods for this exploratory analysis.
For the non-experimental analysis, we will estimate variations of Equation 1 that introduce within-experiment differences in teacher characteristics:
where Cij represents a vector of observable characteristics of student i's teacher, γ is a vector of parameters to be estimated, and all other variables are defined as above. Since these models include classroom match-level fixed effects, the coefficients in vector γ represent the correlations between the within-match differences in teacher characteristics and the within-match differences in student outcomes. These exploratory analyses will be guided in large part by differences between HSAC and non-HSAC teachers that are observed through the teacher survey and that have been hypothesized to influence student achievement. For example, HSAC teachers are often perceived to be different from non-HSAC teachers in their subject knowledge, the selectivity of their undergraduate colleges, and their experience, all of which have been connected to student achievement in prior research (Clotfelter et al. 2007). Therefore, using data from the teacher survey and teacher math knowledge assessments (if the option is exercised), we will examine how the differences between the HSAC teachers and the non-HSAC teachers along these dimensions are correlated with student outcomes.
Non-Response and Crossovers. Although, we will take steps to minimize the amount of missing data, some student non-response for this evaluation is inevitable. This non-response may lead to biased impact estimates if the non-response is correlated with math achievement and whether the student was assigned to an HSAC teacher. To address this, we will use propensity score matching and create non-response weights that appropriately weight those for whom we have outcome math test scores, so that the weighted sample of students with nonmissing data is representative of the full sample. In addition, some students who are assigned to an HSAC teacher may crossover into a class with a non-HSAC teacher or vice versa. Including crossover students might bias the impact estimates by attributing the performance of the HSAC teacher to a non-HSAC teacher and vice versa. We can adjust the estimates for these crossovers using the students’ assignment status as an instrumental variable for having an HSAC teacher (Angrist et al. 1996).
The study is designed to achieve a minimum detectable effect (MDE) of 0.10 standard deviations in student math test scores. This target MDE is based on considerations of policy relevance and attainability, balanced against the costs of data collection. It is lower than MDEs from similar studies at the elementary school level because test score gains tend to be lower at the middle and high school levels. Estimates of average annual gains in effect sizes from nationally normed math tests across grade levels presented by Hill et al. (2007) indicate that a 0.10 standard deviation effect of HSAC teachers on test scores would be equivalent to roughly a third of a year of schooling for children in grades 6-10, a policy-relevant effect by most standards. Furthermore, previous research has estimated effects of HSAC teachers as high as 0.11 standard deviations (Boyd et al. 2006; Kane et al. 2006), suggesting that an HSAC impact of 0.10 might be attainable.
Exhibit 1 displays MDE sizes for the full sample and for subgroups of teachers. The MDEs are based on an assumed sample of 112 schools, one-third providing four teachers for the study and the rest providing two teachers, for a total of 300 teachers (150 HSAC and 150 non-HSAC teachers). We assume each teacher on average teaches in three separate classroom matches, for a total of 450 classroom matches or 900 classes. We further assume each class has an average of 20 students, for a total of 18,000 students.
For all calculations, we assume a 5 percent level of statistical significance and an 80 percent level of statistical power. Based on the previous experimental study of TFA (Decker et al. 2004), we assume a “crossover rate” (students switching from the treatment to the control classroom or vice versa) of 5 percent and a sample attrition rate of 10 percent. Also, consistent with the previous experimental TFA study, we assume a teacher-level intracluster correlation (ICC) of 0.15 to account for correlation of outcomes between teachers as well as a correlation between treatment and control group outcomes within a school of 0.50. We assume that control variables in the impact model—in particular baseline test scores—explain 50 percent of the variances in the test score outcome measure (that is, R2 = 0.50).
Exhibit 1
Minimum Detectable
Effect Sizes
Subgroup size |
Minimum Detectable Effect |
Sample Size (students/teachers) |
100 percent (full sample) |
0.10 |
18,000/300 |
75 percent |
0.11 |
13,500/225 |
50 percent |
0.14 |
9,000/150 |
30 percent |
0.18 |
6,000/100 |
Note: The minimum detectable effects were calculated using the following formula:
where R2 (= .50) is the regression R-squared value estimated from previous studies, T is the number of treatment (control) group teachers, N is the total number of students in the treatment (control) group classrooms (assuming 20 students per class), (= .15) is the between-classroom variance as a percentage of the total variance of the outcomes based on previous similar studies, and sample attrition is 10 percent.
We do not anticipate any unusual problems that require specialized sampling procedures.
The pilot study of the student assessment will be conducted only once, and teacher background forms for recruiting will be collected only once. The classroom rosters will be requested at four times during the school year (Appendix I).
Recruiting for the Evaluation. We will rely on several strategies to attain the target participation of 450 classroom matches for the evaluation. The recruiters will be trained to present information, address concerns, and respond to questions clearly, quickly, and effectively. We will use ED letterhead for the notification letters, and recruiters will indicate that they are calling on behalf of ED when they speak to representatives of the districts and schools. Recruiters will also note that the study will be reviewed by both the Office of Management and Budget (OMB) and an independent institutional review board. We will leverage the support of TFA and TNTP through the inclusion of letters of support from the programs. The recruitment task leader will monitor recruiting issues daily so as to quickly resolve obstacles to participation.
To identify potentially eligible teachers for the evaluation, we will ask principals to request teachers complete a teacher background form (Appendix G). If the teachers are present during our school visit, we will hand deliver the form to the teachers. Because it is very short, teachers can complete the form at that time, if they so choose. We will make follow-up calls to the nonresponding teachers to maximize response.
Pilot Study of Student Assessment. We will encourage participation in the pilot of the student assessment by highlighting the benefits of participation. We will send the school principals a letter that will describe the study and payments, emphasize the importance of the study, and request their participation. After a few days, we will call the principals to discuss the pilot and emphasize the opportunity for their teachers and students to gain experience with a computerized assessment. After we receive approval from the principals, we will send the selected teachers a letter about the pilot. The letter will discuss the pilot activities, payments, and the experience students will gain using the computerized assessment. We will also stress that we are willing to schedule the pilot administration at a time most convenient to them. A payment of $250 will be offered to each participating school, and a $5 gift will be offered to each student to encourage cooperation.
Participating teachers will be asked to distribute passive consent forms for the student assessment to their students. Teachers will ask the students to request that their parents or guardians review the consent letter (Appendix H). The consent letter will describe the importance of the pilot study and will emphasize that participation is voluntary. The form will include contact information for the person that the parent or guardian can call with any questions about the study. The form will emphasize that the data will be kept confidential, used only for the evaluation, and reported in aggregate form. Parents will be asked to call MPR only if they do not want their child to participate in the student math assessment.
The pilot of the student assessments will provide data that will allow us to determine the number of questions students can complete within a class period,3 how the number of questions answered affects the precision of students’ scores, whether an assessment may result in floor or ceiling effects, and will also help us to address any logistical issues in administering the student assessment for the evaluation.
The teacher background and request for classroom rosters forms were modeled on the forms used in a previous study, the Impact Evaluation of Teacher Preparation Models. As they were used effectively in that study for similar purposes, they will not be pretested for this study.
The following individuals were consulted on the statistical aspects of the study:
Name |
Title |
Telephone Number |
Melissa Clark |
Senior Researcher, MPR |
609-750-3193 |
Philip Gleason |
Senior Fellow, MPR |
315-781-8495 |
John Deke |
Senior Researcher, MPR |
609-275-2230 |
The following individuals will be responsible for the data collection and analysis:
Name |
Title |
Telephone Number |
Sheena McConnell |
Associate Director of Research and Senior Fellow, MPR |
202-484-4518 |
Timothy Silva |
Senior Researcher, MPR |
202-484-5267 |
Melissa Clark |
Senior Researcher, MPR |
609-750-3193 |
Kathy Sonnenfeld |
Survey Researcher, MPR |
609-275-2293 |
Eric Zeidman |
Survey Researcher, MPR |
609-936-2784 |
Angrist, Joshua D., Guido W. Imbens, and Donald R. Rubin. “Identification of Causal Effects Using Instrumental Variables.” Journal of the American Statistical Association, vol. 91, 1996, pp. 444-472.
Boyd, Donald, Pamela Grossman, Hamilton Lankford, Susanna Loeb, and James Wyckoff. “How Changes in Entry Requirements Alter the Teacher Workforce and Affect Student Achievement.” Education Finance and Policy, vol. 1, no. 2, Spring 2006, pp. 178-216.
Clotfelter, Charles T., Helen F. Ladd, and Jacob Vigdor. “Teacher Credentials and Student Achievement in High School: A Cross-Subject Analysis with Student Fixed Effects.” National Bureau of Economic Research Working Paper no. 13617, November 2007.
Decker, Paul T., Daniel P. Mayer, and Steven Glazerman. “The Effect of Teach for America on Students: Findings from a National Evaluation.” Princeton, NJ: Mathematica Policy Research, Inc., 2004.
Hill, Carolyn J., Howard S. Bloom, Alison Rebeck Black, and Mark W. Lipsey. “Empirical Benchmarks for Interpreting Effect Sizes in Research.” MDRC Working Papers on Research Methodology, New York, NY: MDRC, 2007.
Kane, Thomas J., Jonah E. Rockoff, and Douglas O. Staiger. “What Does Certifcation Tell Us About Teacher Effectiveness? Evidence from New York City.” National Bureau of Economic Research Working Paper No. 12155, Washington, DC: National Bureau of Economic Research, April 2006.
1 Where available, an additional class taught by either an HSAC teacher or a non-HSAC teacher may be included in the classroom match, for a total of no more than three classrooms per match.
2 Alternatively, if the school assigns students to “teams” in which students take all courses together throughout the school day and the target course is taught by an HSAC teacher in one team and a non-HSAC teacher in the other team, students can be randomly assigned to teams regardless of whether the courses are taught concurrently, with no involvement in the school’s scheduling procedures.
3The length of a class period can differ among schools, lasting anywhere from 45 minutes to as long as 90 minutes. Our goal is to administer the student assessment in less than 45 minutes.
File Type | application/msword |
File Title | MEMORANDUM |
Author | Nancy Duda |
Last Modified By | Dawn Patterson |
File Modified | 2009-02-09 |
File Created | 2009-02-09 |