Saturday, November 5, 2022

Meta-Analysis of Bilingual Education Research

Research on Effectiveness of Bilingual Education (in the U

2007

The Big Picture in Bilingual Education:

A Meta-analysis Corrected for Gersten’s Coding Error

Kellie Rolstad, Arizona State University

Kate Mahoney, State University of New York at Fredonia

Gene V Glass, Arizona State University

Language minority education is at a peculiar point in its history. Within the last few years, clarity and consensus regarding the effectiveness of bilingual instruction has emerged in the scientific literature, while the political environment has become more hostile than at any time since the passage of the Bilingual Education Act forty years ago.

In Rolstad, Mahoney and Glass (2005) (hereafter, RMG), we reviewed narrative summaries and meta-analyses examining the effectiveness of bilingual education. The meta-analyses and more recent narrative summaries favored the conclusion that bilingual education is an effective approach to raising academic achievement for English Language Learners (ELLs), a conclusion also consistent with the work by the National Literacy Panel (August & Shanahan, 2006) completed after RMG. A puzzling source of data for us was Gersten (1985), which was an outlier in our analysis. In the present paper, we note recent revelations in Rossell and Kuder (2005) that Gersten miscoded program descriptions in his study, and we produce a new meta-analysis corrected for the coding error.1

Rolstad, Mahoney and Glass’s (2005) Meta-analysis

             RMG used a corpus of 17 studies which were conducted in the years following Willig’s (1985) meta-analysis.  Unlike previous studies, RMG provided comparisons not only for Transitional Bilingual Education (TBE) and English-only approaches, programs in which English acquisition is the primary goal, but also for Developmental Bilingual Education (DBE), programs that promote the development and maintenance of the first language as well as English.  Furthermore, RMG included as many studies as possible in the meta-analysis, without applying selection criteria bearing on study quality, as intended by the original developers of the method (Glass, 1976; Glass, McGaw, & Smith, 1981).

            As an additional methodological contribution, RMG coded program models according to the descriptions provided in the studies rather than the labels themselves, as many studies were found to use program labels adopted by schools but which did not fit conventional definitions. RMG coded programs whose descriptions were more aligned with the conventional definition of TBE as TBE, those more aligned with the conventional definition of DBE as DBE, and those more aligned with the conventional definition of an English-only program as EO. See Crawford (2004) for conventional definitions and discussion of program models.

            RMG showed that TBE was consistently superior to all-English approaches, and that DBE programs were superior to TBE programs. In an analysis controlling for ELL status, RMG found a positive effect for bilingual education of .23 standard deviations, with outcome measures in the native language showing a positive effect of .86 standard deviations. Note that in Table 1 (originally published in RMG) Gersten’s three average effect sizes contributed negatively to the meta-analysis. More specifically and by individual effect size, Gersten (1985) contributed 3 negative effect sizes, Gersten, Woodward, and Schneider (1992) contributed 10 negative and two positive effect sizes, and Gersten and Woodward (1995) contributed eleven negative effect sizes. For further details, please see RMG.

 

[Insert Table 1 about here]

 

Gersten’s Coding Error

            Gersten (1985) had reported that a larger percentage of children enrolled in a structured immersion program (75%) scored at or above grade level on standardized tests than children in a bilingual program (19%) at the end of second grade. Gersten (1985) does not present a description of his comparison group apart from labeling it “the district’s bilingual program.” However, because Gersten has written extensively on bilingual education, consistently expressing a preference for direct instruction in Structured Immersion (SI) over bilingual methods, we included the 1985 study in our analysis along with two other Gersten contributions, even though it lacked an actual definition or description of the bilingual education program. In RMG, we coded Gersten’s SI program as an EO program and what Gersten called TBE was coded as TBE.

            However, Rossell and Kuder (2005) recently reported a personal communication with Gersten revealing that Gersten “now agrees that the district undoubtedly mislabeled their ESL program as a bilingual program” (footnote 7, page 18), and that the comparison was not between EO and TBE, as Gersten originally stated, but rather between SI and ESL Pullout.

            Gersten’s three articles contributed 26 individual effect sizes out of 67 (39% of the sample), which had a substantial influence on the mean effect size. We now have a better understanding of why one of the studies, Gersten (1985), differed so dramatically from the others in the meta-analysis -- rather than comparing TBE with SI, it compared two varieties of English-only programs, namely, ESL Pullout and SI.  Gersten’s (1985) description of the SI program in his study depends on reference to general characteristics of SI outlined in Baker and de Kanter (1983).

The key to a structured immersion is that all academic instruction takes place in English, but at a level understood by the students (Baker & de Kanter, 1983). At the same time, there are always bilingual instructors in the class who understand the children's native language and translate problematic words into the native language, answer questions phrased in the native language, help the children understand classroom routines, show them the bathrooms, lunchroom, and playground, and so forth (p. 189).

            Gersten (1985) appears to define SI as used in the study, then, as involving bilingual teachers who provide minimal help in the native language. While no details are provided regarding the ESL Pullout program, such programs generally do not provide native language support of any kind (Crawford, 2004). Therefore, following the coding convention established in RMG, we regard Gersten’s SI as more aligned with TBE, since it appears to have provided a modicum of native language support, and we take what Gersten has now revealed to have been ESL Pullout as a variety of EO.

A Recalculated Meta-analysis Corrected for Gersten’s Coding Error

            Recalculating the meta-analysis in Table 2 with the corrected coding for Gersten (1985), following these conventions, we see that the mean effect size for all outcome measures increases from .08 to .19, Reading (in English) increases from -.06 to .14, Math (in English) increases from .08 to .17,  and all TBE studies increased from   -.01 to .10.  The revelation of a coding error for Gersten (1985), and the inconsistency of all three of Gersten’s studies with the rest of the work we reviewed, increases our confidence that the “investigator effect” noted in RMG may justify removing all three of the Gersten studies. As shown in Table 2, removing Gersten’s studies renders an effect size for TBE of 0.17, nearly as high as for DBE. Because numerous factors other than language proficiency are known to contribute to lower academic achievement among ELLs (August & Hakuta, 1996; August & Shanahan, 2006), we argued in RMG that the most informative result is the effect size reported for studies involving ELLs in both treatment and control groups; as shown in Table 2, the average effect size for TBE in these studies is 0.23, favoring bilingual approaches.

 

[Insert Table 2 about here]

 

Conclusions

            Meta-analysis is a useful tool for clarifying variation among studies reporting divergent findings. The original RMG analysis discovered curious effects associated with the Gersten studies, which behaved as outliers in the analysis. The coding error recently reported by Rossell and Kuder (2005) confirmed our suspicion, at least for Gersten (1985), that the results were incorrect. The new analysis reported in Table 2 strengthens the conclusions previously reached in RMG supporting TBE over English-only approaches, and DBE over TBE.

 

 


 

References

August, D. & Hakuta, K. (1998). Improving Schooling for Language-Minority Children:  A Research Agenda. Washington, DC:  National Academy Press.

August, D., & Shanahan, T. (Eds.) (2006). Developing literacy in second-language learners: Report of the National Literacy Panel on Language-Minority Children and Youth. Mahwah, NJ: Lawrence Erlbaum.

Baker, K., & de Kanter, A., A. (1981). Effectiveness of Bilingual Education: A Review of the Literature. Final Draft Report. Washington, DC: Department of Education Office of Planning, Budget, and Evaluation.

Burnham-Massey, M. (1990). Effects of bilingual instruction on English academic achievement of LEP students. Reading Improvement, 27, 129-32.

Carlisle, R. S. (1989). The writing of Anglo and Hispanic elementary school students in bilingual, submersion, and regular programs. SSLA, 11, 257-280.

Carter, T. P., Chatfield, M.L. (1986). Effective bilingual schools: Implications for policy and practice. American Journal of Education 95(1), 200-32.

Crawford, J. (2004). Educating English Learners: Language Diversity in the Classroom. Los Angeles, CA: Bilingual Educational Services, Inc.

de la Garza, J. & Medina, M. (1985). Academic achievement as influenced by bilingual instruction for Spanish-dominant Mexican American children. Hispanic Journal of Behavioral Sciences, 7(3), 247-59.

Gersten, R. (1985). Structured immersion for language minority students: Results of a longitudinal evaluation. Educational Evaluation and Policy Analysis, 7, 187-196.

Gersten, R., Woodward, J. (1995). A longitudinal study of transitional and immersion bilingual education programs in one district. Elementary School Journal, 95(3), 223-239.

Gersten, R., Woodward, J., & Schneider, S. (1992). Bilingual immersion: A longitudinal evaluation of the El Paso program. Washington, DC: READ Institute. (ERIC Document Reproduction Service No. ED389162)

Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5, 3-8.

Glass, G. V., McGaw, B., & Smith, M. L. (1981). Meta-analysis in social research. Beverly Hills, Calif.: SAGE Publications.

Krashen, S. (2005). A correction -- 20 years later. Stephen Krashen’s Mailing List. Retrieved December 3, 2008 from http://sdkrashen.com/pipermail/krashen_sdkrashen.com/2005-November/000332.html.

Lindholm, K.J. (1991). Theoretical assumptions and empirical evidence for academic achievement in two languages. Hispanic Journal of Behavioral Sciences, 13(1), 3-17.

Medina, M. Jr.& Escamilla, K. (1992). Evaluation of transitional and maintenance bilingual programs. Urban Education, 27(3), 263-90.

Medina, M. Jr., Saldate, M. IV, & Mishra, S. (1985). The sustaining effects of bilingual instruction: A follow-up study. Journal of Instructional Psychology, 12(3), 132-39.

Medrano, M.F. (1986). Evaluating the long-term effects of a bilingual education program: A study of Mexican students. Journal of Educational Equity and Leadership, 6, 129-138.

Medrano, M.F. (1988). The effects of bilingual education on reading and mathematics achievement: A longitudinal case study. Equity and Excellence, 23(4), 17-19.

Ramirez, J.D., Yuen, S.D., Ramey, D.R., Pasta, D.J. & Billings, D. (1990). Final report: Longitudinal study of immersion strategy, early-exit and late-exit transitional bilingual education programs for language-minority children. San Mateo, CA: Aguirre International. (ERIC Document Reproduction Service No. ED330216)

Rolstad, K., Mahoney, K. & Glass, G.V. (2005). The Big Picture: A Meta-Analysis of Program Effectiveness Research on English Language Learners. Educational Policy, 19(4), 572-594.

Rossell, C. (1990). The effectiveness of educational alternatives for limited-English proficient children. In G. Imoff (Ed.), Learning in Two Languages (pp. 71-121). New Brunswik, NJ: Transatlantic Publishers.

Rotharb, S. and others (1987). Evaluation of the bilingual curriculum content (BCC) pilot project: A three year study. Final report. Miami, FL: Dade County Public Schools. Office of Educational Accountability. (ERIC Document Reproduction Service No. ED300382)

Saldate, M., IV, Mishra, S., & Medina, M., Jr. (1985). Bilingual instruction and academic achievement: A longitudinal study. Journal of Instructional Psychology, 12(1), 24-30.

Slavin, R. E., & Cheung, A. (2003). Effective Reading Programs for English Language Learners: A Best-Evidence Synthesis. Baltimore, MD: Johns Hopkins University Center for Research on the Education of Students Placed At Risk (CRESPAR).

Texas Education Agency. (1988). Bilingual/ESL  Education: Program evaluation report. Austin, TX: Texas Education Agency. (ERIC Document Reproduction Service No. ED305821)

Thompson, M. S., DiCerbo, K., Mahoney, K. S., & MacSwan, J. (2002). ¿Éxito en California? A validity critique of language program evaluations and analysis of English learner test scores. Education Policy Analysis Archives, 10(7), entire issue. Available at http://epaa.asu.edu /epaa/v10n7/.

Willig, A. C. (1985). A Meta-analysis of selected studies on the effectiveness of bilingual education. Review of Educational Research, 55(3), 269-318.


Endnote

1. We are indebted to Stephen Krashen for bringing this important fact to our attention (Krashen, 2005).

 


Tables

Table 1

Comparisons of Effect Size by Study as They Appeared in Rolstad, Mahoney & Glass (2005)

Study

N of ES

Mean ES

SD of ES1

 

 

 

 

Burnham-Massey, 1990

 

 

 

Grades 7-8

 

 

 

Range of n's for TBE: 36-115

 

 

 

Range of n's for EO2: 36-115

 

 

 

 

 

 

 

TBE vs EO2

 

 

 

    Reading

3

-0.04

0.07

    Mathematics

3

0.24

0.14

    Language

3

0.16

0.25

 

 

 

 

 

 

 

 

Carlisle, 1989

 

 

 

Grade 4, 6

 

 

 

Range of n's for TBE:23

 

 

 

Range of n's for EO1:19

 

 

 

Range of n's for EO2:22

 

 

 

 

 

 

 

TBE vs EO1

 

 

 

    Writing-Rhetorical Effectiveness

1

0.82

 

    Writing- Overall Quality

1

1.38

 

    Writing-Productivity

1

0.60

 

    Writing-Syntactic Maturity

1

1.06

 

    Writing-Error Frequency

1

0.50

 

 

 

 

 

TBE vs EO2

 

 

 

    Writing-Rhetorical Effectiveness

1

-2.45

 

    Writing- Overall Quality

1

-8.25

 

    Writing-Productivity

1

0.18

 

    Writing-Syntactic Maturity

1

0.24

 

    Writing-Error Frequency

1

1.01

 

 

 

 

 

 

 

 

 

Carter and Chatfield, 1986

 

 

 

Grades 4-6

 

 

 

Range of n's for DBE: 26-33

 

 

 

Range of n's for EO2:14-47

 

 

 

 

 

 

 

DBE vs EO2 

 

 

 

    Reading

3

0.32

0.24

    Mathematics

3

-0.27

1.06

    Language

3

-0.60

1.54

 

 

 

 

1“SD of ES” is the standard deviation of the effect sizes.

Table 1 (continued)

Comparisons of Effect Size by Study

Study

N of ES

Mean ES

SD of ES

 

 

 

 

de la Garza and Medina, 1985

 

 

 

Grades 1-3

 

 

 

Range of n’s for TBE: 24-25

 

 

 

Range of n’s for EO2: 116-118

 

 

 

 

 

 

 

TBE vs EO2

 

 

 

    Reading Vocabulary

3

0.15

0.38

    Reading Comprehension

3

0.17

0.06

    Mathematics Computation

3

-0.02

0.15

    Mathematics Concepts

3

-0.02

0.14

 

 

 

 

 

 

 

 

Gersten, 1985

 

 

 

Grade 2

 

 

 

Range of n’s for TBE: 7-9

 

 

 

Range of n’s for ESL: 12-16

 

 

 

 

 

 

 

TBE vs ESL

 

 

 

    Reading

1

-1.53

 

    Mathematics

1

-0.70

 

    Language

1

-1.44

 

 

 

 

 

 

 

 

 

Gersten, Woodward, and Schneider, 1992

 

 

 

Grades 4-6

 

 

 

Range of n’s for TBE: 114-119

 

 

 

Range of n’s for ESL: 109-114

 

 

 

 

 

 

 

TBE vs ESL

 

 

 

    Reading

4

-0.17

0.12

    Language

4

-0.35

0.26

    Mathematics

4

0.00

0.17

 

 

 

 

 

 

 

 

Gersten and Woodward, 1995

 

 

 

Grades 4-7

 

 

 

Range of n’s for TBE: 117

 

 

 

Range of n’s for ESL: 111

 

 

 

 

 

 

 

TBE vs ESL

 

 

 

    Reading

4

-0.15

0.13

    Language

4

-0.33

0.22

    Vocabulary

3

-0.15

0.12

 

 

 

 

Table 1 (continued)

Comparisons of Effect Size by Study

Study

N of ES

Mean ES

SD of ES

 

 

 

 

Lindholm, 1991

 

 

 

Grades 2-3

 

 

 

Range of n's for DBE: 18-34

 

 

 

Range of n's for EO1: 20-21

 

 

 

 

 

 

 

DBE vs EO1 

 

 

 

    Reading

1

-0.59

 

    Language

2

-0.14

0.57

 

 

 

 

 

 

 

 

Medina and Escamilla, 1992

 

 

 

Grades K-2

 

 

 

Range of n's for DBE: 138

 

 

 

Range of n's for TBE: 123

 

 

 

 

 

 

 

DBE vs TBE

 

 

 

    language-oral, native

2

0.64

0.74

    language-oral, English

1

0.11

 

 

 

 

 

 

 

 

 

Medina, Saldate, and Mishra, 1985

 

 

 

Grades 6, 8, and 12

 

 

 

Range of n's for DBE:19

 

 

 

Range of n's for EO1: 24-25

 

 

 

 

 

 

 

DBE vs EO1

 

 

 

  MAT Test

 

 

 

    Total Mathematics

2

-0.32

0.16

    Problem Solving

2

-0.24

0.13

    Concepts

2

-0.34

0.25

    Computation

2

-0.13

0.53

    Total Reading

2

-0.21

0.08

    Reading

2

-0.30

0.28

    Word Knowledge

2

-0.10

0.10

  CAT Test

 

 

 

    Total Mathematics

1

-0.20

 

    Concepts/Application

1

-0.11

 

    Computation

1

-0.27

 

    Total Reading

1

-0.63

 

    Comprehension

1

-0.57

 

    Vocabulary

1

-0.41

 

 

 

 

 

 

 

 

Table 1 (continued)

Comparisons of Effect Size by Study

Study

N of ES

Mean ES

SD of ES

 

 

 

 

Medrano, 1986

 

 

 

Grades 1, 6

 

 

 

Range of n's for TBE: 179

 

 

 

Range of n's for EO2: 108

 

 

 

 

 

 

 

TBE vs EO2

 

 

 

    Reading

2

-0.18

0.13

    Mathematics

2

0.10

0.24

 

 

 

 

 

 

 

 

Medrano, 1988

 

 

 

Grades 1, 3

 

 

 

Range of n's for TBE: 172

 

 

 

Range of n's for EO2: 102

 

 

 

 

 

 

 

TBE vs EO2 

 

 

 

    Reading

1

0.10

 

    Mathematics

1

0.60

 

 

 

 

 

 

 

 

 

Ramirez, Yuen, Ramey, Pasta, and Billings, 1991

 

 

 

Grades 1-3

 

 

 

Range of n's for DBE: 97-197

 

 

 

Range of n's for TBE:108-193

 

 

 

Range of n's for ESL: 81-226

 

 

 

 

 

 

 

DBE vs ESL

 

 

 

    Mathematics

3

0.26

0.22

    Language

3

-0.43

-0.97

    Reading

3

0.37

0.21

 

 

 

 

TBE vs ESL

 

 

 

    Mathematics

3

0.11

0.10

    Language

3

-0.17

0.17

    Reading

3

0.01

0.16

 

 

 

 

 


Table 1 (continued)

Comparisons of Effect Size by Study

Study

N of ES

Mean ES

SD of ES

 

 

 

 

Rossell, 1990

 

 

 

Grades K-12

 

 

 

Range of n's for TBE: 250

 

 

 

Range of n's for ESL: 326

 

 

 

 

 

 

 

TBE vs ESL

 

 

 

    oral language

2

0.36

0.23

 

 

 

 

Rotharb and colleagues, 1987

 

 

 

Grades 1-2

 

 

 

Range of n's for TBE: 34-70

 

 

 

Range of n's for ESL: 33-49

 

 

 

 

 

 

 

TBE vs ESL

 

 

 

  Tests in English

 

 

 

    Mathematics

4

0.13

0.11

    Language

2

0.28

 

    Social Studies

4

0.20

0.13

    Science

4

0.09

0.18

  Tests in Spanish

 

 

 

    Mathematics

4

0.11

0.14

    Language

2

0.10

 

    Social Studies

4

0.23

0.22

    Science

4

0.16

0.11

 

 

 

 

 

 

 

 

Saldate, Mishra, and Medina, 1985

 

 

 

Grades 2-3

 

 

 

Range of n's for DBE: 31

 

 

 

Range of n's for EO1: 31

 

 

 

 

 

 

 

DBE vs EO1 

 

 

 

  Tests in English

 

 

 

    Total Achievement*

1

-0.29

 

    Reading

1

1.47

 

    Spelling

1

0.50

 

    Arithmetic

1

1.16

 

  Tests in Spanish

 

 

 

    Total Achievement

1

0.46

 

    Reading

1

2.31**

 

    Spelling

1

3.03

 

    Arithmetic

1

1.16

 

 

 

 

 

 

Table 1 (continued)

Comparisons of Effect Size by Study

Study

N of ES

Mean ES

SD of ES

 

 

 

 

Texas Education Agency, 1988

 

 

 

Grades 1, 3, 5, 7, 9

 

 

 

Range of n's for TBE: approximately 135,000

 

 

 

Range of n's for ESL: approximately 135,000

 

 

 

 

 

 

 

TBE vs ESL

 

 

 

  Tests in English

 

 

 

    Mathematics

4

-0.03

0.02

    Reading

4

-0.06

0.13

  Tests in Spanish

 

 

 

    Mathematics

2

0.33

0.06

    Reading

2

0.78

0.09

 

 

 

 

 

*Reading, Spelling, and Arithmetic are not constituents of the Total Achievement

**This effect size was calculated with the treatment group's standard deviation

TBE is Transitional Bilingual Education

DBE is Developmental Bilingual Education

ESL is English as a Second Language

EO1 is English Only instruction for Limited English Proficient children

EO2 is English Only instruction for non-Limited English Proficient children

 

 


Table 2

Combining Effect Sizes by Grouping before and after Correcting for Gersten’s Coding Error

 

Before Correction

After Correction

Grouping

N of ES

Mean ES

SD of ES

N of ES

Mean ES

SD of ES

 

 

 

 

 

 

 

All outcome measures

67

0.08

0.67

67

0.19

0.65

Reading (in English)

16

-0.06

0.61

16

0.14

0.6

Math (in English)

15

0.08

0.42

15

0.17

0.39

All outcomes in native language

11

0.86

0.96

11

0.86

0.96

Without Gersten studies

58

0.17

0.64

58

0.17

0.64

All TBE studies

35

-0.01

0.45

32

0.1

0.24

All DBE studies

30

0.18

0.86

30

0.18

0.86

All studies comparing ELLs to ELLs

22

0.23

0.97

22

0.23

0.97

 

 

 

 

 

 

No comments:

Post a Comment

Politics of Teacher Evaluation

1993 Glass, G. V & Martinez, B. A. (1993, June 3). Politics of teacher evaluation. Proceedings of the CREATE Cross-Cutting Eval...