[page 100]
Appendix 2 Statistical note
Contributed by one of the Department's statisticians
The sample of teachers
1. A stratified random sample of slightly over 300 teachers was drawn from the DES main mechanised record of teachers, a computer file containing data supplied by local education authorities for statistical and superannuation purposes. The population available for sampling was defined as all trained graduate teachers* in maintained primary, middle and secondary schools in England and Wales who started teaching in September 1980. There were three dimensions to the stratification: the type of school in which the teacher was employed (primary, middle or secondary), an eightfold geographical classification of the teacher's LEA based on the seven English Divisions of HM Inspectorate with the addition of Wales and, for secondary teachers only, the first-named subject of the teacher's degree.
2. At the time when the structure of the sample was determined it seemed likely that the population of newly trained graduate teachers would be split between primary and secondary schools in the ratio 1:2. This ratio was used for sampling and is confirmed by the latest available estimates of the intake. After losing 6 of the teachers originally drawn from the sample due to illness or resignation and having distributed 20 teachers teaching in middle schools between primary and secondary depending on the curricular organisation of the schools**, the resulting sample of 93 primary school teachers and 20 I secondary school teachers represented a little over 3 per cent of the graduate intake population.
3. There is no discernible geographical bias in the sample, though the marked differences in the rates of recruitment between various LEAs are reflected in it. The qualifications of secondary school teachers are in proportion to those of the population, but there may be small biases in that certain graduate qualifications were not separately identified in the Department's records. Any bias would be too small to invalidate any of the conclusions drawn in this report.
4. Given the composition of the sample there are also some small biases in the lessons observed for teachers in secondary schools. Inspectors visited each school for one day during which they were to observe two lessons and have a number of discussions. With such a tight schedule it was not possible to exercise more than minimal control over what lessons were observed. Inspectors were asked to ensure that at least one of the two lessons was typical of each teacher's timetable but were otherwise free to choose whatever it was convenient to observe. The teachers had been asked to provide certain details of their weekly timetables, and the distribution of their lessons over the six year-groups is set alongside the distribution of the 402 lessons observed in the table opposite:
*ie including both BEd and PGCE-trained teachers.
**Which normally reflected their status and age-range, "deemed primary" or "deemed secondary".
[page 101]
5. Lessons with the sixth year arc under-represented, but the balance within the other years is preserved reasonably well. It was not possible to test for other forms of bias, but it is unlikely that they would be severe enough to invalidate any of the conclusions in the report. It is possible that lessons which were atypical of the teachers' timetables were under-represented in the collection of observations, since each Inspector visiting a school had a specialism matched with the school's preliminary indication of the teacher's main teaching subject. The Inspector may have tended to observe his own specialism for both of the lessons if the constraints of the one-day visit allowed of that.
Statistical analyses
6. The material in Chapters 1 to 6 of the report rests to a greater or lesser extent on analyses of the statistical material collected. While most of the analyses simply reported the distribution of each variable separately for teachers in primary and secondary schools, some examined the relationships between these variables. Here it is only possible to show the basis for a few of the more explicit references in the text to probable associations, giving examples from the data and indicating the techniques and tests employed.
7. With much of the data being of an ordinal* rather than continuous nature, it would have been useful to experiment with the newly emerging techniques for handling multivariate ordinal situations. However, the need to produce a timely report precluded this approach. Instead, a variety of bivariate non-parametric methods were used, supplemented by conventional multivariate techniques. Proper caution was exercised in the interpretation of the results of analyses which were not entirely appropriate to the kind of data being studied.
8. Two major classificatory variables used in the questionnaire completed by the teacher show the type of training course undertaken. whether four-year BEd, three-year BEd, PGCE or other, and the type of institution attended. The categorisation of institution types offered in the questionnaire was found to be imperfectly understood by the probationer and not the most fruitful for subsequent analyses, so HMI undertook a recoding of type of institution into a fourfold division of university departments of education (UDEs), polytechnic departments of education (POEs). LEA-maintained colleges and voluntary colleges or institutes of higher education, based on the name of the institution which every probationer had been asked to supply.
9. These two variables, referred to as TC (Type of Course) and TI (Type of Institution), were set against five assessment variables MSJ, MSK, SAS, SATI, SAT2. All of these assessments except SAT I are clearly of an ordinal nature. The findings are summarised in paragraphs 3.50ff of the report.
*Ordinal data is measured on scales whose points have a clear ordering though little or nothing is known about the relative magnitudes of the intervals between the points. Continuous data on the other hand is measured on scales such as age or height whose intervals arc in some sense equal. Classificatory data (mentioned in paragraph 8) is not measured on any scale at all, serving only to distinguish various subsets of the sample.
[page 102]
10. MSJ is an Inspector's assessment of the teacher's mastery of the subject-matter of the lessons observed, one of a number of assessments made on the evidence of the work HMI saw in the classroom*. The two ratings for each teacher, which ranged between I and 5, are here added together to give an MSJ score ranging between 2 and 10. For ease of display the resulting scores have been placed in three broad bands. Table A 1 shows the percentage distribution across MSJ for each category of TC and TI for secondary teachers. Under TC, the "other" type of course has been suppressed since no secondary teachers placed themselves in this category. This analysis is confined to those teachers who were observed teaching a subject for which they had an appropriate qualification; less than 10 per cent of the total sample of 20 I secondary teachers had to be excluded to meet this condition.
Table A1 Mastery of subject-matter of lessons observed by type of course and type of training institution
11. The apparent superiority of PGCE-trained secondary teachers seen in the TC block Table A I is not confirmed by any statistical test, but those who attended university departments of education were rated significantly better in their mastery of the subject-matter of the lessons observed than those who attended other types of institution. The statistical significance of a Mann-Whitney U-test was confirmed by a median test. For all the associations reported in this appendix the threshold for statistical significance was taken as 95 per cent, allowing only a one in twenty chance that an association reported between two variables would be due to spurious sampling errors.
12. MSK is HMI's rating on a 5 point scale of the teacher's mastery of a wide range of teaching skills**. For primary teachers MSK was found to be associated with the type of course but not with the type of institution attended, whereas for secondary teachers it was weakly associated with the type of institution but not at all with the type of course. The significant associations are displayed in Table A2.
13. Primary teachers who undertook a four-year BEd have a significantly better mastery of skills than either of the other two groups. Secondary teachers who were trained in maintained colleges appear to have a lesser mastery of skills than those trained elsewhere
*See paragraphs 2.25 and 2.29 and Table 8.
**See paragraphs 3.1ff and Figure 2.
[page 103]
but the significance of the difference for secondary teachers is only apparent using the Mann-Whitney U-test which, while more powerful than other tests, might not be considered appropriate to such data. The finding is not confirmed by other tests.
Table A2 Mastery of teaching skills by type of course and type of training institution
14. SAS reflects the school's own assessment of how well equipped each teacher was. The 5-point scale, ranging from I (very well equipped with no stated reservations) to 5 (very ill equipped) was used by a small group ofHMI to code the written responses of visiting HMI which resulted from their interviews with senior school staff". This assessment is consistent with that of HMI (MSK) as is illustrated in Table A3. It is immediately obvious from the marginal totals that SAS, the school's assessment as interpreted by HMI, is more favourable to the teachers than the more direct MSK assessment. Within these marginal constraints it can be seen that the balance of the SAS assessments move to the right as one moves downwards through the rows of MSK. The rank correlation coefficients between the ordered categories of the pairs of variables attain values of 0.44 for primary teachers and 0.38 for secondary teachers. Bearing in mind the subjectivity which underlies the assessments, these values are quite high.
Table A2 Schools' assessment of how well equipped each teacher was by HMIs' assessment of their mastery of skills
*See paragraphs 5.5ff.
[page 104]
15. SATI is a measure of the teachers' satisfaction with certain aspects of their training. The ratings shown in Tables II, 12 and 13 were averaged for each teacher.* The resulting scale can be treated as if it were continuous. There were no significant differences in the mean SAT I scores between teachers in primary and teachers in secondary schools but there were significant differences in the mean scores of teachers trained through different types of course, putting primary and secondary teachers together. The mean scores are listed in Table A4 together with their standard errors.
Table A4 Teachers' satisfaction with different types of training course
Since a higher mean score represents a greater dissatisfaction, it is apparent that teachers with PGCE training are significantly more dissatisfied with certain parts of their training than are other teachers, the differences in the mean scores being considerably greater than the standard errors in the differences, which are about 0.083.
16. There is a smaller but still significant association between SAT I and the type of institution attended. An analysis of variance to test for the equality of the four mean scores for the four types of institution was almost significant at the 95 per cent level. An inspection of the means revealed that there was little difference between the levels of satisfaction with courses in UDEs. LEA-maintained colleges and voluntary colleges but that there was a somewhat greater level of satisfaction with courses in polytechnic departments of education, the mean being significantly different from the combined mean of the other three types of institution.
17. SAT2 is a measure of the teachers' satisfaction with the balance of time spent on various components of the training course. The first four items shown in Table 14** were averaged after recoding the midpoint of each scale to 1 to indicate satisfaction, points 2 and 4 to 2 and points I and 5 to 3 to indicate dissatisfaction. The mean scores are listed in Table A5 together with their standard errors.
Table A5 Teachers' satisfaction with balance of time within training course, by type of course
*See paragraphs 3.21 to 3.40 for further details
**See paragraphs 3.41 to 3.49 for further details
[page 105]
As for SAT1, the PGCE-trained teachers are different from the other two groups but this time in the opposite direction. They are more satisfied with the balance of their courses.
18. There is also an association between SA T2 and the type of institution attended. The mean scores are shown in Table A6 where the types of institution are reordered in descending order of satisfaction. An F-test for the equality of the four means was highly significant (p = 0.99). The difference between adjacent pairs of institution types in the tables are not significant but all other pairwise comparisons show significant differences.
Table A6 Teachers' satisfaction with balance of time within training course, by type of training institution
SAT2 has a somewhat skewed distribution on only nine points. The use of F and T statistics on such data is questionable but the differences for PGCE-trained students displayed in Table A5 were confirmed by suitable chi-square tests on the raw frequency distributions for each of the four variables which were averaged to create SA T2.
19. Where an association between two ordinal scales was needed. Kendall's rank correlation coefficient was computed, together with its associated level of significance. An example of this is seen in the correlation of the HMIs' overall 6-point scale assessment of each lesson observed with their rating of the constraints within the class or within the school which might have affected the lesson*. The severity of each of the four constraints had been rated on a 5-point scale for each lesson. The four ratings were average for each lesson and the result correlated with the 6-point scale assessment whose overall distribution is shown, in Figure I.
20. The rank correlation coefficient for lessons in primary schools was 0.43 and for those in secondary schools 0.38, both coefficients being very significantly non-zero. II is not easy to interpret the absolute value of rank correlation coefficients but the impression that constraints on lessons played a major part in the outcome was confirmed by an analysis of variance (more correctly an analysis of deviance within the framework of a Generalised linear Model which treated the 6-point scale as if it were a continuous variable) under which 36 per cent of the total variance in the 6-point scale ratings for primary lessons and 25 per cent for secondary lessons was explained by a linear combination of the four constraint ratings. There was a
*See paragraphs 5.41ff.
[page 106]
large degree of overlap in the effects of the four constraints with about three-quarters of the explainable variation being accounted for by the difficulty of the class and over half being separately attributable to the absence of clear guidelines or schemes of work.
Table A7 Schools' provision of opportunities for the professional development of teachers by teachers' job satisfaction
21. The Inspectors who visited the schools were asked to record, in descriptive form, their impression of how well satisfied the teachers were with their jobs. As for SAS, a small group of HMI coded these written responses onto a five-point scale, JSAT. whose points were defined as:
1 well satisfied
2 well satisfied except for minor reservations
3 generally satisfied
4 fairly dissatisfied
5 very dissatisfied*
Table A8 Support from heads and other staff by teachers' job satisfaction
*See paragraph 5.57 and Table 40.
[page 107]
Table A9 Teachers', relationships with heads and other staff by teachers' job satisfaction
22. JSAT was found to correlate quite highly with a number of assessments made about the school and relationships between its staff and the probationary teacher*. These correlations are shown in Tables A 7, A8 and A9. On both dimensions of each table the lowest two categories of each scale have been combined for ease of display. The rank correlation coefficients (Kendall's Tau) are all very significantly non-zero (with a significance level greater than 99.9 per cent) except for those shown for primary teachers in Table A8 which are nevertheless significant at the 95 per cent level.
23. All the examples quoted above involve some measure of subjective assessment made by HMI, by the schools or by the teachers. Several other analyses looked at the associations between purely factual pieces of data. An example is shown in Figure A I where the distribution of the total number of weeks of teaching practice is displayed for each of the three types of course. It can be seen that the teaching practice in PGCE courses. with an average of 12.2 weeks is less than in BEd courses. with an average of 14.5, or in four-year BEd courses. with an average of 14.8. It is also evident that the most common pattern for BEd courses is 15 weeks with 13 weeks as the next most common pattern. For PGCE courses the two most common patterns are 12 and 10 weeks**.
*See paragraphs 5.51-5.86 and Tables 37-39.
**See also paragraph 3.56.
[page 108]
Figure A1 Distribution of length of teaching practice by type of course