Measuring School Engagement: Validation and Measurement Equivalence of the Student Engagement Scale on Angolan Male and Female Adolescents

Department of Educational and Developmental Psychology, Faculty of Psychology, University of Valencia, Spain. Department of Methodology for the Behavioral Sciences, Faculty of Psychology, University of Valencia, Spain. Language and Literature Teaching, Faculty of Teacher Training, University of Valencia, Spain. Department of Psychology and Sociology, Faculty of Social and Human Sciences, University of Zaragoza, Teruel, Spain. Departamento de Ciências da Educação, Instituto Superior de Ciências da Educação, Universidade Katyavala Bwila, Benguela, Angola.


INTRODUCTION
Engagement in school settings is defined primarily in relation to the participation of the student in academic achievement, which leads "to the experiences and desired outcomes such as persistence, satisfaction, learning, and graduation" [1, p.44].School engagement is viewed as a multidimensional and integrative construct, or macroconstruct, made up of several dimensions.There is no absolute consensus neither in the definition nor about what is the number of its dimensions but, at a minimum, engagement is seen as composed of participatory behavior and an affective component [2].Other authors have added to this list a cognitive component or have divided the behavioral component into two dimensions, academic and behavioral or participation [3,4,5].Nevertheless, the most repeated typology of dimensions of engagement recognizes three specific and overlapping dimensions: Cognitive, behavioral, and emotional [4,6,7,8].
The cognitive dimension includes the use of sophisticated, deep and personalized learning strategies, seeking for conceptual understanding, and use of self-regulatory strategies [5,9].It relies on students' investment in learning, encompassing intrinsic motivation and selfregulated metacognitive strategies used in tasks and learning activities as well as willingness to exert the necessary effort for comprehension of complex ideas [4].The Behavioral dimension focuses on the students' persistence, consistency of effort, concentration, determination, involvement in academic tasks and extracurricular activities, actions and practices related to school and learning [10].The emotional engagement or psychological dimension refers to students' attitudes about learning, teachers, academics and classmates and their feeling and sense of belonging to school and schoolwork [5].
Recently, a fourth new dimension, personal agency, has been proposed [11,12], which reflects students' constructive engagement with the academic instruction they receive at school.It is defined by an active, intentional, and constructive contribution into the flow of learning, as well as enriching the learning activity, rather than passively receiving information [9].Hence, students' personal agency plays an important role in the school-related outcomes during their entire educational career.Personal agency makes the students active participants and valued partners in their classroom interactions.The cognitive, affective, behavioral and agentic components of school engagement are thought to be fully embedded within the individual and represent the way in which students act, feel and think at school [13].
Based on the four dimensional model of school engagement proposed by Reeve and Tseng [11], Veiga developed a self-report measure (the Student Engagement Scale-4 dimensions-SES-4DS) to be used in Portuguese-speaking contexts.To our knowledge, Veiga's instrument has been the first scale in Portuguese to measure the three main components of student engagement (behavioral, emotional and cognitive) while adding the personal agency dimension.Items were derived from instruments developed in previous studies, a literature review and interviews with middle and high school students and their teachers [14][15].Sentences were intended to be short and specific and were chosen to refer to the four aforementioned dimensions of engagement.Items tapping the behavioral dimension were derived from previous studies [16].The affective dimension is composed of items employed in a new scale measuring student engagement in school, proposed to the PISA (Programme for International Student Assessment) project [15].Cognitive engagement items were derived from studies on learning processes, and school motivation and academic time management [17].Finally, the items sampling personal agency are based on several works [11,15].Veiga [12] offers evidence on certain types of validity for the SES-4DS.
A number of variables/outcomes relevant in educational research have been linked to school engagement [2,18,19], and therefore they are useful for validation purposes either as criteria or as a part of a nomological net.For example, teachers who promote students' autonomy also affect students' engagement by presenting interesting and relevant learning activities, providing optimal challenges, highlighting meaningful learning goals, and supporting students' volitional endorsement of classroom behaviors, according to the existing literature [20].In the same vein, classroom environment is thought as playing an important role in students' motivation, engagement, and achievement at school.Researchers have suggested various ways to conceptualize the characteristics of classroom environments that would be related to students' adaptive engagement [21].Among them, an influential framework has been achievement goal structures-students' perceptions of the motivational emphasis in their classroom [22,23] Finally, school engagement has also been linked with satisfaction with school, and specifically, school satisfaction is thought to promote engagement.For example, Ladd, Buhs and Seid [25] found that kindergarten students' school satisfaction determined their subsequent school engagement behaviors.These results were also found by Elmore and Huebner [26].Apparently, happiness, in the form of positive school satisfaction, and education are inextricably intertwined [27].
A bulk of research has been devoted to explore engagement differences between females and males.Martin [28] analyzed over 12,000 responses to the Motivation and Engagement Scale-High School (MES-HS), and females were found to be significantly higher than males in valuing of school, mastery orientation, planning, task management, and persistence, while males scored higher than females in their valuing selfhandicapping and disengagement.Indeed, students' level of engagement with their schools has been found to profoundly differ by gender.These differences have found empirical support regardless of the type of engagement studied [29].However, most of these studies have not considered measurement equivalence when making gender comparisons [30], they took equivalence for granted.If measures of engagement are not gender invariant, it is inappropriate to compare levels of engagement across groups.For example, Betts, Appleton, Reschly, Christensenm and Scott-Huebner [31] tested for gender invariance and found a basic invariance both in factor loadings and residuals for males and females, but they failed to test for intercepts invariance.Wang, Willet and Eccles [32] tested the measurement invariance of behavioral, cognitive and emotional engagement in ninth grade students, across gender and race/ethnicity groups.Regarding gender invariance, they found measurement equivalence, and therefore tested for latent mean differences.They found significant differences favoring females in behavioral (d = .32)and emotional engagement (d = .25),whereas there were no significant differences in cognitive engagement across gender.
In the preface of an influential handbook on student engagement Christenson, Reschly and Wylie [33] argued that to date methodological rigor and psychometrically sound measurement of engagement has not yet completely been achieved.Although they recognize that there are engagement measures psychometrically sound, they also point out that further conceptual clarity and methodological rigor is needed in this arena.Thus, the aim of this research was threefold: a) to study the validity of the SES-4DS questionnaire, b) to analyze its measurement invariance across gender, and c) to study potential latent differences between male and female in student engagement.

Sample and Procedure
The sample was composed of 2034 students studying seventh to twelfth grades in Angola.Their mean age was 17.5 years old (SD = 2.31).48.9% were males (n = 993) and 50.1% were females (n = 1035).With respect to the grade they were studying: 7º were 9.5%; 8º were 22.7%; 9º were 39.5%; 10º were 19.9%; 11º were a 6.4%; and finally 2.1% were studying the 12º grade.47.2% were living in rural areas, while the remaining 52.8% lived in urban areas.They were sampled in their school during normal teaching lessons.The survey was selfadministered but trained interviewers also were present to answer potential questions.Almost all participants completed the survey, but there were a few (less than 1%) students who did not consistently answered all parts of the survey and their responses were deleted from further analyses.All ethical guidelines for research in Angola was accomplished and the required authorities' permissions were obtained.The students took about half an hour to answer the survey.

Instruments
The survey comprised several instruments on engagement and related topics, as well as basic socio-demographic information and academic achievement measures.All of them were rated as Likert-type ordinal measures from 1 (strongly disagree) to 5 (strongly agree).Among these instruments, the ones relevant for the validation of the engagement scales were: a) Students' Engagement at School Scale-4 dimensions (SES-4DS; [12] There was a pilot study with 25 students to verify the adequate comprehension of all item contents.

Statistical Analyses
Main statistical analyses included a series of confirmatory factor analyses, the invariance routine to test for psychometric equivalence across sexes.The model specifies four latent variables (cognitive, affective, behavioral and agency engagement) each explaining five items.All models were estimated with EQS 6.1 [38].Given non-normality of the data, maximum likelihood estimation with Satorra-Bentler corrections was used as recommended, for example, by Finney and Di Stefano [39].
The equivalence or invariance routine applied was the standard procedure [40].This routine comprises a hierarchical set of steps.First, the model was separately tested on the two groups until good fit was achieved in both samples (males and females).Then, a configural model, with no parameter constraints across sexes, was tested simultaneously for females and males.This model tested the so-called weak factorial invariance or configural equivalence, and its fit indexes are used as the baseline fit criteria.Then, equality constraints were imposed for factor loading across groups, a constrained model that tests for metric invariance.
Constrained item intercepts were added to the model with equal loadings subsequently, a model that tested for scalar invariance or strong factorial invariance.Finally, further equalities across gender were imposed on factor correlation.This last model does not test for psychometric invariance, it is of substantive interest.If this last model is tenable would give support to equality of correlations among the dimensions for males and females.No constraints for invariance of errors or factor variances were imposed (strict factorial invariance) as most researches omit these constraints as not really needed for mean comparisons [41].
The plausibility of the models was assessed using several fit criteria [42,43]: (a) chi-square statistic [44,45]; (b) the comparative fit index (CFI; [46]) of more than .90(and, ideally, greater than .95;[42]); and (c) the root mean squared error of approximation (RMSEA) of .08 or less (and, ideally less, than .05)[42].Hu and Bentler's [42] suggested that a CFI of at least .90, and a RMSEA less than .06together, would indicate a very good fit between the hypothesized model and the data.The models in the invariance routine are nested.When nested models are compared there are two rationales [47]: The statistical and the modeling one.The statistical approach employs χ 2 differences (∆χ 2 ) to compare constrained to unconstrained models, with non-significant values suggesting multigroup equivalence or invariance.This statistical approach has been criticized [47,48], recommending the modeling approach that uses practical fit indices to determine the overall adequacy of a fitted model.From this point of view, if a parsimonious model (such as the ones that posit invariance) evinces adequate levels of practical fit, then the sets of equivalences are considered a reasonable approximation to the data.Usually, CFI differences (∆CFI) are used to evaluate measurement invariance.CFI differences lower than .01[48] or 0.05 [47] are usually employed as cut-off criteria.
Once the adequate fit of both samples was achieved, the set of increasingly constrained multi-group models were estimated and tested.Table 2 shows the sequence of models, their fit indices and their chi-square and CFI differences to baseline (configural) model.Both statistical and practical approaches to model comparison agree that there were no statistically significant differences between configural, metric and scalar equivalent models, and therefore the more parsimonious (scalar with equal correlations) model could be retained.Indeed practical fit indices remained extremely similar or even slightly improved (i.e. the RMSEA).Accordingly, the set of equivalences were considered tenable, and the SES-4DS may be considered equivalent by gender.
Unstandardized and standardized factor loadings in the retained model are presented in Table 3. Table 4 offers covariances and correlations among the four dimensions of engagement for both groups.In general there are significant correlations among all dimensions.Once scalar equivalence was established the latent means differences could be investigated.The latent mean values were fixed to zero for females, making them the reference group.Males had higher ratings in cognitive engagement than females (α = 0.063, z = 2.46, p < .05,d = 0.33), and they also had higher ratings in affective engagement than females (α = 0.108, z = 2.76, p < .05,d = 0.24).There were no significant mean differences neither in behavioral (α = 0.004, z = .127,p > .05,d = 0.01) nor in agency engagement (α = 0.052, z = 1.65, p > .05,d = 0.17).

Reliability and Nomological Validity
Alphas were .60,.60,.72 and .58respectively for the dimensions of cognitive, affective, behavioral and agency engagement.With respect to the criterion-related and nomological validity, Table 5 offers the correlations with the criterion and the nomological net of constructs.All these correlations were in the hypothesized direction and may be considered adequate.

DISCUSSION
The SES-4D is, to our knowledge, the only scale in Portuguese language that measures the three common dimensions of school engagement (cognitive, behavioral and affective), plus the recently proposed new dimension of agency or agentic engagement [11].Veiga [12] presented an initial validation of the scale, but an independent replication has never been presented and measurement invariance across gender has not yet been tested.Therefore, a validation of the scale in a sample coming from a different population seems timely.Reliability, factorial, criterion-related and nomological validity results of the SES-4DS in a large sample of Portuguese-speaking Angolan students is presented.Additionally, measurement invariance across gender is stablished and males' and females' engagement means compared at the latent level.This research is based on a Portuguese-speaking population coming from a very different cultural context to the original applications of the scale, and it can be therefore considered a cross-cultural validation.
With respect to factorial validity, a good model fit was achieved for the scale, with a structure of the aforementioned four dimensions, but only after deleting two items in the affective dimension ("My school is a place where I feel excluded", "My school is a place where I feel alone").It is worth noting that these two items, with an inadequate psychometric behavior, were the only ones negatively worded in the affective dimension.This structure was found to be scalar invariant for males and females.That is, the main psychometric properties remain the same for males and females, which is important for further gender comparisons.Criterion-related and nomological validity of the four dimensions of engagement can be considered adequate, with all of the relationships in accordance to hypotheses.Internal consistency, as estimated by Cronbach's alphas, cannot be considered adequate as three of the four dimensions had alphas around or equal to .60.
This results should be compared to those in the literature.Although validity results can be compared to the broad literature on engagement, regardless of the specific scale used (for example, engagement has been systematically related to teachers' autonomy support), reliability comparisons can only be made within the same  scale results.In this regard, Veiga [15,12] presented such data for the SES-4DS in Portuguese samples, and his results shown a reliable scale, as well as reliable dimensions, with alphas always above .70.This is not in line with our results.Only the behavioral dimension had a relatively high alpha.Although dropping of two items in the affective dimension of engagement could well have affected the reliability of this factor, as measured by alpha, there is no straight explanation for the low reliabilities in the cognitive and agentic engagement factors.
Validity results were clearly in line with those found in the literature.Autonomy support is given when teachers promote the participation of the students into all the decision regarding academic tasks and school governance and allow for class discussion [49].The relationship of these practices with school engagement may be explain as being practices that produce personal satisfaction and sense of responsibility with the school and learning environment [10,50].Indeed correlational analyses have shown that students' classroom engagement was quite strongly and positively associated with teachers' autonomy support [51].This research gives support to such findings with positive associations among the engagement dimensions and perceived autonomy support, with the exception of behavioral engagement.
Positive relationships among the engagement dimensions and motivational climate were also expected.Specifically, previous results have found that students who perceive their teachers' advance mastery goals are more motivated to learn and tend to engage in deeper cognitive processing, such as metacognitive and selfregulation strategies, than do students who report teachers with performance goals [52].Thus, larger relations were expected between engagement and mastery climate promotion than it was expected for performance climate.This was indeed the case, as on one hand, there were positive relations of mastery climate with cognitive, affective and agentic engagement, and there were negative relations with behavioural engagement (misbehavior).On the other hand, the relationships among the engagement dimensions and performance climate were lower and positive in the particular case of behavioural engagement.The same pattern of results was found for students' mastery and performance orientation.
Finally, school satisfaction, defined as the "subjective, cognitive appraisal of the perceived quality of school life" [53, p. 210], was also significantly related to the dimensions of engagement (with the obvious negative relation with misbehavior).This was expected since as Noddings [27] argues, "children learn best when they are happy" (p.2).This results provide some support for such notion.
Last but not least important have been the results on latent mean comparisons between males and females.The differences found at the latent mean level favoured males for cognitive and affective engagement, with no differences found in the other two dimensions.A recent trans-national study that studied gender differences in 12 countries across the world found larger means for females [54], which is contrary to in this study results.

CONCLUSION
With respect to the three aims of the study, the conclusions are: a) Factorial validity was acceptable, with a structure of the expected four dimensions, but only after deleting two items in the affective dimension.b) Scalar measurement invariance across gender has been supported by the data.
c) The differences found at the latent mean level favored males for cognitive and affective engagement, with no differences found in the other two dimensions.
Among the limitations of the study, all variables included in the survey were self-report which can lead to some types of bias.Additionally, further research is clearly needed, both from a psychometric and a substantive point of view.
Given the mixed results found in this research, replications in new samples of Angolan students and in new samples from other Portuguese speaking countries seem needed to clarify the structure, and the reliability of the scale.Given that the definition of behavioral engagement as a dimension measuring students' persistence, consistency of effort, concentration, determination, involvement in academic tasks and extracurricular activities, actions and practices related to and learning [10], it could be interesting to avoid that all indicators in this dimension where measured as lack of behavioral engagement or misbehavior.

COMPETING INTERESTS
Authors have declared that no competing interests exist.

Table 4 . Variances, covariances and correlations among the engagement factors
Note: All covariances and correlations statistically significant (p < .05)