A meta-analysis of the relationship between self-assessment and Mathematics learning achievement

This study uses a meta-analysis model to describe the relationship between self-assessment and achievement in learning mathematics. This meta-analysis covers articles published from 2011 to 2022 and is restricted to studies published in English. Indexed articles by Google Scholar were chosen. The data must meet these criteria: quantitative research, containing correlational research on the relationship between self-assessment and mathematics learning achievement (including correlation values and sample size). Through the data screening process with predetermined criteria, 12 research results were obtained, containing 43 studies to be analyzed. This meta-analysis uses a random effect model due to the heterogeneous data distribution. A publication bias test was carried out with the Fail-safe N model to ensure the quality of the data. The result of the analysis showed that the data distribution was heterogeneous, according to the I2 test, so selecting the random effect model was the right decision. Regarding publication bias, an accurate Fail-safe N test shows that the data are free from publication bias. Thus, the analysis uses a suitable model, and the results of the analysis can be trusted. The total effect size indicates a significant positive correlation (r = 0.295) between self-assessment and students' mathematics achievement. Therefore, the higher the self-assessment index, the higher one's mathematics learning achievement.

ess.Students also begin to be involved in "penilaian diri" to practice self-assessment."Penilaian diri" is used as a support for attitude assessment."Penilaian diri" is carried out at least once a year at the end of the semester (Ministry of Education and Culture, 2016, p.22).
Self-assessment is included in one of the ideas from a growth mindset.This idea is applied in the assessment, hoping to develop awareness that the final result is just one of many essential things.However, the process behind the final result, namely the achievement of learning objectives, is more important than just the final result (Ministry of Education, Culture, Research, and Technology, 2021).Thus, teachers are expected to be able to familiarize students with selfassessment as part of this effort.Unfortunately, a gap exists between the government's expectations and what teachers face.Only a very few Indonesian teachers are applying self-assessment for their students (Setiadi, 2016).In line with that fact, the self-assessment application is still limited to a formality related to the pressure and burden on teachers to fulfill their administrative duties, and even many classes rarely implement self-assessment (Brown & Harris, 2013).Relating this case, the researcher saw a link between the infrequent implementation of self-assessment in the classroom and the teacher's lack of confidence in the urgency of the self-assessment itself.The teacher needs to get a strong push or "strong why" that self-assessment is essential in learning, not just an empty appeal from the curriculum.Moreover, self-assessment improves academic performance (Yan et al., 2021) and contributes to higher learning achievement (McMillan, 2013;Brown & Harris, 2014).More specifically, there is a positive effect of selfassessment training on students' mathematical achievement as statistically revealed by Popelka (2015) and also Baiduri (2022).
Learning achievement is still one of the benchmarks for the success of the learning process in the classroom.This achievement is easy to see through the various academic scores obtained by students, and it is this learning achievement that teachers have reported to parents and other relevant stakeholders.Several previous meta-analytic studies showed a close relationship between self-assessment and academic achievement/performance, with the median effect size being between d = 0.40-0.45(Brown & Harris, 2013;Panadero et al., 2014).Likewise, there have been studies on the relationship between self-assessment and mathematics learning achievement.Therefore, the researchers here are interested in analyzing in more depth how far the link between self-assessment and mathematics learning achievement is through a metaanalysis study.Furthermore, the researchers also have a conjecture that education level and region have an effect on self-assessment and achievement in learning mathematics.The researchers hope this article will add insight into self-assessment in mathematics education in general and can help readers, especially teachers, as a first step which can later develop to be interested in further understanding of self-assessment in their mathematics class.

RESEARCH METHOD
The study focuses on articles published in peer-reviewed journals to maintain the quality of the research methodology of the selected articles (Castro-Alonso et al., 2019).The researchers used Google Scholar as the search engine to find relevant articles with the keywords "correlation", "self-assessment", or equivalent terms such as "self-evaluation", "self-reflection", "selfregulated learning", "metacognitive" (Dent & Koenka, 2016;Youde, 2019), "mathematics achievement".
Google Scholar is connected to various journal portals and journal indexing institutions so that searches can be conducted as widely as possible to find articles representing global conditions to avoid bias (Arlinwibowo et al., 2022).The "cited by" feature in Google Scholar is also used to access related studies (Martin-Martin et al., 2017).The selected articles are in English to maintain transparency in the global community (Krieglstein et al., 2022).In addition, selected articles were published in 2011-2022 to keep information up-to-date.Afterward, articles were selected based on the inclusion criteria (Figure 1).This study uses a random model.The random model was determined based on the research characteristics collected as research data.The data analyzed indicated heterogeneity, as shown by the various levels of education, the research location (country), and the instruments used.We chose the random analysis process based on these characteristics.Furthermore, the value of I² clarifies the accuracy of selecting the model.Data are said to be heterogeneous if I² > 25%.
We re-examined the selected articles to ensure the self-assessment context is in the mathematics subject area.Therefore, studies that report a correlation (r) will be selected for further analysis as an effect size through meta-analysis (Retnawati et al., 2018).Articles that do not include the correlation coefficient and the number of participants in the study will be excluded from the selected sample unless t-statistical or F data can still be obtained.They will be converted into r using the Formula (1) with t = √: where n is the number of populations or samples, all included articles will be reviewed to retrieve the necessary data.Correlation is transformed by using the z-Fisher formula, as shown in Formula (2).Then, to obtain a summary effect, data that were analyzed were converted back into a correlation as in Formula (3).
These correlations were used to read the final analysis results (Retnawati et al., 2018).The input data are z as the effect size and SEz as the standard error of the effect size.SEz can be determined in advance using Formula (4).
The data obtained above (z and SEz) were processed to display a forest plot that summarized each study's interval values and standard errors and their conclusions (Arlinwibowo et al., 2022).In addition, to ensure the quality of the data and the conclusions of the analysis results, the indication identification of publication bias was carried out.The conclusions of the metaanalysis results can be trusted if they are free from publication bias.Identification of publication indications can be carried out using the fail-safe N method.This method will produce a value with the criteria if the value of fail-safe N > 5K+10, it can be concluded that no publication bias problem exists.K on these criteria is the number of studies involved in the analysis process.
Moderating variable analysis was carried out by examining the selected studies' descriptor covariance and effect size.In the moderator analysis, the study's effect size becomes the dependent variable, and the research descriptor (in this case, the study level and area) becomes the independent variable (Cooper et al., 2009).Dent and Koenka (2016) state that the variation within each group (study level and region) is tested by meta-analysis using a within-group goodness-of-fit statistic (Qw).The significant Qw statistic indicates a systematic variation among the 43 mean correlations.A between-groups goodness-of-fit statistic (Qb) indicates more variation among the mean correlations for moderator categories than can be explained.Qb value can be obtained using Formula (5): Q is the overall correlation, and Qw is the total Q value of each member of the moderator group.The moderator testing analysis related to student education levels was carried out by categorizing students into four groups,-: JHS, SHS, and Higher Education (College) so that Qw = QES + QJHS + QSHS + QCollege.Besides, the region is categorized into East (Eastern) and West (Western).This categorization is related to the finding that many instruments were developed by western countries and turned out to be influential if used by students in the eastern region (Tian et al., 2018;Fadlelmula et al., 2014).

Findings
A total of 43 studies analyzed through this research came from 12 articles that produced data on the relationship between self-assessment and students' mathematics achievement, as summarized in Table 1.Several studies have produced several different correlations due to the differentiation of student achievement tests into several types, for example, midterm tests (or many studies call them periodic tests) and end-of-semester tests (Baiduri, 2022;Zimmerman et al., 2011).Some studies differentiate between male and female participants (Altun & Elden, 2013).In addition, several studies include t-values (Cabedo & Maset-Laudes, 2019) and F values (Lai & Hwang, 2016).Studies with an r = 0.5 have a strong relationship, r = 0.3 indicates a moderate relationship, and r = 0.1 indicates a weak one (Li, 2022).This study generally analyzes studies that produce data on the relationship between selfassessment and mathematics achievement at various levels of education.In this study, selfassessment was carried out through various existing instruments (MSLQ, MKMQ, Jr. MAI, MSA) and those developed by researchers with categories related to self-assessment (Rosário et al., 2013).These instruments often relate to students' self-regulated and metacognitive learning (Youde, 2019), while the mathematics learning achievement referred to in this study includes learning outcomes of mathematics material (in various domains) in specific periods (Ahmed et al., 2013) or special mathematics tests (namely MAT) conducted in a research by Özsoy (2011).From the various studies collected and selected, each effect size and standard error effect size were analyzed and then processed.The heterogeneity test results obtained are presented in Table 3. Table 2 shows that the 43 study effect sizes analyzed were heterogeneous with a p <0.05 (Q=942.670;p<0.001).In addition, Arlinwibowo et al. (2021) also state that one method for testing heterogeneity is I², where I² illustrates the proportion of the variation in the summary effect size on a scale of 0% to 100%.Table 3 shows the value of I² = 95.344%> 25%, so it can be said that there is heterogeneity.Thus, the random effect model is more suitable for estimating the average effect size of the 43 studies analyzed.The analysis results also indicate the potential to investigate moderator variables influencing the relationship between self-assessment and students' mathematics achievement.

Fail-safe N Target Significance
Observed Significance Rosenthal (1979) 19623.0000.050 < .001 The results of the analysis using the random effect model show (see Table 4) that there is a significant positive correlation between self-assessment and students' mathematics achievement (z=8.814;p<0.001;95% CI [0.229;0.360]).The effect of self-assessment on students' mathematics learning achievement is in the low category, with r = 0.295 (Li, 2022).The overall description of the effect size and summary effect size of this study is illustrated in the forest plot in Figure 2. It shows that the effect size of the selected studies ranges from -0.20 to 1.09.Publication bias in this meta-analysis study uses the theory by Rosenthal (1979), namely by using file drawer analysis.Because K=43, so 5K+10 = 5(43)+10 =225.The fail-safe N value obtained was 19623 (see Table 5), with a significance level of 0.05 and p <0.001.Because the value of fail-Safe N > 5K+10, it can be concluded that there is no problem of publication bias in the meta-analysis study (see Table 5).The moderator variable analysis was carried out by determining each independent sample's mean weighted correlation.All 43 correlations were generated, contributing to the overall correlation between self-assessment and mathematics learning achievement.The random effect model's mean weighted correlation is significant (Qw = 586.76 ; p <0.001).The education level variable found significant differences in the average effect sizes at the ES, JHS, SHS, and colleges (Qb=355.91;p=0.00).However, based on the follow-up test analysis (post hoc), the difference is only the ES-JHS, ES-SHS, and college-SHS levels (see Figure 3).

Figure 3. Education Level Descriptive Plot
As for regional variables, as shown in Figure 4, it was found that there was a significant difference in the average effect size in the eastern and western regions (Qb=202.548;p=0.00).However, based on the follow-up test analysis (post hoc), it turned out that there was no significant difference between eastern and western regions.

Discussion
A well-designed self-assessment can support improving student learning performance (Brown & Harris, 2013).It must also be linked with instruction so students learn and perceive they are making progress (Schunk, 1996).Assessment practices that are tightly integrated into daily instruction will focus teacher attention on the objectives to be measured and provide teachers with more useful information in order to improve students' performance in mathematics (Ross et al., 2002).Through self-assessment, students can do meta-tasks to develop their meta-competencies (self-efficacy, self-confidence, motivation).They can encourage themselves to be able to study mathematics to a higher level of understanding (Beumann & Wegner, 2018).The advantages of self-assessment in supporting students' learning achievement are closely related to the relationship between self-assessment and self-regulated learning (Andrade, 2019;Yan et al., 2021).As previously found, self-regulated learning intervention had a good impact on at-risk urban technical college students.Self-regulated learning interventions positively affected the career's path of at-risk urban technical college students based on mathematics per- formance benchmark analysis.However, there were 36% of the SRL group still did not meet the benchmark, so future research is needed to reveal the truth in this important group of students (Zimmerman, et.al., 2011).
The most robust relationship between self-assessment on mathematics learning achievement is shown in the self-regulated approach to mathematics learning achievement with a value of r = 0.80.This finding is in line with meta-analytical study by Panadero et al. (2017) which found a positive effect on students' SRL strategies and self-efficacy with self-assessment interventions.This indicates the need to use self-assessment within the framework of a self-regulated approach in the classroom to encourage students' mathematics learning achievement.Schunk (1996) states that students who feel good about their self-evaluations are driven to keep working hard and feel effective about their learning.If students feel they are capable of achieving but that their current technique is unsuccessful, low self-judgments of progress and negative self-reactions will not always impair self-efficacy and drive.These students may change their self-regulatory processes by exerting more effort, persevering longer, choosing an approach they think is superior, or asking their classmates and instructors for assistance.Those students can be easily stimulated when learning in the scenario of self-regulation.In this situation, those teachers who plan to conduct in-class activities could care more for the lower selfregulated students.The lower self-regulated students need serious assistance to be able to improve their learning before class starts so that they will be better prepared to participate in learning in class (Lai & Hwang, 2016).Hopefully, their learning performance will improve, and they could be more engaged in the mathematics class activity.This engagement is needed in self-regulated learning approach to raise their mathematics learning achievement.
Meanwhile, the lowest r-value (r = -0.2) shows a negative correlation between metacognitive knowledge (MK) of self-difficulty/lack of fluency on mathematics achievement.This negative correlation indicates that the measurement of MK of self (difficulty/lack of fluency) and MK of tasks in MKMQ.The instrument may have limitations (Fadlelmula et al., 2014) or may not be appropriate for student participants who come from China because of the different subjectivity of 'difficulties' for students.Students in China are used to getting drills on difficult questions.In the context of MKMQ, it is classified as complex, but for Chinese students, it is easy (Tian et al., 2018).In other words, there is a need for a more thorough review of the related instrument items for further measures in the context of students who are different from the origin of the MKMQ instrument (Europe).Besides, Fadlelmula et al. (2014) also found that assessing students' learning strategies through self-report instruments, such as MSLQ, may have limitations for interpreting their actual implementation of these strategies.However, in the follow-up test of the moderator analysis, there was no significant difference in self-assessment and mathematics achievement in the two regions (eastern and western).Thus, there is a need for research related to the effects of self-assessment instruments adapted from the west in the context of eastern culture and self-assessment instruments adapted from the west originally on mathematics learning achievement.
A low correlation between self-assessment and students' mathematics learning achievement founded in the form of "metacognitive student-form" in the context of elementary school self-assessment.Students' age must be one of the affecting factors due this result besides the difficulty applying on-line measurement instruments like think aloud protocols or interviews in crowded classrooms which is common in Turkey (Özcan, 2014).However, Özcan (2014) also found that there is a strong correlation shown by the relationship between "metacognitive teacher-form" and students' mathematics achievement (r = .78).He adapted the instrument which is developed by Desoete (2007) namely Teacher Evaluation Form of Students' Metacognitive Abilities in Mathematics which is a mathematics domain specific measurement instrument.
Study by Özsoy (2011) also used an adapted version of the late Desoete's instrument namely MSA (Metacognitive Skills and Knowledge Assessment) (Desoete et al., 2001).The instrument's inventory was adapted to Turkish from the previous study in 2009 (Özsoy & Ataman, 10.21831/pep.v27i12009; Özsoy et al., 2009).He found a high-level positive and reasonable relation between mathematics achievement and metacognitive skills (r = .648).In his research, the moderator's analysis also showed significant differences in self-assessment and achievement in learning mathematics at the elementary school level.Therefore, students at elementary school age need to be combined with self-assessment from teachers, considering that self-assessment skills at elementary school age still need to be developed (Nitko & Brookhart, 2014).

CONCLUSION
The selection of the random model in this meta-analysis is appropriate because the characteristics of the sample show homogeneous data and confirmation of an I2 value of more than 25%.This meta-analysis research shows a positive correlation of 0.29 between self-assessment and mathematics learning achievement.The meta-analysis results are free from publication bias, so the information obtained from the total effect size can be trusted.Even so, self-assessment must be designed properly to provide the optimal effect.This instrument design must also be ensured for its adjustment to the specific students' context.Then, this instrument design must meet the approach to learning mathematics in the classroom that supports the sustainability of the effects of the self-assessment for students.In this self-assessment context, self-regulated learning is one approach that correlates strongly with mathematics learning achievement.In addition, in the implementation of self-assessment, it is necessary to consider the students' age.For students at elementary school age, for example, it needs to be balanced by an assessment from the teacher or the need for parental assistance in its implementation.

Table 2 .
Heterogeneity Residual Test The model was estimated using the Restricted ML method.

Table 4 .
Summary Effect/Average Effect Size