Applying the Rasch Rating Scale Model (RSM) to investigate the rating scales function in survey research instrument

Jimmy Chong, Siti Eshah Mokshein, Ramlee Mustapha


The functionality and quality structure of a rating scale are vital in a survey instrument. This article discusses the underlying assumption of a rating scale and addresses the application of the Rasch Rating Scale Model (RSM) to diagnose the rating scale structure in a survey instrument. The instrument used to demonstrate the process of diagnosis a rating scale functioning in this study was the Vocational Teachers’ Assessment Literacy (VoTAL) instrument. The VoTAL instrument utilized the five-point Likert-type response format and consisted of 88 items representing three assessment literacy constructs. The data were obtained from 224 vocational teachers at five vocational colleges, which were randomly selected from the state of Selangor and the federal territory of Kuala Lumpur in Malaysia. The rating scale diagnosis results showed that the initial five steps of rating scale categories were not properly functioning as intended. It was found that the Andrich threshold of category three was disordered as it was not monotonically advance with categories. Additionally, the result also found that the width of the threshold between category three and category four was too narrow (0.66 logits). The results indicate that the rating scale used in the VoTAL instrument was disordered. Thus, it was suggested that the initial five rating step categories were collapsed down into four categories. In conclusion, this study highlights the importance of assessing the rating scale’s functionality in a survey instrument to reduce measurement error and collect valid and reliable data.


rasch measurement model; rating scales; rating scales function; survey instrument

Full Text:



Abd-El-Fattah, S. M. (2015). Rasch Rating Scale Analysis of the Arabic Version of the Physical Activity Self-Efficacy Scale for Adolescents: A Social Cognitive Perspective. Psychology, 06(16), 2161–2180.

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43(4), 561–573.

Andrich, D. (1996). Measurement Criteria for Choosing among Models with Graded Responses. In Categorical Variables in Developmental Research.

Andrich, D. (2011). Rating scales and Rasch measurement. Expert Review of Pharmacoeconomics & Outcomes Research, 11(5), 571–585.

Bond, T., & Fox, C. M. (2015). Applying the Rasch Model: Fundamental Measurement in the Human Sciences, Third Edition. Routledge.

Boone, W. J. (2020). Rasch Basics for the Novice. In M. S. Khine (Ed.), Rasch Measurement (pp. 9–30). Springer Singapore.

Boone, W. J., Staver, J. R., & Yale, M. S. (2014). Rasch analysis in the human sciences. Springer.

Bradley, K. D., Peabody, M. R., Akers, K. S., & Knutson, N. M. (2015). Rating Scales in Survey Research: Using the Rasch model to illustrate the middle category measurement flaw. Survey Practice, 8(1), 1–12.

Dimitrov, D. M. (2014). Statistical Methods for Validation of Assessment Scale Data in Counseling and Related Fields. Wiley.

DiStefano, C., & Jiang, N. (2020). Applying the Rasch Rating Scale Method to Questionnaire Data. In M. S. Khine (Ed.), Rasch Measurement (pp. 31–46). Springer Singapore.

Engelhard, G., & Wind, S. A. (2017). Invariant Measurement with Raters and Rating Scales. In Invariant Measurement with Raters and Rating Scales.

Fink, A. (2003). How to Ask Survey Questions (2nd ed.). SAGE Publications, Inc.

Finn, J. A., Ben-Porath, Y. S., & Tellegen, A. (2015). Dichotomous versus polytomous response options in psychopathology assessment: Method or meaningful variance? Psychological Assessment, 27(1), 184–193.

Fowler, F. J. (2013). Survey Research Methods. SAGE Publications.

Hales, S. (1986). Rethinking the Business of Psychology. Journal for the Theory of Social Behaviour, 16(1), 57–76.

Hilbert, S., Küchenhoff, H., Sarubin, N., Nakagawa, T. T., & Bühner, M. (2016). The influence of the response format in a personality questionnaire: An analysis of a dichotomous, a likert-type, and a visual analogue scale. TPM - Testing, Psychometrics, Methodology in Applied Psychology, 23(1), 3–24.

Lei Chang. (1994). A Psychometric Evaluation of 4-Point and 6-Point Likert-Type Scales in Relation to Reliability and Validity. Applied Psychological Measurement, 18(3), 205–215.

Likert, R. (1932). A Technique for the Measurement of Attitudes (Issue nos. 136-165). publisher not identified.

Linacre, J. M. (2000). Comparing and Choosing between “Partial Credit Models” (PCM) and “Rating Scale Models” (RSM). Rasch Measurement Transactions, 14(3), 768.

Linacre, J. M. (2002). Optimizing Rating Scale Category Effectiveness. Journal of Applied Measurement, 3, 85–106.

Linacre, J. M. (2019). A User’s Guide to WINSTEPS MINITEP, Rasch-Model Computer Programs, Program Manual 4.4.7. 0-941938-03-4

Merbitz, C., Morris, J., & Grip, J. C. (1989). Ordinal scales and foundations of misinference. Archives of Physical Medicine and Rehabilitation, 70(4), 308–312.

Mokshein, S., Ishak, H., & Ahmad, H. (2019). The Use Of Rasch Measurement Model In English Testing. Jurnal Cakrawala Pendidikan, 38(1), 16-32. doi:

Nardi, P. M. (2015). Doing Survey Research. Taylor & Francis.

Nunnally, J. C. (1967). Psychometric theory. McGraw-Hill.

Oxford Dictionary. (2020).

Preston, C. C., & Colman, A. M. (2000). Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104(1), 1–15.

Rasch, G. (1960). Probabilistic Models for Some Intelligence and Attainment Tests. Danmarks Paedagogiske Institut.

Smith, A. B., Rush, R., Fallowfield, L. J., Velikova, G., & Sharpe, M. (2008). Rasch fit statistics and sample size considerations for polytomous data. BMC Medical Research Methodology, 8(1), 33.

Smith, R. M. (1996). Polytomous mean-square fit statistics. Rasch Measurement Transactions, 10(3), 516–517.

Smith, E. V, Wakely, M. B., De Kruif, R. E. L., & Swartz, C. W. (2003). Optimizing Rating Scales for Self-Efficacy (and Other) Research. Educational and Psychological Measurement, 63(3), 369–391.

Stone, M., & Wright, B. D. (1994). Maximizing rating scale information. Rasch Measurement Transactions, 8(3), 386.

Weng, L. J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64(6), 956–972.

Wright, B. D., & Linacre, J. M. (1989). Observations are always ordinal; measurements, however, must be interval. Archives of Physical Medicine and Rehabilitation, 70(12), 857–860.

Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. MESA press.

Wright, B. D., & Panchapakesan, N. (1969). A Procedure for Sample-Free Item Analysis. Educational and Psychological Measurement, 29(1), 23–48.

Zhu, W., Updyke, W. F., & Lewandowski, C. (1997). Post-hoc Rasch analysis of optimal categorization of an ordered-response scale. Journal of Outcome Measurement, 1(4), 286–304.



  • There are currently no refbacks.


Social Media:



 Creative Commons License
Jurnal Cakrawala Pendidikan by Lembaga Pengembangan dan Penjaminan Mutu Pendidikan UNY is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at

View Our Stats

Flag Counter