Research Interests
I study how measurement theory and psychometrics can guide the assessment of both human (e.g., psychological traits, knowledge and skills) and machine (e.g., AI model capability). My research identifies practical questions in these domains that cannot be adequately addressed with existing statistical tools, and I develop new statistical tools that better address them.
My current research interests include:
- Developing measurement theory and psychometric tools for unstructured test response data (e.g., process data, constructed responses);
- Adapting LLMs to support development of measurement theory-grounded, evidence-centered assessments for learning (e.g., diagnostic assessments, simulation-based tasks);
- Measurement theory and new psychometric tools for AI model evaluation and benchmark design.
Education
Quantitative Psychology, Ph.D., University of Illinois Urbana-Champaign
Applied Mathematics, MS, University of Illinois Urbana-Champaign
Psychology, BA, Bryn Mawr College
Mathematics, BA, Haverford College
Grants
IES R324P230002 (co-PI): Analysis of NAEP Mathematics Process, Outcome, and Survey Data to Understand Test-Taking Behavior and Mathematics Performance of Learners with Disabilities
AERA NSF 112057 (PI): Revision and Review Behavior in Large-Scale Computer-Based Assessments: An Analysis of NAEP Mathematics Process Data
Schmidt Sciences Foundation AI Safety Science Grant (co-PI): Creating Effective Benchmarks for LLMs with Human AI Collaboration
UIUC Campus Research Board Research Support Grant (PI): Improving Ability Estimation via Flexible Copula-Based Modeling of Item Responses and Response Times
Awards and Honors
Alicia Cascallar Award (NCME, 2022)
Excellent Reviewer Award (JEBS, 2020, 2023, 2024)
UIUC List of Teachers Ranked as Excellent by Students (SP 2021, FA 2022, FA 2023, SP 2024, SP 2025)
UIUC LAS Lincoln Excellence for Assistant Professors (LEAP) Scholar (2024 - 2026)
Courses Taught
- PSYC 490 : Measurement and Test Development Lab
- STAT 428: Statistical Computing
- PSYC 593: Statistical Learning for Behavioral Data
- STAT 410: Statistics and Probability II
- NCME ITEMS Digital Module: Process Data
Additional Campus Affiliations
Associate Professor, Psychology
Associate Professor, Statistics
Highlighted Publications
Zhang, S., Wang, Z., Qi, J., Liu, J., & Ying, Z. (2023). Accurate Assessment via Process Data. Psychometrika, 88(1), 76–97. https://doi.org/10.1007/s11336-022-09880-8
Kwon, S., & Zhang, S. (2025). Explaining Performance Gaps with Problem-Solving Process Data via Latent Class Mediation Analysis. Psychometrika, 90(5), 1622-1650. https://doi.org/10.1017/psy.2025.10038
Zhang, S., Liu, J., & Ying, Z. (2023). Statistical Applications to Cognitive Diagnostic Testing. Annual Review of Statistics and Its Application, 10, 651-675. https://doi.org/10.1146/annurev-statistics-033021-111803
Jiang, H., Zhang, S., Zhu, D., Bai, Y., Truong, S. T., Yi, X., Koyejo, S., Xie, X., & Xiao, Z. (2026). AI evaluation should require standardized item-level data releases. arXiv [cs.AI].
Xiao, Z., Zhang, S., Lai, V., & Liao, Q. V. (2023). Evaluating Evaluation Metrics: A Framework for Analyzing NLG Evaluation Metrics using Measurement Theory. In H. Bouamor, J. Pino, & K. Bali (Eds.), EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 10967-10982). (EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.676
Recent Publications
Chen, L., Zhang, S., & Liu, J. (2026). Reducing Differential Item Functioning via Process Data. Psychometrika, 91(1), 95-130. https://doi.org/10.1017/psy.2025.10072
Fang, L., Wang, S., Chen, Y., Zhang, S., Liu, Z., & Zhong, W. (2026). Analyzing Complex Educational Data: A Data Analytic Framework for Integrating Structured and Unstructured Eye-Tracking Data. Psychometrika. Advance online publication. https://doi.org/10.1017/psy.2026.10096
Jiang, H., Zhang, S., Zhu, D., Bai, Y., Truong, S. T., Yi, X., Koyejo, S., Xie, X., & Xiao, Z. (2026). AI evaluation should require standardized item-level data releases. arXiv [cs.AI].
Zhang, S., He, Q., & Kwon, S. (2026). Digital Module 41: Process Data. Educational Measurement: Issues and Practice, 45(2), e70030. https://doi.org/10.1111/emip.70030
Domingue, BW., Braginsky, M., Caffrey-Maffei, L., Gilbert, JB., Kanopka, K., Kapoor, R., Lee, H., Liu, Y., Nadela, S., Pan, G., Zhang, L., Zhang, S., & Frank, MC. (2025). An introduction to the Item Response Warehouse (IRW): A resource for enhancing data usage in psychometrics. Behavior Research Methods, 57(10), Article 276. https://doi.org/10.3758/s13428-025-02796-y