Training methodology was established to optimize reliability of outcome measures in the nusinersen clinical trials. The Children's Hospital of Philadelphia Infant Test of Neuromuscular Disorders (CHOP INTEND), Hammersmith Functional Motor Scale Expanded (HFMSE), and Revised Upper Limb (RULM) were primary or secondary outcomes.
Video review, quarterly conference calls, and item scoring checks supported evaluator competence. Baseline and screening along with video review established intra and inter-rater reliability.
Inter and intra-rater reliability were both excellent. Intraclass correlation coefficients (ICC) ranged between 0.906-0.994 across initial training meetings and 0.824-0.996 across annual retraining meetings. This was similar for CHOP INTEND (ICC = 0.824-0.951), HFMSE (ICC = 0.981-0.996), and RULM (ICC = 0.966-0.990). Intra-rater reliability for the CHOP INTEND, HFMSE, and RULM were ICC = 0.895 (95% CI: 0.852-0.926; n = 116), ICC = 0.959 (95% CI: 0.942-0.971; n = 125), and ICC = 0.948 (95% CI: 0.927-0.963; n = 126) respectively.
Rigorous evaluator training ensures reliability of assessment of subjects with spinal muscular atrophy (SMA) in multicenter international trials.