Objectives: To estimate intra- and inter-observer reproducibility and reliability of assessment of the color content in adnexal masses at color/power Doppler ultrasound examination for observers with different levels of experience, and to determine if they change after a consensus meeting. Methods: Digital clips with color/power Doppler information of 103 adnexal masses were evaluated independently four times, twice before and twice after a consensus meeting, by four experienced and three less experienced ultrasound examiners. The color content of the adnexal mass was estimated using the International Ovarian Tumor Analysis color score and a 100 mm visual analogue scale (VAS score). Intraobserver repeatability was estimated for each observer. Interobserver agreement was estimated for the four most experienced observers (six pairs), for the three less experienced observers (three pairs), and for four other pairs of observers, each pair consisting of one of the experienced and one of the less-experienced observers. Results: Intra- and inter-observer agreement for the color score was moderate to very good, percentage agreement ranging from 48% to 82.5% (Kappa 0.52-0.82) before and from 59% to 90% (Kappa 0.60-0.88) after the consensus meeting. For seven of 13 pairs of observers, interobserver agreement improved after the consensus meeting. Intra-observer intraclass correlation coefficient (ICC) values for the VAS score ranged from 0.80 to 0.92 before and from 0.75 to 0.94 after the consensus meeting, but limits of agreement were wide (+/-20-35 mm). For six of the seven observers the ICC values were higher after the consensus meeting than before. Inter-observer ICC values for the VAS score ranged from 0.77 to 0.88 before and from 0.77 to 0.91 after the consensus meeting, but limits of agreement were wide (+/-30-40 mm). For ten of 13 pairs of observers the ICC values improved after the consensus meeting. Conclusions: Intra-and inter-observer agreement for the color score was good, especially after the consensus meeting, but there is room for improvement. VAS score results varied substantially within and between observers both before and after the consensus meeting. General consensus needs to be reached about how to interpret color/power Doppler ultrasound findings in adnexal masses.
- assessment adnexal masses