If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Department of Biomedical Sciences for Health, Università degli Studi di Milano, Via Luigi Mangiagalli 31, 20133 Milano, ItalyUnit of Radiology, IRCCS Policlinico San Donato, Via Rodolfo Morandi 30, 20097 San Donato Milanese, Italy
Chest x-ray had a 89 % sensitivity detecting COVID-19 pneumonia during pandemic peak.
Experienced radiologists had higher specificity than less-experienced ones.
Overall and per-group sensitivity in detecting COVID-19 pneumonia increased over time.
Overall and per-group accuracy in detecting COVID-19 pneumonia increased over time.
To report real-world diagnostic performance of chest x-ray (CXR) readings during the COVID-19 pandemic.
In this retrospective observational study we enrolled all patients presenting to the emergency department of a Milan-based university hospital from February 24th to April 8th 2020 who underwent nasopharyngeal swab for reverse transcriptase-polymerase chain reaction (RT-PCR) and anteroposterior bedside CXR within 12 h. A composite reference standard combining RT-PCR results with phone-call-based anamnesis was obtained. Radiologists were grouped by CXR reading experience (Group-1, >10 years; Group-2, <10 years), diagnostic performance indexes were calculated for each radiologist and for the two groups.
Group-1 read 435 CXRs (77.0 % disease prevalence): sensitivity was 89.0 %, specificity 66.0 %, accuracy 83.7 %. Group-2 read 100 CXRs (73.0 % prevalence): sensitivity was 89.0 %, specificity 40.7 %, accuracy 76.0 %. During the first half of the outbreak (195 CXRs, 66.7 % disease prevalence), overall sensitivity was 80.8 %, specificity 67.7 %, accuracy 76.4 %, Group-1 sensitivity being similar to Group-2 (80.6 % versus 81.5 %, respectively) but higher specificity (74.0 % versus 46.7 %) and accuracy (78.4 % versus 69.0 %). During the second half (340 CXRs, 81.8 % prevalence), overall sensitivity increased to 92.8 %, specificity dropped to 53.2 %, accuracy increased to 85.6 %, this pattern mirrored in both groups, with decreased specificity (Group-1, 58.0 %; Group-2, 33.3 %) but increased sensitivity (92.7 % and 93.5 %) and accuracy (86.5 % and 81.0 %, respectively).
Real-world CXR diagnostic performance during the COVID-19 pandemic showed overall high sensitivity with higher specificity for more experienced radiologists. The increase in accuracy over time strengthens CXR role as a first line examination in suspected COVID-19 patients.
] have repeatedly stated that the diagnosis of SARS-CoV-2 infection should primarily rely on viral testing rather than on chest imaging.
This endorsed reference standard, i.e. reverse transcriptase-polymerase chain reaction (RT-PCR) on nasal or throat swabs, has become essential in the triage and monitoring phases of patients with suspected SARS-CoV-2 infection [
]. Moreover, during the pandemic peak, RT-PCR response times became often incompatible with appropriate triaging and management of the high number of suspect COVID-19 cases simultaneously presenting to emergency departments [
]. The two largest by far are a retrospective review by a single radiologist of 518 CXRs acquired during the first phase of the pandemic peak (from March 1st to March 15th) – with a resulting overall sensitivity of 57 % [
]. In our analysis we instead considered the dichotomized reports of all radiologists on duty during a larger period (i.e., from February 24th to April 8th, 2020), obtaining an overall 89.0 % sensitivity and 60.6 % specificity, using a composite reference standard (RT-PCR supplemented by anamnestic data and patient follow-up, as well as by RT-PCR repetition in negative cases). We aim now to further analyse the radiologists’ real-world performance in CXR reading during the COVID-19 pandemic, distinguishing them according to their CXR reading experience.
2. Materials and methods
This retrospective observational study was approved by the local Ethics Committee and performed between February 24th and April 8th, 2020, at IRCCS Policlinico San Donato (San Donato Milanese, Italy), a university hospital mainly focusing on cardiovascular diseases but promptly converted to a primarily COVID-19-dedicated hospital during the pandemic peak.
We included in this study all patients presenting to our emergency department for suspected SARS-CoV-2 infection who underwent both a nasopharyngeal swab for RT-PCR and an anteroposterior bedside CXR within 12 h from admission. At our hospital, CXRs are reported by the on-duty radiologist within about 60−90 min if performed during the day shift (07:00 am – 08:00 pm), and at the beginning of the following working day if performed during the night shift (08:00 pm – 07:00 am). Considering the delay in the availability of RT-PCR results, caused by the high number of patients incessantly presenting to the emergency department during the pandemic peak in our region, all CXRs in the study period were reported by radiologists forcedly blinded to RT-PCR results.
For the purposes of this study, as previously described [
], we then built a composite reference standard to improve RT-PCR sensitivity, by combining RT-PCR results with phone-call-based complete anamnesis in RT-PCR-negative patients who had not repeated the swab during hospitalization. Considering the rather unspecific nature of CXR findings in patients with COVID-19 pneumonia, a radiologist with 5 years of experience in CXR interpretation (S.S.) reviewed all routine CXR reports – being blinded for the original radiologists’ signatures – in order to classify them dichotomously as positive or negative for COVID-19. The absence of pulmonary abnormalities on a CXR determined its classification as a negative one, while the presence of interstitial infiltrates – associated or not with alveolar infiltrates – with predominantly bilateral and basal distribution on a CXR implied its classification as a positive examination [
]. Conversely, CXR findings unrelated to COVID-19, such as lobar alveolar infiltrates (typically associated with bacterial pneumonia) pleural effusion, pneumothorax, were considered as non-COVID-19-related finding for the purpose of this dichotomization.
We grouped the seven radiologists from our department by their CXR reading experience: Group 1 included 4 radiologists (R1, R2, R3, and R4) with 10 or more years of experience in CXR reading; Group 2 included 3 (R5, R6, and R7) radiologists with less than 10 years of experience in CXR reading. All radiologists were board-certified: if a resident was in charge of drafting a first version of the report, the report was always checked by a board-certified radiologist and the final version was signed by the same board-certified radiologist. Only one of the seven radiologists (in Group 1) has a particular dedication to breast imaging but practices at least half of his time as a general radiologist. Overall and patient-sex-specific diagnostic performance indexes were calculated for each radiologist and for the two groups over the 6-week timeframe and according to the first and second half of all CXRs read by each radiologist. Data are presented as sensitivity, specificity, positive predictive value, negative predictive value, accuracy, positive likelihood ratio, negative likelihood ratio, and their 95 % confidence intervals (CI). Statistical analyses were performed using Microsoft Excel 2019 (Microsoft Corporation, Redmond, WA, USA).
In the six-week study period, R1 read 180 CXRs, with a 79 % disease prevalence, R2 read 147 CXRs with a 70 % disease prevalence, R3 read 65 CXRs with an 80 % disease prevalence, and R4 read 43 CXRs with an 88 % disease prevalence. Overall, readers from Group 1 read 435 CXRs with a 77.0 % disease prevalence, obtaining an 89.0 % sensitivity (95 % CI 85.2 %–91.9 %), a 66.0 % specificity (95 % CI 56.3 %–74.5 %), an 83.7 % accuracy (95 % CI 79.9 %–86.9 %), an 89.8 % positive predictive value (95 % CI 86.0 %–92.6 %), a 64.1 % negative predictive value (95 % CI 54.5 %–72.7 %), a 2.62 positive likelihood ratio (95 % CI 1.99–3.45), and a 0.17 negative likelihood ratio (95 % CI 0.12–0.23). In Group 2, R5 read 59 CXRs with a 78 % disease prevalence, R6 read 27 CXRs with a 70 % disease prevalence, R7 read 14 CXRs with a 57 % disease prevalence; overall, readers from Group 2 read 100 CXRs with a 73.0 % disease prevalence, obtaining an 89.0 % sensitivity (95 % CI 79.8 %–94.3 %), a 40.7 % specificity (95 % CI 24.5 %–61.0 %), a 76.0 % accuracy (95 % CI 66.8 %–83.3 %), an 80.2 % positive predictive value (95 % CI 70.3 %–87.5 %), a 57.9 % negative predictive value (95 % CI 36.3 %–76.9 %), a 1.50 positive likelihood ratio (95 % CI 1.09–2.08), and a 0.27 negative likelihood ratio (95 % CI 0.12–0.60). Fig. 1 shows an example of a true positive and of a false positive case both for Group 1 and Group 2, Table 1 details overall performance indexes of all readers, and Table 2 shows the results of readers performance evaluation according to patients subgroups and different timeframes (i.e. the first and second three-week periods).
Table 1Diagnostic performance indexes for chest x-ray reading for each radiologist and for the two experience-tiered groups.
Considering the first half and the second half of all CXRs read by each radiologist, we observed an increase in disease prevalence for 5 out of 7 readers: disease prevalence in the CXR subset read by R1 increased from 77 % to 81 %, from 64 % to 77 % for R2, from 86 % to 90 % for R4, from 70 % to 86 % for R5, from 64 % to 77 % for R6, while decreasing from 85 % to 75 % for R3 and from 71 % to 43 % for R7. Group 1 readers attained an 87.2 % sensitivity (95 % CI 81.2 %–91.5 %), a 71.4 % specificity (95 % CI 58.5 %–81.6 %), an 83.2 % accuracy (95 % CI 77.7 %–87.5 %), an 89.9 % positive predictive value (95 % CI 84.3 %–93.7 %), a 65.6 % negative predictive value (95 % CI 53.0 %–76.3 %), a 3.05 positive likelihood ratio (95 % CI 2.01–4.64), and a 0.18 negative likelihood ratio (95 % CI 0.12–0.28) in the first half of all their reported CXRs, while in the second half they reached a 90.6 % sensitivity (95 % CI 85.3 %–94.2 %), a 59.1 % specificity (95 % CI 44.4 %–72.3 %), a 84.2 % accuracy (95 % CI 78.7 %–88.5 %), an 89.6 % positive predictive value (95 % CI 84.2 %–93.3 %), a 61.9 % negative predictive value (95 % CI 46.8 %–75.0 %), a 2.22 positive likelihood ratio (95 % CI 1.55–3.17), and a 0.16 negative likelihood ratio (95 % CI 0.09–0.27). Conversely, Group 2 readers had an 82.9 % sensitivity (95 % CI 67.3 %–91.9 %), a 43.8 % specificity (95 % CI 23.1 %–66.8 %), a 70.6 % accuracy (95 % CI 57.0 %–81.3 %), a 76.3 % positive predictive value (95 % CI 60.8 %–87.0 %), a 53.8 % negative predictive value (95 % CI 29.1 %–76.8 %), a 1.47 positive likelihood ratio (95 % CI 0.93–2.33), and a 0.39 negative likelihood ratio (95 % CI 0.16–0.98) in the first half of all their reported CXRs, while in the second half they showed a 94.7 % sensitivity (95 % CI 82.7 %–98.5 %), a 36.4 % specificity (95 % CI 15.2 %–64.6 %), a 81.6 % accuracy (95 % CI 68.6 %–90.0 %), an 83.7 % positive predictive value (95 % CI 70.0 %–91.9 %), a 66.7 % negative predictive value (95 % CI 30.0 %–90.3 %), a 1.49 positive likelihood ratio (95 % CI 0.95–2.34), and a 0.14 negative likelihood ratio (95 % CI 0.03–0.69). Table 3 details performance indexes both overall and for each reader in the first and second half of their CXR subset, sensitivity, specificity, and accuracy being also plotted in Figs. 2, 3 and 4, respectively.
Table 3Different diagnostic performance indexes for chest x-ray reading between the first and second half of interpreted chest x-rays for each reader and both radiologists’ groups.
], who also warned against potential low diagnostic performance of CXR when reported by non-dedicated chest radiologists. Real-world data from this study, albeit conducted in a high-prevalence region and during a SARS-CoV-2 pandemic peak, seem to provide a better scenario, in which radiologists with less than 10 years of experience matched the 89.0 % sensitivity attained by radiologists with more than 10 years of experience, with similar disease prevalence in the CXR subsets read by each group (73 % versus 77 %, respectively). A non-negligible cost for Group 2 to attain such a sensitivity was a consistently lower specificity (41 %, 95 % CI 25 %–59 %) – a value similar to the pooled specificity reported for chest CT by a meta-analysis of 3 studies from non-high-epidemic areas and 2 studies from high-epidemic areas (37 %, 95 % CI 26 %–50 %) [
] – while Group 1 showed a smaller difference between sensitivity and specificity, with a constantly higher accuracy (Table 2). Such pattern was also observed comparing different timepoints or the total number of CXRs read by each radiologist: between the first and second half of the six-week study period overall accuracy increased from 76 % to 86 %, with corresponding increases both in Group 1 and Group 2; between the first and second half of CXRs read by each reader, overall accuracy increased from 81 % to 84 %, again with corresponding increases in both groups, albeit more pronounced in the less experienced Group 2 (1% difference for Group 1, 11 % difference for Group 2). This trend was most likely driven in both groups by an adaptation to the escalation of examined cases (from 195 in the first three weeks to 340 in the following three), with an increase in sensitivity and accuracy mirrored by a specificity decrease. Of note, we can observe how in both groups there was a comparable number of readers exhibiting an inverse tendency towards a decrease in accuracy (Fig. 1) and sensitivity (Fig. 2), reinforced by a decrease in specificity in all but one less-experienced reader (Fig. 3).
Limitations of this study include its retrospective and monocentric nature, the fact that each radiologist read a different subset of images, and the imbalance in the number of CXRs read by Group 1 and Group 2, with the lesser-experienced Group 2 reading 18.6 % of all CXRs. However, the closely proportionate disease prevalence between the two groups substantiates the comparability of subsequent findings and seems to suggest a more pronounced influence of overall radiological experience on the diagnostic performance of each group. Such an hypothesis should be verified with a conventional multi-reader study, to ascertain if these differences in diagnostic performance are also influenced by the number of COVID-19-positive CXRs read by each radiologist, or indeed result from a combination of these factors. However, we should also consider that any multi-reader study performed after a pandemic outbreak would not reproduce the condition of the first outbreak, when the new disease first spread in a country. Other than a conventional multi-reader study, further evaluations of real-world diagnostic performance should also target the potential impact on diagnostic performance of various types of subspecialty radiological training and of centre-specific contingencies, such as presence and employment of residents, different radiologists workloads, and disparities in CXR reporting conducted during day or night shifts. In addition, the result herein reported should be considered in light of the pandemic peak – with very high disease prevalence – and could be not reproducible in low prevalence settings [
]. Being this a real-world data study, our results rely on a practical dichotomization of CXR reports: their potential generalizability must be therefore very carefully considered, especially when, in case of suspected COVID-19, we have a non-typical CXR for SARS-CoV-2 pneumonia. Clinical translation of our findings would still result in at least two different scenarios, also taking into account the unspecific nature of CXR findings in COVID-19 pneumonia and other viral pneumonias. First, when a patient displays suspicious symptoms for COVID-19 that can however be justified by alternative pathological CXR findings pointing to another disease (such as pleural effusion, pneumothorax, bacterial pneumonia), the management of the patient would remain the one that would have normally been followed in the detected condition. Otherwise, if in a general situation of increased patient influx to emergency departments a patient presents with suspicious symptoms for COVID-19 but no suggestive CXR findings or other findings that can justify a COVID-19 diagnosis, the use of chest CT could be considered [
] – if the patient’s clinical conditions are stable and it is therefore possible to wait for RT-PCR confirmation of SARS-CoV-2 infection, preventive isolation would remain the safest approach.
To summarize, the real-world diagnostic performance of CXR during the COVID-19 pandemic peak reached a relatively well-balanced overall accuracy (76 %–86 %), with an 89 % sensitivity and a higher specificity for the more experienced radiologists (66 %), lower for the less experienced radiologists (41 %). Such data play in favour of the use of CXR as first line examination when chest imaging is required to aid the triage process of suspected COVID-19 patients during a pandemic peak.
CRediT authorship contribution statement
Andrea Cozzi: Conceptualization, Methodology, Formal analysis, Investigation, Data curation, Writing - original draft, Writing - review & editing, Visualization, Project administration. Simone Schiaffino: Conceptualization, Methodology, Investigation, Data curation, Validation, Writing - original draft, Writing - review & editing, Supervision, Project administration. Francesco Arpaia: Investigation, Data curation, Writing - original draft, Writing - review & editing. Gianmarco Della Pepa: Investigation, Data curation, Writing - original draft, Writing - review & editing. Stefania Tritella: Investigation, Data curation, Writing - original draft, Writing - review & editing. Pietro Bertolotti: Investigation, Data curation, Writing - original draft, Writing - review & editing. Laura Menicagli: Investigation, Data curation, Writing - original draft, Writing - review & editing. Cristian Giuseppe Monaco: Investigation, Data curation, Writing - original draft, Writing - review & editing. Luca Alessandro Carbonaro: Investigation, Data curation, Writing - original draft, Writing - review & editing. Riccardo Spairani: Investigation, Data curation, Writing - original draft, Writing - review & editing. Bijan Babaei Paskeh: Investigation, Data curation, Writing - original draft, Writing - review & editing. Francesco Sardanelli: Conceptualization, Methodology, Validation, Resources, Funding acquisition, Supervision, Project administration, Writing - review & editing.
Declaration of Competing Interest
A. Cozzi, F. Arpaia, G. Della Pepa, S. Tritella, P. Bertolotti, L. Menicagli, C.G. Monaco, L.A. Carbonaro, R. Spairani, and B. Babaei Paskeh, all declare that they have no conflict of interest and that they have nothing to disclose.
S. Schiaffino declares to have received travel support from Bracco Imaging and to be member of speakers’ bureau for General Electric Healthcare.
F. Sardanelli declares to have received grants from or to be member of speakers’ bureau/advisory board for Bayer Healthcare, Bracco, and General Electric Healthcare.
This study was partially supported by Ricerca Corrente funding from Italian Ministry of Health to IRCCS Policlinico San Donato
The Role of Chest Imaging in Patient Management during the COVID-19 Pandemic: A Multinational Consensus Statement from the Fleischner Society.