Narrative
Pneumonia accounts for roughly 2% of all emergency department (ED) visits in the United States annually and is frequently on an ED physician's differential diagnosis in patients with respiratory symptoms. Yet, pneumonia remains a challenging diagnosis to make, given the relatively poor (and highly variable) sensitivity and specificity of both clinical symptoms and chest radiographs. Computed tomography (CT) is the diagnostic reference standard for detecting pneumonia, with the advantage of identifying alternative pathologies, but it is limited by relatively high resource utilization and radiation exposure., If proven accurate, bedside lung ultrasound (LUS) is a valuable imaging modality for diagnosing pneumonia due to its accessibility, safety, and low cost.
The systematic review and meta-analysis discussed here assessed the operating characteristics of individual LUS findings for the diagnosis of pneumonia in adults. The authors included studies assessing LUS criteria for the detection of community-acquired pneumonia (CAP), hospital-acquired pneumonia (HAP), and ventilator-associated pneumonia (VAP) in adults. Specific LUS signs and algorithms investigated were consolidation, focal B-lines, subpleural consolidation, dynamic air bronchograms, color Doppler, the BLUE (Bedside Lung Ultrasound in Emergency) protocol, and the Lung Ultrasound Clinical Pulmonary Infection Score (LUS-CPIS). All the included studies used either chest radiography or CT as the reference standard for the diagnosis of pneumonia. The quality and the risk of bias in the original trials were assessed using the revised Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. A total of 26 studies (3454 patients) were included in the final analysis. Of the 26 studies, two were low risk for bias, with the most common concern for bias being that chest radiography was the diagnostic reference standard.
For the detection of pneumonia, the BLUE protocol performed best, with a sensitivity of 88% (84%–92%), a specificity of 94% (87%–98%), a positive likelihood ratio of 15 (6–36), and a negative likelihood ratio of 0.12 (0.09–0.17). Dynamic air bronchograms were less sensitive (46%; 30%–62%) and had a less favorable negative likelihood ratio (0.56; 0.41–0.78) but had a similar specificity (96%; 91%–99%) and positive likelihood ratio (12; 4–39) to the BLUE protocol. Subpleural consolidation, focal B-lines, and consolidation provided less statistical utility for diagnosing pneumonia. Furthermore, the likelihood ratio for a negative test for these findings was not low enough to rule out pneumonia (ranging from 0.26 to 0.7). As such, these LUS findings are not reported in this review.
Subgroup analysis was also performed on VAP and non-VAP (CAP or HAP) patients. In non-VAP patients, dynamic air bronchograms were highly specific (98%; 87%–100%) and had the highest positive likelihood ratio (31; 3–319). That being said, LUS consistently performed worse in VAP patients compared to non-VAP patients. Consolidation had the highest sensitivity (78%; 60%–89%), and dynamic air bronchograms had the highest specificity (92%; 79%–97%) and positive likelihood ratio (4; 1–9).
Caveats
Several limitations exist in this review, similar to previous LUS for pneumonia reviews. Ultimately, the barriers stem from variability and heterogeneity across multiple studies included in the review. The systematic review highlighted that the included studies varied widely in clinical context and setting. Seventeen studies were based in the intensive care units, eight in the ED, and one in the ED or wards. Different clinical settings mean varying patient populations and levels of disease severity. Patients with more severe disease may have more complex lung pathology, obscuring or mimicking sonographic findings used to detect pneumonia. LUS can directly detect pneumonias involving the pleural surface, but infections located deeper within the lung often manifest only as indirect signs such as B-lines, subpleural consolidations, or pleural line abnormalities. In some cases (i.e., bronchopneumonia), it may not be detected at all. It should also be considered that pneumonia exists on a severity spectrum, with the potential for variation in LUS findings at different stages of disease.
Furthermore, the prevalence of pneumonia in the analyzed studies ranged from 21% to 91%, which is much greater than the prevalence of pneumonia in the typical ED population. The systematic review authors also highlighted that there was variability in the definitions of LUS findings. For example, what constituted a “subpleural consolidation” varied across studies, making direct comparison difficult. There were also differences in the reference standard, with some studies using chest radiography as a comparison, while others used chest CT. It should also be noted that a key caveat, in not only this review but also in ultrasound in general, is user dependence. This can significantly affect diagnostic accuracy and reproducibility. Inconsistencies in technique can lead to missed findings.
In summary, the existing evidence demonstrates that protocols such as the BLUE protocol show strong sensitivity and specificity for CAP and HAP. Recently updated clinical practice guidelines from the American Thoracic Society (ATS) align with these findings. The new guidelines conclude that, despite low-quality evidence and a lack of direct outcome-based studies, LUS is “likely at least as accurate as chest x-ray” in confirming a clinical suspicion of CAP [8]. Both the ATS guidelines and systematic review discussed here emphasize the importance of operator skill and standardization in assuring accurate and reproducible results.
The original manuscript was published in
Academic Emergency Medicine as part of the partnership between TheNNT.com and AEM.
Author
Caleb Bailie, MD; Edem Adika, MD; Christopher Hanuscin, MD; Kelly Maurelus, MD
Supervising Editors: Kabir Yadav, MD
Published/Updated
May 28, 2026
What are Likelihood Ratios?
LR, pretest probability and posttest (or posterior) probability are daunting terms that describe simple concepts that we all intuitively understand.
Let's start with pretest probability: that's just a fancy term for my initial impression, before we perform whatever test it is that we're going to use.
For example, a patient with prior stents comes in sweating and clutching his chest in agony, I have a pretty high suspicion that he's having an MI – let's say, 60%. That is my pretest probability.
He immediately gets an ECG (known here as the "test") showing an obvious STEMI.
Now, I know there are some STEMI mimics, so I'm not quite 100%, but based on my experience I'm 99.5% sure that he's having an MI right now. This is my posttest probability - the new impression I have that the patient has the disease after we did our test.
And likelihood ration? That's just the name for the statistical tool that converted the pretest probability to the posttest probability - it's just a mathematical description of the strength of that test.
Using an online calculator, that means the LR+ that got me from 60% to 99.5% is 145, which is about as high an LR you can get (and the actual LR for an emergency physician who thinks an ECG shows an obvious STEMI).
(Thank you to Seth Trueger, MD for this explanation!)