Article, Cardiology

The utility of iPhone oximetry apps: A comparison with standard pulse oximetry measurement in the emergency department

a b s t r a c t

Objectives: To determine if a correlation exists between 3 iphone pulse ox applications’ measurements and the standard pulse oximetry (SpO2) and whether these applications can accurately determine hypoxia.

Methods: Three applications reportedly measuring SpO2 were downloaded onto an iPhone 5s. Two of these ap- plications used the onboard light and camera lens “Pulse Oximeter” (Pox) and “Heart Rate and Pulse Oximeter” (Ox) and one used an external device that plugged into the iphone (iOx). Patients in the ED were enrolled with chief complaints of cardiac/pulmonary origin or a SpO2 <= 94%. All measurements were compared to controls. Concordance Correlation coefficients, sensitivity, and specificity were calculated.

Results: A total of 191 patients were enrolled. The concordance correlation of iOx with control was 0.55 (CI 0.46,

0.63), POx was 0.01 (CI -0.09, 0.11), and Ox was 0.07 (CI -0.02, 0.15). 68/191 patients (35%) were found to have hypoxemia. Sensitivities for detecting hypoxia were 69%, 0%, and 7% for iOx, POx, and Ox, respectively. Specific- ities were 89%, 100%, and 89%. Even iOx (the most accurate) 21 (11%) were incorrectly classified nonhypoxic, and 22 (12%) were incorrectly classified hypoxic.

Conclusions: While iOx has modest concordance with control, Ox and POx showed almost none. The iOx device was best in correctly identifying hypoxia patients, but almost 1/4 of patients were incorrectly classified. The three apps provided inaccurate SpO2 measurements and had limited to no ability to accurately detect hypoxia. These apps should not be relied upon to provide accurate SpO2 measurements in emergent, even austere conditions.

(C) 2019

Introduction

Pulse oximetry (pulse ox) is an expedient and accurate tool to non- invasively measure the oxygenation status of any patient in whom this might be a Clinical concern. It was developed by Glenn Allan Millikan, an American physiologist and mountaineer during World War II [1]. The term “oximetry” is attributable to him. Pulse oximeters measure satura- tion of peripheral oxygen (SpO2) by measuring the difference in absorp- tion of oxygenated vs deoxygenated blood at two different wavelengths (typically red light at 660 nm and infrared at 940 nm). Oxygenated he- moglobin absorbs more infrared and deoxygenated hemoglobin ab- sorbs more red light. This difference is measured by the diodes on the device and is used to calculate the SpO2 [2]. This tool can be placed on multiple spots on the body to non-invasively obtain an accurate mea- sure of blood oxygenation and detect hypoxemia [3-6]. The finger probe is commonly used for measurements, but multiple other sights

* Corresponding author.

E-mail address: [email protected] (W.A. Schrading).

can be used with varying accuracy [7,8]. The use of pulse oximetry has been shown to reduce the need for more invasive measurements, such as an arterial blood gas [9]. More recently, portable finger probes have been developed, allowing one to measure oxygen saturation in a variety of environments and situations [10,11]. Small, portable pulse ox devices allow measurement of oxygen saturation in resource limited environ– ments, however even these may not be immediately available.

In the past decade, smart phone technology has become nearly ubiq- uitous, with usage growing rapidly around the world. Ninety-five per- cent of Americans own some type of cell phone and 77% own a smart phone [12]. In recent years there have been applications have been de- veloped claiming to measure vital signs such as heart rate, respiratory rate, and pulse oximetry with nothing but the device’s camera lens and light [13]. Overall, little research has been done to verify the validity of these applications’ claims. Recent studies have found validity in the HR measuring capabilities of these applications when compared to con- trol [13-15], yet when analyzing their ability to measure pulse ox it was found that the SpO2 did not clinically correlate to the monitor [15]. Studies among both pediatric and adult assessing the accuracy of

https://doi.org/10.1016/j.ajem.2019.07.020

0735-6757/(C) 2019

926 T.B. Jordan et al. / American Journal of Emergency Medicine 38 (2020) 925928

Table 1

Population characteristics.

Characteristic Mean (SD) or N (%)

Participants were recruited from the triage area or main ED patient rooms. Inclusion criteria were presentation with a cardio/pulmonary chief complaint or an initial sPO2 of <=94%, both of which could be

Age (years) (mean/SD) Race (N/%)

58.7 (15.7)

assessed via the ED’s FirstNet electronic medical record (EMR). Exclu-

sion Criteria included the presence of peripheral artery disease (PAD)

White

106 (55.5%)

and anemia of severity that might affect pulse ox, or inability to give

Black

80 (41.9%)

consent. In order to include a reasonable number of patients with hyp-

Other 5 (2.6%)

Gender (N/%)

Female 84 (44.0%)

Male 107 (56.0%)

Anemia or PAD not severe (N/%)

Yes 13 (6.8%)

No 178 (93.2%)

Supplemental oxygen (N/%)

Yes 44 (23.0%)

No 147 (77.0%)

these devices have used healthy volunteers with presumably normal SpO2s [16]. There have been no studies of these applications in an undif- ferentiated or potentially hypoxic patient population.

The objectives of this study are to determine the correlation of three iPhone application pulse ox measurements to that of standard pulse ox- imetry. We also wanted to test their ability to detect hypoxia as com- pared to control measurements in an undifferentiated emergency department (ED) patient population who had both normal and abnor- mal SpO2 readings.

Methods

Study design

This study was approved by the University of Alabama at Birmingham Institutional Review Board. This was a correlational study in which three iPhone pulse ox applications pulse ox and pulse measurements were compared to measurements from a standard pulse oximeter in the ED. The three applications used were: “Oximeter” (Ox) (produced by digiDoc technologies, Egersund, Norway) “Heart Rate & Pulse Oximeter,” (POx) (produced by LIJUN LIU), both of which use the iPhone’s onboard camera and light to record measurements, and “iOx” (produced by Safe Heart USA, Atlanta, GA) which uses an external probe that can be purchased from the Safe Heart company and which connects to the iphone. This de- vice uses red light similar to a more conventional pulse ox monitor. The ED TRAM 451 (General Electrics, New York City, New York) pulse ox monitors were used as control measurements. The applications were pur- chased and downloaded onto a single decommissioned iPhone 5S.

Patients

The study population was a convenience sample of patients 18 years of age and older presenting to a large university medical center ED.

oxia, an effort was made to include as many participants as possible with an initial SpO2 of <=94%.

Measurements

The authors enrolled all patients. After verbal consent and collection of demographic data, experimental measurements were taken from the right index finger using all three iPhone applications. The control was recorded first and then the experimental measures in the predetermined individualized random order. The order was random- ized to minimize any effect of second to second variations in continuous pulse ox readings. If patients were on supplemental oxygen, a ventilator, or a CPAP, this was noted.

Data analysis

Descriptive statistics for the study population are reported as means and standard deviations for continuous variables and percentages for categorical variables. To assess differences in the distributions of SpO2s for each of the devices, we reported means, medians, 25th to 75th percentiles, minimum to maximum, and differences relative to the control measures. The level of agreement between each of the de- vices and control measures was assessed using several measures in- cluded in the user-written “concord” Stata package (Pearson’s correlation coefficient, bias correction factor, intercept/slope, average difference with 95% limits of agreement, and the concordance correla- tion coefficient) and the alpha coefficient for reliability [17-19]. Data was analyzed for the entire population and the sub-population of pa- tients with SpO2 b 94%. We also produced a pairplot of the differences by the mean of the measurements (i.e., a Bland-Altman plot) for the en- tire population [20]. We also reported measures of validity (sensitivity, specificity, positive predictive value, and negative predictive value) for detection of hypoxia, defined as a SpO2 <= 94%. All analyses were per- formed using Stata 13.1 (Stata Corp, LLC, College Station, TX).

Results

Overall 191 patients were evaluated, population characteristics are shown in Table 1. The majority of patients were white (55%) and male (56%). Among these patients, 23.0% required supplemental oxygen at the time of data collection.

Mean and median SpO2 readings were generally higher for Ox and Pox relative to control measurements, while iOx readings were lower

Table 2

Distribution of oxygen saturation and measured differences by instrument.

Measure Measurement instrument

Control

Ox

Pox

iOx

Total patients (N) Oxygen saturation (%)

191

191

191

191

Mean (SD)

96.0 (3.2)

98.2 (2.0)

97.2 (1.2)

94.7 (4.5)

Median (25th, 75th percentile)

97 (94, 99)

99 (97, 100)

97.2 (96.2, 98.1)

95 (93, 98)

Minimum, maximum

80, 100

94, 100

95, 100

75, 100

Hypoxia (<=94%) (N/%)

68 (35.6%)

18 (9.4%)

0 (0%)

69 (36.1%)

Difference

(Measured oxygen saturation – control) (%)

Mean (SD)

– 2.19 (3.78)

1.14 (3.33)

-1.35 (3.56)

Median (25th, 75th percentile)

– 2 (0, 5)

0.8 (-1.4, 3.1)

-1 (-2, 0)

Minimum, maximum

– -6, 14

-4.9, 15.3

-19, 7

T.B. Jordan et al. / American Journal of Emergency Medicine 38 (2020) 925928 927

Table 3

Concordance and reliability of instruments compared with control measurements.

Statistic Measurement instrument

Ox Pox iOx

All patients (N = 191) Concordance

Pearson’s correlation

0.015

0.112

0.613

Control yes/measured no (N)

63

68

21

coefficient

Control no/measured yes (N)

13

0

22

Bias correction factor

0.666

0.578

0.898

Control yes/measured yes (N)

5

0

47

Intercept/slope

0.61/39.73

0.36/62.38

1.37/-36.95

Sensitivity (%)

7.4%

0%

69.1%

Average difference (95% limits

2.19 (-5.21,

1.14 (-5.39,

-1.35

Specificity (%)

89.4%

100%

82.1%

of agreement)

9.60)

7.66)

(-8.32, 5.63)

PPV (%)

27.8%

Not defined

68.1%

Concordance correlation

0.01 (-0.09,

0.07 (-0.02,

0.55 (0.46,

NPV (%)

63.6%

64.4%

82.8%

coefficient (95% CI)

0.11)

0.15)

0.63)

Reliability (alpha coefficient)

0.03

0.13

0.74

Table 4

Measures of validity for identifying hypoxia.

Statistic Measurement instrument

Ox Pox iOx

Hypoxia (<=94%)

Control no/measured no (N) 110 123 101

Hypoxia patients (control <= 94%) (N = 68)

Concordance

Pearson’s correlation

0.195

-0.058

0.497

coefficient

Bias correction factor

0.178

0.182

0.730

Intercept/slope

0.88/17.2

-0.54/146.9

2.26/-116.9

Average difference (95% limits

6.06 (1.03,

4.59 (-0.31,

-0.63

of agreement)

11.09)

9.49)

(-8.89, 7.63)

Concordance correlation

0.04 (-0.01,

-0.01

0.36 (0.22,

coefficient (95% CI)

0.08)

(-0.06, 0.03)

0.51)

Reliability (alpha coefficient)

0.32

0.09

0.54

(Table 2). The minimum readings were 80%, 94%, 95%, and 75% for con- trol, Ox, Pox, and iOx respectively. Control readings indicated that 35.6% of patients were hypoxic (SpO2 <= 94%). For Ox, only 9.4% of patients were classified as hypoxic. No patients were classified as hypoxic by Pox. The proportion classified as hypoxic was higher than controls for iOx (36.1%).

For the full population of patients, iOx readings showed the highest concordance and reliability with control measurements (Table 3). The iOx device also performed best in the subgroup of patients with hyp- oxia. Average differences indicated that Ox and Pox measurements were frequently higher than control readings, whereas iOx was slightly lower (Fig. 1). Differences relative to control readings were greater for Ox and Pox when limited to hypoxic patients. For the whole population, as well as for participants with hypoxia, we found little to no agreement with the control measures for Ox and Pox.

Of the three instruments examined, iOx demonstrated the highest sensitivity, positive predictive value, and negative predictive value

(69.1%, 68.1%, and 82.8%, respectively) (Table 4). Specificity was lower for iOx relative to the other instruments. However, sensitivity and pos- itive predictive value were extremely low for Ox and Pox.

Discussion

Of the three approaches for measuring SpO2, iOx performed better than Ox and Pox in terms of agreement with control measures. While there was a moderate agreement between the external device (iOx) and the ED pulse oximeters used as controls, it is not strong enough to recommend to patients or physicians, even in austere environments. Until the technology is capable of obtaining reliable and valid measure- ments, it should be recommended that patients and care providers use portable devices that have been shown to be accurate.

Alexander et al. [15] found poor correlation of SpO2, blood pressure, and heart rate among a variety of iphone applications compared to clin- ical monitors when assessed in a group of healthy volunteers. Our study also utilized iphone only applications, but we included an app with an attached external finger probe that purported to measure pulse ox. We also studied a population of real patients in a clinical environment and included a moderately sized subgroup who were hypoxic. We re- cruited this sample because we wanted to test these apps not only in persons with normal SpO2, but also in those with low oxygen satura- tion. Similar to the prior study, we found limited agreement with mea- sures taken on a standard ED pulse oximetry device.

Our findings do not suggest that these resources should be completely disregarded. As technology continues to advance, it is possi- ble that accuracy will improve and portable devices will provide valid

Fig. 1. Pairplots showing differences in instrument compared with control measurements.

928 T.B. Jordan et al. / American Journal of Emergency Medicine 38 (2020) 925928

measures that are clinically actionable. These instruments should con- tinue to be examined by peer reviewed research for their possible use moving forward. Other similar avenues of research are opening every day as more companies are releasing technologies to the public for vitals measurement and health maintenance. Wearable technology and de- vices, such as handheld electrocardiograms, are now widely available to the public. It should also be of vital importance to ensure that these technologies are safe and reliable, and only then should they be made available to providers in resource-limited settings.

Limitations

There are several limitations that should be considered in the inter- pretation of this data. First, oxygen was not withheld from patients in hypoxic states and in some situations this altered the control measure- ment as data collection progressed. Researchers in the future should be made aware of this difficulty and informed to rapidly respond to hyp- oxic patients. Secondly, there are a number of possible conditions that can reduce the accuracy of pulse oximetry devices. Differences in the calibration of light source, anemia, optical interference with endoge- nous and exogenous substances in the blood, and even Skin color have been documented to potentially reduce the accuracy of an oximetry measurement. The same iPhone was used for all of the applications throughout the study, the variance in calibration of the light source was deemed not to be significant. However, a second external adaptor had to be purchased for the iOx application because the first one stopped functioning midway through data collection. This may have added to some of the variance in light calibration. It was decided not to exclude patients based on their skin tone because patients with both light and dark skin tones should be equally as likely to use these pulse ox apps.

Conclusions

While iOx had modest agreement with the control measures, Ox and POx showed almost none. The iOx device also showed the best ability to correctly identify patients in low-oxygen states, but almost one-fourth of patients were incorrectly classified. Overall, the three apps provided inaccurate SpO2 measurements and had limited ability to accurately de- tect hypoxia. In their current state, these devices should not be recom- mended for use in situations where there is a need to assess oxygenation status. Additionally, patients and healthcare providers alike should be informed that these devices are not equivalent to gold- standard pulse oximetry devices. Should anyone want to assess oxygen- ation status in austere environments, they should be directed to seek out portable fingertip Pulse oximetry monitors, which have found to be accurate in prior studies [3].

Support

NIH Short Term Research Training Program for medical student re- search. JPD was additionally supported by grant F31-GM122180 from

the National Institute of General Medical Sciences and K12-HL138039 from the National Heart, Lung, and Blood Institute.

Prior presentations

Society of Academic Emergency Medicine, Lightning Oral, 5/17/ 2018, Indianapolis, IN Southeast ACEP chapters regional meeting, 6/6/ 2018, Sandestin, FL.

References

  1. Severinghaus JW, Astrup PB. J Clin Monit Comput 1986;2:270.
  2. Pinsky MR. Assessing the circulation: oximetry, indicator dilution, and pulse contour analysis. In: Hall JB, Schmidt GA, Kress JP, editors. Principles of critical care. 4e. NY: McGaw-Hill; 2014.
  3. Sohila S, Alireza K, Gholamreza M, Alireza A, Farid N. Accuracy of pulse oximetry in detection of oxygen saturation in patients admitted to the intensive care unit of heart surgery: comparison of finger, toe, forehead, and earlobe probes. BMC Nurs 2018;17:15.
  4. Bilin N, Behbahan AG, Abdinia B, Mahallei M. Validity of pulse oximetry in detention of hypoxemia in children: comparison of ear, thumb, and toe probe placements. East Mediterr Health J 2010;16(2):218-22.
  5. Sinex JE. Pulse oximetry: principles and limitations. Am J Emerg Med 1999;17(1): 59-66.
  6. Wilson BJ, Cowan HJ, Lord JA, Zuege DJ, Zygun DA. The accuracy of pulse oximetry in emergency department patients with severe sepsis and septic shock: a retrospective cohort study. BMC Emerg Med 2010;10(1):9-14.
  7. Blaylock V, Brinkman M, Carver S, McLain P, Matteson S, Newland P. Comparison of finger and forehead oximetry sensor in post anesthesia care patients. J Perianesth Nurs 2008;23(6):379-86.
  8. Berkenbosch JW, Tobias JD. Comparison of new forehead reflectance pulse oximeter with a conventional digit sensor in pediatric patients. Respir Care 2006;51(7): 726-31.
  9. Durbin Jr CG, Rostow SK. More reliable pulse oximetry reduces the frequency of ar- terial blood gas analyses and hastens oxygen weaning after cardiac surgery, a pro- spective randomized trial of the clinical impact of new technology. Crit Care Med 2002;30(8):1735-40.
  10. Takeshita W, Iwaki LC, Pupim D, Filho L. Evaluation of accuracy of portable fingertip pulse oximeter, as compared to that of a hospital oximeter with digital sensor. Indian J Dent Res 2013;24(5):542.
  11. Costa JC, Faustino P, Lima R, Ladeira I, Guimaraes M. Research: comparison of the ac- curacy of a pocket versus standard pulse oximeter. Biomed Instrum Technol 2016;50 (3):190-3.
  12. Mobile fact sheet. Pew Research Center, Internet and Technology; Feb 2018.
  13. Cheatham, S.W., Kolber MJ, Ernst, Concurrent validity of resting pulse-rate measure- ments: a comparison of 2 Smartphone applications, the polar H7 belt monitor, and a pulse oximeter with bluetooth. J Sport Med. 2015; 24(2): 171-8.
  14. Mitchell K, Graff M, Hedt C, Simmons J. Reliability and validity of a smartphone pulse rate application for the assessment of resting and elevated pulse rate. Physiother Theory Pract 2016;32(6):494-9.
  15. Alexander JC, Minhajuddin A, Joshi GP. Comparison of smartphone application- based vital sign monitors without external hardware versus those used in clinical practice: a prospective trial. J Clin Monit Comput 2016;31(4):825-31.
  16. Tomlinson S, Berhmann S, Cranford J, Louie M, Hashikawa A. Accuracy of smartphone-based pulse oximetry compared with hospital grade pulse oximetry in healthy children. Telemed e-health 2018 Jul;24(7):527-35.
  17. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1(8476):307-10.
  18. Lin LI. A concordance correlation coefficient to evaluate reproducibility. Biometrics 1989;45(1):255-68.
  19. Steichen TJ, Cox NJ. A note on the concordance correlation coefficient. Stata J 2002;2 (2):183-9.
  20. Cox NJ. Graphing agreement and disagreement. Stata J 2004;4:329-49.

Leave a Reply

Your email address will not be published. Required fields are marked *