False Negatives and Reinfections: the Challenges of SARS-CoV-2 RT-PCR Testing

April 27, 2020

Now that COVID-19 diagnostics are available in most of the United States, media reports are surfacing about false-negative test results and the possibility of reinfection. We break down some of the major challenges of SARS-CoV-2 testing, including issues of test performance and interpretation, to give these reports context. Caveat: evolving!

What Tests Are Available for COVID-19 and How Are They 'Approved' in the U.S.?

Generally, lab tests are classified as in vitro diagnostics (IVD) by the Food and Drug Administration (FDA) and go through a rigorous approval process to gain FDA clearance. If a company wants to sell a new diagnostic testing platform, they normally have to endure a series of steps including test classification, applications and paying fees. If the test is innovative, clinical studies are typically required to demonstrate adequate performance of the test. In the cases where a new diagnostic panel is developed but can be used on existing equipment (the Centers for Disease Control and Prevention's SARS-CoV-2 assay, for example), a relatively lengthy FDA review may still be required under normal conditions (e.g. no pandemic occurring). Since the test is new, its performance needs to be compared to the performance of a current "gold-standard" test, also known as the "reference standard." There currently is no gold-standard diagnostic test for SARS-CoV-2 since the virus is new to us.

On Feb. 4, 2020, the Human Health and Services (HHS) Secretary determined that there is a public health emergency, and therefore, circumstances justify the emergency use authorization (EUA) of in vitro diagnostics for the detection and/or diagnosis of SARS-CoV-2, the virus that causes COVID-19. Clinical and commercial laboratories, as well as test kit manufacturers, may therefore submit expedited submissions for FDA EUAs of their diagnostic devices and assays. The FDA's EUA templates require analytic sensitivity and specificity analyses. Analytical sensitivity is the smallest amount of substance in a sample that can be reliably detected by an assay (limit of detection) and analytical specificity is the assay's ability to detect only the desired substance in a sample without cross reacting with other substances. However, these analyses differ in meaning from clinical sensitivity and specificity (the percentage of positive patients who test positive and negative patients who test negative, respectively) and a test with good analytical sensitivity and specificity does not necessarily have good clinical sensitivity and specificity. The overall performance of SARS-CoV-2 RT-PCR tests cannot be known until we understand who is truly infected and who isn't.

When we last checked (April 24th, 2020), there were 45 commercial COVID-19 test kits and 10 laboratory-developed tests (tests uniquely developed by the performing laboratory) that have received FDA EUA approval. ASM has published protocols for clinical laboratories to help them verify commercial SARS-CoV-2 kits. While these protocols do not overcome the test performance uncertainty explored below, they do enable labs to establish that tests work to the manufacturer's specifications within that specific testing environment.

What Factors Influence Test Performance & Could They Cause False-Negative Results?

'False-negative' test results refer to cases where someone who truly has the disease tests negative instead of positive. For RT-PCR tests, like those used to diagnose COVID-19, false negatives occur for a variety of reasons, such as the level of viral RNA being below the limit of detection of the test. We are learning something new every day about the best specimen types, collection methods and testing platforms for SARS2-CoV detection. These variables could help explain why some SARS-CoV-2 tests are negative when in fact, the patient has clinical disease.

Variables in Specimen Collection and Transport

Microbiology dogma dictates that good specimen collection leads to accurate results from the lab, but we aren't even sure what "good specimen collection" means for COVID-19. Nasopharyngeal (NP) swabs are the preferred specimen type for COVID-19 per the Centers for Disease Control and Prevention (CDC), but this topic is rapidly evolving. CDC guidance has been revised in the past few weeks to allow for lower respiratory tract testing, oropharyngeal (OP) swab testing, self-collected swabs and nasal turbinate swabs. How well these specimens perform against each other is difficult to guess, as we have limited data on this topic. There are even debates and active research on the value of using saliva to diagnose COVID-19 disease.

Several studies attempt to nail down when a patient has the highest SARS-CoV-2 viral load and where that occurs on the body, with sometimes contradictory results. Wolfel et al.'s peer-reviewed study of serial samples from 9 hospitalized COVID-19 patients demonstrates that viral load and rates of detection differ across areas of the body, as you would expect. Both nasopharyngeal (NP) and oropharyngeal (OP) swabs during the first 5 symptomatic days had the highest viral loads of all specimen types and gave positive results by RT-PCR, but detection dropped to 40% after day 5. A larger non-peer-reviewed study (n=213 patients) from China described that OP swabs detected virus less frequently than NP swabs, particularly 8 days after symptom onset. In another brief peer-reviewed letter-to-the-edtior, 17 COVID-19 patients, who were secondary infections from known cases, had sequential samples obtained after symptom onset and were found to have higher viral loads in NP swabs (samples taken from mid-turbinate to nasopharynx) in comparison to OP swabs. The largest study published on detection of SARS-CoV-2 from different clinical specimens describes 205 hospital patients from 3 Chinese hospitals, with 1,070 specimens collected. Nasal swabs had the highest viral load, although the timing of collection into illness or associated clinical history was not available for many of these specimens.

The data on best specimen type is sparse, but we know that in early illness, NP swabs, and possibly OP swabs, are adequate specimens for detection of SARS-CoV-2 and have some of the highest viral loads of any sample types. Conversely, the evidence is pretty definitive that urine and serum samples are not appropriate specimens for COVID-19 testing. Several studies also corroborate that the later a specimen is collected after the day symptoms began, the higher the chance of having a false-negative test result. Given current data, specimen collection from the wrong area of the body or at the wrong time is almost certainly contributing to false-negative test results.

Much of the current knowledge regarding specimen transport for respiratory viral testing has not changed as it relates to SARS-CoV-2. However, there is currently a nationwide shortage of universal transport medium, the supportive liquid swabs are transported in after collection. A recent study demonstrated that collecting a NP swab and placing it in any of 4 alternative liquid mediums is equally effective. Specific instructions for specimen handling will vary with the transport medium, but in general, if the specimen cannot be sent immediately, it should be refrigerated at 2-8°C for up to 72 hours. If transport is not possible within 72 hours, then the sample should be stored at -70°C or below. Without proper transport medium or storage, specimens degrade. This is especially true for the RNA that is detected by an RT-PCR test. RNA is less stable than DNA, so if a specimen is not transported or stored appropriately, the risk of a false-negative RT-PCR result increases.

Limit of Detection of COVID-19 Tests

In practical terms, the limit of detection (LOD) for an RT-PCR test tells you the minimum amount of RNA that the test will detect 95 out of 100 times. If the limit of detection (LOD) for a test is too high, patients who are infected with SARS-CoV-2 may not test positive, leading to a high rate of false-negative results. If the LOD for a test is too low, then contamination can become a major problem, as the test will detect the tiniest amounts of viral RNA, leading to false-positive test results (i.e., people who are not infected with SARS-CoV-2 will test positive instead of negative). The art of determining how much viral RNA detected in a person is clinically significant, and therefore the target LOD for an accurate test, is just that ... an art.

The LOD is available on the package insert or Instructions for Use (IFU) of approved SARS-CoV-2 tests. Below is a sampling of the LODs (in copies per milliliter, cp/mL) of tests from the largest commercial laboratories, the point-of-care Cepheid assay frequently mentioned in the press and the CDC test.

Comparison of SARS-CoV-2 RT-PCR tests approved for use by the FDA.
Test Developer LOD (cp/mL) LOD Method How It Works Specimen Type
Abbott (RealTime SARS CoV-2 Assay) 100 Serial dilutions of AccuPlex SARS-CoV-2 Reference material (recombinant Sindbis virus particle containing target sequences from SARS-CoV-2 genome) in simulated background matrix 2 primer and probe sets to detect the RdRp and N genes NP and OP swabs
CDC 3000*
1000Ϯ
Serial dilutions of in vitro transcribed full length RNA into suspensions of cells in viral transport media; tested via 2 different sample extraction methods (Qiagen EZ1 Advanced XL* vs. Qiagen DSP Viral RNA Mini KitϮ) 2 primer and probe sets to detect regions of the N gene NP or OP swabs, sputum, lower respiratory tract aspirates, bronchoalveolar lavage and nasopharyngeal wash/aspirate or nasal aspirate
Cepheid (Xpert Xpress SARS CoV-2 Test) 250 Serial dilutions of AccuPlex SARS-CoV-2 (recombinant Sindbis virus particle containing target sequences from SARS-CoV-2 genome) in simulated background matrix 2 primer and probe sets to detect the E and N2 genes NP, nasal,or mid-turbinate swab and/or nasal wash/aspirate specimens
Labcorp (COVID-19 RT-PCR Test) 6,250 Serial dilutions of live SARS-CoV-2 into simulated background matrix 3 primer and probe sets to detect regions of the N gene NP or OP swabs, sputum, lower respiratory tract aspirates, bronchoalveolar lavage and nasopharyngeal wash/aspirate or nasal aspirate
Quest
(Quest SARS-CoV-2 rRT-PCR)
137 Serial dilutions of the 1100 nucleotide region of the N gene 2 primer and probe sets to detect regions of the N gene NP or OP swabs, sputum, tracheal aspirates and bronchoalveolar lavage

As you can see, there are a lot of nuances to consider. LOD, and therefore overall clinical sensitivity of the tests, varies depending on the commercial vendor, how the LOD estimate was determined (some companies used intact virus, which better simulates real samples, whereas others used just nucleotide sequence) and the specimen type submitted (some vendors only validated their assays for certain specimen types). On the bright side, the LOD of current diagnostics should allow them to detect at least early-infection (within 1 week), symptomatic COVID-19 patients if the mean viral loads reported thus far are representative. For example, the specimen studies cited earlier indicate that NP swabs have on the order of 105 to 106 cp/mL of viral RNA. Even the less sensitive tests have LODs in the range of 1000s of cp/mL, so the levels of RNA in clinical samples during acute infections should be easily detectable. Early studies also suggest that the viral load in the upper respiratory tract may be similar between people who have asymptomatic or pre-symptomatic infection, and those who are symptomatic, so being asymptomatic may not increase the chances of having a false-negative test.

How Likely Is It That SARS-CoV-2 Is Reinfecting Recovered Patients?

While research on immunity to SARS-CoV-2 after infection is ongoing, it is unlikely that patients are becoming reinfected shortly after their initial infection. A study performed in Zhejiang, China monitored 13 COVID-19 patients from home quarantine over a period of 4 weeks after being discharged from the hospital. Of these, 2 patients had SARS-CoV-2 RNA detected consistently in their stool for 14 and 15 days after discharge. Of the sputum samples collected from the quarantined patients during the 4 week period, the rate of a recurring positive test was 31%. Interestingly, these patients showed no other abnormal clinical signs or symptoms when testing positive after their initial discharge. A peer-reviewed study with a large cohort of patients from a group of researchers in Wuhan, China demonstrated that the virus was detectable by RT-PCR in patients who survived COVID-19 for up to 37 days, with a median time of 20 days. For comparison, viral loads for influenza virus have been shown to decrease greatly after 7 days, although this varies based on patient comorbidities and viral treatment. Lingering positive results are possibly explained if viral RNA remains in tissues for a considerable amount of time, even when the viral particles capable of causing infection have been cleared. In most cases, patients that have tested positive for SARS-CoV-2 after their original infection are asymptomatic or not clinically worse at the time of retesting. If a patient is positive for the disease, then has a negative test followed by a positive test within a short period of time, this is most likely due to a false-negative test occurring between the positive tests (perhaps due to one of the factors described above).

As yet, there is no consensus on how accurate our testing is, and given the potential for asymptomatic carriage and prolonged viral shedding post-infection, we likely have a long road ahead and many lessons to learn. But understanding the limitations and pitfalls of our testing is incredibly important for both serving the public and patients. No test is perfect, and we are all learning together on this one.

Thank you to Rose Lee, M.D., Beth Israel Deaconess Medical Center, for her contributions to this article, including the table comparing tests.


Author: Andrea Prinzi, Ph.D., MPH, SM(ASCP)

Andrea Prinzi, Ph.D., MPH, SM(ASCP)
Andrea Prinzi, Ph.D., MPH, SM(ASCP) is a field medical director of U.S. medical affairs and works to bridge the gap between clinical diagnostics and clinical practice.