Diagnostic Accuracy of COVID-19 Antibody Tests Authorized by FDA Philippines: A Systematic Review and Meta-Analysis

Introduction: Coronavirus Disease (COVID-19) is a highly infectious disease caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) which has infected many people all over the world. One of the best ways to lessen its spread is through early detection and diagnosis. Various serological tests are now being used as a surveillance tool in the detection of antibodies as a response to SARS-CoV-2. The aim of this study is to evaluate the diagnostic accuracy and performance of the available COVID-19 antibody tests authorized by the Food and Drug Administration (FDA) Philippines that make use of Enzyme-Linked Immunosorbent Assay (ELISA), Chemiluminescence Immunoassay (CLIA) and Lateral Flow Immunoassay (LFIA). Method: Complete published journal articles relevant to the diagnostic accuracy of the three antibody tests were collected using trusted medical journal search engines. The quality of journals was assessed using QUADAS-2 to determine the risk of bias and assess the applicability judgments of diagnostic accuracy studies. Forest plots were used to summarize the performance of LFIA, ELISA and CLIA according to their specificity and sensitivity in detecting various antibodies. Pooled sensitivity and specificity were also done using bivariate random-effects models with its log-likelihood, a corresponding chi-square test statistic, and area under the summary Receiver-Operating Characteristic curve to see the potential heterogeneity in the data and to assess the diagnostic accuracy of the COVID-19 antibody tests. Results: Bivariate random-effects model and areas under the sROC curve were used to evaluate the diagnostic accuracy of COVID-19 antibody tests. The pooled sensitivity in detecting IgG based on CLIA, ELISA, and LFIA were 81.7%, 58.7%, and 74.3% respectively, with an overall of 72.0%. For IgM detection, LFIA has a higher pooled sensitivity of 69.6% than CLIA with 61.0%. Overall, the pooled sensitivity is 68.5%. In IgA detection, only ELISA based test was included with a pooled sensitivity of 84.8%. Lastly, pooled sensitivities for combined antibodies based on ELISA and LFIA were 89.0% and 81.6% respectively, with an overall of 82.5%. On the other hand, all tests excluding ELISA-IgA displayed high pooled specificities with a range of 94.0% to 100.0%. Diagnostic accuracies of the test in detecting IgG, IgM, and combined antibodies were found out to be almost perfect based on the computed area under the sROC with values of 0.973, 0.953, and 0.966, respectively. Conclusion: In this systematic review and meta-analysis, existing evidence on the diagnostic accuracy of antibody tests for COVID-19 were found to be characterized by high risks of bias, consistency in the heterogeneity of sensitivities, and consistency in the homogeneity of high specificities except in IgA detection using ELISA. The bivariate random-effects models showed that there are no significant differences in terms of sensitivity among CLIA, ELISA and LFIA in detecting IgG, IgM, and combined antibodies at a 95% confidence interval. Nonetheless, CLIA, ELISA and LFIA were found to have excellent diagnostic accuracies in the detection of IgG, IgM and combined antibodies as reflected by their AUC values.


Introduction
In December 2019, a number of pneumonia cases, with an unknown cause, were identified in Wuhan City, China. It was later identified that the pathogen of this pneumonia-like disease was the novel coronavirus which was later named as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). The World Health Organization (WHO) named the disease, caused by this pathogen, as Coronavirus Disease . This disease is highly infectious and has infected many people all over the world. SARS-CoV-2 belongs to the virus family Coronaviridae, which causes diseases in the respiratory system. Patients who are infected with this virus usually present fever, dry cough, and tiredness. More symptoms include aches and pains, sore throat, diarrhea, conjunctivitis, headache, and loss of smell. Serious cases of COVID-19 may have difficulty breathing, chest pain or pressure or loss of speech or movement. However, there are also asymptomatic cases which may imminently cause more danger as they have no physical manifestations that they have acquired the virus and may unconsciously spread it to other people. As of January 28, 2021, there are 519,575 cases and 10,552 deaths in the Philippines while there are 101,400,862 cases and 2,182,193 deaths worldwide [1].
The diagnosis for COVID-19 is done either directly or indirectly. The direct approach involves the molecular detection of the viral genome through nucleic acid amplification techniques [2]. On the other hand, the indirect approach is termed as such because it does not explicitly discern the presence of the virus rather the indirect tests report the development of antibodies that correlate with former or present infections [3]. Currently, the gold standard in COVID-19 diagnosis is the direct approach, which is real-time reverse transcription PCR (rRT-PCR). Although rRT-PCR is already well-documented as an efficient diagnostic system, the significance of indirect methods, such as serological testing, should not be disregarded for it plays a substantial role in a different, but principal aspect in disease surveillance, and disease control.
Serological tests detect antibodies, the body's adaptive defense mechanism against infections. Its presence, however, is not entirely concurrent with that of a pathogen; instead, it may pertain to a past occurrence of an infection. This is crucial when it comes to disease mitigation, epidemiology, and even in vaccine formulation. With the information gathered from antibody detection through serological means, the nature of infection recurrences can be studied further. Antibody detection can be done through three different testing mechanisms: Enzyme-Linked Immunosorbent Assay (ELISA), Chemiluminescence Immunoassay (CLIA), and Lateral Flow Immunoassay (LFIA). This systematic review and meta-analysis aim to assess the overall diagnostic accuracy and performance of different Food and Drug Administration (FDA) Philippines-authorized antibody diagnostic test kits in terms of their overall diagnostic sensitivity, diagnostic specificity, and area under the summary Receiver-Operating Characteristics (sROC) curve.

Sampling Design
This study is a systematic review and meta-analysis of published articles about the diagnostic accuracy of COVID-19 antibody tests namely Enzyme-Linked Immunosorbent Assay (ELISA), Chemiluminescence Immunoassay (CLIA), and Lateral Flow Immunoassay (LFIA).

Research Instrument
The search engine is based on the four-phase systematic review adopted from Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). Screening and evaluation of articles from trusted medical search engines are based on the inclusion and exclusion criteria listed below. This systematic review followed the guidelines provided in the Cochrane Handbook for Diagnostic Test Accuracy Reviews. Table 1 shows the selection criteria for deciding whether a certain literature is to be included or excluded from this study. In addition to the aforementioned criteria, bibliographic databases from Beall's List of Potential Predatory Journals and Publishers were excluded from use. This study only utilized complete and original published articles. Editorials, narrative review, commentaries, textbook chapters were excluded. This paper is a systematic review and meta-analysis designed and structured to synthesize primary data. Retrospective, cohort, experimental, descriptive, and case series research designs with data including diagnostic sensitivity and specificity were the accepted research designs. Primary sources that satisfied the inclusion criteria focusing on the following tests were synthesized: Enzyme-linked Immunosorbent Assays (ELISA), Chemiluminescence Immunoassays (CLIA), and Lateral Flow Immunoassays (LFIA). Other tests that are not mentioned were not included in this study.

Data Extraction
In order to organize and collate data extraction from all studies involved, a custom Google spreadsheet was used. Two review authors, J.F. and S.A.E.O., separately performed the data extraction by collecting the following characteristics: general information about the study including the author/s, year of publication, study design, country of origin; target population including the age group, case severity, COVID-19 status; and details about the antibodydetection testing kit used such as the brand name, manufacturer, specimen used assay classification, sensitivity, and specificity. Conflicts or discrepancies between the data extraction of the review authors were resolved by a third reviewer (E.D.E.D.S.).

Quality Assessment
Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) is a structured tool used to determine the risk of bias and assess the applicability judgments of diagnostic accuracy studies. Three review authors, C.R.R.C., K.V.H.E. and R.
LG.E., evaluated the articles based on four (4) domains: patient selection, index test, reference standard, and flow and timing. To assess the risk of bias, each domain included a set of signaling questions. The first three domains were also assessed as regards to applicability concerns. Conflicts among the authors were resolved through consensus. Review Manager (RevMan), the software certified by Cochrane Review to manage systematic reviews and meta-analyses, was used by the authors in the assessment of risk of bias and applicability concerns, as well as in the analysis of data.

Data Analysis
To assess the diagnostic accuracy and performance of each antibody testing kit authorized by FDA Philippines, forests plots were used to summarize the performance of the three testing mechanisms namely LFIA, ELISA and CLIA, in terms of their specificity and sensitivity in detecting IgG, IgM, and IgA antibodies. To further investigate the reported results, both estimated pooled sensitivity and specificity were conducted using bivariate random-effects models including its log-likelihood, a corresponding chi-square test statistic, and area under the summary Receiver-Operating Characteristic (sROC) curve to see the potential heterogeneity in the data. Figure 1 shows the study selection process. In the initial search, the researchers identified thirty-five (35) journal articles about COVID-19 antibody test kits from trusted medical journal search engines. After removal of duplicates, thirty-four (34) studies were then screened and nine (9) of these were excluded due to the following reasons: three (3) were published from databases included in the Beall's list of potential predatory journals, two (2) were published before December of 2019 and the study design of four (4) journals were part of the exclusion criteria. Twenty-five (25) full-text articles were further assessed for eligibility. Consequently, the researchers excluded three (3) articles that did not provide absolute data for specificity and sensitivity and another three (3) articles that did not use the gold standard RT-PCR to diagnose COVID-19. After keen assessment and screening done by the researchers, a total of nineteen (19) journal articles [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22] were found appropriate to be included in this systematic review and quantitative or meta-analysis.

Risk of Bias within Studies
The following figures display the risk of bias and applicability concerns for the included studies in this systematic review and meta-analysis. The unified decision of judgement for the risk of biases is done through answering signaling questions tailored by the researchers. This procedure is based on QUADAS-2, the current version of Quality Assessment of Diagnostic Accuracy Studies (QUADAS), a tool used in systematic reviews designed to assess and evaluate the risk of bias and applicability of primary diagnostic accuracy studies, as recommended by the University of Bristol.   Figure 2 presents the summary of the risk of bias and applicability concerns for each domain of quality assessment. More than 90% of the included studies have a high risk of bias in terms of patient selection. The majority of this can be attributed to the non-randomized or non-consecutive selection of patients/specimens. Then, about 80% of the studies have a high risk of bias in terms of the index test. For most of the studies, the status of the specimens (positive or negative via RT-PCR) is already known before the use of the index test namely CLIA, LFIA, and ELISA. In terms of the reference standard, about half of the studies showed a high risk of bias. Half of the studies did not state the use of RT-PCR as its reference standard or gold standard in obtaining a positive result for COVID-19. Lastly, about 90% of the included studies presented a high risk of bias in terms of flow and timing of the test. Bias may have been introduced in studies that did not explicitly state the timing or interval between the index test and the reference standard as well as in studies that faced withdrawal of participants during the course of the study. The breakdown of quality assessment for each study is seen in Figure 3. Table 2 shows the summary of antibody test kits that was included in the study. It includes the authors of the nineteen (19) studies [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22] that were included, along with its Rapid Diagnostic Test Identification (RDT-ID), the brands of the antibody test kits used per study, manufacturer, the types of assays namely LFIA, CLIA, and ELISA, and the type of antibody tested such as IgM, IgG, IgA, or Combined.  Table 3 shows the summary of characteristics of the nineteen (19) studies [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22] included for this systematic review and meta-analysis. Displayed above is the author of each study with his corresponding Rapid Diagnostic Test Identification (RDT-ID) patterned in Table 2. For every article, the characteristics included were the Location, Study Design, Sample Size, and Reference Standard. There were no limitations in the setting as long as the brand of the FDA Philippines-approved test kits are present in the journal. Additionally, the study designs included were cohort, retrospective, experimental, descriptive, and case series studies. Sample sizes were based on the number of samples used in every article, which can be either whole blood, serum, or plasma. Lastly, the reference standard was also noted for each article. Samples used must be confirmed with Real-Time Polymerase Chain Reaction (RT-PCR) in accordance with the inclusion criteria.  Table 4 shows the individual sensitivity and specificity of the antibody tests included in the analysis. Data for the number of positive sera confirmed by RT-PCR (N+) and the number of control serum or samples from non-COVID patients (N-) were gathered to compute the true positive, false negative, false positive, and true negative values. Sensitivity and specificity for each brand of assay were indicated based on the type of antibody tested.

Meta-analysis of Diagnostic Accuracy of COVID-19 Antibody Tests
This chapter presents the results of the meta-analysis of the diagnostic accuracy and performance of COVID-19 antibody tests authorized by FDA Philippines which focuses on the comparisons of the performance of CLIA, ELISA, and LFIA. This chapter is divided into four (4) parts: (a) forest plots of sensitivity and specificity of individual studies, (b) summary receiver operating characteristics (sROC) curves with its corresponding area under the curve (AUC), (c) pooled sensitivity and specificity estimated using a bivariate random-effects model and (d) subgroup analysis.  Table 5 presents a summary of the number of studies that have data on the performance of different COVID-19 antibody tests. A total of sixteen (16) studies for IgG detection, nine (9) studies for IgM, three (3) studies for IgA, and ten (10) studies for combined antibodies detection. Note that for LFIA, some studies tested multiple brands of the same test type which were reflected on the forest plot or multiple entries from one study. To avoid bias, the results were not pooled within those studies.

Forest Plots of Diagnostics Performance of COVID-19 Antibody Tests
The following figures are the forest plots of diagnostic performance of COVID-19 antibody tests of the individual studies included in the meta-analysis. The plots include the number of true positives, false positives, false negatives, and true negatives, as well as the computed sensitivities and specificities with their corresponding confidence intervals. Visual plots of the estimated sensitivity and specificity are also shown in the forest plots. Figure 4 shows the forest plot of the individual studies included in detecting IgG antibodies. For the studies that reported the performance of CLIA, sensitivity ranges from 53.0 to 95.0%. Notice that the two (2) records for CLIA have statistically different sensitivities as shown by their non-overlapping confidence intervals. In comparison, the range of sensitivities for ELISA is lower at 38.0 to 75.0% while for LFIA, the range is as low as 44.0% and as high as 100.0%. On the other hand, in terms of specificity, all reported data do not vary much with a range of 93.0 to 100.0% for all types of tests.  In terms of IgA antibodies detection, Figure 6 shows that the sensitivity of ELISA ranges from 83.0% to 87.0%, which is higher and more compact than the reported sensitivities for IgG and IgM. On the other hand, specificities range from 59.0% up to 86.0%, which is slightly at a disadvantage when compared to the results of the other two (2) antibodies.  Figure 7 presents the forest plot of the studies that reported the diagnostic performance of ELISA and LFIA in detecting combined antibodies. For ELISA, sensitivity ranges from 84.0 to 93.0% which is considerably high compared to the previous reports. Specificities range from 88.0 to 97.0%. Both sensitivity and specificity from the two (2) included studies for ELISA do not have statistical difference at a 95% level of confidence. On the other hand, sensitivities for LFIA ranges from as low as 57.0% up to as high as 100.0%. All reported data for specificity do not have variation, ranging from 90.0 to 100.0% for all types of tests. To summarize, the forest plots of the diagnostic performance of COVID-19 Antibody tests in detecting IgG, IgM, and combined antibodies indicate that the data for sensitivity show a low to moderate level of heterogeneity, while data for specificity are highly homogeneous. On the other hand, data for sensitivities in detecting IgA antibodies are more homogenous than their corresponding specificities. To further test this claim, the authors estimated a bivariate random-effects model and conducted a subgroup analysis to determine whether heterogeneity is present in the data. The results of the models are presented in Tables 6 to 9 of this chapter.

Summary Receiver-operating Characteristics Curves of COVID-19 Antibody Tests
The following figures display the summary ROC (sROC) curve generated from all the included studies of each testing mechanism. From each figure, all testing mechanisms are analyzed by their performance in detecting the same immunoglobulin type through the sensitivity and specificity reported by each study that represents the said testing mechanism. The hollow shapes inside the plot indicate the reported sensitivity against (1-specificity) from every included study in the review and are scaled by the inverse variance of the study, while the solid curved lines are the ROC curves. The Area Under the Curve (AUC) was also computed to summarize the overall diagnostic accuracy of the test. It ranges from 0-1 wherein an AUC of 0 pertains to a perfectly inaccurate test whereas an AUC of 1 pertains to a perfectly accurate test [23]. Figure 8 shows the sROC curve generated by studies that have antibody test kits that detect IgG antibodies. The black circles and black curved line represent the studies that represent CLIA test kits, the red diamonds and red curved line represent ELISA, and lastly, the green boxes and curved line represent LFIA. Based on the figure, both CLIA and LFIA seem to have an almost equal diagnostic performance in detecting IgGs due to their nearly overlapping sROCs. On the other hand, ELISA is shown to have poorer diagnostic performance as compared to the other two antibody tests due to it being relatively father from the upper left corner indicating a perfect test and is relatively closer to the dotted diagonal line or line of no effect. Additionally, an AUC value of 0.973 was computed and this denotes that the diagnostic tests can sufficiently differentiate diseased from healthy individuals. Figure 9 shows the sROC curve showing the performance of test kits in detecting IgM antibodies. The black circles and black curved line represent the studies that represent CLIA test kits, while the red diamonds and red curved line represent LFIA. LFIA is seen to have a lower diagnostic performance as compared to CLIA due to its sROC curve being relatively closer to the line of no effect. Nonetheless, the performance of the two test types is statistically tied as per the bivariate random-effects model. From the computation for AUC, a value of 0.953 was obtained and infers that the diagnostic test is can reliably discriminate diseased from healthy individuals.

Figure 9. Summary ROC curve of COVID-19 antibody test for IgM antibodies
The black dots and the black line represent the studies for ELISA. Based on Figure 10, the dots occupied the upper middle portion of the graph which shows a good balance between the sensitivity and specificity of the test. In contrast, the previous sROC charts are too clustered on the far left of the chart which shows very high specificities but also compromised the sensitivity of the test. As for its AUC, it cannot be computed due to the limited number of studies available.

Figure 10. Summary ROC curve of COVID-19 antibody test for IgA antibodies
The black dots and line represent the studies for ELISA while the red dots and line represent the studies for LFIA. Based on Figure 11, LFIA has better diagnostic performance compared to ELISA although the difference is only slight. Nonetheless, the performance of the two test types is statistically tied as per the bivariate random-effects model. As for its AUC, it obtained a value of 0.966 which also indicates a high performing and accurate test.

Pooled Sensitivities and Specificities of COVID-19 Antibody Tests from the Bivariate Random Effects Model
The following tables present the estimated pooled sensitivity and specificity of different COVID-19 antibody tests using the bivariate random-effects model as this is the preferred method for the estimation of a summary value of sensitivity and specificity, their direct correlation, as well as for the evaluation of how their expected values may vary with study level covariates [24]. A huge variation in the pooled sensitivities is seen as presented in Table 6. ELISA has the lowest pooled sensitivity at 58.7% while CLIA has the highest pooled sensitivity at 81.7%, although it also has the widest confidence interval estimate with a lower bound of 38.4% and an upper bound of 97.0%. Overall, the pooled sensitivity of all COVID-19 antibody test in detecting IgG antibody is 72.0% (95% CI: [63.5%, 79.1%]). From the given data, there is no significant difference seen in the pooled sensitivities of the antibody tests at a 95% confidence level.
On the other hand, the pooled specificities of the tests are nearer to one another as compared to their sensitivities. CLIA has the highest average specificity at 99.3% followed by ELISA at 98.9% and finally by LFIA at 98.8%. Overall, the pooled specificity for all tests is 98.8% with a 95% confidence interval: a lower bound of 97.8% and an upper bound of 99.3%. In any case, the COVID-19 antibody tests are very reliable in predicting specimens negative from COVID-19. Based on Table 7, CLIA has a lower pooled sensitivity of 61.0% compared to LFIA with a pooled sensitivity of 69.6%. Overall, the pooled sensitivity of all COVID-19 antibody tests in detecting IgM antibodies is 68.5% (95% CI: [55.4%, 79.2%]). No significant difference is seen in the pooled sensitivities of the antibody tests at a 95% confidence level.
On the other hand, the pooled specificities of the tests are nearer to one another compared to their sensitivities. Both studies for CLIA reported a 100.0% specificity, hence the pooled specificity stayed at 100.0% while the pooled specificity of LFIA is 98.4% with a 95% confidence interval of 96.6% to 99.2%. Overall, the pooled specificity for all tests is 98.7% with a 95% confidence interval lower bound of 97.2% and an upper bound of 99.4%. Nonetheless, the COVID-19 antibody tests are very reliable in predicting specimens negative from COVID-19. Based on Table 8, the average sensitivity of the test is 84.8% with a 95% confidence interval of 80.7% to 88.2%. Meanwhile, the pooled specificity is 77.5% with a 95% confidence interval of 61.5% to 88.2%. These results suggest that detecting IgA antibodies using the ELISA test kits is considerably more reliable compared to detecting IgG or IgM antibodies. However, the specificity of the test is less efficient for IgA compared to IgG and IgM despite having higher sensitivity.  Table 9 presents the estimated pooled sensitivity and specificity of different COVID-19 antibody tests in detecting a combination of IgG, IgM, or IgA antibodies using the bivariate random-effects model. LFIA has a lower pooled sensitivity of 81.6% compared to ELISA with a pooled sensitivity of 89.0%. Overall, the pooled sensitivity of all COVID-19 antibody tests in detecting combined antibodies is 82.5% (95% CI: [73.3%, 89.0%]). Hence, it can be derived in the given data that there is no significant difference in the pooled sensitivities of the antibody tests at a 95% confidence level.
On the other hand, the pooled specificities of the tests are nearly similar compared to their sensitivities. Pooled specificity for ELISA is lower at 94.0% with a confidence interval of 84.8% to 97.8% compared to the pooled specificity of LFIA at 98.3% with a 95% confidence interval of 96.5% to 99.2%. Overall, the pooled specificity for all tests is 97.9%, with a 95% confidence interval lower bound of 96.0% and an upper bound of 98.9%. Nonetheless, the COVID-19 antibody tests are very reliable in predicting specimens negative from COVID-19.

Subgroup Analysis using Bivariate Random-effects Model with Covariates
Since there are huge gaps and differences in the reported sensitivities for the detection of IgG, IgM and combined antibodies, the researchers conducted a subgroup analysis to further investigate the potential heterogeneity in the data.
Using the bivariate random-effects model with covariates, wherein the test types are used as covariates, the proponents tested for the significant difference of the estimated parameter for sensitivity for each test type while fixing the specificity. Notice that the log-likelihood of the models slightly increased. In addition, all the p-values of the test statistic are not less than or equal to 0.05, hence, there is no sufficient evidence to conclude that the models with covariates are better than the full model. Therefore, the type of test does not affect the diagnostic performance of the antibody test and thus, there is no significant difference between the true sensitivities of CLIA, ELISA, and LFIA in detecting COVID-19 IgG antibodies, CLIA and LFIA in detecting COVID-19 IgM antibodies, and ELISA and LFIA in detecting combined antibodies.

Summary
Based on the included studies, the forest plots of the reported diagnostic performance of COVID-19 antibody tests showed that CLIA has the highest sensitivity range in detecting IgG antibodies. The highest sensitivity ranges for ELISA and LFIA were seen in the detection of combined antibodies. However, the reported sensitivity in detecting IgA antibodies using ELISA is higher and more compact than any of the reported sensitivities for IgG and IgM. In terms of specificity, all reported ranges do not vary for all the types of tests. This indicates that the data for sensitivity shows a low to moderate level of heterogeneity while the data for specificity are highly homogeneous.
Summary ROC curves were used to visually assess the heterogeneity of the given data for each test type in terms of their diagnostic performance. In IgG detection, CLIA and LFIA have an almost equal diagnostic performance since their summary ROC curves are nearly overlapping. For IgM detection, LFIA has a slightly lower diagnostic performance compared to CLIA. The use of ELISA in IgA detection, on the other hand, shows a good balance between the sensitivity and specificity of the test. In contrast, the summary ROC curves of IgG and IgM detection shows very high specificities but compromises the sensitivity of the test. Lastly, LFIA has a better diagnostic performance compared to ELISA with a minimal difference in the detection of combined antibodies. In addition, area under the curve was computed to evaluate each test type in terms of their diagnostic accuracy. Most of their values are near to the value of 1 which indicates that the tests are almost perfectly accurate. The detection of IgG has the highest computed value followed by combined antibodies and lastly by IgM. However, the AUC value for IgA cannot be computed due to the limited number of studies available.
To investigate the potential heterogeneity in the data, estimated pooled sensitivity and specificity were done using bivariate random-effects models. This showed that there are no significant differences in terms of sensitivity among CLIA, ELISA and LFIA in detecting IgG, IgM, and combined antibodies at a 95% confidence interval. For IgA antibody detection, ELISA showed better sensitivity compared to the detection of IgG and IgM antibodies. Since there were huge gaps in the reported sensitivities, a subgroup analysis was conducted wherein the computed p-values are not less than or equal to 0.05. Therefore, the sensitivity of the antibody test in the detection of IgG, IgM and combined antibodies do not significantly vary in terms of test type. All tests are considered to have high specificities in detecting specimens negative from COVID-19 but is relatively lower for IgA antibodies despite having a higher sensitivity.

Conclusion and Recommendations
In this systematic review and meta-analysis, existing evidence on the diagnostic accuracy of antibody tests for COVID-19 were found to be characterized by high risks of bias. Consistency in the heterogeneity of sensitivities is factored by the differences in the number of available studies and patient characteristics such as time of sample collection and symptom onset. On the other hand, consistency in the homogeneity of high specificities was observed except in IgA detection using ELISA which may be influenced by the number of available studies and the possible presence of other viral infections at the time of sample collection. Based on their AUC values, all test types, CLIA, ELISA and LFIA, in the detection of IgG, IgM and combined antibodies were found to have excellent diagnostic accuracies, mostly influenced by their outstanding specificities.
Future studies that aim to evaluate the diagnostic accuracy of SARS-CoV-2 antibody test kits through systematic reviews should design a more well-balanced approach in gathering significant studies. This can be accomplished by collecting articles across a wider platform of journal databases and by implementing a more flexible, yet structured inclusion and exclusion criteria. It is also notable that the sensitivity and specificity of the CLIA antibody testing kits are the highest among the three testing mechanisms. The results may be skewed in this manner due to the lack of journal articles that represent specific brands for CLIA. This partiality can be resolved by gathering the same number of articles for each testing mechanism to reduce the disparity regarding the number of references and its effect on the results.