Simulations of Test Reduction Using Pooled Heavy Metals Analysis in Cannabis

We developed an algorithm to improve heavy metal testing throughput and costs by combining sample batches into pools.

Tom B. Dupree1, Amanda D. Assen1, Eric Janusson1, Amber R. Wise2, Josh M. Swider3, and Markus Roggen1

1 – Delic Labs, 3800 Wesbrook Mall, Vancouver, BC Canada V6S2L9
2 – Medicine Creek Analytics, 3700 Pacific Hwy E #400, Fife, WA 98424, United States
3 – Infinite Chemical Analysis Labs, 8312 Miramar Mall, San Diego, CA 92121, United States

*Corresponding authors email address:,

ORCID Numbers:
Tom B. Dupree: 0000-0002-5158-5833
Eric Janusson: 0000-0002-3207-7067
Amanda D. Assen: 0000-0002-2378-8158
Markus Roggen: 0000-0003-0980-4331
Amber R. Wise: 0000-0001-7179-0685
Josh M. Swider: 0000-0002-8573-2475


Background: Cannabis species have a propensity to bioaccumulate toxic heavy metals from their growth media. Increased testing for these metals is required to improve the safety of the legal medical and recreational cannabis industries. However, the current methods used for mandated heavy metals tests are not efficient for a large framework. As a result, there is limited testing capacity, high testing costs, and long wait times for results across North America. Objective: This study aimed to demonstrate that pooling strategies can be used to increase the throughput in cannabis testing labs and reduce some of the strain on the industry. Methods:  This paper presents an algorithm to simulate different pooling strategies. The algorithm was applied to real world data sets collected from Washington and California state testing labs. Results: Using a single pooling method, a pool size of three samples on average resulted in a 23.8% reduction in tests required for 100 samples for the Washington lab. For the California lab, pooling four samples on average resulted in a 54.1% reduction in tests required for 100 samples. Conclusion: The algorithms generated from the Washington and California lab data demonstrated that pooled testing strategies can be developed on a case-by-case method to reduce the time, effort, and costs associated with heavy metals tests. Highlights: The benefits of pooled testing will vary depending on the region and rate of contamination seen in each testing lab. Overall, our results demonstrate pooled testing has the potential to reduce the fiscal costs of testing through increased efficiency, allowing increased testing, leading to greater safety. 


Recent legislation in North America has opened the cultivation and sale of cannabis products for medical and recreational use. With the availability of cannabis as a commercial product comes the need for analysis and regulation of cannabinoid concentration, pesticides, microorganisms, and heavy metals. Cannabis, and hemp (both Cannabis sativa), are exceptional bio-accumulators of toxic heavy metals. 1–3 Due to cannabis’ propensity to uptake toxic elements, there is concern that cannabis products may contain elevated levels of these toxic elements. 4 

The metals of current regulatory concern in cannabis products are cadmium, arsenic, lead, and mercury. These fall into the FDA’s class 1 category, which defines them as “human toxicants that have limited or no use in the manufacture of pharmaceuticals”. 5 These metals can be toxic at part per billion (ppb) levels due to the body’s inability to remove them after each exposure, resulting in accumulation. 6 In April 2021, a dispensary chain in Colorado recalled multiple harvest batches of cannabis because they contained the heavy metal cadmium. 7 Cadmium is a known carcinogen and can cause degenerative bone disease, kidney failure, and gastrointestinal and lung disease. Despite the risks, Colorado did not start testing cannabis for heavy metals until 2020 7 and there are still states that have legalized cannabis but do not yet have mandates for heavy metal testing. 8 Recalls due to contamination highlight the need for improved quality control standards and many scientists and healthcare workers are calling for increased safety regulations as the industry grows. 9 

There is limited information on the distribution of cannabis crops that fail heavy metal testing across North America and it is difficult to predict the likelihood of individual cannabis products being contaminated with heavy metals. Arable lands contaminated with heavy metals from anthropogenic activities is a global problem; (1, 10–12) however, accurate predictions of whether cannabis will contain heavy metals cannot be ensured through soil testing alone. Heavy metal uptake by plants is mediated by a variety of conditions and not just whether heavy metals are present. 10 Hydroponic systems are also not immune to contamination, as any part of the growth media can be a source of contamination. Air pollutants, fertilizers, pesticides, fungicides, herbicides, contaminated soil and water, cross-contamination during drying and processing, and post-processing additives are all possible sources of heavy metals. 6,10,13,14 Furthermore, cannabis containing heavy metals can go on to contaminate the processing equipment, which in turn can contaminate subsequent batches. 15

Hyperaccumulation of heavy metals in cannabis tissues has no correlation with detectable morphological changes; 14 so, analytical testing is the only dependable method to determine whether a plant is safe for processing and the FDA recommends testing across all potential sources of contamination. 5 However, mandates usually only require heavy metals tests of final products before they are delivered to retailers. 16 As a result, contaminated cannabis may have had the opportunity to pollute processing equipment before it is tested. 

To avoid this, some cultivators/producers opt for extra-regulatory heavy metals testing before processing. Many also seek out testing when it is not mandated at any stage, despite the high cost. Prices are usually not advertised by testing labs and instead are generally negotiated on a case-by-case basis. 16 We found in Maine, tests for potency, residual solvents, pesticides, heavy metals, fungus, and bacteria can cost $550, with four-week wait time for results. 17 

Although some producers are willing to pay extra for testing, wait times, testing costs, and limited capacity stand in the way of increased heavy metal testing in the cannabis industry. To overcome this, we evaluated methods for improving testing capacity in resource-constrained settings. May et al., and Cleary et al. showed that pooling viral test samples allowed for high throughput testing in resource constrained settings. 18,19 We propose a pooled testing approach will similarly increase testing capacity and reduce wait times and costs associated with heavy metals tests.20


Limits of permissible daily exposure (PDE) for heavy metals in cannabis flower are loosely defined and will vary per jurisdiction. However, all tests involve a certain threshold for each metal that, once exceeded, will result in a failed product. Therefore, a heavy metals test is quantitative in nature. 

May et al. developed a method for quantitative pooled testing to monitor HIV RNA levels in resource-constrained settings. They developed synthetic datasets to represent values of viral loads for an HIV-positive population on antiretroviral therapy (ART) and three algorithms were evaluated. 18 The first algorithm was a two-stage mini-pool approach where a fixed number of samples were combined in a pool. All samples were individually tested if the pool tested above the threshold level for ART failure.  The second algorithm also used a mini-pool approach; however, if a pool tested above the threshold level, samples were tested in random order, with each result deconvoluted (subtracted) from the total pool test result, until the pool value was below the threshold level. Finally, the third approach involved a multidimensional matrix (N × N), with pools formed over N rows and N columns. A sample that intersected the row and column with the highest test result was retested, and that test result was deconvoluted from the results of the row and column until the corresponding row and column were below the cutoff. Depending on the method, rate of failure, and pool sample size, May et al. observed a 26% to 73% reduction in tests required, compared to an individual testing method. We propose that pooled testing can similarly increase testing efficiency for the cannabis industry. 

Inspired by May et al.’s approach, a mathematical model that mimics the average rates of samples that test positive for heavy metals and considers the sensitivity of an ICP-MS (Table 1), was used to determine the optimal number of samples in a pool.

Deconvoluted pooled testing approaches are sensitive to sample distribution. With this in mind, we accessed datasets of heavy metal tests from two third-party testing laboratories, one in Washington State and the other in California. The data used to test our algorithm was collected between September 2020 to January 2022 and from January 2020 to April 2022, for the Washington and California labs, respectively (Table 1). While data for extracts, oils and products was available we restricted the analysis to plant material, as this is most relevant to the extra-regulatory testing regime. For our pooled testing simulation known values are required for all pools. Values below the detection limits were estimated with a uniform distribution between 0 and the detection limit for the specific analyte and laboratory method. We then used bootstrap sampling to create synthetic samples for the pooled testing simulations (Figure 1). Using real-world datasets of real heavy metal tests from certified laboratories, in US states with legal cannabis sales, allow us to ensure any efficiency gain found are of real-world application and not an artifact of a designed synthetic dataset.

Bootstrap Sampling

Bootstrapping is a resampling method that uses random sampling with replacement.21 Using bootstrap sampling, we created synthetic data with similar distributions to the experimental data from the two analytical laboratories. We used a uniform probability with replacement approach. For a sample of size N, a value in the original data was selected at random and that value is assigned to the first value of the new sample, crucially that value is not removed from the original data and can be selected again. The process is repeated until N values are drawn. Bootstrap sampling has the advantage of generating a similar sample distribution without having to mathematically describe the distribution. On the downside the bootstrap sample distribution is non continuous, only values that were in the original sample are seen in the simulated samples.

The algorithms were developed de novo and implemented in python using the numpy (, pandas ( and scipy ( libraries. The seaborn library ( was used for plotting. The code used for this paper is available at: (

The first method tested was based on Dorfman’s minipool approach and utilizing May’s minipool and algorithm refinement that we call deconvoluted single pool. 18,22 The second method is an implementation of May’s matrix approach. 18,22 

For testing, assume that each sample contains a value V between 0->∞ of the analyte of interest with a fail value of K. If V ≥ K the sample is considered contaminated. 

In deconvoluted single pool testing, N samples are combined into a pool and tested; if the value of that test (Vt) is greater than the fail value divided by the pool size (Vt ≥ K/N) individual samples that contributed to the pool are selected at random and individually tested. Given the pool test value is known, the individual contribution of each subsequent sample (Vi) can be subtracted/deconvoluted from that value and the residual test value can be assessed against the fail criteria to see if any of the remaining samples require testing (Vt – Vi/N) ≥ K/(N-1). Once the deconvoluted pool test value falls below the adjusted fail value, no further samples from that pool need to be tested, as the individually failing samples will have been identified.

In the Matrix approach, the deconvoluted pooled approach is applied to an N × N matrix of samples and each row and column is tested as a pool (Figure 2). Instead of randomly selecting samples from failed pools for testing, only samples that are at the intersection of failed row and column pools are tested. The algorithm implemented selects the highest pool test value as the primary axis and the highest failing test pool in the secondary axis as the intersection. Again, the deconvolution process is applied to the two appropriate pools and the matrix is reassessed until no intersections of failing pools occur.

For each simulation, a minimum of 2000 samples were generated (such that the sample count could be evenly divided into K pools of size N for the single pool approach, or L matrices size N × N for the matrix approach). After allocating the samples, the number of tests required were obtained as a fraction of the sample count (representing individually testing each sample). Each simulation was then repeated 4000 times with new random seed sequences. Representing the testing of ~8,000,000 samples for each combination of pool size, analyte, and testing method.


Simulating Pooled Testing Efficiency

Simulation results for the two pooling methods were graphed to visualize the number of tests saved with each approach, compared to an individual testing approach (Figure 3).  

The simulations to identify optimal pool size ignored the effect of method LLoQ. However, LLoQ dictates maximum pool size as the pool fail value must be above the LLoQ. To obtain the maximum pool size the regulatory limit should be divided by the LLoQ (Table 1). Due to the relationship between the lower limit of detection and the regulatory limit the maximum pool size that could be used in the Washington lab would be 4 (limited by arsenic), while the California lab could use a pool size of 9 with the original method and 20 with the new method (limited by mercury in both cases).


Our simulation results highlight the interaction between sample distribution and cut-off. The optimal pool size and approach will depend on the typical samples seen by a lab and the relationship of those samples to the regulatory limit and LLoQ. Based on the arsenic simulations (the least efficient metal in single pools), the peak efficiency for the Washington lab would be obtained using the single pool method with a pool size of three, with 23.8 tests saved (or approximately 77 tests required) for 100 samples. The peak efficiency for the California lab would be obtained using the single pool method with a pool size of four, with 51 tests required for 100 samples. Both of these ideal pool sizes are within the LLOQ restriction for these labs. Potential savings for other labs will vary depending on governing regulations and the concentration distribution of contaminated samples. However, even with relatively small pool sizes, we predict the increased testing capacity from pooling could open the door for more mandated and self-elected tests, and in turn, increase safety in the industry. 

We modeled our simulations using the limits for ICP-MS because the expected concentration of total metals is low and standard instrumental methods of metals analysis (ex: FAAS) lack the sensitivity required for trace metals analysis.23 Although purchasing and running this equipment is a significant investment, we estimate the savings from a pooling strategy would make investing in such equipment worthwhile. 

We anticipate the greatest challenges with this method will be reaching the number of samples in a pool required for optimal efficiency, without diluting individual samples below the sensitivity of the instrument. To prevent over-dilution, reconcentrating pooled samples or foregoing dilution steps may be necessary prior to injection. Assuming the pooled samples are effectively homogenized, we hypothesize ICP-MS will be sufficiently sensitive to detect the trace quantity of metal atoms present. Nevertheless, procedure optimization is required before this method can be widely adopted in the industry.

Measurement at the part-per-trillion scale requires extremely sensitive instrumentation which is susceptible to cross-contamination. Therefore, cross-contamination is also a concern for this analysis given the low detection limits available with ICP-MS. Those who wish to validate this method will need to be vigilant in taking preventative measures to minimize contamination throughout the laboratory. Appropriate lab separations and designated instruments and equipment should be utilized to minimize cross-contamination. 

As acknowledged by Cleary et. al. 19, pooling samples adds a level of complexity in the lab because samples would need to be tracked across multiple pools. To overcome this, testing labs might employ effective tracking software or other methods.


Our results demonstrate pooled testing has the potential to increase testing efficiency in labs and reduce testing costs. This may open the door for the implementation of more regulations implemented as safeguards for the industry and make the option for non-mandated tests more affordable. For the Washington and California state testing labs, a 23.8 and 54.1% reduction in tests were achieved with a pool sample of three and four, respectively. However, limits for heavy metals will vary per region as well as the rate of contamination. We anticipate some of the challenges presented by pooled testing will the added level of complexity in terms of sample tracking, and cross-contamination of heavy metals from the surrounding environment. These challenges should be addressed by those who wish to validate this technique. Overall, we believe pooled testing will add to the efforts to decrease the risks associated with Cannabis products on the market.


Thank you to the Washington and California state testing laboratories for supplying their data for this study.


The research was conducted without any funding sources.

Conflict of Interest

All authors declare no conflict of interest.


1. Angelova, V., Ivanova, R., Delibaltova, V. & Ivanov, K. Bio-accumulation and distribution of heavy metals in fibre crops (flax, cotton and hemp). Ind Crops Prod 19, 197–205 (2004).

2. Linger, P., Müssig, J., Fischer, H. & Kobert, J. Industrial hemp (Cannabis sativa L.) growing on heavy metal contaminated soil: fibre quality and phytoremediation potential. Ind Crops Prod 16, 33–42 (2002).

3. Ahmad, R. et al. Phytoremediation Potential of Hemp ( Cannabis sativa L.): Identification and Characterization of Heavy Metals Responsive Genes. Clean (Weinh) 44, 195–201 (2016).

4. Mead, A. The legal status of cannabis (marijuana) and cannabidiol (CBD) under U.S. law. Epilepsy & Behavior 70, 288–291 (2017).

5. FDA CDER CBER. Q3D (R1) Elemental Impurities Guidance for Industry. U. S. Department of Health and Human Services Food and Drug Administration Preprint at,(e.g.%2C%20mined%20excipients). (2020).

6. Wieliczko, M. Heavy Metal Testing to Improve Cannabis Safety & Quality. Analytical Cannabis (2019).

7. Mitchell, T. Heavy Metals Contaminant Sparks Latest Marijuana Recall. WestWord (2021).

8. Leafly Staff. Cannabis testing regulations: A state-by-state guide. Leafly (2020).

9. Sarma, N. D. et al. Cannabis Inflorescence for Medical Purposes: USP Considerations for Quality Attributes. J Nat Prod 83, 1334–1351 (2020).

10. Blume, H.-P. & Brümmer, G. Prediction of heavy metal behavior in soil by means of simple field tests. Ecotoxicol Environ Saf 22, 164–174 (1991).

11. QINGJIE, G., JUN, D., YUNCHUAN, X., QINGFEI, W. & LIQIANG, Y. Calculating Pollution Indices by Heavy Metals in Ecological Geochemistry Assessment and a Case Study in Parks of Beijing. Journal of China University of Geosciences 19, 230–241 (2008).

12. Hu, Y. et al. Assessing heavy metal pollution in the surface soils of a region that had undergone three decades of intense industrialization and urbanization. Environmental Science and Pollution Research 20, 6150–6159 (2013).

13. Mallampati, S. R., McDaniel, C. & Wise, A. R. Strategies for Nonpolar Aerosol Collection and Heavy Metals Analysis of Inhaled Cannabis Products. ACS Omega 6, 17126–17135 (2021).

14. Bengyella, L., Kuddus, M., Mukherjee, P., Fonmboh, D. J. & Kaminski, J. E. Global impact of trace non-essential heavy metal contaminants in industrial cannabis bioeconomy. Toxin Rev 1–11 (2021) doi:10.1080/15569543.2021.1992444.

15. Dryburgh, L. M. et al. Cannabis contaminants: sources, distribution, human toxicity and pharmacologic effects. Br J Clin Pharmacol 84, 2468–2476 (2018).

16. Valdes-Donoso, P., Sumner, D. A. & Goldstein, R. Costs of cannabis testing compliance: Assessing mandatory testing in the California cannabis market. PLoS One 15, e0232041 (2020).

17. Overton, P. Lack of mandated testing could expose cannabis users to toxins. Pressherald (2018).

18. May, S., Gamst, A., Haubrich, R., Benson, C. & Smith, D. M. Pooled Nucleic Acid Testing to Identify Antiretroviral Treatment Failure During HIV Infection. JAIDS Journal of Acquired Immune Deficiency Syndromes 53, 194–201 (2010).

19. Cleary, B. et al. Using viral load and epidemic dynamics to optimize pooled testing in resource-constrained settings. Sci Transl Med 13, (2021).

20. Dorfman, R. The Detection of Defective Members of Large Populations. The Annals of Mathematical Statistics 14, 436–440 (1943).

21. Efron, B. Second Thoughts on the Bootstrap. Statistical Science 18, (2003).

22. Dorfman, R. The Detection of Defective Members of Large Populations. Institution of Mathematical Statistics 14, 436–440 (1943).

23. Lewen, N., Mathew, S., Schenkenberger, M. & Raglione, T. A rapid ICP-MS screen for heavy metals in pharmaceutical compounds. J Pharm Biomed Anal 35, (2004).

Figure Captions:

Chart, box and whisker chart

Description automatically generated

Figure 1. Comparison of real California cadmium data to bootstrap samples generated from that dataset, n = 1884. Box represents 25, 50, 75% quantiles, whiskers are 1.5 times the interquartile range. Dots represent data points that lie outside the range of the whiskers.  


Description automatically generated

Figure 2. A visual representation of the N×N matrix pooled testing approach. 1: the initial results of testing each row and column in the pool matrix, yellow indicates a member of a pool that contains at least one failed result, red indicates the intersection of two failing pools and thus likely to contain the failure. First failure to individually test belongs to the highest tested pool in the rows and columns. 2: after the first test and deconvoluting that value from the rest of the pool the first column and third row are cleared. 3: after retesting the second intersection, the remailing pools are cleared after deconvolution. 4: the underlying sample matrix used to generate the pooled test and deconvoluted pool test scores. In this example each pool in each column and row was tested (10 runs) and two individual samples were tested for a total of 12 runs/tests. Compared to individually testing every sample (25 tests) this represents a saving of 52%. 

Chart, scatter chart

Description automatically generated

Figure 3. Number of tests saved for single pool and matrix pooled testing at a range of pool sizes.  Blue dots for single pool using the data from Washington state. Yellow X for matrix pools using the data from Washington state. Green square for single pool using the data from California. Red cross for matrix pools using the data from California. 

The original article can be found at: Journal of AOAC International

Related Articles

Ever wonder what smoke and vape tests are used for? #AskAnExpert
Ever wonder what smoke and vape tests are used for? #AskAnExpert
As different cannabis products have differing potency levels, knowing the potency of a product is important for determining the correct dosage. Currently, chromatography is used for potency testing of cannabis products. We teamed up with Agilent to show that FTIR spectroscopy can also be used to further enhance these testing processes.

DELIC Labs uses cookies to
ensure you get the best experience
on our website. View our Privacy
Policy for more information.