An introduction to the basic concepts of instrument calibration and a look at how instruments are calibrated, how to measure their quality, and more.
This column introduces the basic concepts of instrument calibration. Then we examine how
instruments are calibrated, how to measure their quality, and examine the calibration myths in our
industry. The column concludes by dispelling these myths.
Analytical chemistry can be defined as the science of identifying and quantifying the chemical species in matter (this is my definition, happy to entertain your thoughts on this). Calibrate (verb) means to determine, check, or rectify . . . any instrument giving quantitative measurements (1). Many chemical analysis methods are geared towards measurement of concentrations of atoms or molecules in samples, and of course these days this is done with chemical analyzers or instruments (2–5). These instruments do not magically produce correct answers out of the box, any instrument that measures anything needs to be calibrated-that is, be set up to ensure that it is giving the correct readings. To do this one must have standards, which are samples containing known values of the quantity that the instrument to be calibrated will measure. At the end of the day, in science the most fundamental standards are those that define basic units like mass, length, and time. In the U.S., it is a federal agency called the National Institutes of Standards and Technology (NIST) that “sets the standard” for standards (6). Amongst its other duties, NIST develops standard reference materials (SRMs) for specific industries. For example, it has developed an SRM for the green tea industry to improve the accuracy and precision of its analysis (7). In a recent conversation with NIST, I discovered they are working on a hemp SRM, which should greatly help many of the problems with hemp analysis (8).
Once an agreed upon standard exists, it is used to calibrate an instrument. A common example is calibrating a scale that measures mass in grams. This is done by placing standard weights on the scale, noting the value given, and adjusting the scale so the measurement is correct. The standard weights are often what is called “NIST traceable,” which means they have been developed in accordance with NIST’s standards. Another example of a calibration is the use of Beer’s Law, which relates the amount of light absorbed by a sample to concentration (9). For example, when a high performance liquid chromatograph (HPLC) with an ultraviolet-visible (UV-Vis) detector is calibrated, solutions of known concentration of the analyte, such as tetrahydrocannabinol (THC), are injected into the system. The amount of light absorbed is measured and plotted versus the known concentration in the standards, generating what is called a calibration line. An example Beer’s Law calibration line is seen in Figure 1.
Once a calibration line is generated, the concentration in unknown samples can be determined by measuring the absorbance and using the calibration line equation to predict concentration in the unknown.
The scale and HPLC examples above are what are called primary calibrations, the instruments were calibrated directly using known standards. In a recent article, I discussed the use of infrared (IR) spectroscopy for cannabis potency analysis (9). One of the advantages of this technique is that it can look at cannabis and other samples directly with little or no sample preparation. That is, one does not have to break down the sample into pure components to measure concentrations (9). Despite what some people claim, IR spectroscopy is quantitative and in combination with Beer’s Law IR spectrometers can be used to determine concentrations of chemical species in samples (for more details I refer you to my book on the subject [10]). IR spectrometers are capable of measuring absorbances with high accuracy and precision (11), however, the peak size and area cannot be used by themselves to determine concentration directly. Instead, the spectrometer must be calibrated with samples of known analyte concentration. For cannabis potency measurements, IR spectrometers are calibrated by running a set of cannabis samples on a spectrometer, measuring the absorbances, and then running the same samples by HPLC at a third party, state licensed, ISO certified laboratory (12). One then constructs a correlation plot of the concentrations as determined by HPLC versus as determined by IR spectroscopy to determine the quality of the IR spectrometer’s calibration. An example of this for the determination of total THC in hemp by mid-IR spectroscopy is seen in Figure 2 (12). Since IR spectrometers must be calibrated using data from primary methods such as HPLC, it is correct to say that IR spectroscopy is a secondary method.
Which brings me to the first calibration myth I need to dispel. I have heard people in this industry say, “Primary methods are always better than secondary methods.” This statement is unscientific on its face. What criteria are being used to judge the methods? Based on whose peer-reviewed scientific work? Remember from the last column the Golden Triangle of chemical analysis, where the criteria used to judge a method are accuracy, speed, and cost (9). This means the “goodness” of a method must be determined within the context of its application. As I pointed out in the last column, chromatographic measurements are accurate but are slow and expensive, and spectroscopic measurements are typically not as accurate but are faster and cheaper. Within the context of cannabis potency testing, I argued that chromatography is great for laboratory testing, and that IR spectroscopy is better suited for in-field testing (9). There are no absolutes about one technique being “better” than another, it’s all about how fit a method is for its anticipated purpose. Thus, primary methods are not necessarily better than secondary methods, and secondary methods are not necessarily better than primary methods, they each must be judged on their own merits.
Figures of merit for a calibration include accuracy and precision, which we have discussed previously (13). Additionally, when plotting lines as in Figures 1 and 2, the agreement between the x-axis and y-axis data is measured using a metric called the correlation coefficient (R2) (10). If the two sets of data are in perfect agreement, that is if the two techniques are in perfect agreement, an R2 of 1.0 is achieved. This is of course an impossibility because of the existence of error in all measurements (13). If there is absolutely no correlation between the data, an R2 of 0 is obtained. Of course, the closer to 1 the R2 is the better. If you look closely in Figure 1, the R2 value is labeled on the plot at 0.99, whereas in Figure 2 it is labeled as 0.95. This means that the data sets in Figure 1 agree better with each other than in Figure 2, but this does not necessarily mean one calibration is better than another. Again remember . . . speed, accuracy, and cost, not just accuracy, are the criteria that should be used when evaluating calibration quality.
Recall that accuracy is a measure of how far off a measurement is from its true value (13). The true values are typically provided by NIST standards and NIST standard reference materials. For example, for the analysis of green tea there exists a standard sample of known composition that can be used to check an instrument’s calibration and can be used as a comparison for round-robin studies across different laboratories (7). Thus, the determination of accuracy in green tea analysis is possible because there exists a standard reference material of known composition-that is, there exists a true value.
The opposite is true with cannabis potency analysis. There are no standard reference cannabis plant materials or oils. This means there are no true values, and no way to check the calibration of chromatographs and spectrometers, or to legitimately compare potency results across different laboratories. This in part explains the inter-laboratory variation problem in cannabis analysis-that is, different laboratories obtaining significantly different results on the same samples (14–16). Given that there are no “true” values in cannabis analysis, there is no way to ascertain the accuracy of any potency analyses regardless of instrumental technique. That is not to say that precision cannot be determined. Recall that precision is how much spread there is in the determination of a value for a given sample or set of samples (13). For example, precision can be determined by running the same sample three times and looking at the variation in the results. Therefore, my recommendation to cannabis businesses is to send multiple portions of the same sample to multiple laboratories, determine which one has the best precision, and then use that laboratory exclusively going forward. This way you can trust changes in values are real, and that comparisons of data over time are legitimate. This problem is solvable by development of NIST traceable cannabis reference materials (8). However, until that happens, potency accuracy in cannabis analysis is a myth.
Since state regulations require compliance testing be performed by third party laboratories, some people in this industry believe that they must do all their testing at third party laboratories. I guess this stems in part from the idea that the methods used by third party laboratories are the only ones allowed, and that an in-house tester will never be as accurate as a third party laboratory.
This is simply not true. In other regulated industries, such as in the pharmaceutical industry, in-house testing is commonly done and required and blessed by the U.S. Food and Drug Administration (FDA). This is part of my “Cannabis is Medicine . . . Test it Like Medicine” mantra I have been trying to instill in the industry (17). When running a cannabis business, third party laboratory testing is impractical because it is slow and expensive; you can’t run an extraction process or optimize a grow if it takes two weeks to get results back. Now, last column I compared and contrasted laboratory versus in-house testing (12). Remember it’s not just about accuracy, but speed, accuracy, and cost. If your in-house testing method is accurate enough, fast enough, and affordable enough for you to make business decisions, it is fit for its purpose.
The problem faced by in-house cannabis analyzer makers (18) is what reference values to use for calibration given the problem of inter-laboratory variation (14–16). Similar to my advice for cannabis businesses, in this case instrument manufacturer’s need to find a laboratory they trust and use their readings going forward until the inter-laboratory variation problem goes away. Until cannabis SRMs are available, we should as an industry decide upon a “golden” HPLC potency method and standardize around this. I believe the method of Giese, Lewis, and Smith should be used (19).
The science of instrument calibration was introduced, and the importance of standards and standard reference materials was emphasized. When evaluating a calibration for use accuracy, speed, and cost should all be considered. The correlation coefficient as a measure of calibration quality was introduced. The idea that primary calibration methods are better than secondary methods was dispelled. Given the current lack of SRMs in this industry, the myth of cannabis potency measurements was dispelled. Finally, the importance of in-house testing for the cannabis industry was established.
Brian C. Smith, PhD, is Founder, CEO, and Chief Technical Officer of Big Sur Scientific in Capitola, California. Dr. Smith has more than 40 years of experience as an industrial analytical chemist having worked for such companies as Xeros, IBM, Waters Associates, and Princeton Instruments. For 20 years he ran Spectros Associates, an analytical chemistry training and consulting firm where he improved their chemical analyses. Dr. Smith has written three books on infrared spectroscopy, and earned a PhD in physical chemistry from Dartmouth College.
B.C. Smith, Cannabis Science and Technology3(1), 10-24 (2020).