Analytic Rarefaction help
Overview
Rarefaction is used to estimate diversity for a smaller sample size than was measured. It is commonly used to compare diversity in samples of different size by reducing (rarifying) all samples to the same number of individuals. Rarefaction also places a confidence interval on its estimate of diversity.
Analytic Rarefaction is free. If you use it for a published paper or talk, please list Analytic Rarefaction in your references or mention it the acknowledgements section. In either case, indicate that the program is available at this website or through the Mac App Store. Journals differ in citation format, but this is a sample citation with the primary information most journals will require:
Holland, S.M., 2021. Analytic Rarefaction, version 2.0. www.huntmountainsoftware.com
How rarefaction works
Analytic Rarefaction uses the rarefaction equations for the estimated number of species (E), given by Hurlbert (1971), and for the variance in E (Var), given by Heck et al. (1975). These are the same equations used by Raup (1975) and Tipper (1979). In particular, Analytic Rarefaction uses the formulation of Tipper (1979), as his equations (1) and (2) are easy to program and avoid the overflow errors associated with the large combinatorials.
The results of this program have been cross-checked with analytic solutions supplied by Michael Foote and with resampling calculations by myself and Michael. Note that if you test the results of this program by using Table 3 of Raup (1975), the values of Var will differ for low values of n. Dave Raup, Michael, and I have found a coding error used in Raup’s original program, which caused his published values of Var to be inflated at low values of n.
See the recommended reading at the bottom for more details about how analytic rarefaction works and how the results should be interpreted.
Preparing your data file
In ecologic sampling, each individual is identified and tallied, resulting in a list of the abundances of each species. For example, species A though G may have the following abundances:
- A 13
- B 45
- C 2
- D 1
- E 18
- F 99
- G 174
To perform a rarefaction on these data, you need to list just the abundances (not the species) in a text file. You can do this in several different ways, depending on which is easiest for you. You could list the abundances, one on a line, separated by carriage returns:
- 13
- 45
- 2
- 1
- 18
- 99
- 174
You could also list all of them one one line, separated by commas:
- 13,45,2,1,18,99,174
or separated by either tabs or spaces:
- 13 45 2 1 18 99 174
Whatever you choose, just be consistent and do not mix tabs, spaces, commas, and returns in one file.
The list of abundances can be in any order; the order of values will not affect the results. All abundance values should be integers (no decimals).
This list should be saved as a plain text file (.txt) or comma-delimited text file (.csv). TextEdit ships on all Macs and is probably the easiest tool to use, although any text editor like BBEdit, Atom, and TextMate work just as well (and they are also even better as full-featured text editors). Analytic Rarefaction cannot read Microsoft Word (.doc or .docx), Excel (.xls or .xlsx), or Rich-Text (.rtf) files.
Rarefaction calculations may be made in any increment of individuals. Specify this number in the main window prior to performing your rarefaction. Small numbers will cause the rarefaction calculations to be slower, but they also offer more precision. Just be sure that you need that precision if the runs are slow. In most cases, the rarefaction will be sufficient if the increment is be about 1/100 to 1/1000 of the total number of individuals.
Interpreting your results file
Analytic Rarefaction will display the results in the window. Check the first line of the output to make sure the number of individuals and the number of species that were read match what you think they should be.
Seven columns of results are displayed:
- n: rarefied number of individuals
- E: rarefied diversity estimate
- Var: variance of rarefied diversity estimate
- Upper95: upper 95% confidence limit on diversity
- Lower95: lower 95% confidence limit on diversity
- Upper99: upper 99% confidence limit on diversity
- Lower99: lower 99% confidence limit on diversity
The confidence limits are calculated as E plus or minus Z times the square root of variance, where Z is 1.96 for the 95% confidence interval and 2.58 for the 99% confidence interval.
To save the results, go to the File menu, choose Save, and give the file a name of your choosing. The results will be saved as a .txt file, which can be opened by many programs, including Word, Excel, R, text editors, and most plotting programs. For a program that can perform a rarefaction and plot the data (and much more), check out Taxon by Hunt Mountain Software.
Troubleshooting
ERROR: Insufficient memory for this data set
This error is unlikely to arise and should occur only for extraordinarily large data sets. If you encounter this, e-mail me at huntmountainsoftware@me.com
Recommended reading
Heck, K.L., Jr., G. Van Belle, and D. Simberloff, 1975. Explicit calculation of the rarefaction diversity measurement and the determination of sufficient sample size. Ecology 56:1459–1461.
Hurlbert, S.H., 1971. The nonconcept of species diversity: a critique and alternative parameters. Ecology 52:577–586.
Raup, D.M., 1975. Taxonomic diversity estimation using rarefaction. Paleobiology 1:333–342.
Tipper, J.C., 1979. Rarefaction and rarefiction - the use and abuse of a method in paleontology. Paleobiology 5:423–434.