Experimental design and statistical power of fish growth studies  

Helgi Thorarensen, Godfrey Kubiriza, Albert K. Imsland
Department of Aquaculture and Fish Biology
Holar University College
551 Sauðárkrókur
Iceland

Numerous studies are published every year that compare the effects of different factors on the growth of aquaculture fish. Little attention has been given to the experimental designs of these studies, e.g. in how many rearing units should each treatment be tested, how many fish should be in each tank (n) and how should the data be analyzed. A survey of recent publications suggests that most (80%) aquaculture growth studies apply each treatment in triplicates with an average of 26 fish in each tank (range: 4 to 100).

The null hypothesizes of statistical tests assume no effect of treatment and they are rejected when the probability of Type I error (rejecting a true hypothesis) is less than 0.05. In contrast, failure to reject an incorrect null hypothesis is called Type II error. The probability of Type II error is denoted as β and 1-β is the statistical power of the test - the probability of finding a difference where one truly exists. Statistical power increases with increasing replication of treatments, n, difference between the mean responses to treatment and reduced variance of data.

The minimum difference that can reliably be detected in an experiment depends of the statistical power of the design. In the present study, we accumulated information on the variance of data in aquaculture growth studies to estimate the statistical power of different designs and the minimum detectable difference with 80% statistical power. The results suggest, that the variance is similar for different aquaculture species and, therefore, the same designs (level of replication and n) are suitable for studies on different species of fish.

The minimum difference in mean body-mass of different treatment groups that can be detected in a typical aquaculture study (triplicates, 25 fish in each tank and average variance) with 80% statistical power (less than 20% chance of Type II error) is around 26% of the grand mean. Increasing n from 25 to 100 will reduce the effect size to 19%, while a further increase in n will have comparatively lesser effect. Effect size under 10% of the grand mean is only possible when care is taken to limit the size range of the fish used in the study to reduce variance.

Simulations were performed, where samples were repeatedly drawn from artificial populations with identical distribution and with the same experimental design as used growth studies. Two of the populations had dose-dependent responses to treatment while one population showed no response to treatment. The resulting data was analyzed with a mixed model ANOVA and by fitting either polynomials or asymptotic models to the data. Contrary to earlier suggestions, the critical treatment (minimum treatment to generate maximum response) estimated with the ANOVA approached more closely the population responses than did the critical treatments estimated with the non-linear models.