DEVELOPMENT OF GROWTH MODELS FOR HATCHERY PRODUCTION USING DATA MINING AND MACHINE LEARNING METHODS

Konstantinos Bovolis*, Kostas Seferis, Gerasimos Antzoulatos, Antonio Coli, Evagelia Sbiliri, Konstantinos Tzakris, Ioannis Despotopoulos
 
Integrated Information Systems SA.
Mitropoleos 43, 15122 Athens, Greece.
kbovolis@i2s.gr
 

Aquaculture companies are drowning in data, but starving for knowledge. Data can tell a lot about the parameters influencing the success of production (from water quality parameters, to feed types, feeding rates and practices, management strategies and more). They can also be used to identify patterns, trends, problem causes and also to develop models.

Data mining and machine learning can help convert data into to knowledge, which can be used to dramatically improve performance. Whether it is used to drive new business, reduce current costs or gain a competitive edge, data mining can be seen as a highly transformational asset for every fish farming organization, be it large or small. This secure and unobtrusive collation of data enables the analysis of huge volumes of historical data to deliver informed business driven knowledge from models built for prediction, estimation, and other inferences involving uncertainty. In aquaculture, it can be used to support smarter decisions, better production and efficient management.

The work to be presented is related to the use of advanced machine learning methods for the development of growth models for Sea Bream, Sea Bass and Meagre for Selonda SA. The company is one of the biggest producers of sea bass and sea bream worldwide, producing more than 30000 tons per year in 55 farms. It operates six hatcheries that produce over 150 million juveniles per year. The work to be presented was conducted using large datasets of average weight measurements. These datasets have been explored using descriptive statistics techniques in order to

  • Preprocess data, so as to exclude misleading and faulty entries from the analysis
  • Detect outliers in terms of growth and fish density and remove them for the analysis

Then, Machine Learning methodologies, such as Generalized Linear Models (GLM), Generalized Additive Models (GAMs), Support Vector Machines (SVM), were used in order to create models able to predict fish growth in terms of fish density, average weight and temperature. After the 10 folds cross-validation process the best models were chosen and utilized to the predict growth and create production plans.

The results of this work are new, more accurate models that promote informed and precise strategic business decisions, previously not possible, thus enabling a competitive edge to manifest. The new models are presented, along with how data mining can support knowledgeable methods for production and management and richer decision making capabilities, based on a holistic data framework.