Multivariate Data Analysis for Advancing the Interpretation of Bioprocess Measurement and Monitoring Data
The advances in measurement techniques, the growing use of high-throughput screening and the exploitation of ‘omics’ measurements in bioprocess development and monitoring increase the need for effective data pre-processing and interpretation. The multi-dimensional character of the data requires the application of advanced multivariate data analysis (MVDA) tools. An overview of both linear and non-linear MVDA tools most frequently used in bioprocess data analysis is presented. These include principal component analysis (PCA), partial least squares (PLS) and their variants as well as various types of artificial neural networks (ANNs). A brief description of the basic principles of each of the techniques is given with emphasis on the possible application areas within bioprocessing and relevant examples.
This is a preview of subscription content, log in via an institution to check access.
Access this chapter
Subscribe and save
Springer+ Basic
€32.70 /Month
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (France)
eBook EUR 245.03 Price includes VAT (France)
Softcover Book EUR 316.49 Price includes VAT (France)
Hardcover Book EUR 316.49 Price includes VAT (France)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Multivariate Data Analysis (Chemometrics)
Chapter © 2014
Application of Multivariate Statistical Techniques to Predict Process Quality
Chapter © 2018
An introductory review on the application of principal component analysis in the data exploration of the chemical analysis of food samples
Article 03 February 2024
References
- Alexandrakis D (2012) NIRS in an industrial environment. Euro Pharmaceut Rev 17(1):27–30 Google Scholar
- Arnold SA, Crowley J, Woods N, Harvey LM, McNeil B (2003) In-situ near infrared spectroscopy to monitor key analytes in mammalian cell cultivation. Biotechnol Bioeng 84(1):13–19 ArticleCASGoogle Scholar
- Arora N, Biegler LT (2001) Redescending estimators for data reconciliation and parameter estimation. Comput Chem Eng 25(11–12):1585–1599 ArticleCASGoogle Scholar
- Balestrassi PP, Popova E, Paiva AP, Marangon Lima JW (2009) Design of experiments on neural network’s training for nonlinear time series forecasting. Neurocomputing 72:1160–1178 ArticleGoogle Scholar
- Basheer IA, Hajmeer M (2000) Artificial neural networks: fundamentals, computing, design and application. J Microbiol Methods 43:3–31 ArticleCASGoogle Scholar
- Brereton R (2009) Chemometrics for pattern recognition. Wiley, Chichester BookGoogle Scholar
- Chatfield C, Collins AJ (1980) Introduction to multivariate analysis. Chapman and Hall, London Google Scholar
- Chen M, Hu M, Hofestädt R (2011) A systematic petri net approach for multiple-scale modeling and simulation of biochemical processes. Appl Biochem Biotechnol 164:338–352 ArticleCASGoogle Scholar
- Cunha CCF, Glassey J, Montague GA, Albert S, Mohan P (2002) An assessment of seed quality and its influence on productivity estimation in an industrial antibiotic fermentation. Biotechnol Bioeng 78:658–669 ArticleCASGoogle Scholar
- David F, Westphal R, Bunk B, Jahn D, Franco-Lara E (2010) Optimization of antibody fragment production in Bacillus megaterium: the role of metal ions on protein secretion. J Biotechnol 150(1):115–124 ArticleCASGoogle Scholar
- Desai K, Badhe Y, Tambe SS, Kulkarni BD (2006) Soft-sensor development for fed-batch bioreactors using support vector regression. Biochem Eng J 27(3):225–239 ArticleCASGoogle Scholar
- Fielding AH (2007) Cluster and classification techniques for the biosciences. Cambridge University Press, Cambridge Google Scholar
- Gao Y, Kipling K, Glassey J, Willis M, Montague G, Zhou Y, Titchener-Hooker N (2010) Application of agent-based system for bioprocess description and process improvement. Process Biochem 26:706–716 CASGoogle Scholar
- Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chimica Acta 185:1–17 ArticleCASGoogle Scholar
- Glassey J, Ignova M, Montague GA, Morris AJ (1994) Autoassociative neural networks in bioprocess condition monitoring. In: ADCHEM’94, Kyoto, pp 447–451 Google Scholar
- Gregersen L, Jørgensen SB (1999) Supervision of fed-batch fermentations. Chem Eng J 75:69–76 ArticleCASGoogle Scholar
- Gu H, Pan Z, Xi B, Asiago V, Musselman B, Raftery D (2011) Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics: application to the detection of breast cancer. Anal Chim Acta 686(1–2):57–63 ArticleCASGoogle Scholar
- Guebel DV, Cánovas M, Torres NV (2009) Analysis of the Escherichia coli response to glycerol pulse in continuous, high-cell density culture using a multivariate approach. Biotechnol Bioeng 102(3):910–922 ArticleCASGoogle Scholar
- Han-Ming W (2011) On biological validity indices for soft clustering algorithms for gene expression data. Comput Stat Data Anal 55(5):1969–1979 ArticleGoogle Scholar
- Haseltine EL, Rawlings JB (2005) Critical evaluation of extended Kalman filtering and moving horizon estimation. Ind Eng Chem Res 44(8):2451–2460 ArticleCASGoogle Scholar
- Huang JH, Shimizu H, Shioya S (2002) Data preprocessing and output evaluation of an autoassociative neural network model for online fault detection in virginiamycin production. J Biosci Bioeng 94(1):70–77 CASGoogle Scholar
- Igne B, Zacour BM, Shi Z, Talwar S, Anderson CA, Drennen JK III (2011) Online monitoring of pharmaceutical materials using multiple NIR sensors—Part I: blend homogeneity. J Pharm Innov 6:47–59 ArticleGoogle Scholar
- Kohonen T (1997) Self-organizing maps, springer series in information sciences, 2nd edn. vol 30. Springer, Heidelberg Google Scholar
- Kompany-Zareh M (2011) On-line monitoring of a continuous pharmaceutical process using parallel factor analysis and unfolding multivariate statistical process control representation. J Iran Chem Soc 8(1):209–222 ArticleCASGoogle Scholar
- Kong CS, Yu J, Minion FC, Rajan K (2011) Identification of biologically significant genes from combinatorial microarray data. ACS Comb Sci 13(5):562–571 ArticleCASGoogle Scholar
- Kramer NA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37:233–243 ArticleCASGoogle Scholar
- Kuehn DR, Davidson H (1961) Computer control II. Mathematics of control. Chem Eng Prog 57:44–47 CASGoogle Scholar
- Landgrebe D, Haake C, Höpfner T, Beutel S, Hitzman B, Scheper T, Rhiel M, Reardon KF (2010) On-line infrared spectroscopy for bioprocess monitoring. Appl Microbiol Biotechnol 88:11–22 ArticleCASGoogle Scholar
- Laurent S, Karim MN (2001) Probabilistic neural networks using Bayesian decision strategies and a modified Gompertz model for growth phase classification in the batch culture of Bacillus subtilis. Biochem Eng J 7(1):41–48 ArticleGoogle Scholar
- Lee D (2005) Component-based software architecture for biosystem reverse engineering. Biotechnol Bioprocess Eng 10(5):400–407 ArticleCASGoogle Scholar
- Lee DS, Lee MW, Woo SH, Kim YJ, Park JM (2006) Nonlinear dynamic partial least squares modeling of a full-scale biological wastewater treatment plant. Process Biochem 41:2050–2057 ArticleCASGoogle Scholar
- Linko S, Zhu YH, Linko P (1999) Applying neural networks as software sensors for enzyme engineering. Trends Biotechnol 17:155–162 ArticleCASGoogle Scholar
- Luttmann R, Bracewell DG, Cornelissen G, Gernaey KV, Glassey J, Hass VC, Kaiser C, Lindström IM, Preusse C, Striedner G, Mandenius CF (2012) Soft Sensors in Bioprocessing. Biotechnol J 7, 1040–1047 Google Scholar
- Maier HR, Dandy GC (1998) The effect of internal parameters and geometry on the performance of back-propagation neural networks: an empirical study. Environ Model Softw 13:193–209 ArticleGoogle Scholar
- Miao Y, Su HY, Chu J (2009) A support vector regression approach for recursive simultaneous data reconciliation and gross error detection in nonlinear dynamical systems. Acta Automatica Sinica 35(6):708–716 ArticleGoogle Scholar
- McCulloh WP (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133 ArticleGoogle Scholar
- Møller SF, von Frese J, Bro R (2005) Robust methods for multivariate data analysis. J Chemometrics 19:549–563 ArticleGoogle Scholar
- Naes T, Isaksson T, Fearn T, Davies T (2002) A user-friendly guide to multivariate calibration and classification. NIR, Chichester Google Scholar
- Nikhil , Koskinen PEP, Visa A, Kaksonen AH, Puhakka JA, Yli-Harja O (2008) Clustering hybrid regression: a novel computational approach to study and model biohydrogen production through dark fermentation. Bioprocess Biosyst Eng 31(6):631–640 Google Scholar
- Nomikos P, MacGregor JF (1994) Monitoring of batch processes using multi-way principal component analysis. AICHE J 40:1361–1375 ArticleCASGoogle Scholar
- Nucci ER, Cruz AJG, Giordano RC (2010) Monitoring bioreactors using principal component analysis: production of penicillin G acylase as a case study. Bioprocess Biosyst Eng 33:557–564 ArticleCASGoogle Scholar
- O’Malley CJ, Montague GA, Martin EB, Liddell JM, Kara B, Titchener-Hooker NJ (2012) Utilisation of key descriptors from protein sequence data to aid bioprocess route selection. Food Bioprod Process (in press). doi: 10.1016/j.fbp.2012.01.005
- Ödman P, Johansen CL, Olsson L, Gernaey KV, Lantz AE (2010) Sensor combination and chemometric variable selection for online monitoring of Streptomyces coelicolor fed-batch cultivations. Appl Microbiol Biotechnol 86:1745–1759 ArticleGoogle Scholar
- Pate ME, Turner MK, Thornhill NF, Titchener-Hooker NJ (2004) Principal component analysis of nonlinear chromatography. Biotechnol Prog 20:215–222 ArticleCASGoogle Scholar
- Rhee JI, Kang TH, Lee KI, Sohn OJ, Kim SY, Chung SW (2006) Application of principal component analysis and self-organizing map to the analysis of 2D fluorescence spectra and the monitoring of fermentation processes. Biotechnol Bioprocess Eng 11(5):432–441 ArticleCASGoogle Scholar
- Roger JM, Chauchard F, Williams P (2008) Removing the block effects in calibration by means of dynamic orthogonal projection. Application to the year effect correction for wheat protein prediction. J Near Infrared Spectrosc 16(3):311–315 ArticleCASGoogle Scholar
- Roggo Y, Chalus P, Maurer L, Lema-Martinez C, Edmond A, Jent N (2007) A review of near infrared spectroscopy and chemometrics in pharmaceutical technologies. J Pharmaceut Biomed Anal 44(3):683–700 ArticleCASGoogle Scholar
- Shaffer RE, Rose-Pehrsson SL, McGill A (1999) A comparison study of chemical sensor array pattern recognition algorithms. Anal Chim Acta 384:305–317 ArticleCASGoogle Scholar
- Shen D, Kiehl TR, Khattak SF, Li ZJ, He A, Kayne PS, Patel V, Neuhaus IM, Sharfstein ST (2010) Transcriptomic responses to sodium chloride-induced osmotic stress: a study of industrial fed-batch CHO cell cultures. Biotechnol Prog 26(4):1104–1115 CASGoogle Scholar
- Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319 ArticleGoogle Scholar
- Tarazona S, Prado-López S, Dopazo J, Ferre A, Conesa A (2012) Variable selection for multifactorial genomic data. Chemometrics Intell Lab Syst 110(1):113–122 ArticleCASGoogle Scholar
- Tenenbaum JB, de Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290:2319–2323 Google Scholar
- Tewari J, Vivechana D, Kamal M (2011) On-line monitoring of residual solvent during the pharmaceutical drying process using non-contact infrared sensor: a process analytical technology (PAT) approach. Sens Actuators B Chem 144(1):104–111 ArticleGoogle Scholar
- Varmuza K (2009) Introduction to multivariate statistical analysis in chemometrics. Taylor & Francis, CRC, New York BookGoogle Scholar
- von Stosch M, Oliveira R, Peres J, Feyo de Azevedo S (2011) A novel identification method for hybrid (N) PLS dynamical systems with application to bioprocesses. Expert Syst Appl 38(9):10862–10874 ArticleGoogle Scholar
- Walczak B, Massart DL (2000) Local modelling with radial basis function networks. Chemometrics Intell Lab Syst 50:179–198 ArticleCASGoogle Scholar
- Warnes MR, Glassey J, Montague GA, Kara B (1998) Application of radial basis function and feedforward artificial neural networks to the Escherichia coli fermentation process. Neurocomputing 20:67–82 ArticleGoogle Scholar
- Weiss GH, Romagnoli JA, Islam KA (1996) Data reconciliation—an industrial case study. Comput Chem Eng 20:1441–1449 ArticleCASGoogle Scholar
- Widrow B, Lehr MA (1990) 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation. In: Proceeding of the IEEE, vol 78(9). p 1415 Google Scholar
- Wilkes JG, Rushing L, Nayak R, Buzatu DA, Sutherland JB (2005) Rapid phenotypic characterization of Salmonella enterica strains by pyrolysis metastable atom bombardment mass spectrometry with multivariate statistical and artificial neural network pattern recognition. J Microbiol Methods 61(3):321–334 ArticleCASGoogle Scholar
- Wold S, Geladi P, Esbensen K, Ohman J (1987) Multi-way principal components and PLS-analysis. J Chemom 1:41–56 ArticleCASGoogle Scholar
- Wold S, Trygg J, Berglund A, Antti H (2001) Some recent developments in PLS modelling. Chemom Intell Lab Syst 58:131–150 ArticleCASGoogle Scholar
- Yin H (2008) The self-organizing maps: background, theories, extensions and applications. In: Fulcher J, Jain LC (eds) Computational intelligence: a compendium. Springer, Heidelberg, pp 715–762 Google Scholar
- Yin H, Huang W (2010) Adaptive nonlinear manifolds and their applications to pattern recognition. Inform Sci 180(14):2649–2662 ArticleGoogle Scholar
- Yu DL, Gomm JB, Williams D (1999) Sensor fault diagnosis in a chemical process via RBF neural networks. Control Eng Pract 7:49–55 ArticleGoogle Scholar
Author information
Authors and Affiliations
- School of Chemical Engineering and Advanced Materials, Merz Court, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK Jarka Glassey
- Jarka Glassey