Multivariate Data Analysis for Advancing the Interpretation of Bioprocess Measurement and Monitoring Data

The advances in measurement techniques, the growing use of high-throughput screening and the exploitation of ‘omics’ measurements in bioprocess development and monitoring increase the need for effective data pre-processing and interpretation. The multi-dimensional character of the data requires the application of advanced multivariate data analysis (MVDA) tools. An overview of both linear and non-linear MVDA tools most frequently used in bioprocess data analysis is presented. These include principal component analysis (PCA), partial least squares (PLS) and their variants as well as various types of artificial neural networks (ANNs). A brief description of the basic principles of each of the techniques is given with emphasis on the possible application areas within bioprocessing and relevant examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic €32.70 /Month

Buy Now

Price includes VAT (France)

eBook EUR 245.03 Price includes VAT (France)

Softcover Book EUR 316.49 Price includes VAT (France)

Hardcover Book EUR 316.49 Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

Multivariate Data Analysis (Chemometrics)

Chapter © 2014

Application of Multivariate Statistical Techniques to Predict Process Quality

Chapter © 2018

An introductory review on the application of principal component analysis in the data exploration of the chemical analysis of food samples

Article 03 February 2024

References

  1. Alexandrakis D (2012) NIRS in an industrial environment. Euro Pharmaceut Rev 17(1):27–30 Google Scholar
  2. Arnold SA, Crowley J, Woods N, Harvey LM, McNeil B (2003) In-situ near infrared spectroscopy to monitor key analytes in mammalian cell cultivation. Biotechnol Bioeng 84(1):13–19 ArticleCASGoogle Scholar
  3. Arora N, Biegler LT (2001) Redescending estimators for data reconciliation and parameter estimation. Comput Chem Eng 25(11–12):1585–1599 ArticleCASGoogle Scholar
  4. Balestrassi PP, Popova E, Paiva AP, Marangon Lima JW (2009) Design of experiments on neural network’s training for nonlinear time series forecasting. Neurocomputing 72:1160–1178 ArticleGoogle Scholar
  5. Basheer IA, Hajmeer M (2000) Artificial neural networks: fundamentals, computing, design and application. J Microbiol Methods 43:3–31 ArticleCASGoogle Scholar
  6. Brereton R (2009) Chemometrics for pattern recognition. Wiley, Chichester BookGoogle Scholar
  7. Chatfield C, Collins AJ (1980) Introduction to multivariate analysis. Chapman and Hall, London Google Scholar
  8. Chen M, Hu M, Hofestädt R (2011) A systematic petri net approach for multiple-scale modeling and simulation of biochemical processes. Appl Biochem Biotechnol 164:338–352 ArticleCASGoogle Scholar
  9. Cunha CCF, Glassey J, Montague GA, Albert S, Mohan P (2002) An assessment of seed quality and its influence on productivity estimation in an industrial antibiotic fermentation. Biotechnol Bioeng 78:658–669 ArticleCASGoogle Scholar
  10. David F, Westphal R, Bunk B, Jahn D, Franco-Lara E (2010) Optimization of antibody fragment production in Bacillus megaterium: the role of metal ions on protein secretion. J Biotechnol 150(1):115–124 ArticleCASGoogle Scholar
  11. Desai K, Badhe Y, Tambe SS, Kulkarni BD (2006) Soft-sensor development for fed-batch bioreactors using support vector regression. Biochem Eng J 27(3):225–239 ArticleCASGoogle Scholar
  12. Fielding AH (2007) Cluster and classification techniques for the biosciences. Cambridge University Press, Cambridge Google Scholar
  13. Gao Y, Kipling K, Glassey J, Willis M, Montague G, Zhou Y, Titchener-Hooker N (2010) Application of agent-based system for bioprocess description and process improvement. Process Biochem 26:706–716 CASGoogle Scholar
  14. Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chimica Acta 185:1–17 ArticleCASGoogle Scholar
  15. Glassey J, Ignova M, Montague GA, Morris AJ (1994) Autoassociative neural networks in bioprocess condition monitoring. In: ADCHEM’94, Kyoto, pp 447–451 Google Scholar
  16. Gregersen L, Jørgensen SB (1999) Supervision of fed-batch fermentations. Chem Eng J 75:69–76 ArticleCASGoogle Scholar
  17. Gu H, Pan Z, Xi B, Asiago V, Musselman B, Raftery D (2011) Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics: application to the detection of breast cancer. Anal Chim Acta 686(1–2):57–63 ArticleCASGoogle Scholar
  18. Guebel DV, Cánovas M, Torres NV (2009) Analysis of the Escherichia coli response to glycerol pulse in continuous, high-cell density culture using a multivariate approach. Biotechnol Bioeng 102(3):910–922 ArticleCASGoogle Scholar
  19. Han-Ming W (2011) On biological validity indices for soft clustering algorithms for gene expression data. Comput Stat Data Anal 55(5):1969–1979 ArticleGoogle Scholar
  20. Haseltine EL, Rawlings JB (2005) Critical evaluation of extended Kalman filtering and moving horizon estimation. Ind Eng Chem Res 44(8):2451–2460 ArticleCASGoogle Scholar
  21. Huang JH, Shimizu H, Shioya S (2002) Data preprocessing and output evaluation of an autoassociative neural network model for online fault detection in virginiamycin production. J Biosci Bioeng 94(1):70–77 CASGoogle Scholar
  22. Igne B, Zacour BM, Shi Z, Talwar S, Anderson CA, Drennen JK III (2011) Online monitoring of pharmaceutical materials using multiple NIR sensors—Part I: blend homogeneity. J Pharm Innov 6:47–59 ArticleGoogle Scholar
  23. Kohonen T (1997) Self-organizing maps, springer series in information sciences, 2nd edn. vol 30. Springer, Heidelberg Google Scholar
  24. Kompany-Zareh M (2011) On-line monitoring of a continuous pharmaceutical process using parallel factor analysis and unfolding multivariate statistical process control representation. J Iran Chem Soc 8(1):209–222 ArticleCASGoogle Scholar
  25. Kong CS, Yu J, Minion FC, Rajan K (2011) Identification of biologically significant genes from combinatorial microarray data. ACS Comb Sci 13(5):562–571 ArticleCASGoogle Scholar
  26. Kramer NA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37:233–243 ArticleCASGoogle Scholar
  27. Kuehn DR, Davidson H (1961) Computer control II. Mathematics of control. Chem Eng Prog 57:44–47 CASGoogle Scholar
  28. Landgrebe D, Haake C, Höpfner T, Beutel S, Hitzman B, Scheper T, Rhiel M, Reardon KF (2010) On-line infrared spectroscopy for bioprocess monitoring. Appl Microbiol Biotechnol 88:11–22 ArticleCASGoogle Scholar
  29. Laurent S, Karim MN (2001) Probabilistic neural networks using Bayesian decision strategies and a modified Gompertz model for growth phase classification in the batch culture of Bacillus subtilis. Biochem Eng J 7(1):41–48 ArticleGoogle Scholar
  30. Lee D (2005) Component-based software architecture for biosystem reverse engineering. Biotechnol Bioprocess Eng 10(5):400–407 ArticleCASGoogle Scholar
  31. Lee DS, Lee MW, Woo SH, Kim YJ, Park JM (2006) Nonlinear dynamic partial least squares modeling of a full-scale biological wastewater treatment plant. Process Biochem 41:2050–2057 ArticleCASGoogle Scholar
  32. Linko S, Zhu YH, Linko P (1999) Applying neural networks as software sensors for enzyme engineering. Trends Biotechnol 17:155–162 ArticleCASGoogle Scholar
  33. Luttmann R, Bracewell DG, Cornelissen G, Gernaey KV, Glassey J, Hass VC, Kaiser C, Lindström IM, Preusse C, Striedner G, Mandenius CF (2012) Soft Sensors in Bioprocessing. Biotechnol J 7, 1040–1047 Google Scholar
  34. Maier HR, Dandy GC (1998) The effect of internal parameters and geometry on the performance of back-propagation neural networks: an empirical study. Environ Model Softw 13:193–209 ArticleGoogle Scholar
  35. Miao Y, Su HY, Chu J (2009) A support vector regression approach for recursive simultaneous data reconciliation and gross error detection in nonlinear dynamical systems. Acta Automatica Sinica 35(6):708–716 ArticleGoogle Scholar
  36. McCulloh WP (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133 ArticleGoogle Scholar
  37. Møller SF, von Frese J, Bro R (2005) Robust methods for multivariate data analysis. J Chemometrics 19:549–563 ArticleGoogle Scholar
  38. Naes T, Isaksson T, Fearn T, Davies T (2002) A user-friendly guide to multivariate calibration and classification. NIR, Chichester Google Scholar
  39. Nikhil , Koskinen PEP, Visa A, Kaksonen AH, Puhakka JA, Yli-Harja O (2008) Clustering hybrid regression: a novel computational approach to study and model biohydrogen production through dark fermentation. Bioprocess Biosyst Eng 31(6):631–640 Google Scholar
  40. Nomikos P, MacGregor JF (1994) Monitoring of batch processes using multi-way principal component analysis. AICHE J 40:1361–1375 ArticleCASGoogle Scholar
  41. Nucci ER, Cruz AJG, Giordano RC (2010) Monitoring bioreactors using principal component analysis: production of penicillin G acylase as a case study. Bioprocess Biosyst Eng 33:557–564 ArticleCASGoogle Scholar
  42. O’Malley CJ, Montague GA, Martin EB, Liddell JM, Kara B, Titchener-Hooker NJ (2012) Utilisation of key descriptors from protein sequence data to aid bioprocess route selection. Food Bioprod Process (in press). doi: 10.1016/j.fbp.2012.01.005
  43. Ödman P, Johansen CL, Olsson L, Gernaey KV, Lantz AE (2010) Sensor combination and chemometric variable selection for online monitoring of Streptomyces coelicolor fed-batch cultivations. Appl Microbiol Biotechnol 86:1745–1759 ArticleGoogle Scholar
  44. Pate ME, Turner MK, Thornhill NF, Titchener-Hooker NJ (2004) Principal component analysis of nonlinear chromatography. Biotechnol Prog 20:215–222 ArticleCASGoogle Scholar
  45. Rhee JI, Kang TH, Lee KI, Sohn OJ, Kim SY, Chung SW (2006) Application of principal component analysis and self-organizing map to the analysis of 2D fluorescence spectra and the monitoring of fermentation processes. Biotechnol Bioprocess Eng 11(5):432–441 ArticleCASGoogle Scholar
  46. Roger JM, Chauchard F, Williams P (2008) Removing the block effects in calibration by means of dynamic orthogonal projection. Application to the year effect correction for wheat protein prediction. J Near Infrared Spectrosc 16(3):311–315 ArticleCASGoogle Scholar
  47. Roggo Y, Chalus P, Maurer L, Lema-Martinez C, Edmond A, Jent N (2007) A review of near infrared spectroscopy and chemometrics in pharmaceutical technologies. J Pharmaceut Biomed Anal 44(3):683–700 ArticleCASGoogle Scholar
  48. Shaffer RE, Rose-Pehrsson SL, McGill A (1999) A comparison study of chemical sensor array pattern recognition algorithms. Anal Chim Acta 384:305–317 ArticleCASGoogle Scholar
  49. Shen D, Kiehl TR, Khattak SF, Li ZJ, He A, Kayne PS, Patel V, Neuhaus IM, Sharfstein ST (2010) Transcriptomic responses to sodium chloride-induced osmotic stress: a study of industrial fed-batch CHO cell cultures. Biotechnol Prog 26(4):1104–1115 CASGoogle Scholar
  50. Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319 ArticleGoogle Scholar
  51. Tarazona S, Prado-López S, Dopazo J, Ferre A, Conesa A (2012) Variable selection for multifactorial genomic data. Chemometrics Intell Lab Syst 110(1):113–122 ArticleCASGoogle Scholar
  52. Tenenbaum JB, de Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290:2319–2323 Google Scholar
  53. Tewari J, Vivechana D, Kamal M (2011) On-line monitoring of residual solvent during the pharmaceutical drying process using non-contact infrared sensor: a process analytical technology (PAT) approach. Sens Actuators B Chem 144(1):104–111 ArticleGoogle Scholar
  54. Varmuza K (2009) Introduction to multivariate statistical analysis in chemometrics. Taylor & Francis, CRC, New York BookGoogle Scholar
  55. von Stosch M, Oliveira R, Peres J, Feyo de Azevedo S (2011) A novel identification method for hybrid (N) PLS dynamical systems with application to bioprocesses. Expert Syst Appl 38(9):10862–10874 ArticleGoogle Scholar
  56. Walczak B, Massart DL (2000) Local modelling with radial basis function networks. Chemometrics Intell Lab Syst 50:179–198 ArticleCASGoogle Scholar
  57. Warnes MR, Glassey J, Montague GA, Kara B (1998) Application of radial basis function and feedforward artificial neural networks to the Escherichia coli fermentation process. Neurocomputing 20:67–82 ArticleGoogle Scholar
  58. Weiss GH, Romagnoli JA, Islam KA (1996) Data reconciliation—an industrial case study. Comput Chem Eng 20:1441–1449 ArticleCASGoogle Scholar
  59. Widrow B, Lehr MA (1990) 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation. In: Proceeding of the IEEE, vol 78(9). p 1415 Google Scholar
  60. Wilkes JG, Rushing L, Nayak R, Buzatu DA, Sutherland JB (2005) Rapid phenotypic characterization of Salmonella enterica strains by pyrolysis metastable atom bombardment mass spectrometry with multivariate statistical and artificial neural network pattern recognition. J Microbiol Methods 61(3):321–334 ArticleCASGoogle Scholar
  61. Wold S, Geladi P, Esbensen K, Ohman J (1987) Multi-way principal components and PLS-analysis. J Chemom 1:41–56 ArticleCASGoogle Scholar
  62. Wold S, Trygg J, Berglund A, Antti H (2001) Some recent developments in PLS modelling. Chemom Intell Lab Syst 58:131–150 ArticleCASGoogle Scholar
  63. Yin H (2008) The self-organizing maps: background, theories, extensions and applications. In: Fulcher J, Jain LC (eds) Computational intelligence: a compendium. Springer, Heidelberg, pp 715–762 Google Scholar
  64. Yin H, Huang W (2010) Adaptive nonlinear manifolds and their applications to pattern recognition. Inform Sci 180(14):2649–2662 ArticleGoogle Scholar
  65. Yu DL, Gomm JB, Williams D (1999) Sensor fault diagnosis in a chemical process via RBF neural networks. Control Eng Pract 7:49–55 ArticleGoogle Scholar

Author information

Authors and Affiliations

  1. School of Chemical Engineering and Advanced Materials, Merz Court, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK Jarka Glassey
  1. Jarka Glassey