Bioinformatics
BMJ 2002; 324 doi: https://doi.org/10.1136/bmj.324.7344.1018 (Published 27 April 2002) Cite this as: BMJ 2002;324:1018All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Sir - The field of bioinformatics is ever expanding. This paper only
highlights one aspect of bioinformatics, namely the application to the
study of genomics (1). In this post-genomic era, where the study of
proteomics and epigenetics are equally important in the understanding of
the fundamental mechanisms of disease processes, the broad perspective of
the application of bioinformatics in the medical field needs to be
appreciated. Bioinformatics utility in the area of imaging for the
management of cancers for the purpose of screening, diagnostics, staging
and treatment has been well described (2).
The contribution of bioinformatics in improving the role of imaging
in clinical practice is potentially tremendous (2, 3). A new dimension of
the role of bioinformatics in expanding the potentiality of biological
imaging as a tool in diagnostics and therapeutics unlimited by time and
space has been created recently(4). The advent of information technology
has opened up immense possibilities of innovation and creativity for the
existing information-rich modern medical practice ranging from diagnostic
to therapeutic processes. However, it is very apparent that there is a
lack of reference made to the application of bioinformatics in the field
of electron microscopy as a powerful imaging tool (4). This is
particularly relevant in the context of obtaining quantitative data set
from the image databases. The experience from Electron Microscopy
workstation which has been constructed with the in-built Energy-dispersive
Xray facility (EDAX) in order to provide some parallel quantitative data
of the imaged elements and compounds could provide some insights in that
direction(5).
There is an earnest effort to try to parallel the effort of applying
image informatics in the Open Microscopy Environment(OME) project to that
of the well-established genomics databases by which reference can be made
to identify any novel gene (4). The lack of congruence between the two
systems was ascribed to the failure of the current image bioinformatics to
yield quantitative data. Furthermore, it needs to remember that the
extensively available DNA databases are the “spill over” of the courageous
Human Genome Project over so many years. The genesis of OME from the
inception does not have the equivalent fore-runner and it is doubtful for
it to be a versatile reference database to be applied for any novel image
discovery to assist rare diagnosis. On the contrary, Genomic Databases
available are instrumental in guiding researches the world over with
regard to gene identifications.
In conclusion, Bioinformatics has become a mandatory composite
research tool in the broadest area of biomedical researches which will
provide enormous opportunity and impetus for the future. It is hoped that
future articles on bioinformatics published in the widely-read journal as
the BMJ will contain other biological areas apart from genomics.
References
1. Ardeshir Bayat. Science, medicine, and the future –
Bioinformatics. BMJ 2002;324:1018-1022 ( 27 April )
2. Haque S, Mital D, Srinivasan S. Advances in Biomedical Informatics
for the management of Cancer. Ann. N.Y. Acad. Sci. 980 : 287-297(2002)
3. http://www2.dimag.com/pacsweb/
4. Jason R. Swedlow, Ilya Goldberg, Erik Brauner, and Peter K. Sorger
Informatics and Quantitative Analysis in Biological Imaging. Science Apr 4
2003: 100-102.
5. Che Ghazali FB, Mat Sain AH, Mat Asan J. High-resolution
Visualization And Microstructure Characterization of Mixed Biliary Stones.
Annals of Microscopy ; 3 : March 2003 ; pg 111-116
Competing interests:
None declared
Competing interests: No competing interests
Re: Bioinformatics
An interdisciplinary scientific analytical stream which was born to play a supportive or complementary role for various molecular experimental analyses has finally emerged as a discipline itself.
Increment of biological data generated through high throughput techniques were in need of fast, accurate and reliable analytical techniques immensely.
Did bioinformatics moved beyond that belief to be a supportive discipline in the past decades? We would say, definitely yes.
Amount of literature getting published regularly with core bioinformatics work is the proof of its emergence as an important individual discipline. In this context some particular sections of bioinformatics which has grown tremendously are discussed below.
Dynamic database management systems with workbench and analytical tool support:
Biological databases have moved beyond the job of merely deposition and retrieval of experimentally or theoretically derived scientific records. Thousands of biologically important and specialized dynamic databases are becoming online regularly which can be tracked in issues of journals like Nucleic Acids Research and other database related journals1.
The initial features of NCBI2 has also been altered drastically along with other allied databases and tools such as Bioproject, Biosystem, Clinvar, Epigenomics, Geo, SNP, SRA etc. The present databases are also associated with analytical tools which are useful for comparative analysis and valuable information retrieval, for instance, HIV database is a highly specific database related to HIV (http://www.hiv.lanl.gov/content/sequence/HIV/mainpage.html) and is associated with an array of analytical servers and tools.
Workbenches and software:
The wings of bioinformatics have spanned over several important sub disciplines such as sequence analysis, phylogenetic analysis, comparative genomics etc. To obtain such success several softwares with the purpose of better and robust analysis have been developed. In broad way, these tools are of two types, namely, online servers and offline executables (those having installation executable files and runs locally on a system), further, these tools are again categorized as workbenches and general softwares, where, workbenches or suites are a large collection of tools which works as a platform for several related analysis such as GCG, CLC, Galaxy etc. for sequence analysis. Specific softwares are for specific analyses such as PHYLIP, MEGA and PAUP are used for phylogenetic analysis.
Genomics, transcriptomics, proteomics, metabolomics and other “omics” high throughput data generation compelled the bioinformatics sector to expedite and develop large scale, accurate and robust analytical tools with better visualization and data interpretation ability. Such kind of analytical tools are observed for microarray, NGS data analysis platforms, machine learning platforms and System biology workbenches.
Microarray data analysis:
Large scale gene expression data generated through microarray techniques requires extensive analysis capacity and computing efficiency. Technically, generating data through Affymetrix or other chips is one aspect and the other half is the extensive and sensitive computing aspect where rigorous statistical calculations are performed and step wise analysis are done using some efficient tools such as R package or Bioconductor. Detection of disease causing proteins or genes from a pool of huge genes and proteins are now feasible though this technology3,4. Large scale genomics and proteomics analysis is performed in less time using microarray methodologies.
Next generation Sequencing (NGS) data analysis:
Similar to microarray data analysis methodologies, NGS data analysis is another latest addition in the arsenal of bioinformatics techniques where gigabytes of bases are sequenced in parallel and analyzed using sophisticated bioinformatics protocols. This technology has provided the footstep towards the successful implementation of personalized medicine in the coming future. Although cost effective yet it is now possible to sequence genome of a patient and find out the disease trends through such kind of analysis5. We hope these techniques will be available for common person too in a cheaper cost in near future.
Impact of modern machine learning techniques in Biological data analysis:
Machine learning techniques have emerged as the best tool for classification and clustering of complex and highly overlapping biological datasets6, 7 along with assessing the relationship of various important parameters8. Bio-inspired algorithms such as artificial neural network (ANN) 9, ant colony optimization (ACO) are extensively used for genes, proteins and other specific classifications and predictions. Other techniques such as Support Vector Machines (SVM), decision tree based approaches, Self Organizing Maps (SOM) are also being used for classification and clustering important biological data including disease causing gene and protein classifications. At present and in future, these computational techniques integrated with experimental methods may be used for identifying so far unknown genes and proteins for human and disease causing pathogens.
System biology: towards understanding whole organisms:
Exploring an organism completely is a far reach even for today’s developed and sophisticated methodologies and protocols. The beauty of predictive or computational approaches is in the capability of scientific explorations where experimental techniques are either not able to reach or highly expensive and time consuming. System biology is such a discipline which raises the hope in understanding a complete biochemical pathway or a small pathogenic organism as a whole. Different workbenches related to system biology have been developed including specific computing languages such as SBML (System Biology Mark Up language).
The journey continues:
The present status and the direction of bioinformatics analysis and development are rendering extensive support or analytical tools to molecular bioscience along with establishing itself as an individual discipline. A drastic shift from the previous decades showed very fast growth and important tool development capabilities of this interdisciplinary applied science. Though at this moment it is not completely possible to bring the real-time complexity of biological or medical science in a desktop system and provide deep simulation with all reality, but the growing availability and capability of the specific hardware, software, internet connectivity and algorithmic advances may make this possible in near future.
References:
1. Xosé M, Fernández-Suárez, Galperin MY. The 2013 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucl. Acids Res. 2013;41 (D1): D1-D7. DOI: 10.1093/nar/gks1297
2. Bayat A. Bioinformatics. BMJ 2002; 324:1018–22.
3. Olson NE. The microarray data analysis process: from raw data to biological significance. NeuroRx. 2006;3(3):373-83.
4. Allison DB, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006;7(1):55-65.
5. Rizzo JM, Buck MJ. Key Principles and Clinical Applications of "Next-Generation" DNA Sequencing. Cancer Prev Res 2012; 5:887-900.
6. Banerjee AK, Ravi V, Murty US, Shanbhag AP, Prasanna VL. Keratin protein
property based classification of mammals and non-mammals using machine learning
techniques. Comput Biol Med. 2013;43(7):889-99. DOI:10.1016/j.compbiomed.2013.04.007.
7. Banerjee AK, Ravi V, Murty US, Sengupta N, Karuna B. Application of intelligent techniques for classification of bacteria using protein sequence-derived features. Appl Biochem Biotechnol. 2013; 170(6):1263-81. DOI: 10.1007/s12010-013-0268-1.
8. Banerjee AK, Manasa BP, Murty US. Assessing the relationship among physicochemical properties of proteins with respect to hydrophobicity: a case study on AGC kinase superfamily. Indian J Biochem Biophys. 2010;47(6):370-377.
9. Banerjee AK, Kiran K, Murty US, Venkateswarlu Ch. Classification and identification of mosquito species using artificial neural networks. Comput Biol Chem. 2008;32(6):442-7. DOI: 10.1016/j.compbiolchem.2008.07.020.
Competing interests: No competing interests