One: Metabolomics analysis process
Generally speaking, the metabolome analysis process is as follows: first, the metabolites are preprocessed, and the method of preprocessing is determined by the measurement and analysis method. If mass spectrometry is used for analysis, the metabolites need to be separated and ionized in advance. Then, qualitative and quantitative analysis of the pretreated components was carried out.
In pretreatment, commonly used separation methods include: Gas chromatography (GC), High performance liquid chromatography (HPLC). Gas chromatography has high resolution, but requires gasification of metabolic components and has certain restrictions on the molecular mass of components. High performance liquid chromatography is also widely used in metabolome analysis, because it separates the metabolic components in the liquid phase, so there is no need to gasify the components. Compared with gas chromatography, it has a wider measurement range and is more sensitive. advantage. In addition, capillary electrophoresis can also separate metabolic components, and its application is less, but its separation efficiency is theoretically higher than that of high performance liquid chromatography.
During preprocessing, internal standards are often added to facilitate subsequent monitoring and comparison of the quality of samples. Since different experimental batches and sample sequences also have a certain impact on subsequent measurements, a blank control is also added. Compare with mixed samples for quality monitoring.
Methods for qualitative and quantitative analysis of different metabolic components include mass spectrometry (Mass spectrometry, MS) and nuclear magnetic resonance spectroscopy (Nuclear Magnetic Resonance Imaging, NMR). Among them, mass spectrometry has the advantages of high sensitivity and strong specificity, and is widely used in the detection of metabolites. It can qualitatively and quantify the metabolites after separation and ionization. The ionization method includes: atmospheric pressure chemical ionization (Atmospheric-pressure chemical ionization, APCI), electron ionization (Electron ionization, EI), and electrospray ionization (Electrospray ionization, ESI), etc., need to be selected according to different separation methods. Electrospray ionization, for example, is commonly used for components separated by liquid chromatography. However, since mass spectrometry cannot directly detect biological solutions or tissues, its application has been limited. In order to improve the sensitivity of the original mass spectrometry, simplify the sample preparation and reduce the influence of background, some new mass spectrometry related techniques have been developed. These techniques include: Secondary-ion mass spectrometry (SIMS) and Nanostructure-Initiator MS (NIMS), which are desorption/ionization methods, both of which are matrix-independent. Among them, SIMS uses a high-energy ion beam to desorb the contact surface of the sample, which has the advantage of high spatial resolution, and is a powerful technique for organ/tissue imaging in tandem with mass spectrometry. NIMS can be used for the detection of small molecules. Matrix-assisted laser desorption/ionization (MALDI) is a milder ionization method, which can obtain the mass spectrometry information of some complete macromolecules that are easily dissociated into fragments by conventional ionization methods, such as DNA, Proteins, peptides and sugars, etc. Desorption electrospray ionization (DESI) is a direct ionization technique that can be used in series with mass spectrometry to directly analyze samples under atmospheric conditions. The principle is to use a fast-moving stream of charged solution to extract samples from the contact surface, which can be used for forensic analysis, analysis of drugs, plants, biological tissues, polymers, etc. Laser Ablation Electrospray Ionization (LAESI) is a direct ionization technique that combines mid-infrared laser ablation and secondary electrospray ionization, and can be used for a wide range of samples, including plants, tissues, cells, and even Untreated biological solutions such as blood, urine, etc. It has been used in food supervision, drug supervision and other fields. NMR spectroscopy does not require pre-separation of metabolic components. Compared with mass spectrometry, NMR spectroscopy has the advantages of better reproducibility of results, simpler sample preparation, no pre-separation, and less destructive to the sample. It is lower than mass spectrometry (controversial, some believe this is due to an incorrect sample preprocessing workflow), but it is also widely used due to its ease of use.
In addition, other detection methods include: ion mobility spectrometry (Ion-mobility spectrometry, IMS) is a technology based on the migration of ionized molecules in a gas phase carrier to separate and analyze these molecules. High sensitivity, can be used alone or in tandem with mass spectrometry, gas chromatography or liquid chromatography. Electrochemical detection techniques coupled with high-performance liquid chromatography (HPLC-ECD) can be used to measure low-level components in complex matrices, with ease of use, sensitivity, and selectivity. Used in clinical research, food testing, drug testing and other fields. Raman spectroscopy is based on vibrational spectroscopy, which can detect the structure of compounds and their small changes. It has the advantages of not destroying the sample, simple sample pretreatment, and high spatial resolution. It has been used in clinical pathology research and classification of microorganisms. and detection, compound analysis and other fields.
Two: Metabolomics-related databases (and commonly used software)
Commonly used metabolome-related databases include Human Metabolome Database (HMDB), KEGG database, Reactome database, etc., which are introduced as follows: Human Metabolome Database (HMDB) is one of the popular databases in metabolomics, including human Detailed information on small molecule metabolites found in vivo, with no less than 79,650 metabolite entries. The SMPDB database is linked to the HMDB and contains pathway maps for approximately 700 human metabolic and disease pathways. The KEGG database is one of the popular metabolome databases, containing information on metabolic pathways and interaction networks. The Reactome database mainly collects information on the main metabolic pathways and important reactions of the human body. The MassBank database mainly collects spectra of many high-resolution low-metabolite components.
The BioCyc database contains pathway and genomic data. The METLIN database, a commercial metabolome and tandem mass spectrometry database, contains about 43,000 metabolites and 22,000 MS/MS spectra. The FiehnLib database is a commercial metabolome database containing EI spectra of about 1000 conserved metabolites.
The NIST/EPA/NIH Mass Spectral Library database is also a commercial metabolome database containing more than 190,000 EI spectra. The BioCyc database collects pathway and genomic data and is freely available. The MetaCyc database is a comprehensive collection of information on metabolic pathways and enzymes from many different organisms, including more than 51,000 articles. The MMCD database collects information on more than 10,000 metabolites and their MS and NMR data, most of which are Arabidopsis metabolites.
Three: Integrating metabolomics with other omics data
How to better integrate various omics data is still a major challenge facing the biological community, and sometimes it also faces imperfect experimental design and integration of data from different experimental platforms. Commonly used methods are metabolic pathway level analysis, biological network analysis, empirical correlation analysis, etc. There are software or websites that provide ready-to-use analyses that integrate multiple omics data. For example, metabolic pathway enrichment analysis includes: IMPaLA website, which uses information from more than 3,000 metabolic pathways from 11 databases and can be used to integrate multiple omics analysis; in addition, there are iPEAP software, MetaboAnalyst website, etc. Metabolic pathway enrichment analysis can be provided. Provide biological network analysis include: SAMNetWeb website, which can provide pathway abdominal muscle analysis and network analysis of transcriptome and proteome; pwOmics package, is an R software package that can construct networks from transcriptome and proteome information that changes over time ; Similar software is MetaMapR (R package, with user interface), MetScape (Cytoscape plug-in), Grinn (R package) and so on. Empirical correlation analysis can be performed: WGCNA (R software package), which can integrate and analyze a variety of omics data based on correlation and network topology; other R software packages include MixOmic, DiffCorr, qpgraph, and huge.
Four: Statistical analysis methods and strategies for metabolomics data
After obtaining the metabolomics data, it is necessary to use software to read and analyze the information of the original data to determine the composition and content of the metabolic components contained in the original data. There are many statistical softwares that can read and analyze NMR spectra and mass spectrometry data. XCMS is a commonly used free software for reading and analyzing mass spectrometry raw data. Similar commonly used software are MZmine2, MetAlign, MathDAMP, LCMStats and so on.
Once the metabolite composition and content are obtained, statistical analysis of these data can be performed. Commonly used analysis methods include principal component analysis (PCA), partial least squares regression, cluster analysis, differential expression analysis, etc. The results can also be analyzed for functional and pathway enrichment using the aforementioned databases.