Colorectal cancer (CRC) is one of the most common digestive system malignancies in the world. In China, nearly one fifth of CRC patients have been found to have liver metastasis (LM) at their first diagnosis, and half of CRC patients have LM during the disease progression. The involvement of LM is a very important factor in the prognosis of CRC (1,2), it is caused by the interaction of multiple factors, and its clinical phenotype, characteristics, and molecular mechanism of metastasis are also currently hot topics in both clinical and basic research. The variation of tumor cell genome is an important factor for the occurrence and metastasis of tumors (3-5). In recent years, whole-exome sequencing has been widely used in the study of various diseases, especially in oncology research. Several functional genes associated with lung cancer targeting or immunotherapy have been identified in non-small cell lung carcinoma (NSCLC) (6,7). It has also been repeatedly confirmed by whole-exome sequencing that protein 53 (P53), Kirsten rat sarcoma viral oncogene homolog (KRAS), adenomatous polyposis coli (APC), and other genes play an important role in the occurrence and progression of CRC (8,9). However, the genomic characteristics of LM from CRC are still unclear. Though there are many studies on the NGS of colorectal cancer, but relatively few studies on the gene detection of colorectal cancer with liver metastasis, and the data of exome sequencing of colorectal cancer with liver metastasis are even less in China.
In this study, primary tumor tissues, matched normal tissues, and liver metastasis specimens of 4 CRC patients with LM from the General Surgery Department of the First Affiliated Hospital of Soochow University were collected, and high-depth complete exome sequencing was performed to analyze the mutated genes, tumor mutation burden (TMB), molecular functions, and signal pathways related to CRC with LM. It is hoped that this study can deepen the understanding of CRC with LM and provide potential marker molecules and therapeutic targets for the diagnosis and treatment of CRC with LM. We present the following article in accordance with the MDAR reporting checklist (available at http://dx.doi.org/10.21037/jgo-21-9).
Participants and samples
CRC patient sample acquisition
We collected primary CRC tumor tissues, matched normal tissues, and liver metastatic tumor tissues of 4 patients from the Department of General Surgery of the First Affiliated Hospital of Soochow University from September 2019 to April 2020. Informed written consent was provided by all patients before inclusion in this study. Respective tumor tissue samples from a histologically confirmed adenocarcinoma by two molecular pathologists were matched with the inclusion criteria. This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013) and was approved by the Ethics Committee of the First Affiliated Hospital of Soochow University (No. 2021-023).
The DNA was extracted from primary tumor tissues, liver metastatic tissues, and matched normal tissues using E.Z.N.A. Tissue DNA kit (Omega Bio-Tek, Norcross, GA, USA) and QIAamp DNA FFPE Tissue kit (Qiagen Sciences, Venlo, Netherlands). The DNA extracted from normal tissues were used as germline DNA control. The samples with cancer cell populations were estimated by pathologists to ensure more than 75% of cells were cancer cells. A bioanalyzer (Agilent, Palo Alto, CA, USA) was used to assess DNA quantity.
Whole-exome sequencing and data processing
Sample genome DNA was fragmented using NEBNext dsDNA Fragmentase (NEB, Ipswich, MA, USA) followed by DNA ends repairing. End-repaired DNA fragments were detailed and ligated with the NEBNext adaptor (NEB, Ipswich, MA, USA). Biotinylated RNA library baits and magnetic beads were mixed with the barcoded library for selection of targeted regions using the SureSelect Human All Exon V6 Kit (Agilent Technologies, Palo Alto, CA, USA). The captured sequences were further amplified for 150 bp paired-end sequencing in Illumina X-ten system (Illumina, San Diego, CA, USA).
Somatic single nucleotide variant and insertion/deletion identification
To identify somatic single nucleotide variants (SNVs) and insertion/deletions (InDels), the Burrows-Wheeler Aligner (BWA) (10) was used to align the clean reads from each sample against the human reference genome (GRCh38). The picard tool (http://picard.sourceforge.net/index.shtml) removed the read duplicates derived from library polymerase chain reaction (PCR). Somatic SNV and InDel calling was processed for multi-sample by MuTect (https://software.broadinstitute.org/cancer/cga/mutect) (11). Tumor mutation burden (TMB) is calculated as the number of somatic mutations per million bases (MB), including somatic single nucleotide variants (SNVs) and insertion/deletions (InDels).
The gene sets screened were used for functional annotation analysis using in-house script, which consisted of the Kyoto Encyclopedia of Genes and Genomes (KEGG) (12) and Gene Ontology (GO) (13) database. The significance of gene group enrichment was defined by a modified Fisher’s exact test and a P value <0.05 was considered to indicate a statistically significant difference.
The software SPSS (IBM Corp., Armonk, NY, USA) (14) was used to analyze all the correlated biological and clinical variables.
The mean age of the 4 CRC patients with LM was 64.5 years (range, 46–77 years), and included 3 males and 1 female. All 4 were diagnosed with adenocarcinoma, and did not receive adjuvant chemoradiotherapy before Surgery. The tumor stage of 4 participants were stage IV. Half (n=2) of them were histological grade 2 and another 2 were histological grade 3. All 4 participants had lymphatic invasion and perineural invasion. The carcinoembryonic antigen (CEA) was increased in all 4 participants (Table 1).
We collected the primary tumor tissues, matched paracancerous normal tissues, and LMs of 4 patients admitted to the Department of General Surgery of the First Affiliated Hospital of Soochow University, and performed high-depth whole-exome sequencing, with the sequencing depth of each sample reaching 123× on average (Table S1).
Identification of somatic single nucleotide polymorphisms and InDels
The normal tissue samples data of all participants matched the original site of cancer tissue samples data. To compare the LMs sample data analysis, we took the first participants’ normal tissue and named it CRC01C (CRCO1 CONTROL), the original cancer organization was named CRC01CvsT (CRCO1 CONTROL vs. tumor), LMs sample was named CRC01CvsLM (CRCO1 CONTROL vs. liver metastasis), and then named the rest of the participant samples in the same way. By filtering the sequencing comparing and analyzing the data, 8,565 SNVs and 429 Indels were detected across all samples; CRC04CvsT had the largest number of SNVs (5,940), followed by CRC01CvsLM (1,094) (Table S2). Significantly mutated genes (SMG) are those genes whose mutation frequency is significantly higher than those in the background. Generally, differences of somatic SNV and InDel are analyzed comprehensively. MuSiC (The Genome Institute, Washington University, USA) software (15) was used to search for genes with higher mutation frequency in tumor samples (compared with control samples). A significantly mutated gene (SMG) test was performed for each type of mutation, whose test method was convolution test (CT). We found that the genes with the highest frequency of mutation were titin (TTN), obscurin (OBSCN), and homeodomain-interacting protein kinase 2 (HIPK2). We identified the 30 genes with the highest frequency of mutations. We also calculated the TMB of each sample respectively on the average TMB over 16 mutations per MB (5–60 mutations/MB) (Figure 1).
KEGG and GO analysis
We conducted GO and KEGG pathway analysis of the mutant genes, revealing that the mutant genes were mainly concentrated in the cells, cell parts, and cellular process of GO (Figure 2). Results of KEGG pathway analysis showed that mutations were mainly distributed in circadian entrainment, insulin secretion, and glutamatergic synapse (Figure 3).
Gene mutation and KEGG and GO analysis in LM samples
We further analyzed the high frequency mutated genes of LMs, which were TTN, OBSCN and hydrocephalus-inducing protein homolog (HYDIN) (Figure 4). The GO analysis showed that the mutated genes in LM tissues were mainly concentrated in the cell, cell parts, and cellular process (Figure 5). The KEGG pathway analysis showed that high frequency mutation genes were focused on gastric acid secretion, bile secretion, and melanogenesis (Figure 6).
Tumor metastasis is an important factor in tumor progression, and distant metastasis is often the main reason for tumor treatment failure. The process of LM of CRC is extremely complex, involving many tumor-related molecules, which are induced and regulated by their respective specific signaling pathways. With the continuous progress of all kinds of research techniques in recent years, a growing body of evidence has shown that tumor development and metastasis are related to the cancer genome (16),transcriptome (17), epigenetics (18), proteome (19),metabolome (20), and tumor microenvironment (21) interaction cross-regulation of a dynamic complex process. This study collected the primary tumor tissues, matched paracancerous normal tissues, and metastatic tissues of 4 patients admitted to the General Surgery Department of the First Affiliated Hospital of Soochow University, and conducted high-depth whole-exome sequencing, in an attempt to explore the molecular mechanism of CRC LM from the genomic level. We performed an average sequencing depth of 123× per sample. By filtering, comparing, and analyzing the data, we identified 8,565 SNVs and 429 Indels in primary and hepatic metastases, and we found that the genes with the highest frequency of mutation were TTN, OBSCN, HIPK2, and HYDIN. The TTN gene was the most frequently mutated gene and had mutations in all 4 participants with LM. The TTN gene is known to encode rhabdomyin, and mutations in this gene are also associated with familial hypertrophic cardiomyopathy (HCM) 9 (22). It has also been identified as a chromosomal structural protein, and TTN has been found to promote bone metastasis in breast cancer (23). We speculated that variation of TTN might lead to chromosome instability and thus promote the occurrence and metastasis of tumors. The OBSCN gene is over 150 kb long and contains over 80 exons encoding a protein of approximately 720 kDa that belongs to the sarcomeric signaling protein family (24,25). It is an important signaling protein that is involved in the modulation of multiple cellular signals. We hypothesized that OBSCN mutation might lead to abnormal intracellular signaling pathway transduction that catalyzed changes in cell function.
Due to the small number of patients with liver metastasis of colorectal cancer, it is difficult to obtain appropriate tissue samples of liver metastasis of colorectal cancer for sequencing. We will also increase the number of samples of liver metastasis of colorectal cancer for total exon sequencing in the future work to obtain more data. And due to financial constraints, this study failed to combine with other omics studies such as transcriptome and proteomics. We will try to address these issues in the future to further explore the molecular mechanism of CRC LM.
In this study, we found some candidate genes related to the occurrence of CRC and LM through whole-exome sequencing of relevant tissues in CRC patients with LM. These findings are expected to provide us with new markers and therapeutic targets for CRC patients with LM.
Funding: This work was supported by Project of Nature Science Foundation of China (81672348, 81702048), National Science Foundation of Jiangsu Province of China (BK20191172), Program of Postgraduate Research Innovation in University of Jiangsu Province (KYCX19_1993), Project of Key laboratory of Clinical pharmacy of Jiangsu Province of China (XZSYSKF2020027), Project of Science and Technology Development Plan of Suzhou City of China (SYS2019007), Project of Gusu Medical Key Talent of Suzhou City of China (GSWS2020005), Project of Medical Science and Technology Development Foundation of Suzhou City of China (SYS201734), The Science and Technology Project Foundation of Suzhou (SS201852, SS202093), the Science and Education for Health Foundation of Suzhou for Youth (KJXW2019074) and Project of Medical Specialty and Community of Baoshan District of Shanghai City of China (BSZK-2018-A08).
Reporting Checklist: The authors have completed the MDAR checklist. Available at http://dx.doi.org/10.21037/jgo-21-9
Data Sharing Statement: Available at http://dx.doi.org/10.21037/jgo-21-9
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at http://dx.doi.org/10.21037/jgo-21-9). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013) and was approved by the Ethics Committee of the First Affiliated Hospital of Soochow University (No. 2021-023). Informed written consent was provided by all patients before inclusion in this study.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
- Weitz J, Koch M, Debus J, et al. Colorectal cancer. Lancet 2005;365:153-65. [Crossref] [PubMed]
- Engstrand J, Nilsson H, Strömberg C, et al. Colorectal cancer liver metastases - a population-based study on incidence, management and survival. BMC Cancer 2018;18:78. [Crossref] [PubMed]
- Bhullar DS, Barriuso J, Mullamitha S, et al. Biomarker concordance between primary colorectal cancer and its metastases. EBioMedicine 2019;40:363-74. [Crossref] [PubMed]
- van Huizen NA, Coebergh van den Braak RRJ, Doukas M, et al. Up-regulation of collagen proteins in colorectal liver metastasis compared with normal liver tissue. J Biol Chem 2019;294:281-9. [Crossref] [PubMed]
- Alves JM, Prado-López S, Cameselle-Teijeiro JM, et al. Rapid evolution and biogeographic spread in a colorectal cancer. Nat Commun 2019;10:5139. [Crossref] [PubMed]
- Haratani K, Hayashi H, Tanaka T, et al. Tumor immune microenvironment and nivolumab efficacy in EGFR mutation-positive non-small-cell lung cancer based on T790M status after disease progression during EGFR-TKI treatment. Ann Oncol 2017;28:1532-9. [Crossref] [PubMed]
- Lusk CM, Watza D, Dyson G, et al. Profiling the Mutational Landscape in Known Driver Genes and Novel Genes in African American Non-Small Cell Lung Cancer Patients. Clin Cancer Res 2019;25:4300-8. [Crossref] [PubMed]
- Boutin AT, Liao WT, Wang M, et al. Oncogenic Kras drives invasion and maintains metastases in colorectal cancer. Genes Dev 2017;31:370-82. [Crossref] [PubMed]
- Yang J, Lin Y, Huang Y, et al. Genome landscapes of rectal cancer before and after preoperative chemoradiotherapy. Theranostics 2019;9:6856-66. [Crossref] [PubMed]
- Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010;26:589-95. [Crossref] [PubMed]
- Cibulskis K, Lawrence MS, Carter SL, et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 2013;31:213-9. [Crossref] [PubMed]
- Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27-30. [Crossref] [PubMed]
- Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000;25:25-9. [Crossref] [PubMed]
- Weaver B, Wuensch KL. SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients. Behav Res Methods 2013;45:880-95. [Crossref] [PubMed]
- Dees ND, Zhang Q, Kandoth C, et al. MuSiC: identifying mutational significance in cancer genomes. Genome Res 2012;22:1589-98. [Crossref] [PubMed]
- Birkbak NJ, McGranahan N. Cancer Genome Evolutionary Trajectories in Metastasis. Cancer Cell 2020;37:8-19. [Crossref] [PubMed]
- Berglund E, Maaskola J, Schultz N, et al. Spatial maps of prostate cancer transcriptomes reveal an unexplored landscape of heterogeneity. Nat Commun 2018;9:2419. [Crossref] [PubMed]
- Nebbioso A, Tambaro FP, Dell'Aversana C, et al. Cancer epigenetics: Moving forward. PLoS Genet 2018;14:e1007362 [Crossref] [PubMed]
- Álvarez-Chaver P, De Chiara L, Martínez-Zorzano VS. Proteomic Profiling for Colorectal Cancer Biomarker Discovery. Methods Mol Biol 2018;1765:241-69. [Crossref] [PubMed]
- Kumar A, Misra BB. Challenges and Opportunities in Cancer Metabolomics. Proteomics 2019;19:e1900042 [Crossref] [PubMed]
- Vitale I, Manic G, Coussens LM, et al. Macrophages and Metabolism in the Tumor Microenvironment. Cell Metab 2019;30:36-50. [Crossref] [PubMed]
- Neidhardt J, Fehr S, Kutsche M, et al. Tenascin-N: characterization of a novel member of the tenascin family that mediates neurite repulsion from hippocampal explants. Mol Cell Neurosci 2003;23:193-209. [Crossref] [PubMed]
- Chiovaro F, Martina E, Bottos A, et al. Transcriptional regulation of tenascin-W by TGF-beta signaling in the bone metastatic niche of breast cancer cells. Int J Cancer 2015;137:1842-54. [Crossref] [PubMed]
- Russell MW, Raeker MO, Korytkowski KA, et al. Identification, tissue expression and chromosomal localization of human Obscurin-MLCK, a member of the titin and Dbl families of myosin light chain kinases. Gene 2002;282:237-46. [Crossref] [PubMed]
- Carlsson L, Yu JG, Thornell LE. New aspects of obscurin in human striated muscles. Histochem Cell Biol 2008;130:91-103. [Crossref] [PubMed]
(English Language Editor: J. Jones)