Publications


Co-authored papers are tagged. Corresponding, co-corresponding, first, and co-first papers are not. Please click on the link of each paper for a full list of authors.
2021:
  • SARS‐CoV‐2 biology and variants: anticipation of viral evolution and what needs to be done
    Environmental Microbiology.  [Environ. Microbiol.] [Traditional Chinese Translation] [Simplified Chinese Translation] 
    [SARS-CoV-2 Cytosine Attenuation Tracking]
  • High Prevalence and Mechanism Associated With Extended Spectrum Beta-Lactamase-Positive Phenotype in Laribacter hongkongensis
    Frontiers in Microbiology  [Front. Microbiol.]
  • RENET2: High-Performance Full-text Gene-Disease Relation Extraction with Iterative Training Data Expansion
    bioRxiv  [bioRxiv]
  • ECNano: A Cost-Effective Workflow for Target Enrichment Sequencing and Accurate Variant Calling on 4,800 Clinically Significant Genes Using a Single MinION Flowcell
    bioRxiv  [bioRxiv]
  • SENSV: Detecting Structural Variations with Precise Breakpoints using Low-Depth WGS Data from a Single Oxford Nanopore MinION Flowcell
    bioRxiv  [bioRxiv]
  • DNA methylation affects pre-mRNA transcriptional initiation and processing in Arabidopsis
    bioRxiv  [bioRxiv]
  • (co-authored) Multi-tissue integrative analysis of personal epigenomes
    bioRxiv  [bioRxiv]
  • (co-authored) Distinct disease severity between children and older adults with COVID-19: Impacts of ACE2 expression, distribution, and lung progenitor cells
    Clinical Infectious Disease  [CID]
  • (co-authored) Clinical analysis and pluripotent stem cells-based model reveal possible impacts of ACE2 and lung progenitor cells on infacts vulnerable to COVID-19
    Theranostics.  [Theranostics]
2020:
  • Exploring the limit of using a deep neural network on pileup data for germline variant calling
    Nature Machine Intelligence.  [Nat. Mach. Intell.] [PDF] [GitHub]
  • CONNET: Accurate Diploid Genome Consensus in de novo Assembly of Nanopore Sequencing Data via Deep Learning
    iScience.  [iScience] [GitHub]
  • Skyhawk: An Artificial Neural Network-based discriminator for reviewing clinically significant genomic variants
    International Journal of Computational Biology and Drug Design.  [IJCBDD] [PDF] [GitHub]
  • MegaPath: sensitive and rapid pathogen detection using metagenomic NGS data
    BMC Genomics  [BMC Genomics] [SourceForge]
  • MegaPath-Nano: Accurate Compositional Analysis and Drug-level Antimicrobial Resistance Detection Software for Oxford Nanopore Long-read Metagenomics
    IEEE BIBM 2020.  [PDF] [Conference]
  • ChromSeg: Two-Stage Framework for Overlapping Chromosome Segmentation and Reconstruction
    IEEE BIBM 2020.  [PDF] [Conference]
  • Tracking cytosine depletion in SARS-CoV-2
    bioRxiv.  [bioRxiv] [Website]
  • (co-authored) High-quality bacterial genomes of a partial-nitritation/anammox system by an iterative hybrid assembly method
    Microbiome.  [Microbiome]
  • (co-authored) Identification of Cooperative Gene Regulation Among Transcription Factors, LncRNAs, and MicroRNAs in Diabetic Nephropathy Progression
    Frontiers in Genetics.  [Front. Genet.]
  • (co-authored) Translocator: local realignment and global remapping enabling accurate translocation detection using single-molecule sequencing long reads
    ACM-BCB 2020.  [PDF] [Conference]
  • (co-authored) MC-Explorer: Analyzing and Visualizing Motif-Cliques on Large Networks
    ICDE 2020.  [PDF] [Demo]
2019:
  • RENET: A Deep Learning Approach for Extracting Gene-Disease Associations from Literature
    RECOMB 2019.  [Springer]
  • Clairvoyante: a multi-task convolutional deep neural network for variant calling in Single Molecule Sequencing
    Nature Communications.  [Nat. Comm.] [GitHub]
2018:
  • Restricted Boltzmann Machine and its Potential to Better Predict Cancer Survival
    Biomed J Sci & Tech Res.  [PDF]
  • (co-authored) Transcriptome Analysis of Acute Phase Liver Graft Injury in Liver Transplantation
    Biomedicines.  [PubMed]
  • (co-authored) AC-DIAMOND v1: Accelerating large-scale DNA-protein alignment
    Bioinformatics.  [PubMed] [GitHub]
  • (co-authored) MegaPath: Low-Similarity Pathogen Detection from Metagenomic NGS Data (Extended Abstract)
    ICCABS 2018.  [IEEE]
2017:
  • First Draft Genome Sequence of the Pathogenic Fungus Lomentospora prolificans (formerly Scedosporium prolificans)
    G3: Genes, Genomes, Genetics.  [PubMed]
  • Serine peptidase inhibitor Kazal type 1 (SPINK1) as novel downstream effector of the cadherin-17/β-catenin axis in hepatocellular carcinoma
    Cellular Oncology.  [PubMed]
  • LRSim: a Linked Reads Simulator generating insights for better genome partitioning
    Computational and Structural Biotechnology Journal.  [PubMed] [GitHub]
  • 16GT: a fast and sensitive variant caller using a 16-genotype probabilistic model
    GigaScience.  [PubMed] [GitHub]
  • (co-authored) MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs
    BMC Bioinformatics.  [PubMed]
2016:
  • MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices
    Methods.  [PubMed]
  • BASE: a practical de novo assembler for large genomes using long NGS reads
    BMC Genomics.  [PubMed]
  • (co-authored) AC-DIAMOND: Accelerating Protein Alignment via Better SIMD Parallelization and Space-Efficient Indexing
    IWBBIO.  [Springer]
2015:
  • database.bio: a web application for interpreting human variations
    Bioinformatics.  [PubMed]
  • De novo assembly of a haplotype-resolved human genome
    Nature Biotechnology.  [PubMed]
  • MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
    Bioinformatics.  [PubMed] [GitHub]
  • MICA: A fast short-read aligner that takes full advantage of Intel Many Integrated Core Architecture (MIC)
    BMC Bioinformtics.  [PubMed] [SourceForge] [GitHub]
  • (co-authored) Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber
    Plant Cell.  [PubMed]
2014:
  • SOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads
    Bioinformatics.  [PubMed] [SourceForge] [GitHub]
  • BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU
    PeerJ.  [PubMed] [SourceForge]
  • Exome sequencing of tumor cell lines: Optimizing for cancer variants
    Cancer Research.  [AACR]
  • GPU-Accelerated BWT Construction for Large Collection of Short Reads
    ArXiv.  [PDF]
2013:
  • SOAP3-dp: Fast, Accurate and Sensitive GPU-based Short Read Aligner
    PLoS ONE.  [PubMed] [GitHub]
2012:
  • SOAPdenovo2: An empirically improved memory-efficient short-read de novo assembler
    GigaScience.  [PubMed] [SourceForge] [GitHub]
  • COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly
    Bioinformatics.  [PubMed] [SourceForge]
  • The oyster genome reveals stress adaptation and complexity of shell formation
    Nature.  [PubMed]
  • (co-authored) Single-base resolution maps of cultivated and wild rice methylomes and regulatory roles of DNA methylation in plant gene expression
    BMC Genomics.  [PubMed]
  • (co-authored) Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species
    GigaScience.  [PubMed]
  • (co-authored) An integrated map of genetic variation from 1,092 human genome
    Nature.  [PubMed]
2011:
  • Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly
    Nature Biotechnology.  [PubMed]
  • (co-authored) Mapping copy number variation by population-scale genome sequencing
    Nature.  [PubMed]
  • (co-authored) Assemblathon 1: A competitive assessment of de novo short read assembly methods
    Genome Research.  [PubMed]
2010:
  • Building the sequence map of the human pan-genome
    Nature Biotechnology.  [PubMed]
  • (co-authored) Sequencing of 50 Human Exomes Reveals Adaptation to High Altitude
    Science.  [PubMed]
  • (co-authored) The DNA Methylome of Human Peripheral Blood Mononuclear Cells
    PLoS Biology.  [PubMed]
  • (co-authored) International network of cancer genome projects
    Nature.  [PubMed]
  • (co-authored) A map of human genome variation from population scale sequencing
    Nature.  [PubMed]