Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes

Published: July 21, 2021

Given the abundance of the autism-related literature, we were thus motivated to develop Autism_genepheno, a text mining pipeline to identify sentence-level mentions of autism-associated genes and phenotypes in literature through natural language processing methods.

Recommended citation: Li, S., Guo, Z., Ioffe, J.B. et al. Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes. Sci Rep 11, 15269 (2021). http://oliiverhu.github.io/files/sr2021.pdf

An ensemble deep learning framework to refine large deletions in linked-reads

Published: December 09, 2021

In this work, we propose AquilaDeepFilter to filter large deletion SVs from Aquila and Aquila_stLFR. AquilaDeepFilter relies on a deep learning ensemble approach by integrating several state-of-the-art CNN backbones.

Recommended citation: Y. Hu, S. V. Mangal, L. Zhang, X. Zhou. An ensemble deep learning framework to refine large deletions in linked-reads. The IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2021) http://oliiverhu.github.io/files/bibm2021.pdf

Automated filtering of genome-wide large deletions through an ensemble deep learning framework

Published: October 01, 2022

After extending the algorithm to shortreads dataset, we tested the performance of AquilaDeepFilter on all five linked-reads and short-read libraries sequenced from the well-studied NA24385 sample, validated against the Genome in a Bottle benchmark. To demonstrate the filtering ability of AquilaDeepFilter, we utilized the SV calls from three upstream SV detection tools including Aquila, Aquila_stLFR and Delly as the baseline.

Recommended citation: Yunfei Hu, Sanidhya Mangal, Lu Zhang, Xin Zhou, Automated filtering of genome-wide large deletions through an ensemble deep learning framework, Methods, Volume 206, 2022, Pages 77-86. http://oliiverhu.github.io/files/methods2022.pdf

Haplotyping-Assisted Diploid Assembly and Variant Detection with Linked Reads

Published: November 07, 2022

Motivated by current limitations in generating high-quality diploid assemblies and detecting variants, a new suite of software tools, Aquila, was developed to fully take advantage of linked-read sequencing technology. The overarching goal of Aquila is to exploit the strengths of linked-read technology including long-range connectivity and inherent phasing of variants for reference-assisted local de novo assembly at the whole-genome scale.

Recommended citation: Hu, Y., Yang, C., Zhang, L., Zhou, X. (2023). Haplotyping-Assisted Diploid Assembly and Variant Detection with Linked Reads. In: Peters, B.A., Drmanac, R. (eds) Haplotyping. Methods in Molecular Biology, vol 2590. http://oliiverhu.github.io/files/mimb2022.pdf

ADEPT: Autoencoder with Differentially Expressed Genes and Imputation for a Robust Spatial Transcriptomics Clustering

Published: March 01, 2023

To harness both spatial context and transcriptional profile in ST data, we develop a novel graph-based multi-stage framework for robust clustering, called ADEPT. To control and stabilize data quality, ADEPT relies on selection of differentially expressed genes (DEGs) and imputation of the multiple DEG-based matrices for the initial and final clustering of a graph autoencoder backbone that minimizes the variance of clustering results.

Recommended citation: Y. Hu, Y. Zhao, C. T. Schunk, Y. Ma, T. Derr*, X. M. Zhou*. ADEPT: autoencoder with differentially expressed genes and imputation for a robust spatial transcriptomics clustering. (Recomb-seq 2023). http://oliiverhu.github.io/files/recombseq2023.pdf

MaskGraphene: Advancing joint embedding, clustering, and batch correction for spatial transcriptomics using graph-based self-supervised learning

Published: March 15, 2024

To address this, we introduce a method called MaskGraphene, for the purpose of better aligning and integrating different ST slices using both self-supervised and contrastive learning. MaskGraphene learns the joint embeddings to capture the geometric information efficiently. MaskGraphene further facilitates spatial aware data integration and simultaneous identification of shared and unique cell/domain types across different slices.

Recommended citation: Hu, Yunfei, et al. "MaskGraphene: Advancing joint embedding, clustering, and batch correction for spatial transcriptomics using graph-based self-supervised learning." bioRxiv (2024): 2024-02. https://www.biorxiv.org/content/10.1101/2024.02.21.581387v1.abstract

CNVeil enables accurate and robust tumor subclone identification and copy number estimation from single-cell DNA sequencing data.

Published: March 15, 2024

To address these challenges, we introduce CNVeil, a robust quantitative algorithm designed to accurately reveal CNV profiles while overcoming the inherent noise and bias in scDNA-seq data. CNVeil incorporates a unique bias correction method using normal cell profiles identified by a PCA-based Gini coefficient, effectively mitigating sequencing bias.

Recommended citation: Yuan, W., Luo, C., Hu, Y., Zhang, L., Wen, Z., Liu, Y. H., ... & Zhou, X. M. (2024). CNVeil enables accurate and robust tumor subclone identification and copy number estimation from single-cell DNA sequencing data. bioRxiv, 2024-02. https://www.biorxiv.org/content/10.1101/2024.02.21.581409.abstract

Benchmarking clustering, alignment, and integration methods for spatial transcriptomics

Published: July 15, 2024

Spatial transcriptomics (ST) is advancing our understanding of complex tissues and organisms. However, building a robust clustering algorithm to define spatially coherent regions in a single tissue slice and aligning or integrating multiple tissue slices originating from diverse sources for essential downstream analyses remains challenging. Numerous clustering, alignment, and integration methods have been specifically designed for ST data by leveraging its spatial information. The absence of comprehensive benchmark studies complicates the selection of methods and future method development. In this study, we systematically benchmark a variety of state-of-the-art algorithms with a wide range of real and simulated datasets of varying sizes, technologies, species, and complexity. We analyze the strengths and weaknesses of each method using diverse quantitative and qualitative metrics and analyses, including eight metrics for spatial clustering accuracy and contiguity, uniform manifold approximation and projection visualization, layer-wise and spot-to-spot alignment accuracy, and 3D reconstruction, which are designed to assess method performance as well as data quality. The code used for evaluation is available on our GitHub. Additionally, we provide online notebook tutorials and documentation to facilitate the reproduction of all benchmarking results and to support the study of new methods and new datasets. Our analyses lead to comprehensive recommendations that cover multiple aspects, helping users to select optimal tools for their specific needs and guide future method development.

Recommended citation: Hu, Yunfei, et al. "Benchmarking clustering, alignment, and integration methods for spatial transcriptomics." Genome Biology 25.1 (2024): 212. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03361-0

talks

BIBM2021 oral presentation

Published: December 09, 2021

More information here

RECOMB-seq2023 oral presentation

Published: April 27, 2023

Gives a talk on our paper “ADEPT: Autoencoder with Differentially Expressed Genes and Imputation for a Robust Spatial Transcriptomics Clustering”

RECOMB-seq2024 oral presentation

Published: April 27, 2024

Gives a talk on our paper “MaskGraphene: Advancing joint embedding, clustering, and batch correction for spatial transcriptomics using graph-based self-supervised learning”

Yunfei

Sitemap

Pages

Posts

portfolio

publications

talks

teaching