SAIGEgds - Scalable Implementation of Generalized mixed models using GDS files in Phenome-Wide Association Studies
Scalable implementation of generalized mixed models with highly optimized C++ implementation and integration with Genomic Data Structure (GDS) files. It is designed for single variant tests and set-based aggregate tests in large-scale Phenome-wide Association Studies (PheWAS) with millions of variants and samples, controlling for sample structure and case-control imbalance. The implementation is based on the SAIGE R package (v0.45, Zhou et al. 2018 and Zhou et al. 2020), and it is extended to include the state-of-the-art ACAT-O set-based tests. Benchmarks show that SAIGEgds is significantly faster than the SAIGE R package.
Last updated 2 days ago
softwaregeneticsstatisticalmethodgenomewideassociationgdsgwasmixed-modelphewas
6.15 score 7 stars 15 scripts 254 downloadsSCArray - Large-scale single-cell omics data manipulation with GDS files
Provides large-scale single-cell omics data manipulation using Genomic Data Structure (GDS) files. It combines dense and sparse matrices stored in GDS files and the Bioconductor infrastructure framework (SingleCellExperiment and DelayedArray) to provide out-of-memory data storage and large-scale manipulation using the R programming language.
Last updated 2 days ago
infrastructuredatarepresentationdataimportsinglecellrnaseq
5.32 score 1 stars 1 packages 9 scripts 188 downloadsSCArray.sat - Large-scale single-cell RNA-seq data analysis using GDS files and Seurat
Extends the Seurat classes and functions to support Genomic Data Structure (GDS) files as a DelayedArray backend for data representation. It relies on the implementation of GDS-based DelayedMatrix in the SCArray package to represent single cell RNA-seq data. The common optimized algorithms leveraging GDS-based and single cell-specific DelayedMatrix (SC_GDSMatrix) are implemented in the SCArray package. SCArray.sat introduces a new SCArrayAssay class (derived from the Seurat Assay), which wraps raw counts, normalized expressions and scaled data matrix based on GDS-specific DelayedMatrix. It is designed to integrate seamlessly with the Seurat package to provide common data analysis in the SeuratObject-based workflow. Compared with Seurat, SCArray.sat significantly reduces the memory usage without downsampling and can be applied to very large datasets.
Last updated 24 days ago
datarepresentationdataimportsinglecellrnaseq
4.48 score 1 stars 3 scripts 127 downloads