Fine-tuning LLMs for DNA Sequence Prediction
How researchers are utilizing transformer architectures to identify non-coding RNA motifs and predict splice sites with 99% accuracy.
I specialize in bridging the gap between complex biological data and actionable insights. My work focuses on genomic sequencing, phylogenetics. Currently, I am exploring the intersection of machine learning and structural biology.
Automated variant calling pipeline for large-scale WGS data using Nextflow and GATK.
View Source →A Streamlit dashboard for 3D visualization of protein-ligand docking results.
View Source →Browser-based bioinformatics tools powered by WebAssembly. No installation required, all processing happens locally in your browser.
View and analyze alignment files using samtools
Process sequences with seqtk
Analyze variant files with bcftools
Genome arithmetic with bedtools
Quality control with fastp
Align sequences with minimap2
Visualize structural variants and alignments
Download these sample files to test the bioinformatics tools below.
SRR5924196_1.fastq.gz (77 MB)
Use this with Seqtk or Fastp to test sequence manipulation and quality control.
Download FASTQSRR5924196_1.fastq.gz.subread.BAM (183 MB)
Use this with the SAM/BAM Viewer to test alignment viewing and statistics generation.
Download BAMSRR5924196_1.fastq.gz.subread.BAM.indel.vcf (244 KB)
Use this with the VCF Analyzer to test variant viewing and querying.
Download VCFExploring the intersection of high-performance computing, molecular biology, and artificial intelligence.
How researchers are utilizing transformer architectures to identify non-coding RNA motifs and predict splice sites with 99% accuracy.
An analysis of how dynamic conformational changes are being modeled to accelerate drug discovery for intrinsically disordered proteins.
Optimizing AWS Batch and Kubernetes configurations to handle petabyte-scale metagenomics datasets without breaking the budget.
Interested in collaboration or have questions about my work?
ashifraza4142@gmail.com