Omics Nexus Summer School 2026

From Biological Data to
Bioinformatics Mastery

Master Bioinformatics, Transcriptomics, Protein Designing & AI
for Modern Biological Research

6 Weeks | July - August

Hands-on Training

About the Program

The Omics Nexus Summer School 2026 is an intensive training program designed for students, researchers, and professionals who want to build expertise in computational biology, transcriptomics, machine learning, and AI-driven biological discovery.

Participants will work with real-world biological datasets and industry-standard tools while developing practical skills that are directly applicable to research, graduate studies, and careers in bioinformatics.

MODULE 1

Foundations for Bioinformatics

From Raw Data to Biological Insight

Build a strong computational foundation by learning how to navigate Linux environments, write scripts in Python and R, and handle real biological datasets. Develop the core skills researchers rely on to process, analyze, and visualize biological data efficiently and reproducibly.

Topics Covered

•Introduction to the Linux operating system and command-line navigation
•Writing and executing Bash scripts for automation
•Python programming fundamentals for biological data
•Reading, parsing, and managing biological file formats (FASTA, FASTQ, BED, GFF/GTF, VCF, SAM/BAM)
•Combining Bash and Python for data processing pipelines
•R programming fundamentals and data structures
•Data wrangling and exploratory data analysis using the tidyverse
•Data visualization and interpretation using ggplot2
•Best practices for reproducible computational workflows

Hands-On Projects

Independent Linux and Python-based processing of real biological data files
End-to-end data handling and visualization workflow spanning Bash, Python, and R
Research-style presentation of computational workflows and findings

Outcomes

›Work confidently in Linux-based environments
›Write basic scripts in Python and R
›Handle and parse common biological file formats
›Create publication-quality visualizations to identify and interpret meaningful patterns in biological data

Download Syllabus

MODULE 2

Advanced Transcriptomics & AI

Bulk & Single-Cell RNA-Seq • Machine Learning

Master the analysis of modern transcriptomic datasets by exploring both single-cell RNA sequencing and machine learning approaches. Learn how researchers uncover cellular heterogeneity, identify biomarkers, and build predictive models from gene expression data.

•Introduction to Bulk RNA-seq and Single-Cell RNA-seq
•Understanding Cell Ranger outputs and transcriptomic data structures
•End-to-end scRNA-seq analysis using Scanpy
•Quality control, normalization, and feature selection
•Cell clustering, dimensionality reduction, and cell-type annotation
•Marker gene discovery and biological interpretation
•Fundamentals of machine learning for biological data
•Gene expression preprocessing and feature selection
•Building disease-classification models from transcriptomic data
•Model evaluation, biomarker identification, and result interpretation

Hands-On Projects

Independent analysis of a real-world scRNA-seq dataset
Development of a disease-classification model using bulk RNA-seq data
Research-style presentation of findings and biological insights

Outcomes

›Analyze single-cell transcriptomic datasets using industry-standard tools
›Identify biologically relevant cellular populations
›Build machine learning models for disease prediction
›Translate complex gene expression data into actionable biological insights

Download Syllabus

MODULE 3

Teaching Machines to Speak Protein

Protein Language Models • AI-Driven Design

A deep dive into protein language, AI-driven design, and hands-on PLM training — from foundational theory to a fully trained model, all built around a real cancer drug target (EGFR Kinase).

•Amino acid alphabet tokens, sequence-to-structure-to-function mapping
•Protein Language Models (ESM-2) learning via masked residue prediction
•Embeddings encoding secondary structure, evolutionary signals, and functional sites
•Forward problem (structure prediction) vs. inverse problem (sequence design)
•Design strategies: Fixed-backbone (ProteinMPNN), De novo generation (RFdiffusion), Hallucination
•ML architectures: Transformer self-attention, GNN message passing, Diffusion denoising
•Data retrieval from open-source protein databases and sequence preprocessing
•Model selection, configuration, training, and fine-tuning in PyTorch
•Evaluation metrics (pLDDT, iPAE, Perplexity, Log-Likelihood, RMSD)
•Self-consistency validation (designed vs. natural sequences)

Hands-On Projects

Fetch EGFR kinase, run ColabFold for 3D structure prediction, visualize pLDDT
Run ProteinMPNN on backbone, score sequences with ESM-2, validate with RMSD
Engineer sequence dataset from open databases, train & fine-tune a transformer-based PLM, evaluate with perplexity & log-likelihood

Outcomes

Students leave with a complete end-to-end skillset:

›A clear mental model of protein language and AI design
›A notebook producing their first AI-designed protein candidate against EGFR kinase
›A trained PLM generating novel sequences validated by fold self-consistency

Download Syllabus

Investment & Price Plan

🎁 Save 3000 PKR / 15 USD with the Early Bird Summer School Bundle

Early Bird Special

Bundle (All 3 Modules)

Local Participants12,000 PKR

International Participants60 USD

Individual Module

Local Participants4,000 PKR

International Participants20 USD

Deadline: 30 June 2026

Regular Pricing

Bundle (All 3 Modules)

Local Participants15,000 PKR

International Participants75 USD

Individual Module

Local Participants5,000 PKR

International Participants25 USD

Last Date to Apply: 10 July 2026

Secure Your Spot

Select Program Option

Choose Option *

1. Participant Details

Full Name *

Email *

WhatsApp Number *

City *

Country

University / Organization *

Level *

Field *

2. Payment Instructions

Please submit the applicable fee based on your selected program option. After transferring the amount, enter your transaction details in the next section.

Account Title

NADEEM KHAN

Bank Name

Allied Bank

Account Number

12890010058958870016

3. Verify Payment

Payment Method Used *

Sender Account Title *

Transaction ID (TID) *

Amount Paid *

Payment Date & Time *

Note: Screenshot upload is not required. Our team verifies payments directly using the Transaction ID and Sender Title.

From Biological Data toBioinformatics Mastery

About the Program

Foundations for Bioinformatics

Topics Covered

Hands-On Projects

Outcomes

Advanced Transcriptomics & AI

Hands-On Projects

Outcomes

Teaching Machines to Speak Protein

Hands-On Projects

Outcomes

Investment & Price Plan

Early Bird Special

Regular Pricing

Secure Your Spot

Select Program Option

1. Participant Details

2. Payment Instructions

3. Verify Payment

From Biological Data to
Bioinformatics Mastery