Shaheen Nabi

Research Engineer (Independent) · Bengaluru, India

Hi! 👋 I'm Shaheen, an independent Research Engineer based in Bengaluru, India. My research is centered on reasoning and thinking models — understanding what makes a model reason well, and how to make it reason better at lower cost.

I approach this from the ground up: studying how decisions made during pre-training and mid-training — data composition, continual pre-training, and architectural choices like hybrid MoE and state space models — shape the reasoning capabilities that emerge downstream. I'm interested in how these foundation-level choices propagate through to post-training, and what they make possible or limit.

The core of my work lives in post-training: RLVR, SFT, preference optimization pipelines, and test-time compute — how to get a model to think better, for less. I actively follow and engage with new research in reasoning: small reasoning architectures, hierarchical reasoning, and emerging approaches to efficient thinking. The longer-term question I care about is how these directions can push the next generation of frontier reasoning models — cracking harder benchmarks while becoming more compute-efficient at inference.

I contribute to open source through code and writing, and am actively working toward pushing contributions into large LLM infrastructure and post-training frameworks.

Shaheen Nabi

Research Interests

Outputs

Open-source repositories, writing, and models — in lieu of publications.

2025 · Writing · Substack
Reinforcement Learning Foundations
A technical introduction to the mathematical foundations of RL — MDPs, Bellman equations, policy gradients, and value functions — written for researchers entering the field from a supervised learning background.
2025 · Open-Source · GitHub
Reinforcement Learning: Zero to Hero
A maintained RL repository — MDPs through PPO and DDPG — with clean implementations and math annotations. Written for practitioners building the intuition needed for modern post-training research.
2025 · Implementation · Architecture
Attention Variants from Scratch — GQA & MLA
PyTorch implementations of Grouped-Query Attention and Multi-Head Latent Attention. GQA shares KV heads across query groups to reduce cache memory. MLA (as in DeepSeek) compresses KV into a low-rank latent space and up-projects at inference, decoupling cache footprint from model width.
2024 · Fine-Tuning · Open Source
Instruction Fine-Tuning of LLaMA 3.2 3B — Kannada
LoRA + 4-bit QLoRA fine-tune of Meta LLaMA 3.2-3B Instruct on 390K Kannada instruction pairs. Merged to FP16 for deployment. Released on Hugging Face Hub; used by hundreds of developers monthly for regional NLP applications.
2025 · Applied ML · Dataset
LeafLogic — Agricultural Object Detection & Multi-Agent Pipeline
YOLOv5 detection pipeline (NVIDIA A100) for 100+ crop species. Multi-agent framework for autonomous post-detection research and reporting. Deployed on AWS ECR/EC2. Open-sourced a 25K annotated image dataset on Hugging Face, used by researchers globally.
Open source. I contribute through code and writing, and am actively working toward pushing contributions into large LLM infrastructure frameworks and post-training pipelines at scale — and actively document my journey so it can help others along the way.

Experience

2025 — Present
Research Engineer, Independent
Github(open-source) · Bengaluru, India
Post-training pipelines, reinforcement learning for LLMs, reasoning model research, and inference-time compute. Implementing training objectives and architectures from frontier research. Building toward contributions to large-scale LLM infrastructure.
Jan – Mar 2025
Data Science Intern
iNeuron.ai · Bengaluru, India
Object detection pipelines on NVIDIA A100. Open-sourced a 25K annotated image dataset. Designed cloud inference infrastructure on AWS ECR, EC2, and Jenkins.
2022
Founder
Lasso Pacific Pvt Ltd · Anantnag, J&K, India
AI and robotics edtech platform for rural learners. Reached 2M+ annual visitors organically. Closed after one year; proceeds reinvested into free tech literacy programs.

Education

2025 – 2028
Bachelor of Arts
Indira Gandhi National Open University
2021 – 2022
Full Stack Data Science
iNeuron Intelligence
2021 – 2023
High School Diploma — Mathematics & Computer Science
J&K Board of School Education