Applied scientist and research engineer with 6+ years spanning generative AI, reinforcement learning, causal modeling, and multimodal systems — shipping at scale while developing novel methods when existing ones fall short. Novel contributions include C-GRPO (83.7% zero-shot coordination vs. 71.4% FCP), IRM-NECIL (+22 pp continual learning with zero stored examples), and Port-Hamiltonian NNs with a custom Triton kernel. Production track record: shipped LLM alignment at Informatica (60% self-service triage, 10× triage reduction), led $8M+ programs with up to 35 engineers, and founded an AI function from scratch. M.S. in Data Science at Columbia (2025–2026), available December 2026.

Education

M.S. in Data Science Aug 2025 – Dec 2026
Columbia University — New York, NY
  • Coursework: Probabilistic Machine Learning, Reinforcement Learning, Causal Inference, Advanced Deep Learning & Generative AI, Analysis of Algorithms, Continual Learning & Memory Models
  • Teaching Assistant: Causal Inference (graduate), Applied Risk Analytics, Advanced Analytics, Statistical Data Analysis
B.E. in Control Engineering 2015 – 2019
Anna University — Chennai, India
  • Focus: Control Systems, Model Predictive Control, Dynamical Systems, Signal Processing, Neural Networks & Evolutionary Algorithms

Research Interests

  • Machine Learning
  • World Models
  • Causal Inference
  • Continual Learning
  • Model-Based Reinforcement Learning
  • Deep Learning
  • Probabilistic Modeling
  • Cognitive Science
  • Complexity Science
  • Social Sciences

Technical Skills

Machine Learning & Deep Learning PyTorch, JAX, TensorFlow, Transformers, CNNs, GNNs, VAEs, Normalizing Flows, Diffusion Models, Energy-Based Models, State-Space Models
RL & Alignment RLHF, PPO, GRPO, DPO, ORPO, Multi-Agent RL, Curriculum Learning, Invariant Risk Minimization (IRM)
Generative AI & LLMs Large Language Models, Fine-tuning (LoRA/QLoRA, instruction tuning), RAG, model evaluation & benchmarking, LangGraph, Vision-Language Models (Qwen, MiniCPM, LLaMA)
Interpretability & Causal Inference Explainable AI, Mechanistic Interpretability, Structural Causal Models, Do-Calculus, IPW, G-Computation, CATE, A/B Testing, Causal Representation Learning
Computer Vision & NLP SAM, YOLOv8, U-Net, FPN/RCNN, GANs, PINNs, BioBERT, Named Entity Recognition (NER), Multimodal OCR, Segmentation Transformers
ML at Scale & Infrastructure Distributed Training (DeepSpeed ZeRO-3, FSDP), CUDA/Triton, AWS SageMaker, Docker, FastAPI, Redis, Dask, MLflow, PySpark

Experience

Machine Learning Engineering Lead (Consulting) — Medical AI & Insurance Document Intelligence Dec 2024 – Jul 2025
Stealth Startup — Bengaluru, India
  • Built the company's AI capability from scratch — defined the engineering roadmap, hired and mentored the team, and led the transition from a fully manual document-processing workflow to an AI-driven operation.
  • Adapted Qwen-2.5 VLM for medical document intelligence as a multi-task structured generation problem (entity extraction, classification, grounded summarization): two-stage fine-tuning — domain-adaptive warm-up followed by supervised LoRA on 50K+ annotated records via FSDP (bf16, Flash Attention). INT4-quantized serving achieved a 3× cost reduction; benchmarked against LLaMA-3, SmolVLM, MiniCPM, and commercial OCR APIs.
  • Designed long-context RAG pipelines — semantic chunking, cross-encoder re-ranking, query decomposition — to extract medical chronologies, drug side-effect patterns, and injury causation chains for legal case analysis and clinical decision-making.
  • Evaluated IBM WatsonX and Discovery against the in-house VLM stack on factual grounding, retrieval recall, and cost of ownership; the analysis shaped the company's long-term AI platform strategy.
Senior Machine Learning Engineer Jan 2024 – Jul 2025
Informatica — Enterprise Data Governance & AI — Bengaluru, India
  • Reframed enterprise troubleshooting as preference-ranked generation: constructed synthetic preference datasets via RLAIF, trained reward models for Phi-3 and Mistral followed by DPO and ORPO policy optimization via DeepSpeed ZeRO-3. Deployed GPTQ-quantized checkpoints with continuous batching and speculative decoding, cutting per-query latency by 41%.
  • Architected a knowledge-graph-backed RAG system over 100K+ support tickets, product docs, and wikis that auto-generates structured troubleshooting playbooks on ticket creation, cutting mean resolution time and accelerating new-engineer onboarding.
  • Framed log triage as few-shot sequence classification: fine-tuned sentence-transformer encoders with a supervised contrastive loss to surface golden failure signals from noisy high-volume logs, reducing false-positive triage rate.
  • Designed causal inference experiments — IPW, stratified regression, doubly-robust estimation — to isolate the true effect of AI chatbot deployment on ticket resolution time; counterfactual evidence shaped the automation roadmap.
  • Developed interpretable churn-prediction models via Kalman filter latent SSMs, decomposing account consumption into trend, seasonal, and idiosyncratic components to give Customer Success an explainable early-warning signal of at-risk renewals.
Data Scientist II Nov 2022 – Dec 2023
Captain Fresh — Satellite-Based Aquaculture Intelligence — Bengaluru, India
  • Treated aquaculture pond mapping as weakly supervised semantic segmentation — self-supervised SAM encoder pre-training paired with a U-Net decoder fine-tuned on Sentinel-2 imagery — reaching F1 of 0.92 across 1,200+ hectares, with Dask-powered pipelines for near-real-time water quality monitoring.
  • Built a time-series 3D Transformer fusing Sentinel-1 SAR with Sentinel-2 optical bands via cross-modal attention and temporal positional encodings, framing cloud removal as conditional masked reconstruction and improving cloud-free analysis coverage by 20.5%.
  • Ran randomized causal field experiments (propensity matching, CATE estimation) to identify the true drivers of shrimp growth, translating findings into optimized harvest cycles. Deployed on AWS SageMaker with auto-scaling for a 37% latency reduction; results presented at OSICON-2023.
Decision Scientist · Apprentice Leader Jan 2019 – Nov 2022
Mu Sigma — Decision Sciences and Machine Learning at Scale — Bengaluru, India
  • Led a $8M supply chain transformation for a Saudi petrochemical firm — demand sensing, inventory optimization (MIP), distribution routing (graph algorithms) — managing 35 engineers and analysts; produced a $500K follow-on contract.
  • Designed a hybrid physics + ML forecasting system for a Japanese renewable energy client: solar geometry ODEs model atmospheric trends; sliding-window CatBoost corrects residuals on meteorological data. Achieved 4.5% nMAPE, enabling $6M/year in reduced bidding risk.
  • Built a multi-task defect detection system (shared FPN + RCNN backbone, task-specific heads, focal loss) achieving mAP of 0.93; extended with Pix2Pix GANs for synthetic augmentation and PINNs for process-parameter optimization.
  • Fine-tuned BioBERT with domain-adaptive pre-training and a CRF output layer for pharmaceutical NER, achieving F1 of 0.94 and accelerating bio-pharma R&D literature reviews.

Research & Projects

Zero-Shot Coordination in Multi-Agent Systems via C-GRPO 2025–2026 · JAX · PyTorch · Dec-POMDPs · RLVR
  • Developed C-GRPO, a critic-free policy optimization algorithm for Dec-POMDPs using group-relative advantage normalization as an implicit counterfactual baseline — an instance of RLVR where verifiable coordination outcomes serve as reward signals, enabling stable cooperative credit assignment under sparse rewards without a value function.
  • Architecture: residual CNN encoder → multi-head spatial attention → global–local fusion → GRU recurrence. Training parallelized across 64 JAX workers (vmap/pmap) for 12× throughput.
  • Achieved 83.7% zero-shot coordination rate on Overcooked-V2 vs. 71.4% (FCP) and 69.1% (TrajeDi), with 28% lower variance across unseen agent pairings.
IRM-NECIL: Eliminating Catastrophic Forgetting via Invariant Feature Learning 2025–2026 · PyTorch · IRM · Continual Learning
  • Eliminated the replay buffer entirely by treating catastrophic forgetting as a causal feature-extraction problem: standard backbones entangle class identity Ck with task-irrelevant nuisance Z, causing prototype drift at task boundaries — IRM forces the backbone to extract only Ck, keeping old prototypes stable forever.
  • IRM-CMD-NSM achieves 74.2% average accuracy on Split-CIFAR-10 vs. 52.0% for DER++ (which keeps a 200-image replay buffer) — a +22 pp gain with zero stored training examples. On CIFAR-100, the simpler NSM baseline beats DER++ by +31 pp.
  • Introduced prototype interference as a new diagnostic metric quantifying how much a prototype's projection shifts when the subspace expands, enabling principled ablation of subspace expansion strategies.
Hierarchical Causal SSM for Counterfactual Simulation 2025–2026 · PyTorch · Normalizing Flows
  • Built a hierarchical causal recurrent SSM that decomposes the latent space into identity, baseline dynamics, and intervention response channels, with hard-gated causal isolation and sequential g-computation conditioning to prevent post-treatment leakage.
  • Posterior inference uses conditional normalizing flows for non-Gaussian regimes; achieved PEHE score of 0.21 vs. 0.34 (CRN) and 0.31 (CT) on a semi-synthetic longitudinal cohort.
Energy-Based Port-Hamiltonian Neural Networks 2026 · JAX · Triton · CUDA
  • Designed a Port-Hamiltonian NN with compositional kinetic/potential energy factorization aligned to the interaction graph and a symplectic leapfrog integrator implemented as a custom Triton kernel.
  • Training combines derivative supervision, energy calibration, and contrastive transition losses; reduced long-horizon Hamiltonian drift by 91% vs. NeuralODE — energy error 0.8% vs. 8.9% over 1000 steps on coupled dynamical benchmarks.
  • Fine-tuned MiniCPM with joint ORPO + DPO preference alignment to embed personality traits (wit, confidence, playfulness) derived from expert coaching patterns. GPT-4-as-judge validation achieved 73% preference rate vs. 61% baseline (QLoRA, A100); live A/B confirmed 18% engagement lift; deployed via FastAPI + Redis + Docker.
  • Extended into EchoLogics: a production multi-agent platform generating census-aligned synthetic consumer panels with Big Five personality traits (OCEAN) for rapid market research — with a causal inference evaluation layer (A/B testing, Cohen's d, 95% CI, statistical power) and multi-model routing (Gemini for simulation, Claude for analysis) for cost-efficiency.
FAIRE — Frontiers in AI Research and Engineering 2025–ongoing · Sprint-based Research Program
  • Independent sprint-based research program targeting open problems in frontier AI. Each sprint begins with a precisely defined question and concludes with a concrete artifact: working code, a quantified finding, or a documented failure. Active tracks: frontier model engineering (aops-fms), causal structure for continual learning (causal-continual), deep learning on dynamical systems (67systems), and mechanistic interpretability applied to social agents (interp-exp1-socialagents).
  • Four objectives structure every sprint as a loop: Discovery · Evidence · Inference · Optimization. All work is version-controlled and published; results are reported regardless of sign.
Thursday Learning Hours 2025–ongoing · Weekly Self-Run Seminar
  • A weekly seminar I create and run for myself: each session covers one foundation, frontier, or framework in AI/ML/DS — slides, reading list, experiments, and notes. Topics range from theoretical foundations (random matrix theory, information geometry) to research frontiers (mechanistic interpretability, diffusion flows, world models, causal representation learning). The discipline is the point. Consistency compounds.
Exploratory Research: Persona Graph Architecture · Causal LLM Interpretability · RSSM World Model 2025–2026 · PyTorch · JAX · Variational Inference
  • Proposed Persona Graph: explicit psychometric graph (Big Five, MBTI nodes) as cognitive core with LLM as stateless execution engine; validated via node ablations measuring behavioral drift under controlled personality perturbations (architecture realized in EchoLogics).
  • Applied Pearl's causal hierarchy to multi-turn LLM negotiation traces: 7-tuple causal graph extraction over agent reasoning to test motif consistency, counterfactual stability under do-calculus interventions, and cross-scenario invariance.
  • Built a Hierarchical Causal RSSM disentangling customer behavior into time-invariant identity (zI), natural dynamics (zN), and treatment response (zC) channels with hard causal gates; normalizing flows generate individual-level counterfactuals. Achieved 2.4× reduction in confounding bias and +8.2% sales lift on Dunnhumby retail data.