A Letter to My Fellow Grad Students - Navigating the AI Era Together
Starting our MS journey in Fall 2025 - what it means and how to make the most of it (transcribed)
Hey everyone,
I’ve been thinking a lot about our cohort lately. We’re this diverse group of people from different backgrounds and expectations, all diving into MS programs across data science, computer science, electrical engineering, and computer engineering. From what I’ve seen in our interactions so far, most of us are pretty interested in AI and machine learning, which makes sense given where we are in 2025.
But here’s the thing that’s been on my mind - we’re starting these programs at this wild moment in history. We already have claims about artificial general intelligence floating around, zero-shot and few-shot models that can learn things incredibly quickly, and agentic AI that’s handling coding and all sorts of mundane tasks. So what does that actually mean for us? How do we navigate this and make our journey not just rewarding, but set us up for the best possible outcomes?
I wanted to share some of my thoughts and observations - partly as a way to think through this myself, and partly because I think it might be helpful for others. It’s also kind of a journal for me, you know? I want to look back after this semester and see how far I’ve traveled and what impact these months had on my thinking.
The Current Landscape - More Opportunity Than You Think
When you look at the current AI landscape, especially open source, there are incredible benchmark models everywhere. We’ve got Qwen, LLaMA, DeepSeek - all doing amazing work in the LLM space, vision language models, coding models, you name it. Then there are these fantastic AI applications and tools like Perplexity, Manas AI, voice agents from ElevenLabs. Pick any domain and you’ll find very standard AI models competing on everything.
Take text-to-speech for example - Microsoft’s Vibe Voice is fascinating because you can generate entire podcasts with music and all these different features. Meanwhile, you have Gemini and other models handling image generation and editing. There are models focusing on different modalities everywhere you look.
The industry is also getting really interested in reinforcement learning for reasoning models, different RL environments, benchmarks testing AGI capabilities like ARC, and research into how models hallucinate. There’s this whole conversation about how RL performs better than supervised fine-tuning in terms of forgetting what it was originally trained on.
But here’s what I find encouraging - we already have these good benchmarks and North Stars to focus on, which means the industry hasn’t reached saturation. There’s still massive scope for us.
Beyond all these consumer-facing applications, there’s incredible research happening in weather forecasting, population dynamics, causal modeling, probabilistic modeling, diffusion models, risk management and finance. The scope is high because it’s not just about applied research - there’s core research being developed everywhere. Even something as fundamental as moving from correlational systems (which is what most LLMs are) to truly causal knowledge - there are labs and professors working on developing those foundational frameworks.
And think about this - maybe 1% of companies, or let’s say safely 5%, have really good AI infrastructure and research setups. That means there’s another 10-15% that will try to adopt and climb this AI innovation curve over the next 15 months. We still have enormous scope in companies focusing on medical informatics, customer-related applications like recommendation systems and pricing, robotics, materials discovery - the industry is accelerated across applied research, fundamental research, and product innovation. It’s not stopping anytime soon.
What Should We Actually Focus On?
So given all this, what should we be focusing on that will actually be rewarding for us?
The machine learning side of things is huge, especially taking specific problem statements and coming up with machine learning solutions. I’m not even worried about XGBoost or classical machine learning models anymore - it’s all about deep learning in the post-transformer era. After DeepSeek’s release, there’s so much happening around reasoning, vision language models, speech.
I think taking specific courses - maybe a series on deep learning for NLP, then pivoting into scaling LLM-based applications or agentic AI or deep learning systems - having different courses complement each other could be the best choice. Along with that, taking a couple of papers and trying to implement them, or dissecting them and playing around with them would be great.
We have to approach things from both ends - what’s currently happening and what foundational knowledge we need to acquire. For example, I’m quite interested in probabilistic models right now, but I can’t just stick with old techniques like latent Dirichlet allocation or hidden Markov models. I have to think about how probabilistic models help me approach problems in LLMs and diffusion models, because diffusion is a probabilistic model, flows are probabilistic models. We have to think from both ends.
It’s not like everyone has to start from research and then move to applied research and product innovation. You can start on any side, but it should be complementary and interrelated. Like how you might start with deep learning and NLP, then take scaling and agentic AI alongside that coursework.
Now, some people might be really interested in the performance side of things - ML performance, not necessarily applied research or building new models, but more about scaling, GPU programming, and optimization. For that, learning Triton is something I see as really valuable, because CUDA and Triton are what most startups are using to improve their GPU performance.
Think about it - when a model comes from a frontier lab and ends up on Hugging Face, not all companies will just take that model and host it simply using VLLM. They might need to work on the model architecture to make it run faster on GPUs, save time on latency and memory consumption. That’s what infrastructure-focused teams are working on.
If someone’s really interested in software roles, I’d suggest focusing on how agents actually work from an engineering perspective - distributed computing, building sandboxes, different protocols - rather than just learning backend frameworks. There are two different things here: developing agents with frameworks like LangGraph, versus developing the frameworks themselves or the backend infrastructure that can run these different systems. That’s agentic engineering, which is different from ML performance engineering.
And you might ask, what if I’m not made for machine learning but I’m interested in data science applications? In that case, I’d say don’t limit yourself to basic statistics or basic visualization. Anyone interested in data science who’s not comfortable with LLMs or current gen AI should seriously consider taking econometrics courses. Almost all big enterprises have marketing science or revenue growth management teams that want to optimize their decisions quarterly, studying different campaign strategies, supply chain stuff.
I’d suggest taking econometrics courses plus operational research - optimization, causal inference, and predictive modeling. This combination should really help so you’re not just doing classification or regression models, but moving toward building causal inference solutions for measuring campaign performance or different nudges and choice architectures that companies use.
I read this paper about Apple Watch where they studied the nudge effect of notifications sent every 55 minutes telling people to stand. How does this nudge actually impact people’s behavior to stand? The econometric approach of causal inference, the potential outcomes framework, is heavily used in experimental sciences, marketing sciences, econometrics teams. AI isn’t taking over everywhere - we need both intelligence and the capability to make informed decisions, which requires specific methodologies.
Even with coding agents, someone without statistical foundations can’t build good solutions. We need methodological understanding of how to take samples, construct hypotheses, build models and evaluate them. That’s why people not comfortable with LLMs can focus on causal inference applications across different industries, or optimization-focused industries.
There’s also the finance and operations research side - mostly algorithmic development, but some people might be interested in the data science aspects. Recommendation systems and personalization are mostly deep learning-based now, so I’d still suggest getting that deep learning understanding of how things work and how to develop and scale models.
What we really need is strong commitment to a specific area of interest so courses can complement each other.
The Optimization and Decision Making Opportunity
There’s this really interesting area that very few people are exploring, especially coming from operations research backgrounds - optimization and decision making. The scope is there and very few people are really doing it, but enterprises are constantly trying to optimize supply chains, inventory operations, pricing points.
Having optimization and reinforcement learning as focus areas is a great choice. Learning how mixed integer linear programming helps with optimization scenarios, inventory operations, how reinforcement learning works for sequential decision making. The core ML folks interested in LLMs will always take reinforcement learning for LLMs as research, trying to utilize it because it gives them preparation for new challenges and new dimensions of knowledge.
I strongly recommend developing breadth, but you need depth in at least two related subjects that cover a specific scope in the industry. If you ask me why you shouldn’t learn agent frameworks, well, those are things you’ll obviously be able to do because you’re preparing yourself with essential and fundamental concepts. Developing with frameworks becomes much easier when you can do the underlying work.
The Interdisciplinary Approach
Another thing I’m quite interested in is approaching these topics from an interdisciplinary perspective - ML plus X, or data science plus X. Taking a focus on industries that need machine learning understanding or specific conceptual knowledge. For finance, medical, real-world evidence - it’s not just knowing statistical or machine learning concepts, but understanding how data and methodologies work in those specific domains.
In pharmaceutical or medical domains, there are different types of data collection methods, different problem statements. There’s health economics, which focuses on economic impact and economic burdens that might need better statistical inference or causal inference approaches. If you take finance - derivatives, credit risk modeling - you need to understand different data formats and domain-specific mechanisms to understand how that particular industry works.
I wouldn’t say we should be chasing frameworks from day one. That’s not what a master’s is meant for. We should build things from the middle ground where concepts are evolving, understanding the foundational ideas that build current concepts and current developments.
For example, if you’re really good at Markov decision processes, you’ll be able to understand how to frame any reinforcement learning problem statement. If you don’t know the foundations, RL for LLMs doesn’t make sense - you’re just thinking “RL, RL, RL” without understanding that it all evolved from MDPs, then partially observable MDPs, then algorithmic approaches. It’s not just deep Q-learning anymore - everything is policy gradients and their derivatives.
The same goes for other areas. In predictive machine learning for credit risk or customer lifetime value modeling, as grad students we shouldn’t just learn XGBoost. We should learn what industry actually needs. Do they just need predictions, or do they need uncertainties quantified with probabilistic models or conformal prediction?
If we’re working with economists on marketing science, AB testing shouldn’t be just simple t-tests and lift calculations. We should understand current best practices like regression discontinuity or difference-in-differences methods and further advancements.
The Real Purpose of Graduate Studies
What I’m underlining is this - if you just chase frameworks and keep courses just for the sake of courses, I seriously doubt the value of graduate studies. Graduate studies should impact our knowledge acquisition and help us apply that knowledge in the right way, rather than having knowledge acquisition decoupled from applied experience. Both should complement each other so we can be confident and relevant to any changes in the industry.
We should understand different scope areas and how to approach them. Another common misconception I see is people just focusing on applying to internships en masse, trying to automate applications. But we should also focus on how to prepare for these opportunities, how to build a better portfolio.
Portfolio isn’t just something you do for the sake of doing it - it’s really taking industrial problem statements, solving them, building solutions and showing that work. Currently, I’m working on probabilistic modeling and another area, trying to see where both intersect. That’s what I can really see happening.
Keep Your Standards High
These are the things I’m quite interested in and wanted to share. We should all keep our standards high, keep our goals ambitious. A master’s degree is about mastering foundational concepts, not chasing frameworks.
The key is developing the distinction between intelligence and inference - we need both the capability to understand and the methodological foundation to make informed decisions. Whether you’re interested in the cutting-edge ML research, the infrastructure and performance side, the data science applications, or the interdisciplinary approaches, the foundation remains the same: deep understanding of principles that you can apply and adapt as the field evolves.
All the best to everyone as we navigate this incredible journey together.
This is as much a reflection for me as it is advice for others. I’m curious to hear your thoughts and see how we all grow throughout this program. Let’s keep the conversation going.