My First Week at Columbia - Crafting the Perfect ML/Data Science Course Portfolio
How was my first week at columbia and the rigour of Courseworks.. (transcribed)
So yeah, I’m just going to talk about my first week at Columbia for the fall 2025 season. This is more about the courses I went for, what I’m trying to take this semester, and what could be the best composition of courses that would really help anyone doing their masters with a focus on machine learning and data science, right? I’m just sharing my experience here.
The Strategic Approach: Why I Chose Foundational Over Applied
When I started planning for my masters, I was pretty sure I should be focusing more on foundational, theoretical, and algorithmic science-centered courses. Why? Because for years, I’ve worked on applied data science and applied machine learning areas. To really build that rigor, I was curious to select some of the best electives available.
Now, here’s one caveat with Columbia’s Masters program - you have to take some core courses that are compulsory. But honestly, some of these courses sounded very redundant to me. Courses like probability, statistics, exploratory data analysis, and visualization felt quite entry-level, especially since I’ve taught probability and statistics and utilized them across my years in the industry. I mean, EDA is like the bread and butter thing we always end up doing, right?
I’m not saying these courses aren’t necessary - they’re quite important for people who are getting started and want a strong understanding. But since I had the background, I explained my experiences, submitted for waivers, and got them approved. So now I’m able to take three electives with one core course, which is algorithms.
The Course Lineup: Four Courses for Credit
1. Algorithms with Professor Eleni
I really like this algorithms coursework by Professor Eleni. It’s actually backed up by Cliff Stein and people like Tim Roughgarden - they’re actually at Columbia! Having the OG professors and pioneers in analysis of algorithms as part of the Data Science Institute is quite a good vibe. We end up studying and learning from their papers directly.
I see everything as an ecosystem - how doing a course within a specific ecosystem and alignment with my personal vibe is going to help. The fact that Cliff Stein and Timothy Roughgarden are associated with this, and maybe we can connect with them for wonderful conversations, makes it really exciting.
Personally, I believe having this algorithmic knowledge will help in many places - optimizing solutions and thinking of coming up with better algorithms. In my bachelors, I worked on evolutionary algorithms, using genetic algorithms, particle swarm optimization, and ant colony optimization for process control optimizations. But I never had a chance to go in-depth with different analysis paradigms of algorithms.
Also, I always stayed away from LeetCode kind of conversations in the past. So this is setting up a very good proper foundation for me. I really enjoyed Professor Eleni’s classes and how she teaches algorithm analysis. When I read the CLRS book along with that, it makes more sense, and since I come from an industry background, I can imagine where I can utilize these concepts.
I’m trying to make this semester about benchmarking - how I can take any algorithm and benchmark it against my own algorithms. I don’t want to build a winning algorithm, but an algorithm that is benchmarked, where I have the caliber of validating and building my own algorithm. Data scientists typically may not think this way, but I’m trying to bring in how I can design my own algorithms, especially since I’m quite interested in bandit algorithms. I’ve been reading Thor Lattimore’s book on bandit algorithms because bandits are quite a tough playground for making better decisions.
2. Causal AI with Professor Elias
I chose Columbia specifically because Professor Elias is taking sessions and has his causal AI lab here. I’m quite interested because from 2022 onwards, I’ve been reading and trying to use Professor Pearl’s causal inference frameworks. Being in Professor Elias’s session is exciting because he and Professor Pearl collaborate - they’re the pioneers in causal intelligence.
For me, it’s not just learning what is what - it’s more about having very good conversations with Elias during sessions. It’s thinking from philosophical, statistical, and computer science perspectives and learning how a new framework evolved. It’s more about unleashing the intuition behind causal intelligence and how these people are approaching it.
This aligns with my interest in decision engineering - you’re not just predicting something, but also going to the operationalization end of things, right?
3. Reinforcement Learning with Professor Shipra Agrawal
I’m taking another course from Shipra Agrawal on reinforcement learning because currently, all the agent AI systems and LLMs are focused on RL. Though I’ve worked on RLHF, direct preference optimization, and PPO, I want to go back and get those foundations of RL properly.
I’ve tried reading from Professor Sutton and Barto’s books and experimented with it, but I believe Shipra’s classes are very good for conversations, and the lab sessions and assignments are quite engaging. So far, I’ve looked at the first assignment and lab-related stuff, and it covers a breadth of things related to RL.
Professor Shipra’s background in dynamic pricing and operations research-based algorithms aligns with what I’m trying to pursue. It shouldn’t be just using RL for large language models, but understanding how RL problems can be formulated. Since I come from a control theory background and have used model predictive control, it’s quite fun to cover different aspects of reinforcement learning. I’m particularly interested in industrial or enterprise analytics - supply chain optimization, pricing, personalization.
4. Probabilistic Models and Machine Learning with Professor David Blei
Along with that, I’m taking Professor David Blei’s probabilistic models and machine learning course. It’s giving us a Swiss knife kind of feel - understanding the unified framework of probabilistic modeling so you can take up any data analysis projects or core data science problem statements.
This also intertwines with my bachelor’s degree because I worked on Kalman filters and extended Kalman filters for control engineering, and six years later in informatics, I worked on Kalman filter-based frameworks with state space models to model customer behavior. We were able to do that, but I still hope there are a lot of things we can do better.
The interesting thing is latent representation learning or latent variable modeling. When you look at usual machine learning models like random forest, XGBoost, or LightGBM, they’re designed for forecasting or pattern recognition, but they wouldn’t give you explainability in a critical way. We use SHAP and LIME, but how can we craft it better?
Professor David Blei is the author of Latent Dirichlet Allocation, and when I read his papers, I see how they’re making it a unified framework. You can use this understanding to model hidden Markov models, Kalman filters, LDA, and other probabilistic models. Even language models and diffusion models can be represented this way. It gives me a new perception about how you can take a problem and present it probabilistically.
The Audit Courses: Three Additional Learning Experiences
I’m also auditing three courses - not taking them for grades, but to find new nuances, have better conversations, and relearn certain things:
Deep Learning with Professor Andrei - covers everything from logistic regression to LLMs and reinforcement learning for LLMs, how to scale LLMs. It’s going to be a quick refresher for me.
Applied Machine Learning with Professor Spencer Luo - another fantastic course that covers a range of things. For someone who needs a proper start, both Andre’s and Spencer’s courses are fantastic because they cover such a breadth.
Deep Learning and Systems with Professor Parajit Dubey from IBM Research - focuses on operationalization and system aspects like GPU programming, multi-node GPU programming, scaling, performance optimization, and deployment with Kubernetes and Docker. It also focuses on current GenAI things.
I attended his session yesterday - it was quite neat and interesting, particularly because I never had a chance to get in-depth with GPU and how sharding works conceptually. I’ve used full sharded data parallel or model parallelism, but never got the chance to really understand the internals.
The Columbia Ecosystem: Why It’s More Than Just Courses
One thing I realized is that Columbia doesn’t have all courses published on YouTube or online, so people outside Columbia may not know the quality and rigor of what’s happening here. But it’s really on par with what we see from Stanford’s Andrew Ng or CS 229 and recent agent AI-related things.
I could see courses like scaling LLMs from the IBM research team and Columbia professors, high performance machine learning, and agent AI sessions which are purely seminar and presentation-based. There are also application-focused courses in banking data science, insurance, finance, deep learning in finance - the breadth is incredible.
But I personally feel a better start for grad school students would be building breadth in the first semester. Many people come straight out of college or with one-two years of software engineering experience but are pretty new to machine learning. I’d strongly suggest keeping courses like machine learning and deep learning along with probability and algorithms - that’s a great combination.
The Research Environment: Where Real Innovation Happens
Yesterday I visited the research fair, and I discussed with PhD students from different backgrounds. I shared my thoughts on various problem statements. The resources are quite good, and you’re almost always welcome to collaborate with anyone.
I discussed one problem statement with a PhD student from chemical engineering who’s doing material or drug discovery with foundational models. They’re building their own networks - completely different from large language models. Such things help you think differently from the hype.
There were projects on global fish counting or fish activity monitoring with computer vision, and style analysis from artworks with VLMs. I was able to share my past experiences and readings. It was quite a good conversation - I was actually longing for this for quite some time because in India, it’s quite tough to have such level of conversations.
The TA Opportunity: Bridging Different Worlds
I’ve gotten a TA opportunity with quantitative methods in social science by the Institute of Societal and Economic Research and Policy. This is great for me because while I’m always running around with machine learning and current AI, I’m quite interested in how people make their choices.
Working with Professor Eirich on their statistical analysis course is quite different. The statistical way of looking at problems in social science is completely different from probability stats classes because you get interesting problems to explore and always come up with different questions.
Around 200 members have registered, and I’m one of four TAs. It’s quite satisfying and motivating to apply my readings from books like “Freakonomics,” “Nudge,” and “Outliers,” “Tipping Point” from Malcolm Gladwell - a lot of things. So it’s actually making the personality that I built over time more intact - I’m not limiting myself to just coding. No, I’m covering end-to-end problem solving, and what I inculcated in my previous work is quite intact and seeing a new transformation, right? I’m not doing a transition, but I’m trying to do a transformation - how can I improvise it, how can I make more things, how can I make myself come up with something new, something innovative.
The Philosophy: Transformation, Not Just Transition
This isn’t just a transition for me - I’m trying to do a transformation. How can I improvise? How can I make more things? How can I make myself come up with something new, something innovative?
I follow this control theory approach: observe, understand the pattern, simulate, optimize, and keep iterating. That’s what you can do here.
There are wonderful courses, wonderful professors, very good setup, and almost every day you interact with very good minds - motivated people. I’m happy to be here and quite motivated to see the outcomes.
I’m not rushed to get an internship or full-time offers because I truly believe that if I properly focus on this, the employment opportunities in research or industrial roles will naturally follow. All we need is persistence, patience, and real focus to build something new.
Final Thoughts: The Best Option Framework
People might argue that I could have gone to MIT or Stanford, which are “the best in the world.” But there’s no single “best” - there are best options. And anyone could agree Columbia is a best option. I can validate this based on conversations and experiences so far.
It’s like Bayesian subjectivity - my opinion will always change, and so far, it’s only strengthening. I don’t see any great disappointment happening.
The campus connections that DSI helps with are incredible because DSI focuses on applied science with strong ties to the New York City ecosystem. They’re working on child health, sexual health, medical sciences, civil engineering, landscape architecture - it proves data science is a fluid department. You know how to solve problems, but you might not find interesting problems to focus on. The campus connections enable that discovery.
The only thing is we have to learn some nuances of communication, prioritizing, and how proactive we need to be. Those are skills people automatically learn in this ecosystem.
I’ll keep sharing my progress and further developments. This is just the beginning of what I believe will be a transformative journey.
What resonates most with you about this approach to graduate school? I’d love to hear your thoughts on balancing foundational learning with applied experience.