Behind the scenes of my first computer vision project (as a first-time team lead)
3 lessons I learned that you can apply to your ML projects
Hello friends,
Today I want to share some lessons I learned from leading my first computer vision project. I learned computer vision from an internal bootcamp I did while at Amazon. As a data scientist, you have to keep learning on the job - building the plane while you are flying it. Today, a lot of data scientists want to learn LLM from the ground zero, so before I start today's story, I have an LLM related workshop to share -- “How to Build a Llama 2 Fully-Private GenAI App.” You'll learn the inner workings of Llama 2 but also discover how to create a fully-private and air-gapped Gen AI application. Get access here.
Alright, let's get started!
I still remember the nervous excitement on my birthday in 2020. My manager asked me to lead a computer vision (video recognition) project for an AWS customer. I knew I would learn a ton in this project, but the feeling of uncertainty was palpable.
Why? Because as my first computer vision project, video recognition is more challenging than image recognition. Also, I'm also leading two very experienced data scientists as a first-time team lead.
In today's letter, I'll share some challenges I faced, how I tackled those, and some lessons learned.
1. The first challenge: data labeling
The ML task was to predict whether there might be a soccer goal in the next few seconds, and we need to cut the clips before the goals actually happened. We were provided with the entire video of soccer games, so we need to identify the timestamps a few seconds before it happens. The only thing we have are a small time interval of when the goal happened.
As we want to make sure we pinpoint the exact timestamps leading up to the goals, for the first 2 weeks, we were just focused on labeling those pre-goal moments.
The best (or maybe the worst) part of the job is that we each watched about 200 soccer goal moments in a few days 🤣 after that I didn’t feel like watching soccer for a while.
At the same time I have more empathy for the data labelers working for YouTube and Meta flagging violent content, imagine watching those as a full time job. So, it really wasn’t that bad.
We told our customers we'll have our first model within the first two weeks, and we had to delay that because of the data labeling process. As a data scientist, you never know what kind of data labeling process you need to go through. It's important to assign enough time in the data pre-processing phase of the ML project - there almost always is something unexpected.
2. Betting on the Right Strategy
Once our data was prepped, we need to figure out the model architecture.
A team member proposed an ambitious strategy, but this one is more of a "moonshot", and we can't afford failing for the customer. As there are not a lot of literature on this type of problems, I sought insights from colleagues with similar project backgrounds. They recommended a tried-and-tested method, albeit less glamorous than the “moonshot”: we might not achieve really high performance metrics, but it's a safe bet and it works 70% of the time.
Which strategy should I choose?
Rather than opting for a singular approach, I saw our project strategy as a diversified portfolio. The bulk of our efforts—similar to investing in stable stocks/ETFs—focused on the reliable return. Simultaneously, we allocate a portion of our resources to explore the moonshot possibility, akin to investing in high-risk, high-reward assets. So, me and one team member worked on the tried-and-tested method, and the other team member who proposed the "moonshot" method got to work on it.
The outcome? Our "safe" strategy worked, and while the "moonshot" didn’t pan out, it offered invaluable insights into our data.
There is benefit to have people work on the same project in a different approach and just give someone the freedom to explore.
The key is to know it might fail and not blame the person, and treat it as a learning opportunities and have a baseline solution that works.
3. Translating a complex problem to simple solutions
There is one part of the solution where we need to identify different types of soccer activities during the game. It includes players entering the field, taking breaks, talking to referees, etc. Ideally, we'd like to identify all those activities. Still, it requires more data labeling effort and we are wondering if we have enough time - even if we do, those activities need more examples than regular soccer activities. And while we are still at the beginning of the ML project, we want to quickly build a baseline model to test whether the model we use would work in action recognition and differentiate different activities.
So instead of trying to identify all those activities, we simplify a multi-class classification problem to a binary classification - we only train the model on the clips a few seconds before the goal happened where players are attacking the opponent and the clips where players are walking around on the field (not in any intense activities)
My thesis was that if the model couldn't even differentiate those two significantly different actions, it wouldn't be able to recognize the other classes I mentioned.
So, we started with the binary problem after we labeled two classes of clips, and our first model had ~ 70% accuracy. And later, we improved this binary classifier, which became our final model.
There is always a perfect solution you can go for, and we always have limited time and resources. Start simple in your first baseline model; think of a way with low effort to quickly test your strategies before you go for the more complicated solution. And a lot of times, you'll realize the simple solution works the best.
Take-aways
Allocate more time in data preparation
Diversify the risk of your ML strategies - treat it as a portfolio
Figure out a low-risk simple solution before diving into complex problem
(Also, don't miss the GenAI workshop, get access here)
That's it for today! I'm in New York right now writing this newsletter to you. I'll be till early September. I'm thinking about organizing a meetup if you are interested in. Reply to this email if you'll be around.
What do you think of today's newsletter? I read every reply, would love to get your input!
Until next time,
Daliana