The Machine Learning Interview
A new role in the tech industry has been forming over the last few years: The Machine Learning Engineer (MLE). What is a machine learning engineer and what should you expect in your interviews? It really depends on the team and role, but there are some general guidelines. The term MLE hasn’t fully standardized so it’s important to understand what your recruiter and hiring manager want from the position. This guide will help you facilitate that conversation and get you into your interview prepared.
Machine Learning Engineer vs Data Scientist
These terms are sometimes used interchangeably. We define them primarily based on their work product or artifacts. A machine learning engineer will produce a production system that performs some machine learning task. This may be computer vision solutions for self driving cars, recommender systems for content products like YouTube, regressors for stock market predictions, or similar types of systems. The key here is the production of systems. The resulting artifact is generally long lived and needs to be maintainable and supportable by future engineers. So their day-to-day work includes analysis, feature engineering, model selection, and productionizing those things at scale. Their skillset is a blend of applied statistics and software engineering fundamentals with a strong bias towards engineering. Data scientist’s artifacts are generally knowledge. A data scientist is a domain expert usually with a stronger theoretical background than an MLE. They will do analysis to inform business decisions, to explore bleeding edge approaches to existing problems, and the end work product will likely be a jupyter notebook or similar documentation that tells the story of the knowledge they’ve learned. Their skillsets will be much more statistics and ML theory focused with less emphasis on software engineering. Sometimes in competency rubrics for Data Science roles you’ll see things like “coding skills equivalent of L-1 of software engineering peers.” Here L-1 means “the level below.” What this says is that a sr data scientist is expected to code at the level of a junior software engineer.
The real world isn’t that black and white, but it does give us some guidelines on what to expect in a day-to-day role, and what you might be interviewed on.
Machine Learning Engineer Interview Modules
The above description of a Machine Learning engineer hints at the types of modules you may see in an interview. Let’s take an example of LinkedIns machine learning engineer modules:
Coding and Algorithms
Machine Learning Design
Machine Learning Theory or System Design (depending on candidates background)
Note that there are two coding modules. Data Coding, is a coding problem from a data domain. Popular examples seen at many companies include “implement dot product on sparse vectors” or “count the maximum number of points on a line”. Coding and Algorithms is exactly what is sounds like and you should practice for traditional coding interviews. Depending on the candidates background, this module may be at a medium or hard level. Machine Learning Design is a product and systems design problem. If you’re interviewing for a recommender systems team you may be asked to design the Netflix or Youtube homepage and be expected to talk about the trade-offs between different model families, the features you’d explore, how you’d validate them, and A/B testing. You would be expected to touch on your training pipelines, how you’ll serve inferences, and your feature stores. Finally, depending on the candidates background they’ll either get a traditional systems design interview or a deep ML Theory session focused on their area of expertise. If you claim to have deep knowledge of vector machines you’ll get fundamental questions on the topic here. It’s been seen at some companies to have candidates derive simple logistic regression on the board from first principles.
Don’t let that intimidate you. MLE is a new role and companies need people from all different backgrounds. The key is to present yourself as you are. Are you a self taught Machine Learning practitioner with a strong background in distributed systems? I guarantee there’s a role for you. Be up front about your strengths and weaknesses and you’ll likely get an interview panel tailored to it.
Preparing for your other interviews
Don’t make the mistake too many people do, only preparing for coding interviews. Be sure to spend some time preparing for System Design Interviews too! After you’ve done a few on your own be sure to schedule a mock interview with someone who will help simulate the high pressure environment of a real interview and give you the feedback at the end that would normally be given to the recruiter.