All interview prep guidesInterview Prep · 6 Questions

Data Scientist interview questions

Data science interviews in 2026 are noticeably more applied than they were five years ago. SQL fluency and product sense are now table stakes; the differentiator is whether you can ship a model that actually moves a business metric, and explain why.

What interviewers look for
  • SQL fluency on real-world schemas, not toy tables
  • Statistical reasoning, especially around causality vs. correlation
  • Knowing when ML is the wrong tool
  • Communicating model output to non-technical stakeholders
  • Awareness of data quality, bias, and ethical risks

Real questions with model answers

SQL

1. Write a query to find the top 3 products by revenue each month.

Window function with ROW_NUMBER() OVER (PARTITION BY month ORDER BY revenue DESC), then filter where rn ≤ 3. Mention that DENSE_RANK changes the answer when there are ties — interviewers love when you flag this unprompted.

Statistics

2. How would you design an A/B test for a new homepage?

Define the metric, compute required sample size (effect size × baseline conversion × power), randomize at the user level, run for at least one full business cycle, and check for novelty effects. Cover guardrail metrics, not just the primary one.

ML

3. When would you choose a simple model over a complex one?

When interpretability matters (regulated industries), when training data is limited, when latency or cost is constrained, or when the simple model is within 1-2% of the complex one. "Always start with the simplest thing that works" is the right framing.

Product

4. How would you measure the success of a recommendation system?

Multi-layer: offline metrics (precision@k, recall), online metrics (CTR, dwell time, downstream revenue), and qualitative diversity / fairness audits. Name the trade-off: optimizing CTR can collapse diversity over time.

Behavioral

5. Describe a model that did not perform in production and what you did.

Distribution shift, training/serving skew, feedback loops — pick one. Be specific about how you detected it, what you changed, and what monitoring you added so it would not happen again.

Causal Inference

6. How do you know if a feature causes a metric change or just correlates?

Randomized experiment is the gold standard. If that is impossible, use difference-in-differences, regression discontinuity, or instrumental variables — whichever fits the natural experiment in the data. Acknowledge what each assumes.

Prep tip

Spend a third of your prep on SQL drills, a third on stats fundamentals, and a third on the company's actual product — interviewers reward candidates who already understand the business. Bring two real projects you can talk about deeply.

Prep for other roles

GET STARTED

Kickstart Your Career Journey

AI that searches, applies, and coaches while you focus on landing the offer.

Try for free
TALK TO AN EXPERT

Build a team that wins

AI agents run sourcing, screening, and outreach so your team only meets the best.

Schedule Now