SQL
1. Write a query to find the top 3 products by revenue each month.
Window function with ROW_NUMBER() OVER (PARTITION BY month ORDER BY revenue DESC), then filter where rn ≤ 3. Mention that DENSE_RANK changes the answer when there are ties — interviewers love when you flag this unprompted.
Statistics
2. How would you design an A/B test for a new homepage?
Define the metric, compute required sample size (effect size × baseline conversion × power), randomize at the user level, run for at least one full business cycle, and check for novelty effects. Cover guardrail metrics, not just the primary one.
ML
3. When would you choose a simple model over a complex one?
When interpretability matters (regulated industries), when training data is limited, when latency or cost is constrained, or when the simple model is within 1-2% of the complex one. "Always start with the simplest thing that works" is the right framing.
Product
4. How would you measure the success of a recommendation system?
Multi-layer: offline metrics (precision@k, recall), online metrics (CTR, dwell time, downstream revenue), and qualitative diversity / fairness audits. Name the trade-off: optimizing CTR can collapse diversity over time.
Behavioral
5. Describe a model that did not perform in production and what you did.
Distribution shift, training/serving skew, feedback loops — pick one. Be specific about how you detected it, what you changed, and what monitoring you added so it would not happen again.
Causal Inference
6. How do you know if a feature causes a metric change or just correlates?
Randomized experiment is the gold standard. If that is impossible, use difference-in-differences, regression discontinuity, or instrumental variables — whichever fits the natural experiment in the data. Acknowledge what each assumes.