AI/ML Technical Interview Questions

Sample questions organized by role — with evaluation context explaining what each question is designed to surface.

AI Engineer

Role overview

1. Walk me through how you would design a RAG pipeline for a production search feature.

Senior

Tests system design thinking, knowledge of retrieval architecture, and understanding of the trade-offs between precision and recall in a real product context.

2. How do you evaluate whether a prompt change has actually improved output quality?

Any level

Surfaces whether the candidate thinks rigorously about evaluation, or just uses subjective impressions — a key signal for AI engineers who ship changes at scale.

3. What are the main failure modes you have seen in LLM-powered applications, and how did you address them?

Senior

Tests production experience and the ability to diagnose real-world AI system failures, not just academic knowledge of LLMs.

4. How would you decide whether to fine-tune a model versus relying on prompting or RAG?

Any level

Reveals depth of understanding of the trade-offs between fine-tuning cost, data requirements, and the flexibility of retrieval-based approaches.

5. Describe a time you reduced latency or cost in an LLM-based system without meaningfully degrading quality.

Senior

Tests practical optimization experience — a strong AI engineer has navigated the cost and speed constraints that real LLM applications face.

6. How do you handle context window limitations in a document processing pipeline?

Mid-level

Tests practical knowledge of chunking strategies, summarization trade-offs, and retrieval approaches — a common real-world challenge.

7. What does good guardrail design look like for an LLM feature in a customer-facing product?

Senior

Assesses the candidate's understanding of AI safety, output filtering, and the operational concerns that matter when AI systems interact with real users.

8. How would you set up observability for an LLM-powered feature in production?

Any level

Tests whether the candidate thinks like a production engineer — tracing, logging, latency tracking, and anomaly detection are all relevant here.

ML Engineer

Role overview

1. Walk me through how you would design an end-to-end ML pipeline for a recommendation system.

Senior

Tests system design capability, data flow understanding, and awareness of the operational concerns that arise in production ML — not just model design.

2. How do you detect and respond to feature drift in a production model?

Any level

Reveals whether the candidate has operated ML systems in the real world, where data distributions shift and models degrade over time.

3. How would you decide when a model needs to be retrained versus when it is a data issue?

Senior

Tests diagnostic reasoning and the ability to distinguish between model degradation, data pipeline failures, and distribution shifts.

4. Describe how you have managed experiments across multiple modeling approaches for the same problem.

Any level

Surfaces experiment discipline — good ML engineers track experiments systematically rather than relying on memory or scattered notebooks.

5. What are the trade-offs between batch inference and real-time inference, and how did you decide for a specific project?

Any level

Tests practical infrastructure awareness and the ability to reason about latency, throughput, and operational complexity in context.

6. How do you validate a model before deploying it to production?

Mid-level

Assesses rigor around offline evaluation, shadow mode testing, canary deployments, and online metric monitoring — the full deployment safety chain.

7. Walk me through a feature engineering decision that had a meaningful impact on model performance.

Any level

Tests practical feature engineering judgment, not just knowledge of techniques — strong ML engineers can explain why a specific feature choice mattered.

8. How would you build a training pipeline that is reproducible and auditable six months later?

Senior

Surfaces awareness of versioning, artifact tracking, and the operational discipline needed to reproduce ML results in a real team environment.

Data Scientist

Role overview

1. Walk me through how you would design an A/B test for a new feature with a small user base.

Any level

Tests experiment design fundamentals — power analysis, metric selection, and the ability to reason about statistical validity when sample size is constrained.

2. How do you communicate a null result to a stakeholder who expected a positive outcome?

Any level

Surfaces communication maturity and the ability to present uncertainty and negative findings honestly — a key skill in a business-facing data science role.

3. Describe a time you found a significant issue in a dataset that changed the direction of an analysis.

Any level

Tests data quality awareness and intellectual honesty — strong data scientists catch issues before they become misleading conclusions.

4. How do you decide between a simple model and a complex one for a business prediction task?

Mid-level

Reveals the candidate's instinct for interpretability, maintenance cost, and the actual business need — not just model performance metrics in isolation.

5. What does good experiment design look like when you cannot randomize perfectly?

Senior

Tests understanding of quasi-experimental methods, causal inference approaches, and the practical limitations of real-world data — advanced signal for senior roles.

6. How would you identify whether a metric decline is a real business problem or a measurement artifact?

Any level

Assesses analytical rigor and the ability to distinguish signal from noise — a common challenge when dashboards show unexpected drops.

7. How do you handle a situation where your analysis contradicts the intuition of a senior stakeholder?

Senior

Surfaces intellectual confidence and communication judgment — the ability to advocate for data-driven conclusions while remaining open to new information.

8. Describe how you structure an exploratory data analysis when you are unfamiliar with a new dataset.

Mid-level

Tests systematic thinking, data intuition, and whether the candidate starts with questions and hypotheses rather than jumping straight to techniques.

Want a complete evaluation framework?

These sample questions are a starting point. Our full interview process includes calibrated scoring anchors, role-specific competency frameworks, and evidence-based evaluation criteria — not just a question list.

Buy an interview

Ready to hire with more confidence?

Get a structured technical evaluation delivered by a practitioner who knows the domain — not a generic screener.