Karat

Karat Software Engineer Interview Questions

31+ questions from real Karat Software Engineer interviews, reported by candidates.

31
Questions
2
Round Types
8
Topic Areas
2020-2026
Year Range

Round Types

Coding 4 OA 3

Top Topics

Questions

Hello guys, I applied to be a karat interviewer some time ago and they've finally reached out. The issue is I've been on a break and I am really rusty for interviews. Has anyone been through their JS

I just completed a karat technical coding interview for a unicorn company. I prepared only leetcode questions for this interview and felt pretty confident going in. However, the interview was going ov

Hi all, The company I'm interviewing with is using [Karat](https://karat.com/) to conduct the technical assessment. From what I can gather, this is what I'm in for: * Timed (1 hour) * Live video with

LeetCode #2133: Check if Every Row and Column Contains All Numbers. Difficulty: Easy. Topics: Array, Hash Table, Matrix. Asked at Karat in the last 6 months.

LeetCode #1160: Find Words That Can Be Formed by Characters. Difficulty: Easy. Topics: Array, Hash Table, String, Counting. Asked at Karat in the last 6 months.

LeetCode #68: Text Justification. Difficulty: Hard. Topics: Array, String, Simulation. Asked at Karat in the last 6 months.

LeetCode #383: Ransom Note. Difficulty: Easy. Topics: Hash Table, String, Counting. Asked at Karat in the last 6 months.

## Problem Schedule academic classes or events avoiding time conflicts, likely a greedy interval scheduling or graph coloring problem. ## Tags arrays, sorting, greedy

## Problem Process or find specific endings of book titles or text entries, likely involving string matching or suffix analysis. ## Tags strings, arrays, sorting

## Problem You have `n` camping sites, each with a maximum capacity. You have `m` groups of campers, each with a size and an ordered list of preferred site IDs. Assign each group to a site such that: - The site's remaining capacity accommodates the group size. - Each group gets the highest-ranked available preferred site. Return a dict mapping group ID to assigned site ID. If a group cannot be placed, map it to `null`. ```python def assign_sites(sites: dict[int, int], groups: list[dict]) -> dict[int, int | None]: # sites: {site_id: capacity} # groups: [{"id": int, "size": int, "preferences": [site_id, ...]}] pass ``` **Example:** ``` sites = {1: 4, 2: 2, 3: 6} groups = [ {"id": 10, "size": 3, "preferences": [2, 1, 3]}, {"id": 11, "size": 2, "preferences": [1, 3]} ] Group 10: pref site 2 (cap=2) cannot fit size 3 -> try site 1 (cap=4) -> fits. Assign 10->1. Group 11: pref site 1 (remaining=1) cannot fit size 2 -> try site 3 (cap=6) -> fits. Assign 11->3. Output: {10: 1, 11: 3} ``` ## Follow-ups 1. How would you handle groups that must be placed together on the same site? 2. If groups can be split across sites, how does the problem change? 3. Model this as a stable matching / assignment problem. When is greedy suboptimal? 4. How would you make assignments fair when multiple groups want the same top-ranked site?

## Problem You are given a list of score submissions. Each submission has a `player_id`, `score`, and `timestamp`. A submission is suspicious if the player's score increased by more than `max_delta` points compared to their previous submission. Return a list of suspicious `(player_id, score, timestamp)` tuples, sorted by timestamp ascending. ```python def find_suspicious( submissions: list[dict], max_delta: int ) -> list[tuple]: # submissions: [{"player_id": str, "score": int, "timestamp": int}] pass ``` **Example:** ``` submissions = [ {"player_id": "p1", "score": 100, "timestamp": 1}, {"player_id": "p1", "score": 102, "timestamp": 2}, {"player_id": "p1", "score": 200, "timestamp": 3}, {"player_id": "p2", "score": 50, "timestamp": 4} ] max_delta = 10 p1 goes 100->102 (delta=2, ok) then 102->200 (delta=98, suspicious). Output: [("p1", 200, 3)] ``` ## Follow-ups 1. How would you handle submissions arriving out of order? How does this affect your per-player last-seen-score tracking? 2. What if a player can submit from multiple devices simultaneously? How does your model change? 3. Add a second rule: flag any player whose score exceeds the 99th percentile of all scores in a rolling 1-hour window. 4. How would you design this as a streaming pipeline (Kafka + Flink) at 500K submissions/second?

## Problem Implement encode and decode for two ciphers: **Part 1 - Caesar Cipher:** Shift each letter in `text` by `shift` positions in the alphabet (wrap around). Non-letter characters are unchanged. Decoding shifts in the opposite direction. ```python def caesar_encode(text: str, shift: int) -> str: ... def caesar_decode(text: str, shift: int) -> str: ... ``` **Example:** ``` caesar_encode("Hello, World!", 3) -> "Khoor, Zruog!" caesar_decode("Khoor, Zruog!", 3) -> "Hello, World!" ``` **Part 2 - Vigenere Cipher:** Use a repeating keyword. Each letter is shifted by the corresponding keyword letter's position (a=0, b=1, ...). ```python def vigenere_encode(text: str, key: str) -> str: ... def vigenere_decode(text: str, key: str) -> str: ... ``` **Example:** ``` vigenere_encode("ATTACKATDAWN", "LEMON") A+L=L, T+E=X, T+M=F, A+O=O, C+N=P, K+L=V, ... -> "LXFOPVEFRNHR" ``` ## Follow-ups 1. Why is the Caesar cipher trivially broken? How many keys exist? 2. How would you perform a frequency-analysis attack on a Caesar-encoded message? 3. What is the index of coincidence, and how does it help determine the key length of a Vigenere cipher? 4. How does your implementation handle mixed-case text and Unicode input?

## Problem A delivery robot moves on an `m x n` grid. It starts at `(0, 0)` and must reach `(m-1, n-1)`. Some cells are blocked (obstacles). The robot can move up, down, left, or right one cell per step. Find the shortest path (minimum steps). If no path exists, return -1. If a path exists, also return one valid path as a list of (row, col) tuples. ```python def shortest_delivery( grid: list[list[int]] # 0=open, 1=obstacle ) -> tuple[int, list[tuple[int, int]]]: # returns (steps, path) or (-1, []) pass ``` **Example:** ``` grid = [ [0, 0, 1, 0], [1, 0, 0, 0], [0, 0, 1, 0] ] Output: (5, [(0,0),(0,1),(1,1),(1,2),(2,3)]) -- one valid path ``` ## Follow-ups 1. Why is BFS guaranteed to find the shortest path on an unweighted grid? 2. How would you modify this for a weighted grid where each cell has a traversal cost? (Dijkstra) 3. What if the robot can break through at most `k` obstacles? How does your state space change? 4. If the grid is dynamically updated (obstacles appear/disappear), how would you replan efficiently?

## Problem Parse and analyze domain names from URLs, extracting components such as TLD, subdomain, and root domain. ## Tags strings, arrays, parsing

## Problem Traverse a generational graph (e.g., family tree or dependency graph) to find relationships or generations between nodes. ## Tags graph, BFS, DFS

## Problem Implement the core logic for a two-player turn-based game on a 3x3 grid (like Tic-Tac-Toe generalized to N x N). ```python class MiniGame: def __init__(self, n: int): # n x n board, players are 1 and 2 pass def make_move(self, player: int, row: int, col: int) -> str: # returns "continue", "win", "draw", or "invalid" pass def get_board(self) -> list[list[int]]: pass ``` **Example:** ``` game = MiniGame(3) game.make_move(1, 0, 0) -> "continue" game.make_move(2, 1, 1) -> "continue" game.make_move(1, 0, 1) -> "continue" game.make_move(2, 2, 0) -> "continue" game.make_move(1, 0, 2) -> "win" (player 1 fills row 0) ``` ## Follow-ups 1. How do you check for a win in O(1) per move using running row/column/diagonal counts? 2. How would you implement an AI opponent using minimax with alpha-beta pruning? 3. Extend to a 4-in-a-row (Connect Four) variant on a 6x7 grid with gravity. What changes? 4. How would you serialize and restore game state so players can pause and resume?

## Problem You have a collection of cards. Each card has a `mana_cost` and an `attack_power`. You can play cards whose total mana cost does not exceed `max_mana`. Maximize total attack power. Return the maximum total attack power achievable. ```python def max_attack(cards: list[dict], max_mana: int) -> int: # cards: [{"mana_cost": int, "attack_power": int}] pass ``` **Example:** ``` cards = [ {"mana_cost": 3, "attack_power": 7}, {"mana_cost": 2, "attack_power": 5}, {"mana_cost": 4, "attack_power": 8}, {"mana_cost": 1, "attack_power": 3} ] max_mana = 5 Play card 1 (cost=3, atk=7) + card 4 (cost=1, atk=3) = mana 4, atk=10 Play card 2 (cost=2, atk=5) + card 4 (cost=1, atk=3) = mana 3, atk=8 Play card 3 (cost=4, atk=8) + card 4 (cost=1, atk=3) = mana 5, atk=11 Output: 11 ``` ## Approach This is the 0/1 knapsack problem. Each card can be played at most once. ```python dp = [0] * (max_mana + 1) for card in cards: for m in range(max_mana, card["mana_cost"] - 1, -1): dp[m] = max(dp[m], dp[m - card["mana_cost"]] + card["attack_power"]) return dp[max_mana] ``` ## Follow-ups 1. What is the time and space complexity of this DP solution? 2. What if cards can be played multiple times (unbounded knapsack)? How does the DP direction change? 3. Add a constraint: at most 3 cards total. How do you modify the DP state? 4. If max_mana is 10^9, the DP table is too large. What alternative approach would you use?

## Round 1 - ML / System Design Design a movie recommendation system for a streaming platform with 10M users and 100K movies. **Approach 1 - Collaborative Filtering (user-based):** - Build a user-movie rating matrix (sparse). - For user U, find the top-k most similar users by cosine similarity of their rating vectors. - Recommend movies highly rated by similar users that U has not seen. **Approach 2 - Matrix Factorization (ALS/SVD):** - Decompose the sparse matrix into user factors (U, d) and item factors (V, d). - Predicted rating: `r_ui = U_u . V_i` - Train with alternating least squares or SGD. ```python def cosine_similarity(vec_a: list[float], vec_b: list[float]) -> float: dot = sum(a*b for a,b in zip(vec_a, vec_b)) norm_a = sum(a**2 for a in vec_a) ** 0.5 norm_b = sum(b**2 for b in vec_b) ** 0.5 return dot / (norm_a * norm_b) if norm_a and norm_b else 0.0 ``` ## Follow-ups 1. How do you handle the cold start problem for a new user with no ratings? 2. What is the difference between explicit feedback (star ratings) and implicit feedback (watch time)? 3. How do you evaluate your recommender offline before A/B testing? What metrics do you use? 4. How does your system handle a movie released today with no ratings yet?

## Problem A museum has `n` exhibits. Each exhibit has a `start_time`, `end_time`, and `value`. You can visit at most one exhibit at a time and cannot overlap. Maximize the total value of exhibits visited. ```python def max_value_visits(exhibits: list[dict]) -> int: # exhibits: [{"start": int, "end": int, "value": int}] pass ``` **Example:** ``` exhibits = [ {"start": 1, "end": 3, "value": 5}, {"start": 2, "end": 5, "value": 6}, {"start": 4, "end": 6, "value": 4}, {"start": 6, "end": 8, "value": 3}, {"start": 1, "end": 8, "value": 9} ] Option A: exhibit 0 (val=5) + exhibit 2 (val=4) + exhibit 3 (val=3) = 12 Option B: exhibit 4 alone (val=9) Option C: exhibit 1 (val=6) + exhibit 3 (val=3) = 9 Output: 12 ``` ## Approach Sort by end time. Use DP: `dp[i]` = max value using exhibits 0..i where i is included. Use binary search to find the latest non-overlapping exhibit. ## Follow-ups 1. What is the time complexity of this DP with binary search? 2. If all exhibits have equal value, does this reduce to the classic activity selection problem? What is the greedy solution? 3. What if you can visit at most `k` exhibits regardless of value — how does the DP change? 4. Extend to multiple visitors who can attend different exhibits simultaneously.

## Problem You are given a sequence of events: `{entity_id, checkpoint_id, timestamp}`. An entity "completes the passage" if it passes through all required checkpoints in the correct order (by timestamp). Given the list of events and an ordered list of required checkpoints, return all entity IDs that completed the passage, sorted ascending. ```python def completed_entities( events: list[dict], checkpoints: list[str] ) -> list[str]: # events: [{"entity_id": str, "checkpoint_id": str, "timestamp": int}] pass ``` **Example:** ``` checkpoints = ["A", "B", "C"] events = [ {"entity_id": "e1", "checkpoint_id": "A", "timestamp": 1}, {"entity_id": "e1", "checkpoint_id": "B", "timestamp": 3}, {"entity_id": "e2", "checkpoint_id": "A", "timestamp": 2}, {"entity_id": "e1", "checkpoint_id": "C", "timestamp": 5}, {"entity_id": "e2", "checkpoint_id": "C", "timestamp": 4} -- e2 skips B ] Output: ["e1"] ``` ## Follow-ups 1. What is the time complexity of your approach? 2. How would you handle an entity that passes through a checkpoint multiple times? 3. If checkpoints can be passed in any order (set membership, not sequence), how does your solution simplify? 4. How would you scale this to 100M events per day using a streaming pipeline?

What Karat Looks for in Software Engineer Interviews

Karat Software Engineer interviews are calibrated against the level and scope expected of the role. Across 31+ verified candidate reports on LeakCode, the consistent signals interviewers look for: clear problem decomposition before coding, explicit complexity reasoning, structured handling of edge cases, and the ability to articulate trade-offs between two reasonable approaches.

The discriminator between candidates who advance and candidates who do not is rarely the final correctness of the solution. It is the path to the solution: did you ask clarifying questions, did you state your approach before coding, did you handle edge cases without prompting, and did you communicate your reasoning throughout. Reports tagged "no hire" frequently cite a working solution with poor communication; reports tagged "strong hire" cite clear thinking even when the final solution was incomplete.

How To Use This Question Set

Real interview reports are a calibration tool, not a memorization target. Companies update their question pools every 2-4 months; memorizing exact problems risks misleading you when the interviewer uses a variant. The high-leverage use: identify the patterns that appear repeatedly in Karat Software Engineer reports, practice those patterns on similar (not identical) problems, and use the reports to understand the interviewer's typical follow-up depth.

Filter the questions below by round type, difficulty, and recency. Focus first on reports from the past 6-12 months; older reports may reference questions that have since rotated out of Karat's pool. Reports tagged with quantified difficulty (e.g., "medium-hard") are higher-signal than reports without difficulty tags.

Round-by-Round Expectations

Karat Software Engineer loops typically span 4-6 rounds across phone screens and on-site or virtual on-site interviews. The structure varies by company: some run 1 recruiter screen + 1 technical phone + 3-4 on-site rounds; others run 1 recruiter screen + 1 OA + 4-5 on-site rounds. The recruiter screen is logistics and culture-light; the technical phone screen is medium-difficulty coding; the on-site loop covers coding, system design (at L4+ levels), and behavioral rounds.

Each round is designed to surface a specific signal. Coding rounds: correctness, code quality, complexity reasoning, communication. System design rounds: requirements clarification, design judgment, operational thinking. Behavioral rounds: ownership scope, leadership, ambiguity tolerance, conflict navigation. Strong candidates explicitly hit each signal dimension out loud during the round; weak candidates focus only on solving the prompt.

Common Interview Mistakes At This Combination

Reports tagged "no hire" at Karat Software Engineer commonly cite: jumping into code without clarifying requirements, coding silently for 10+ minutes without verbalizing approach, missing edge cases (empty input, single element, very large input, overflow), and producing a working solution that the candidate cannot explain or refactor when probed. Strong candidates avoid these patterns by following a consistent template: clarify, verbalize approach, code with narration, test with examples.

Behavioral and design rounds have their own failure modes. Behavioral: stories that use "we" instead of "I" diluting individual signal, stories with no quantified outcome, defensiveness when probed about failure. Design: not asking clarifying questions, not stating requirements out loud, designing for a single server when the prompt clearly implies scale, ignoring operational concerns (deployment, monitoring, rollback). These show up in roughly half of Karat Software Engineer interview retrospectives on LeakCode.

See All 31 Karat Software Engineer Questions

Full question text, answer context, and frequency data for subscribers.

Get Access