How To Prepare for a Site Reliability Engineer Interview [Part 1]

I've worked as an SRE for over five years now. While working as an SRE I've interviewed hundreds of candidates as well as interviewed myself dozens of times. Going through this process so many times I've become very familiar with where candidates usually falter, and would like to share my roadmap for getting to an offer.

1.  The Interview Process

The interview process for most companies follows a set workflow:

  • Recruiter reaches out to candidate via LinkedIn, email, etc. The Recruiter typically looks for a 30 minute block of time to discuss the candidate's work experience. If there's interest, candidate will get passed along for the first round technical phone screen.
  • 1st Technical Phone Screen - almost always this is going to be a coding exercise. The expectation is that good SRE candidates be proficient with solving leetcode-style problems. In smaller companies where there is usually not a dedicated SRE team, the interviewer almost always will be a member of the software engineering team, and you will almost always be held to the same hiring bar as other software engineers.
  • 2nd Technical Phone Screen - sometimes, a second technical phone screen will be scheduled, either covering other SRE domain-specific information or another leetcode problem.
  • Onsite (Virtual or otherwise) - the onsite will be scheduled with multiple members of either the development team (smaller companies) or the SRE/DevOps team (larger companies). Depending on the maturity of the interview process, this can be more leetcode problems, domain-specific questions related to system design or scaling services, troubleshooting a simulation, or other derivatives. Expect a meeting with the hiring manager where they'll ask you about projects that you've worked on and some behavioral questions.
  • Follow-up with Hiring Manager - for very senior roles there may be yet another meeting with the hiring manager to ensure expectations and fit are appropriate.

At the top of the hiring funnel we have the greatest number of candidates. At each step in the process we eliminate a high percentage of candidates, with typically the rejection rate going down as we progress through the funnel. Since you want to improve your chances of progressing, effort should be spent on passing technical phone screens.

2. Preparing for Coding Technical Phone Screens

If your work experience is geared heavily in software engineering, the technical phone screen is usually fairly easy. If you're like many people who are coming to SRE from a sysops background, it can be more of a struggle, as you likely don't have as much day to day experience writing software.

The technical phone screen interview (coding) is kind of a fake exercise.

Opinions may differ on this, but the majority of signal from the technical phone screen is measuring how many leetcode problems a candidate has worked on. There is some correlation with being good at these problems and being a proficient software engineer, but usually we're not trying to do a clever mathematical abstraction in under 45 minutes in a real-life setting. There is a reason why stackoverflow exists.

That said, this is the game, and if you want to progress through the funnel you'd best get good at playing it.

how do I get good?

Do lots of problems.

There's no other way around it. You sign up for LeetCode and start working on problems. I'd recommend paying for premium - but it's not necessary.

Start solving some problems. Use Python if you're not a software engineer already. The problems that you'll encounter in a phone screen are generally going to be between an easy and medium difficulty. I find it helpful to start out going through easy problems and learn some tricks and patterns before progressing to more difficult problems.

As you're progressing through problems, work on them for 30-45 minutes. If you haven't solved the problem by then, look at the answers. You'll find the solutions along with an explanation beside the problem if you paid for premium, but you can also go look in the attached problem discussion for a solution. Often the implementations and explanations provided in the discussion section are easier to understand.

Read the answers, then go back and try to work through the logic of the solution by manually typing the code in.

Another approach for doing LeetCode is to use some of their paid tracks to progressively be instructed on an algorithm or problem solving technique. For technical phone screens it's less common to need to solve a particularly difficult problem. (Unless you're going for a top-tier company, in which case, yes, it will be difficult.)

Either way, the best way to get more proficient at the problems is to:

  1. see lots of problems
  2. start looking for patterns between the problems since you've done so many

For most companies they'll add tweaks or small variations to the problems that they ask that can either make the problem more difficult (or, in some cases, easier), but pattern recognition from seeing lots of problems as well as having rote memorization of "how to do xyz" is the crucial thing to acquire from this routine.

3. Preparing for Operations Technical Phone Screens

If the company is bigger it's possible they'll do another "follow-up" technical phone screen, often geared to some technical problems. You can expect these types of problems to be similar to "What happens when you curl Google?."

GitHub - alex/what-happens-when: An attempt to answer the age old interview question “What happens when you type google.com into your browser and press enter?”
An attempt to answer the age old interview question "What happens when you type google.com into your browser and press enter?" - GitHub - alex/what-happens-when: An attempt to answer the ...

The pivot to this type of question is supposed to catch "those kids who only know how to LeetCode," but hilariously neglects the fact that the question space is dramatically more narrow than LeetCode; anyone with a bit of foresight can read though that GitHub link and provide a solid answer (and possibly fill in a few gaps of their own knowledge at the same time).

Outside of these types of questions you could expect generalities like:

  • How would you monitor a web service?
  • What tools could you use to do [task]?
  • What does the /proc/ directory do?

For these, work experience usually makes them simple to answer, otherwise just be familiar with Linux.

conclusion

The technical phone screen is the more difficult hurdle to surmount when trying to get to a job offer. Spend time preparing here and you'll obtain more opportunities to interview in the onsite round, where your performance usually depends on other factors.

To be continued in the next part.