Lessons from 5 Years in Software Engineering

2026-02-10

Engineering

Five years in software, starting out as a fullstack engineer and gradually moving toward backend. I didn't plan the transition, it just happened naturally. I found myself more drawn to the infrastructure side, distributed systems, APIs, and what happens under the hood. By the time I was a few years in, backend was where most of my focus and most of my mistakes lived. Here's what stuck with me.

Understand the problem before picking the technology

Early in my career I'd reach for whatever was exciting. Kafka for everything. Redis everywhere. gRPC because it felt serious. It took a few production incidents to realize that every technology you add is something your team has to operate, debug, and understand at 2am when something breaks.

Before introducing anything, I ask: can a plain PostgreSQL query do this? More often than I expected, the answer is yes.

Design for failure, not the happy path

The happy path is easy to build. The real work is handling everything else: timeouts, queue backlogs, slow database queries, third-party services going down.

I've seen services that had no retry logic, no circuit breakers, and no fallback behavior. When something upstream hiccupped, they just… died. Building for failure isn't pessimistic, it's just realistic.

Observability actually matters

You can't fix what you can't see. I used to underestimate how much time gets lost just figuring out what happened when something goes wrong. Good logs, metrics, and traces cut that time dramatically.

One rule I try to follow: log at boundaries (incoming requests, outgoing calls, and errors) and not everywhere. Logging too much is its own problem; it buries the useful signal.

Boring APIs are good APIs

Consistent field names, predictable error codes, versioning from day one. That's it. The goal is that someone reading your API contract never has to guess.

I've inherited APIs that returned 200 OK for error responses. I've seen endpoints where the same field had three different names depending on which route you called. It's more common than it should be.

Read the code you depend on

Not all of it, but the parts that matter for your use case. I've spent hours debugging issues that could've been avoided if I'd just read how the library actually worked instead of assuming it matched the docs. This goes especially for connection pools, ORM internals, and anything that does retries on your behalf.

The best code is the code you don't write

Every line is something that can break, something that needs to be read by the next person, something that might need to change. The solutions I'm most happy with are usually the ones where I ended up deleting more than I added.

Always have a fallback

This one I learned the hard way. No matter how reliable you think a dependency is, whether it's a third-party API, an internal service, or a cache layer, assume it will fail at some point. What does your system do when it does?

Sometimes a fallback is returning cached data. Sometimes it's a degraded response that still gives the user something useful. Sometimes it's just failing fast and clearly instead of hanging for 30 seconds before timing out. The specific answer depends on the context, but having no answer is never the right choice.

The worst incidents I've been part of were ones where one thing went down and took five other things with it, purely because nobody had thought about what happens when that dependency is unavailable. A bit of thinking upfront saves a lot of pain later.

None of this is groundbreaking, and I knew some of it before I started. But there's a difference between knowing something and having actually felt the pain of ignoring it. The second kind sticks around longer.