Review

Claude Code after a year: an honest review

APR 18, 2026 · 7 min read

I've been using Claude Code as my primary coding environment for just over a year. Before that, Cursor. Before that, Copilot glued into VS Code with a lot of hope and a lot of tab-accepting. I didn't switch because Claude Code was hyped. I switched because I kept losing afternoons to tools that felt clever but couldn't hold a thought across more than two files.

This is the review I wish existed when I was deciding. No hype. Real tradeoffs. What I actually use it for, and what I still open Cursor for.

The workflow that actually changed my life

Before Claude Code, my prompts lived in scratch buffers, Notion, half-written GitHub issues. They'd drift. I'd re-explain the same context three times a week.

The shift was stupidly simple: I started writing prompts as .md files in the repo. prompts/refactor-auth.md. prompts/ship-billing-v2.md. Each one is a small essay about what I want, what's in scope, what's out, and what done looks like. Then I hand the file to Claude Code and it goes.

This sounds trivial. It's not. Three things happen when your prompts are files:

  1. They get versioned. I can see how a spec evolved alongside the code it produced. Git blame on a prompt file is one of the more satisfying things in software.
  2. They become reusable. The prompt I used to wire up Stripe in one project is 80% the same prompt in the next. Copy, tweak, run.
  3. They force clarity. Writing a good prompt file is the same skill as writing a good PR description. If I can't describe what I want, I can't expect the model to.

I now treat these prompt files as first-class project artifacts. I commit them. I review them. Occasionally, my teammates send PRs on my prompts before they send PRs on my code. That felt weird for about a week and then felt obviously correct.

Long context is the feature I didn't know I needed

The marketing line is "1M tokens of context" and the marketing line undersells it. The actual experience is: I can dump a 40-file directory into the conversation and have a reasonable conversation about it.

Concrete example. Last month I was untangling a migration that touched our ORM layer, three service files, a bunch of Zod schemas, and the GraphQL resolvers. Old workflow in Cursor: paste four files, ask, get a decent answer that hallucinates the fifth file I didn't paste. New workflow in Claude Code: @include the whole directory, ask the actual question, get an answer that correctly references the real signatures.

The difference is not "better answers." The difference is I stopped having to be the RAG system. I used to spend half my coding time deciding which files the model needed to see next. Now I let it read what it wants.

This is the biggest gap between Claude Code and every other tool I've used, and it's the thing that would make me switch back to Cursor tomorrow if Cursor caught up. It hasn't, yet.

Instruction following

People compare models on benchmarks. I compare them on the answer to one question: when I say "don't refactor anything I didn't ask you to," does the model refactor anything I didn't ask it to?

Claude Opus 4.7 is better at this than anything else I've tried. Not perfect. Still occasionally decides a for loop deserves to become a reduce. But if I put the instruction in the prompt file and remind it in CLAUDE.md, it sticks.

GPT-5 is close on raw capability but has this persistent tic where it "improves" code. I'd be editing a test file and GPT-5 would reach over and "clean up" an unrelated test. Not a huge deal, but it means I can't trust the diff I'm about to accept, which means I have to review every line anyway, which means I lose the speed-up I was paying for.

Where Claude Code is actually worse

Let me be specific.

The UI is not as nice as Cursor. I say this as someone who actively prefers Claude Code's terminal-first philosophy. But if you hand both to a junior engineer and ask which is friendlier, Cursor wins by a mile. The inline diff view, the multi-file highlight, the keybindings — Cursor has spent real product resources on this and it shows. Claude Code is CLI-first and that's a deliberate choice, but it is a choice with costs.

Token costs sneak up on you. My last month of Claude API bills was bigger than my last month of groceries. Part of that is that I ran some agentic experiments that don't reflect normal use. Most of it is just: when the tool is this good, you use it more. I don't begrudge the bill but I do notice it. Cursor's flat-rate plan hides this from you in a way that is, ironically, worse for your instincts.

It can over-edit. This is the biggest real complaint. If you ask Claude Code for a small change in a large file, it will sometimes return a diff that touches the small thing plus four other things it thought were tangentially relevant. On good days this is a gift. On bad days you revert half the patch and re-ask. I've gotten better at asking: "change only this function; do not touch anything else." It mostly listens.

Plan mode is still uneven. When it's good, the step-by-step plan I get back is better than the one I'd have written myself. When it's bad, it's a ten-bullet restatement of my prompt. I haven't figured out the pattern yet.

Where Cursor still wins

I still open Cursor for two things:

  1. Quick local edits in unfamiliar code. If I just need to change a CSS value in a huge component, Cursor's inline completion is faster than conversing with a model. Claude Code wins for thinking. Cursor wins for typing.
  2. Pair programming with a junior. When I'm walking someone through a bug, the visual diff and the inline comments in Cursor are genuinely the better UX. I'll switch to Claude Code when they're off the call.

These are real. I don't pretend I use one tool for everything.

Things people ask me about

"Is it worth the money?" Depends what you compare it to. Against Cursor, it's more expensive and I still think the upgrade is worth it if you're writing complex code. Against a wasted afternoon, it's nothing. Against a junior engineer's hour, it's laughably cheap. If you're shipping software for a living, the cost is not the bottleneck.

"Does it replace junior engineers?" No, and anyone who tells you it does hasn't tried to hand it an ambiguous ticket. What it replaces is the specific parts of my own job that I've always been worst at: boilerplate I've written a hundred times, test cases I can enumerate but don't want to type, migration scripts, small refactors. It makes me a better senior engineer, not a smaller team.

"Is Anthropic going to price me out?" Honestly, maybe. The model is better, the API costs more, and I expect that gap to widen before it narrows. I've built my workflow around being able to swap models, which feels prudent.

The verdict

If you're serious about shipping software and you haven't tried Claude Code yet, try it. Use it for a week. Write your next feature as a prompt file and let the model do the first pass.

If you try it for a week and feel nothing, go back to Cursor. Your instincts are probably right about your work. Not every job needs the long-context muscle, and if yours doesn't, you're paying for a feature you don't use.

For me, a year in, this is the first tool that has actually moved the ceiling of what I can ship alone. Copilot made me faster. Cursor made me faster. Claude Code made me bigger. That's the distinction that matters.

I'm not sure I'd recommend it to everyone. I'm very sure I'd recommend it to anyone who has ever had the experience of holding a whole codebase in their head and wishing they didn't have to.

That's who it's for. If that's you, you already know.