Code Review in the AI Era: What Senior Engineers Should Look For
The Code Review Game Has Changed
The Code Review Game Has Changed
Code review has always been one of the most important practices in software engineering. It catches bugs, spreads knowledge, maintains quality, and builds team culture. But AI-generated code is fundamentally changing what reviewers need to focus on.
I've reviewed thousands of pull requests over my career. In the past year, a growing percentage of the code I review was partially or fully generated by AI. And I've had to adapt my review process significantly — because AI-generated code has a different failure profile than human-written code.
Here's what I mean: a junior developer might write code with obvious syntax errors, misuse APIs, or forget basic error handling. Those are easy to catch. AI-generated code rarely has these issues. Instead, it has subtler problems — code that looks correct, passes tests, but has hidden issues that only surface in production or at scale.
This article is about what senior engineers should look for when reviewing AI-assisted code.
The New Failure Patterns
Pattern 1: Confidently Wrong
AI code is never uncertain. It doesn't add comments like "I'm not sure if this is the right approach." It writes code with the same confidence whether it's implementing a well-documented API or hallucinating a function that doesn't exist.
I've seen AI generate code that calls array.findLast() in a codebase targeting an older browser that doesn't support it. The code was syntactically perfect. It passed TypeScript compilation because the tsconfig targeted a newer version. It even passed unit tests running in Node. It just didn't work in the actual browser.
What to look for: Don't trust confidence. Verify that APIs, functions, and methods actually exist in your target environment. Check import statements — is the AI importing from a package version that's different from what you have installed?
Pattern 2: Plausible but Inefficient
AI often generates code that works correctly but performs terribly. It might use a nested loop where a hash map would reduce complexity from O(n^2) to O(n). It might create new objects inside render cycles in React. It might make synchronous database calls where a batch operation would be appropriate.
The code passes every test. It handles every edge case. It's just slow when your dataset goes from 100 items to 100,000.
What to look for: Think about the data profile. How many items will this actually handle? What's the expected load? Does the AI's approach scale to your real-world numbers?
Pattern 3: Missing Context
AI doesn't know that your team decided last month to deprecate a particular pattern. It doesn't know that the user service is being migrated to a new API next sprint. It doesn't know that this component needs to work with the upcoming redesign.
AI generates code based on patterns it sees in your codebase and its training data. But it lacks the institutional knowledge that exists in your team's heads, Slack channels, and planning documents.
What to look for: Does this code align with the team's current direction? Is it using patterns we're moving away from? Does it account for upcoming changes that only a human would know about?
Pattern 4: Over-Engineering
AI loves to abstract. Ask it for a simple function, and you might get a full-blown strategy pattern with dependency injection. It's not wrong — it's just unnecessary for a function that will only ever have one implementation.
I've seen AI generate a factory pattern for creating error responses when a simple function would do. The code was technically correct, well-structured, and completely over-engineered for the use case.
What to look for: Is this code simpler than it needs to be? (Rarely.) Is it more complex than it needs to be? (Often.) Apply YAGNI aggressively to AI-generated code.
Pattern 5: Copy-Paste Logic
AI sometimes generates similar-looking code blocks that should be a shared utility. It doesn't have the same instinct for DRY that an experienced developer has — or rather, it sometimes applies DRY where it shouldn't and ignores it where it should.
What to look for: Look for repeated patterns in AI-generated code. Could this be extracted into a utility? Is there already a utility in the codebase that does this?
The Updated Code Review Checklist
Here's the checklist I use for reviewing AI-assisted code. It supplements, not replaces, your existing review process.
Correctness Layer
- [ ] Does the code actually solve the stated problem, or a subtly different one?
- [ ] Are all API calls using correct endpoints, methods, and parameters for YOUR API?
- [ ] Are there imported packages or functions that don't exist in your project?
- [ ] Do the types match your actual data shapes, not hypothetical ones?
- [ ] Are error messages accurate and helpful for debugging?
Performance Layer
- [ ] What's the time complexity? Is it appropriate for your data size?
- [ ] Are there unnecessary re-renders, re-computations, or re-fetches?
- [ ] Is there N+1 query potential in database operations?
- [ ] Are large objects being created inside loops or render functions?
- [ ] Would this cause performance issues at 10x your current scale?
Security Layer
- [ ] Is user input validated and sanitized?
- [ ] Are there SQL/NoSQL injection vectors?
- [ ] Does the code expose sensitive data in logs, responses, or error messages?
- [ ] Are authentication and authorization checks in place?
- [ ] Are there SSRF risks from AI-generated HTTP calls?
Architecture Layer
- [ ] Does this follow the team's current patterns and conventions?
- [ ] Is it using patterns we're actively moving away from?
- [ ] Does the abstraction level match the use case (not over/under-engineered)?
- [ ] Are there existing utilities that this code should use instead of reimplementing?
- [ ] Does it handle the integration points with adjacent systems correctly?
Maintainability Layer
- [ ] Could another developer understand this code without AI context?
- [ ] Are variable and function names descriptive and consistent with the codebase?
- [ ] Is the error handling strategy consistent with the rest of the application?
- [ ] Are edge cases handled or explicitly documented as out of scope?
- [ ] Would this code be easy to modify when requirements change?
How to Give Feedback on AI-Generated Code
Reviewing AI-generated code also changes how you give feedback. Here are principles I follow:
Focus on the "why," not the "what." Instead of saying "change this to use a Map instead of an object," explain why: "A Map is better here because we're using non-string keys and need insertion-order iteration."
When the developer used AI to generate the code, they might not understand why a change is needed. Your job as a reviewer is to teach, not just correct.
Ask about intent. "What was the goal of this function?" is a powerful question. If the developer can't explain what the AI-generated code does, that's a red flag. You should understand every line of code you submit, regardless of who or what wrote it.
Don't review the tool, review the output. Whether code was written by a human, generated by AI, or copied from Stack Overflow, the review standard is the same. The code either meets your quality bar or it doesn't.
Be specific about AI-related risks. If you spot a pattern that's common in AI-generated code (like hallucinated imports or over-abstraction), name it. "This looks like the AI may have hallucinated this API — can you verify it exists in our version of the library?"
The Time Investment Shift
Here's the uncomfortable truth: reviewing AI-assisted code takes more time per line, not less. AI-generated code requires deeper scrutiny because it lacks the telltale signs of human error that experienced reviewers use as shortcuts.
When a human writes code, you can often predict where bugs will be based on the complexity and the developer's experience level. AI code is unpredictable — it might be perfect for 50 lines and subtly wrong in line 51.
The good news: there's less code to review per PR overall if the AI is handling boilerplate, and the high-scrutiny code tends to be the logic-heavy parts where your review adds the most value anyway.
Invest the time. The cost of a bug in production always exceeds the cost of thorough review.