Sometime in early 2026, we crossed an invisible line: more than half of all code committed to GitHub was either generated or substantially assisted by an AI tool. Not a projection from a VC pitch deck. An actual measurement from GitHub's own telemetry.
I've been watching these numbers climb for two years, and honestly, 51% still hit different when I saw it confirmed. But the more interesting story isn't the headline — it's what happens when you look at the quality of all that machine-generated code flowing through the world's largest development platform.
The Adoption Curve Hit Escape Velocity
The raw scale is staggering. The AI coding tools market reached 12.8 billion this year, up from 5.1 billion in 2024. Seventy-eight percent of Fortune 500 companies now run some form of AI-assisted development in production. GitHub Copilot X holds roughly 37% market share, but Cursor, Windsurf (the Codeium rebrand), Amazon Q Developer, and Gemini Code Assist are all chipping away at its lead.
What changed isn't just adoption numbers — it's the fundamental shift from autocomplete to agentic workflows. Cursor's Composer, Claude Code, and Copilot Workspace don't suggest the next line. They plan multi-file changes, run tests, iterate on failures. Claude Code leads on SWE-bench Verified at 80.8%, with Copilot at 56% and Cursor at 51.7%. The pattern I see most among experienced devs: Cursor or Copilot for daily editing, Claude Code when something touches six files and a database migration.
Eighty percent of new developers joining GitHub use Copilot within their first week, per Octoverse data. That's not early adoption. That's the water supply.
The Quality Gap Is Getting Harder to Ignore
Here's where the mood shifts. CodeRabbit analyzed 470 pull requests and found that AI-authored PRs produce roughly 1.7x more issues than human-written ones. Not marginally. Almost twice as many problems per PR.
The breakdown is worse than the headline:
| Category | AI vs Human | The Pattern |
|---|---|---|
| XSS vulnerabilities | 2.74x more likely | String interpolation in unsafe contexts |
| Performance issues | ~8x more frequent | Excessive I/O, missing caching, naive loops |
| Code readability | 3x worse | Naming drift, formatting inconsistency |
| Insecure object references | 1.91x | Direct ID exposure without authz checks |
| Improper password handling | 1.88x | Plaintext comparisons, weak hashing |
| Critical issues overall | 1.4x | The category that actually wakes people up at 3am |
And the trajectory? Not improving. A study of 6,275 public repositories found unresolved technical debt from AI-authored commits climbed from a few hundred issues in early 2025 to over 110,000 surviving issues by February 2026. March alone saw 35 new CVEs directly attributed to AI-generated code — up from six in January. That's not a blip. That's a trend with a slope.
One in four AI code samples contains a confirmed security vulnerability, per an AppSec study that tested 534 samples across six major LLMs. GPT-5.2 was the "safest" at a 19.1% vulnerability rate. Llama 4 Maverick and Claude Opus 4.6 tied for worst at 29.2%. Production incidents per pull request increased 23.5% between December and early April.
The Platform Started Buckling Under Its Own Creation
Here's the irony that writes itself: the same platform measuring 51% AI code also experienced outages in early April because AI agents flooded its infrastructure. Autonomous coding tools don't just write code — they push branches, open PRs, trigger CI pipelines, poll for results, retry on failure. At scale, that traffic pattern looks suspiciously like a DDoS attack to any load balancer.
So Where Does This Leave You?
If you build software for a living, the takeaway isn't "AI code bad" or "AI code good." It's that code review just became the single most valuable skill in software engineering. Full stop.
The volume of code is exploding, but the quality floor is dropping. The developers who thrive won't be the ones generating the most output — they'll be the ones who read a diff and catch the XSS the model introduced, or spot the O(n²) loop buried inside an otherwise clean PR. That's the skill that compounds.
Automated review tooling is adapting to close this gap. Repositories using AI-assisted code review show 32% faster merge times and 28% fewer post-merge defects, per Octoverse data. The emerging stack isn't "AI writes, human reviews" — it's "AI writes, AI reviews, human makes the judgment calls." CodeRabbit, Codacy, and GitHub's own Copilot review features are all racing to be that second layer.
My honest read: we're in the awkward adolescence of AI-assisted development. The technology is clearly capable enough to generate half of GitHub — that's undeniable. But it hasn't internalized the engineering judgment that prevents the subtle, expensive bugs. The 8x performance issue multiplier isn't a model capability problem. It's a context problem. Models generate code that's locally correct and globally careless.
That gap — between syntactically valid and production-worthy — is where the next wave of tooling, and the next generation of senior engineers, will be defined.
The code writes itself now. Keeping it from catching fire is still a human job.