🧠 AI Coding Assistants Face Off: Claude 3.7, Claude 3.5, Copilot, GPT-4o, DeepSeek R1, and Gemini 2.5 — Real Project Comparison with .NET 9
In this one-hour deep-dive, I evaluated the practical performance of various AI coding assistants using a real-world development setup:
A .NET 9 API project based on Jason Taylor’s Clean Architecture template.
Here's how I tested each model:
GitHub Copilot (Claude 3.5 Sonnet): Started with some minor bug fixing in Visual Studio. Copilot handled basic syntax fixes well, but it showed clear limitations when faced with architectural or deeper logic problems.
Cursor + Claude 3.7 Deep Thinking: This was the most "senior-like" assistant. I gave it the task of building a full “Book” feature (entities, DTOs, commands, validation, controller, etc.) and it completed everything in ~10 minutes.
When a migration error appeared, instead of debugging endlessly, Claude 3.7 generated a working migration script manually.
It even adjusted password strength requirements in the registration logic when asked.
Everything worked perfectly when tested via Swagger UI. Truly impressive.
VS Code + Claude 3.5 Sonnet: I gave it a “Car” feature task. However, when a migration error occurred, the model spent over 25 minutes stuck, trying to fix it but without ever looking at the actual root of the problem (a method inside Program.cs).
It tried weird, junior-like tricks and didn’t consider generating a migration file manually. This hurt its reliability.
Cursor + Claude 3.5: Tried the same “Car” task here. The code worked, but didn't follow the Clean Architecture properly—it felt rushed, inconsistent with the original structure.
Cursor + DeepSeek R1: Tried to generate migration and feature, but failed to integrate into the project’s structure. It even generated mismatched controller and layering.
Gemini 2.5 Pro in Agent Mode: Didn’t interfere with the code meaningfully. It mostly watched instead of acting. Not effective.
GPT-4o in Cursor: Surprisingly weak. It created unrelated code, injected repository logic into the controller, and didn’t respect the project structure.
Final Thoughts:
Best performer: Claude 3.7 (Deep Thinking) via Cursor
Most persistent (but inefficient): Claude 3.5
Most disconnected: GPT-4o and Gemini 2.5
Least structured: DeepSeek R1
This video is for anyone curious about real-world AI code generation, especially in a professional software architecture like Clean Architecture with .NET 9.
00:00 - Bug Fixing with GitHub Copilot in Visual Studio
02:35 - API Structure & Project Review
05:00 - Book Feature Implementation (Cursor + Claude 3.7 Sonnet Deep Thinking)
11:33 - Migration Fix (Cursor + Claude 3.7 Sonnet)
16:17 - Register Password Strength Adjustment (Claude 3.7)
20:00 - Testing Book Feature via Swagger (Success)
25:10 - Car Feature Development (VS Code + Claude 3.5 Sonnet)
30:19 - Migration Issue Begins (Claude 3.5)
44:30 - Misleading Fix Attempt in Program.cs (Claude 3.5)
46:50 - 25 Minutes of Failed Migration Attempts (Claude 3.5)
50:35 - Car Feature by Cursor + Claude 3.5 (Bad Architecture)
59:20 - DeepSeek R1: Irrelevant Migration + Strange Structure
01:00:40 - Gemini 2.5 Pro Agent Mode: No Effective Interaction
01:01:11 - GPT-4o: Weird Output, Wrong Layered Approach
01:05:00 - DeepSeek R1 Final Attempt: No Meaningful Intervention
01:05:56 - En
コメント