1 result found
Current AI coding benchmarks measure one-shot performance but ignore quality erosion from repeated edits. This oversight could lead to unmaintainable codebases at scale.