OpenAI is racing to generate better code to dominate the software industry

Artificial intelligence systems are moving ahead of human programmers, as OpenAI and other tech groups roll out tools that can write, fix, and explain code, reshaping the way software is built.

San Francisco‑based OpenAI released new versions of its language model line this week, indicating that independent tests rank those models among the best for coding.

The firm says GPT‑4.1, ,o3 and o4‑mini solve tough coding tasks more often than earlier software because the last two models are allowed extra “reasoning” time to think through a query.

On Wednesday, OpenAI also unveiled Codex CLI, a free command‑line helper, calling it an AI agent that taps the same models to speed up day‑to‑day coding chores.

The moves mirror pushes from Anthropic, Google, Meta, and a host of start‑ups, all betting that code generation is one of the clearest early jobs for large language models.

Leaders at these firms say the focus on programming is “one of the most tangible examples” of how the technology could transform whole industries, with thousands of developers already using the new systems at work.

“This is the year . . . that AI becomes better than humans at competitive code forever,” OpenAI chief product officer Kevin Weil said on the Overpowered podcast this week. He likened the shift to the moment computers beat chess champions, but said the new breakthrough matters more “on the world if everybody can create software”.

LLM models are becoming better at spotting patterns in code

Developers now feed a few plain sentences into a prompt and receive whole blocks of working code. Industry figures say LLMs already speed software work by generating blocks of code from short instructions. The same systems scan for bugs and try fixes before a human sees the output.

Over the past twelve months, the models have grown better at spotting patterns, reasoning about a problem, and laying out a logical answer. In 2023, AI could crack only 4.4 percent of issues in the industry benchmark known as SWE‑bench. This year, success jumped to 69.1 percent.

GitHub, the Microsoft‑owned coding hub, says 92 percent of U.S. developers now rely on AI coding aids.

Competition is spreading. Meta launched Code Llama last year, using text prompts to talk through and write code. Anthropic followed in February with Claude Code. Mike Krieger, chief product officer at Anthropic, predicted the engineer’s role will increasingly involve “understanding the requirements, working as a team, and figuring out that what you built was actually the right thing to build”.

“It is more about advocating for your idea,” he added, describing future programmers as “a puppet master or an orchestra conductor” guiding these AI agents.

Cryptopolitan Academy: Want to grow your money in 2025? Learn how to do it with DeFi in our upcoming webclass. Save Your Spot

Open the app to read the full article

You Might Be Interested In

LLM models are becoming better at spotting patterns in code