OpenAI’s New GPT-4.1: Do the Pros Outnumber the Cons?

Thursday April 17, 2025. 12:14 AM , from eWeek

OpenAI announced on Monday the release of GPT-4.1, the newest successor to its GPT-4o series of AI language models. According to OpenAI, GPT-4.1 excels at instruction following, coding, and long-context processing.
The GPT-4.1 large language model brings improved features and enhanced coding capabilities to the next generation GPT-4o model lineup. The GPT-4o series includes the GPT-4.1 mini and GPT-4.1 nano models. OpenAI confirmed the mini will receive supervised fine-tuning support. In addition, OpenAI stated the company plans to retire the GPT-4.5 Preview model in the API.
GPT-4.1 will not be available through the consumer ChatGPT interface; it is accessible exclusively via the developer API.
Standout features of GPT-4.1
The GPT-4.1 AI model is trained to support developers through its coding and instruction following capabilities, enhancing agentic workflows and accelerating productivity.
GPT-4.1 retains the same API capabilities as the GPT-4o model family, like structured outputs and tool calling. Additionally, it has improvements compared to previous models, including:

Longer context window: GPT-4.1 supports up to one million tokens, about four times more than GPT-4o’s capability, allowing it to ingest and understand more context in a single interaction.
Improved coding: The model can process complex technical and coding problems with greater consistency and accuracy.
Enhanced instruction following: GPT-4.1 is optimized for following detailed instructions because it is more intuitive and collaborative.
Updated knowledge base: GPT-4.1 has a more recent “knowledge cutoff” (up until June 2024), providing a more optimal timeframe for current events.
Higher benchmark scores: GPT-4.1 scored 54.6% on the SWE-bench Verified industry coding benchmark, significantly improving from GPT-4.5’s 38.0% score.
Better video comprehension: When using Video-MME to measure the model’s ability to understand video content, GPT-4.1 reached an impressive 72% accuracy on the “long, no subtitles” video category.

At $2 per million input tokens and $8 per million output tokens, GPT-4.1 offers reasonable competitive performance for its price, based on OpenAI’s internal testing. These evaluations of GPT-4.1 covered a range of benchmarks, which are detailed in OpenAI’s announcement.
Drawbacks of GPT-4.1
OpenAI acknowledged some of GPT-4.1’s drawbacks, including:

Its tendency to require more specific prompts compared to GPT-4o.
The model becomes less reliable as the number of input tokens it has to process increases. On one OpenAI-MRCR test, the model’s accuracy decreased from around 84% with 8,000 tokens to 50% with 1 million tokens.

OpenAI competitors
OpenAI is one of many tech companies that are training AI coding models to aid in complex software engineering tasks. OpenAI’s industry competitors are similarly working toward developing their own sophisticated programming models, including Anthropic’s Claude 3.7 Sonnet, Google’s Gemini 2.5 Pro, and DeepSeek’s upgraded V3.
The post OpenAI’s New GPT-4.1: Do the Pros Outnumber the Cons? appeared first on eWEEK.