Gru.ai Ranks First in OpenAI's Latest SWE-Bench Verified Evaluation

Gru.ai

2024-09-09 22:00 1087

SAN FRANCISCO, Sept. 9, 2024 /PRNewswire/ -- On 3rd September, Gru.ai ranked first with a high score of 45.2% in the latest data released by SWE-Bench Verified Evaluation, the authoritative standard for AI model evaluation. The SWE-Bench Verified, a reliable evaluation of AI models' ability to solve real-world software issues, was a Benchmark of collaboration between OpenAI and SWE.

Bug Fix Gru, one of four agents provided by Gru, participated in the SWE-Bench Verified evaluation. According to Gru's team blog, providing Bug Fix Gru with a comprehensive operating environment and a wealth of development tools laid the foundation for their high score. Enhancements in workflow, multimodal support, and the addition of Rag capabilities effectively boosted the score. Notably, the Gru team emphasized that they have an evaluation process in place to assess the impact of any changes.

Gru.ai, a company that builds AI developers, provides four types of software engineering agents:

Assistant Gru: Helps users solve standalone technical issues, which is now in public use.
Test Gru: Generates unit test code automatically
Bug Fix Gru: Fixes bugs based on user issues automatically
Babel Gru: Assists in building end-to-end projects

Gru.ai previously secured a $5.5 million angel investment. Alongside Gru, several other firms in the sector, including Devin, Factory, Cosine.sh and Codium.ai, have also announced their funding details. As large-scale model capabilities mature, the coding agent field is experiencing a surge of investment and innovation, indicating a bright future for this evolving industry.

Source: Gru.ai

Keywords: Computer Software Computer/Electronics Artificial Intelligence

Pine AI: Empowering Consumers with Autonomous AI for Customer Service Challenges

2025-01-03 21:14

931

P&R Measurement Showcases AI-Empowered Solutions at CES 2025, Transforming the Future of Flexible Manufacturing

2025-01-03 16:49

1202

eYs3D Microelectronics Unveils Multi-Sensor Controller IC eSP936, YX9170 Spatial Perception Solution, and YX9670 Navigation Solution for Unmanned Vehicles

2025-01-02 23:00

1789

Source：eYs3D Microelectronics Unveils XINK-II, eSP936, YX9670, YX9170

Gru.ai Ranks First in OpenAI's Latest SWE-Bench Verified Evaluation

Pine AI: Empowering Consumers with Autonomous AI for Customer Service Challenges

P&R Measurement Showcases AI-Empowered Solutions at CES 2025, Transforming the Future of Flexible Manufacturing

eYs3D Microelectronics Unveils Multi-Sensor Controller IC eSP936, YX9170 Spatial Perception Solution, and YX9670 Navigation Solution for Unmanned Vehicles

MICROIP Debuts at CES 2025, Showcasing Rapid ASIC Design and AI Innovation

CES 2025: JMGO Showcases the Future of Projectors with AI-Powered 'Bright, Even in Sunlight' Innovation

Zhi: Embracing the intelligent era, elevating people's wisdom