Join our daily and weekly newsletter for the latest updates and exclusive content on industry-leading AI coverage. learn more
Qodoan AI-powered code quality platform formerly known as Codium, has been announced for release Qodo-Embed-1-1.5bThis is a new open source code embedding model that provides state-of-the-art performance while being significantly smaller and more efficient than competing solutions.
To enhance code search, search and understanding, the 1.5 billion parameter model achieved top results in industry benchmarks, outperforming the larger models of OpenAI and Salesforce.
For enterprise development teams that manage large and complex code bases, Qodo’s innovation represents a leap in AI-driven software engineering workflows. By enabling more accurate and efficient code retrieval, Qodo-Embed-1-1.5b solves a key challenge in AI-assisted development: context awareness in large-scale software systems.
Why code embedding models are important for enterprise AI
Traditionally, AI-driven coding solutions focus on code generation, and large language models (LLMSs) have attracted attention for their ability to write new code.
But, as Qodo CEO and co-founder Itama Friedman explained in a video call interview earlier this week: “Enterprise software can have tens of millions (if not hundreds of millions of dollars) of lines of code. Code generation alone is not enough – you need to ensure that the code is of high quality, works properly and integrates with the rest of the system.”
Code embedding models play a crucial role in AI-assisted development by allowing the system to effectively search and retrieve relevant code snippets. This is especially important for large organizations where software projects span millions of lines of code across multiple teams, repositories, and programming languages.
“The background is the king of anything that is now related to model building software,” Friedman said. “Specifically, to get the right context from a very large code base, you have to go through some search mechanism.”
Qodo-Embed-1-1.5b provides performance and efficiency
Qodo-Embed-1-1.5b stands out due to its balance of efficiency and accuracy. Although many state-of-the-art models rely on billions of parameters, for example, OpenAI’s text Embedding-3 Large has 7 billion parameters, Qodo’s model achieves excellent results with just 1.5 billion parameters.
On the Code Information Retrieval Benchmark (COIR), this is an industry-standard test for code retrieval for multiple languages and tasks, with Qodo-Embed-1.5b scored 70.06, outperforming Salesforce’s SFR-Embedding-2_r (67.41) and Openai’s Text-Embed-Embedding-3-3-Large (65.17).

This level of performance is critical to businesses seeking cost-effective AI solutions. The model has the ability to run on low-cost GPUs, allowing a wider range of development teams to access advanced code retrieval, reducing infrastructure costs while improving software quality and productivity.
Solve the complexity, nuance and specificity of different code segments
One of the biggest challenges in AI-powered software development is that similar-looking code can have very different features. Friedman illustrates this with a simple but influential example:
“One of the biggest challenges with embedding code is that two nearly identical features, such as ‘extract’ and ‘deposit’, may differ only in terms of plus or minus signs. They need to be near vector space, but also noticeably different.”
A key issue with embedding models is ensuring that functionally different codes are not grouped together incorrectly, which can lead to major software errors. “You need an embedded model that is enough to understand the code well to get the right context without introducing similar but incorrect features, which can lead to serious problems.”
To address this problem, Qodo developed a unique training method that combines high-quality synthetic data with realistic code samples. The model is trained to identify subtle differences in code that are functionally similar to ensure that when developers search for relevant code, the system retrieves the correct results, not just similar results.
Friedman noted that the training process was conducted in partnership with Nvidia and AWS, both of which are writing technical blogs about Qodo methodology. “We collected a unique dataset that simulates the exquisite properties of software development and fine-tuned the model to identify these nuances. That’s why our model outperforms the universal embedding model of the code.”
Plans for multi-programming language support and future expansion
The Qodo-Embed-1.5b model has been optimized for the 10 most commonly used programming languages, including Python, JavaScript, and Java, and provides additional support for the long tail of other languages and frameworks.
Future iterations of the model will be extended to this foundation, thereby further supporting enterprise development tools and other languages.
“Many embedded models have difficulty distinguishing programming languages, sometimes mixing summary of different languages,” Friedman said. “We specially trained our model to prevent this model, focusing on the top 10 languages used in the enterprise’s development.”
Enterprise deployment options and availability
Qodo is making its new model widely accessible through multiple channels.
The 1.5B parameter version is available on the hug surface under the OpenRail ++ -M license, allowing developers to freely integrate it into their workflows. Businesses that require additional features can access larger versions under a commercial license.
For companies looking for a fully managed solution, Qodo offers an enterprise-level platform that automates automation of updates as the code base grows. This solves a key challenge in AI-driven development: ensuring that search and retrieval models remain accurate over time.
Friedman believes this is a natural step in the Qodo mission. “We are releasing Qodo as a first step. Our goal is to continuously improve on three dimensions: accuracy, support for more languages, and better handling of specific frameworks and libraries.”
In addition to embracing the face, the model will also be available through NVIDIA’s NIM platform and AWS Sagemaker Jumpstart, making it easier for enterprises to deploy and integrate them into their existing development environments.
The Future of AI in Enterprise Software Development
AI-driven coding tools are rapidly evolving, but the focus is on bringing code generation beyond code understanding, retrieval and quality assurance. As enterprises integrate AI more deeply into their software engineering processes, tools such as Qodo-Embed-1-1.5b will play a crucial role in making AI systems more reliable, efficient and cost-effective.
“If you’re a developer at Fortune 15,000 companies, it’s not just about using adverbs or cursors. You have workflows and internal plans and need a deep understanding of large code bases. That’s where high-quality code embedding models become essential,” Friedman said.
Qodo’s latest model is a step forward in the future, and AI can not only help developers write code, but also help them understand, manage and optimize it in a complex, large-scale software ecosystem.
For enterprise teams looking to leverage AI for smarter code search, retrieval and quality control, Qodo’s new embedding model provides a compelling, high-performance alternative for larger, more resourced solutions.
Source link