BEIJING, March 11, 2024 /PRNewswire/ -- 01.AI, a Beijing-based generative AI unicorn founded by global AI thought leader Dr. Kai-Fu Lee, announced the successful development of a new type of vector database based on a fully navigatable graph. 01.AI's vector database, named "Descartes," has topped the rankings in six data set evaluations on the mainstream ANN-Benchmarks.
In the Large Language Model (LLM) technology stack, a vector database is an important component in connecting external information with LLMs. Descartes vector database will serve at the layer of 01.AI's leading AI infrastructure software that will be used in AI products and also made available to developers as a tool in the future.
In the offline tests per the global evaluation platform ANN-Benchmarks, 01.AI's Descartes vector database achieved the top score in six data sets, with significant performance improvements over the other industry players. Some benchmarks even show twice of performance improvements.
Vector Databases Emerge as Infrastructure for GenAI, Gaining Capital Attention
Vector databases, also known as information retrieval technologies of the AI era, are one of the core technologies of retrieval-augmented generation (RAG). With the expanded capabilities of LLMs, the volume of multimodal unstructured data such as images, videos, and texts has increased exponentially, differing from traditional databases used to handle structured data. Vector databases are specifically designed to store, manage, query, and retrieve vectorized unstructured data. They function as an external memory disk that can be called upon by LLMs at any time to form a "long-term memory." For developers of large model applications, vector databases are a crucial infrastructure that, to a certain extent, affects the performance of large models.
LLMs have four commonly known deficiencies that vector databases are able to address:
The generative AI technological and platform transformation further strengthens the role of vector databases. Products from tech giants such as Google, Microsoft, and Meta have been introduced, and startups like Zilliz, Pinecone, Weaviate, and Qdrant have emerged. In 2023, Pinecone, a partner of OpenAI in vector databases, completed a Series B funding round of $138 million, and Fabarta ArcNeural, a Chinese startup, also completed a Pre-A round of around $15 million.
01.AI's Vector Database Achieves Top Rankings on ANN-Benchmarks, Paving Way to Advanced RAG
01.AI's Descartes vector database has achieved first place in all six data set evaluations of the ANN-Benchmarks which showcasing the performance of different algorithms across various real-world datasets.
In the benchmark graphs, six evaluation datasets covering glove-25-angular, glove-100-angular, sift-128-euclidean, nytimes-256-angular, fashion-mnist-784-euclidean, and gist-960-euclidean, and the horizontal axis represents recall. The vertical axis represents QPS (queries per second, the number of requests processed per second). The higher a curve is towards the upper right corner, the better the algorithm's performance. The 01.AI Descartes vector database ranks highest in all six datasets.
"Throughput QPS" is a crucial metric for evaluating the query processing capabilities of information retrieval systems (such as search engines or databases). Based on the original top 1 on the benchmark, the 01.AI Descartes vector database has achieved significant performance improvements, exceeding twice on some datasets, and a lead of 286% over the original top 1 on the gist-960-euclidean dataset.
RAG (Retrieval-Augmented Generation) is a technology that combines retrieval and generation, enhancing the generative capabilities of LLMs by retrieving information from vast amounts of data. Similar to traditional retrieval methods, at its core, RAG vector retrieval primarily addresses two issues:
reducing the set of candidates to be examined during retrieval by establishing certain indexing structures, and reducing the complexity of computing individual vectors.
The 01.AI Descartes Vector Database demonstrates significant comparative advantages in handling complex queries, enhancing retrieval efficiency, and optimizing data storage compared to the industry. Addressing the first issue, the 01.AI team has two key strategies:
Full Stack Vector Technology: Higher Precision, Faster Performance
With the support of the full stack of vector technology, the 01.AI Descartes Vector Database also exhibits core advantages of higher precision and stronger performance in actual application scenarios.
Currently, the focus of 01.AI's Descartes Vector Database is on high-performance vector databases. High-performance vector databases typically refer to vector datasets with a scale of tens of millions or less (such as 20 million 128-dimensional floating-point vectors). Generally, high-performance vector databases can easily handle 80%-90% of daily scenarios, such as helping enterprise customers build private domain knowledge bases and intelligent customer service systems. In the field of autonomous driving, high-performance vector databases can be used to accelerate model training for autonomous cars.
The 01.AI high-performance vector database has the following advantages:
Take a large platform e-commerce recommendation scenario as an example. The number of items on the shelf can be in the tens of millions, with each item expressed by a vector. Even if the number of vectors in the database is limited, it will encounter performance issues processing a high number of user requests per second during peak hours, reaching hundreds of thousands or even millions of QPS. Using a high-performance vector database can effectively improve the recommendation effect in the search and advertising businesses of the e-commerce scenario, making everyone unable to resist constantly shopping.
01.AI's Descartes Vector Database is the team's first launch of RAG technology stack. The new vector database capabilities will be effectively applied in the company's AI productivity consumer product launching soon. It will also be made available to developers as a tool in the future.