Blog Post

Prmagazine > News > News > Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America
Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America

Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America

Latam-GPT is new Large language models are being developed in Latin America. The project, led by the National Centre for Artificial Intelligence (CENIA), aims to help the region achieve technological independence by developing open source AI models for language and environment training in Latin American.

“A group or country in Latin America cannot be done by just one group or country: it’s a challenge,” Cenia Director Álvaro Soto said in an interview with Cable Enspañol. “Latam-GPT is a project aimed at creating open, free and, most importantly, collaborative AI models. We have been working on a bottom-up process for two years, bringing citizens from different countries who want to work together. Recently, it has also seen some more top-down initiatives with governments interested in and starting to participate in the project.”

The project stands out for its collaborative spirit. “We don’t want to compete with OpenAI, DeepSeek or Google,” Soto explained. “We want to have a model for Latin America and the Caribbean, and they are aware of the cultural requirements and challenges that this brings, such as understanding different dialects, the history of the region, and the unique cultural aspects,” Soto explained.

Thanks to 33 strategic partnerships with Latin American and Caribbean institutions, the project collects data from more than eight texts, equivalent to millions of books. This information basis enables the development of a language model with 50 billion parameters, the scale comparable to GPT-3.5 and gives it moderate to high ability to perform complex tasks such as reasoning, translation and association.

LATAM-GPT is being trained in a regional database that compiles information from 20 Latin American countries and Spain, with a total of 2,645,500 files. The distribution of data shows that the concentration of the largest countries in the region is high, with Brazilian leaders having 685,000 documents followed by Mexico with 385,000, Spain with 325,000, Colombia with 220,000 and Argentina with 210,000 documents. The numbers reflect the size of these markets, their digital development, and the availability of structured content.

“Initially, we will launch a language model. We want its performance in general tasks to be close to that of large business models, but with excellent performance in specific topics in Latin America. If we ask about topics related to our region, then its knowledge will be deeper,” Soto explained.

The first model is a starting point for future development of a family of more advanced technologies, including technologies with images and videos, as well as extensions to larger models. “Since this is an open project, we want other institutions to be able to use it. A group in Colombia can adapt to the school education system, or in Brazil one can adapt it for the health sector. The idea is to open the doors for different organizations to generate specific models for specific areas such as agriculture, culture, etc.” Cenia director explained.

Source link

Leave a comment

Your email address will not be published. Required fields are marked *

star360feedback Recruitgo