Meta opens access to its large language model to AI researchers

Meta said the scientific community must be able to work together to advance AI research and research vulnerabilities.

Facebook parent company Meta is sharing its large language model that contains 175 billion parameters trained on publicly available datasets, making it available to AI researchers.

The social media giant said it shared access to pre-trained models and the code needed to train and use them. He added that this will allow “more community engagement in understanding this fundamental new technology”.

“Access to the model will be granted to academic researchers, those affiliated with government organizations, civil society and academia, and industrial research laboratories around the world,” Meta AI said in a statement. . blog post yesterday (May 3)

Large language models are natural language processing (NLP) systems that are trained on a massive volume of text. These models are able to answer reading comprehension questions, solve basic math problems, and generate text.

Meta said that full access to research on large language models is typically limited to “a few well-resourced labs,” hampering efforts to increase their “robustness” and eliminate problems such as bias and toxicity within models.

“For AI research to advance, the wider scientific community must be able to work with state-of-the-art models to effectively explore their potential while probing their vulnerabilities at the same time,” the company said.

“Meta AI believes that collaboration between research organizations is essential to the responsible development of AI technologies.”

The social media company said it designed its model – called OPT-175B – to be energy efficient and was trained using around 14% of the carbon footprint used for training. OpenAI’s GPT-3 model.

Meta also said it was releasing a suite of “smaller-scale reference models”, trained on the same dataset and using similar parameters as the OPT-175B.

Meta has been investing in AI research for some time. In February, the company shared some of the AI ​​research projects it is focusing on, including universal speech translation, AI that can learn like a human, and a more conversational AI assistant.

In January, Meta also revealed that its AI research team had been working for years on a supercomputer that could be the “biggest and fastest” in the world when fully built, which it hoped to achieve by by mid-2022.

Meta isn’t the only company looking at big language models. Last October, tech giants Microsoft and Nvidia teamed up to create a language model with 105 layers and 530 billion parameters, three times as many parameters as OpenAI’s GPT-3.

10 things you need to know straight to your inbox every weekday. Sign up for the brief dailythe summary of essential science and technology news from Silicon Republic.

James G. Williams