Hero Image

India to develop its own foundational model for AI: Report

The government is planning to develop its own version of a foundational model for artificial intelligence ( AI ). According to a report by Economic Times, the proposed AI foundational model will be customised for use by Indian companies, entrepreneurs, academics and researchers.

Citing people aware of the matter, the report states that the union government has earmarked an outlay of Rs 2,000 crore for the ambitious project that is likely to be launched after the ongoing parliamentary elections.

India’s foundational model may be led by the IndiaAI Innovation Centre to be set up by the ministry of electronics and information technology under the Rs 10,000 crore IndiaAI Mission, the sources said.

“The government will likely tap eminent higher education institutes and prominent researchers working on AI in the private sector to work on foundational model,” a senior official said. It could be a large action model (LAM) or large multimodal model (LMM) so that the output can be used for a wide range of applications and services, he said.

What are foundational models?
Foundational models, as explained by an Amazon Web Services page, are a form of generative artificial intelligence (generative AI). They generate output from one or more inputs (prompts) in the form of human language instructions.

“The needs and specific demands of India are very different from other companies globally. This (foundational model) will aim to provide output in more than one native language, borrowing from all the work that has been done so far on projects such as Bhashini,” a senior official said.

A foundational model can be developed by both private companies and public governments. According to Stanford Center for Research on Foundation Model data , more than 330 foundational models have been developed by private companies as well as governments till date.

Govt to use public data to train the foundational model

The report states that the Indian government plans to use publicly available data, digitised records of books, journals and research papers from public libraries to train the model. It will also use any other anonymised non-personal data that is volunteered either by companies, startups or researchers.

“There are very obvious privacy concerns as well as copyright issues that come with data (used to train foundational models). So historically accurate data from books that are peer-reviewed, scientific research journals can be utilised. We may also look at a platform exclusively for Indian startups where non-personal and anonymised data can be volunteered for training of the model,” an official cited in the ET report said.

In addition, the foundational model will also be trained on global publicly available datasets and open-source tools for machine learning.