Why It's Essential To Grow Indian-Language Wikipedias

By Praveenp - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=12684320

Indian Wikipedia 10th anniversary graphic by Praveenp (CC BY-SA 3.0)

On January 15, 2016, Wikipedia, the free online encyclopaedia, celebrated its 15th birthday, meeting this milestone with 36 million articles in more than 290 languages (the English-language Wikipedia alone has crossed the 5-million-article mark). But here I want to address some major questions that Indians need to ask of themselves. First, what is the state of Indian-language Wikipedia projects? And what does India have to take from and give to Wikipedia?

With the growth of free and open source software in India, people are equipped with more freedom than ever. Especially with the recent federal policy-level changes, the nation is enjoying better collaboration with people of different cultures speaking different languages.

However, there is a huge gap in the access to knowledge on the internet. Of a population of about 1.26 billion, only about 15-18% people are online, largely through mobile devices. The technical community comprises a tiny fraction of this population. It would be useful to have a metric on the percentage of this community's contribution to growing the languages of this country and its cultural heritage.

Wikipedia's family

Wikipedia is not just an encyclopaedia. It is also part of a “family” comprising several other Open Knowledge members. Wikipedia itself is available in over 290 languages, but it also has other multilingual sister projects such as Wikisource (an online library of many public domain and other important texts), Wikimedia Commons (the world's largest repository of media files and documents), Wikibooks (a free library of educational textbooks), Wikivoyage (a free and open travel guide) and Wiktionary (a database of various languages). These projects don't just house millions of images, videos, documents and texts—they allow anyone to contribute their knowledge to this ever deepening pool of information. Four Indian languages made an early entry to the Wiki-world back in 2002 — Assamese, Malayalam, Odia and Punjabi. Some people might not have noticed that the “en” in the URL of Wikipedia that denotes the language code of English, could be altered with “or” for Odia Wikipedia or “pa” for Punjabi Wikipedia.

A map of dying Indian languages.Source: UNESCO Interactive Atlas of the World's Languages in Danger.

A map of dying Indian languages.Source: UNESCO Interactive Atlas of the World's Languages in Danger.

Language neutrality

According to UNESCO, 197 of the 1,652 Indian languages are dying, despite having a long literary and linguistic heritage. That's quite a shocking statistic. In a blog post on content localisation, social entrepreneur Rajesh Ranjan asks if free and open source software can help save these dying languages. In the context of Wikipedia, there are already 23 South Asian-language projects. Out of these 20 are languages listed in the 8th schedule of the Constitution of India, signalling the government's obligation to developing these languages.

Most Wikipedia projects in Indian language projects are relatively small compared to their counterparts. But the Wikimedia communities are thriving. While only parts of government websites are available in Hindi, Hindi Wikipedia has already crossed 0.1 million articles. The Tamil and Malayalam Wikipedia communities have played a central part in implementing Wikipedia basics learning as part of the state-run school syllabus. These communities have also played a significant role in popularizing free and open source software by pushing for policy-level change. Many Indian languages are in the pipeline to become active Wikipedia projects under the scope of the Wikimedia Incubator.

Maithili Wikipedia and Goan Konkani Wikipedia are two Indian-language Wikipedias that have gone live in recent years. The world has seen how digital activism has brought a new life to languages such as Hebrew, and there are a large number of native speakers waiting out there to access knowledge in their own languages. Wikipedia could be a great tool for supporting digital activism with openness and sharing.

The percentage of female Wikipedia editors lies in the range of about 9%, says a survey by Wikimedia Foundation in 2011. (Source: Wikimedia Commons)

The percentage of female Wikipedia editors lies in the range of about 9%, says a survey by Wikimedia Foundation in 2011. PHOTO by User Goran tek-en (Source: Wikimedia Commons)

Addressing gender bias in Wikipedia: Implications for India

India tops South Asia in the gender inequality index. The literacy rate for women, however, is alarmingly low at 65.46%, compared to 82.14% for men. This disparity is evident in many other sectors of society, as well as in politics.

But gender bias is not a problem only in India. The global free and open source software (FOSS) community has long been concerned about the low participation by women—they make up 2-5% of contributors. Wikimedia Foundation's former executive director admitted that Wikipedia, like many other collaborative and open projects, does not have a conducive environment for women. But the Wikimedia community and Wikimedia Foundation are both working on improving this state of affairs. Indian-language Wikipedia projects are part of this global drive, with initiatives such as the Women's History Month edit-a-thons, or the Lilavati's Daughters project, where biographies of Indian women scientists were created and enriched in Wikipedia projects.

Complementing Digital India

India, with its population of 354 million netizens, still has a long way to go towards increasing Indian language content on the web. The government's Digital India campaign aims at digital literacy and making digital resources/services available in Indian languages. This is closely aligned with the Wikimedia movement's goal of providing free access to the sum of all human knowledge.

In addition to Wikipedia, many other open educational resources and free knowledge projects that are not already a part of the Digital India campaign signal the need for the federal-run campaign to be more collaborative and open. Community-government collaborations like the NROER project to make National Council of Educational Research and Training books available under Creative Commons licenses, and the IT@School project in the state of Kerala to provide education using free and open tools, have gained massive traction, and helped build the volume of Indian language content online.

A version of this article previously appeared on The Huffington Post.


  • catinthehat

    Agree! India has a tremendously rich linguistic history. Range of grammatical structures. Real loss if these languages die. There’s a need for something like a National Linguistics Foundation to protect, promote and save languages– to bring people together in this effort.

  • Islamic Origin Language Hindi

    And yet our Muslim Prime Minister Narendra Modi refuses to speak to Indian using an Indian languages, and uses the Pakistani language Hindi instead. The conspiracy to convert India into a Muslim country through the imposition of the Muslim country outsourced Pakistani language Hindi needs to come to an end. WIkipedia is an example of language neutrality, whereas the Pakistani language speaking Hindian govt is not.

Join the conversation

Authors, please log in »


  • All comments are reviewed by a moderator. Do not submit your comment more than once or it may be identified as spam.
  • Please treat others with respect. Comments containing hate speech, obscenity, and personal attacks will not be approved.