Annie Rajan’s recently developed spell checker for Konkani ensures that your written communication is accurate and error-free

RAMANDEEP KAUR

As part of the research for her ongoing Ph.D programme (Machine translation tool for Konkani to Hindi) from the University of Mumbai, Annie Rajan was in need of a spell checker in Konkani. She realised quickly that there was no such tool available.

So, the associate professor of Dhempe College of Arts & Science, Miramar, decided to build one. “Having a spell checker for a language is the need of the hour as it is an elementary linguistic tool required,” says Rajan, who then came across a corpus for morphological analysis developed by Dr. Shilpa Dessai for her Ph.D. work. “She was kind enough to share the corpus for the initial development of the spell checker. On getting the corpus (i.e the data set), the data was cleaned and processed as per the requirement of the spell checker,” says Rajan.

A major challenge was the identification of correct words as suggestions, a process that can usually be simplified if the phonetic structure of a language is referred to. “We have seen phonetics of Mangalorean Konkani but we did not find the same for Goan Konkani on any online platform. Head of the Department of Indian Language at Dhempe College Anju Sakardande volunteered to do the validation of the suggested words and was of great help in the development of the tool,” she says.

While working on any computational linguistic tool, she says a few parameters need to be taken care — the corpus used should have comprehensive correctly spelt words, the morphological dictionary should be regularly updated, and the spell checker should be fast and efficient. “Since we were building a Konkani spell checker from scratch, we looked into all these basic requirements so that it works efficiently,” she says, adding that an email address has been provided for user feedback. The spell checker is designed keeping in mind the multiple scripts the language has. Researchers working on other Konkani scripts can contribute to the corpus for a spell checker in those scripts so that a spell checker for various Konkani scripts can be built, says Rajan.

The algorithm for the tool is written using Python language and the website is built using the Django framework for Python. All the words are referred from one dictionary in XML format. This tool does not work on a database, so the search for the word is much faster than the traditional method.

The website is targeted at anyone who uses Konkani and types in Devanagari script on his mobile or laptop/desktop. Going forward, the challenge is to add new words into the corpus and check if the words generated are correct. “A few days ago, I read about a dictionary for library words being developed. These words will also be added at a later stage,” says Rajan, adding that the whole task is an ongoing process which needs patience, commitment, and constant attention to what is happening in the domain of Konkani as a language.

“I am thankful to my M.Phil. guide Dr. Jyoti Pawar, a professor at Goa Business School, Goa University, for introducing me to natural language processing, and my Ph.D. guide Dr. Ambuja Salgaonkar, who encouraged me in developing this computational linguistic tool, even though it was not a part of my Ph.D. work,” she says.

While the online tool is now public and can be accessed at www.konkanispellcheck.com, Rajan is now in the process of writing a paper about it.