Jalaj

March 20, 2007

How’s Hindi Transliteration in Blogger

Filed under: Ajax, Blog, Blogging, Blogs, Google, Indian Languages, News, Technology, Web, Writing — Jalaj @ 10:16 am

I have been using transliteration for over 10 years and it’s the only way I can type in Hindi language, for I hate learning another keyboard layout. Now that Google recently added transliteration facility to blogger, Let’s see where does Google’s implementation stand in comparison to others.

My first experience with transliteration was with C-DAC’s Leap Lite. Then got to use the online transliteration at rediffmail.com. ITRANS received a overwhelming support from the vast community that powers the sanskritdocuments.org, earlier Sanskrit.gde.to. The above site, to my knowledge, holds the honor of having largest collection of text encoded in a single encoding scheme ITRANS. Other softwares as Baraha and Takhti hold their own share of devoted users. Now Google has entered the scene with initial implementation on Blogger.

All these transliteration services/softwares have been using the Basic transliteration schemes (to name it) where each devnagari (or other language script) character is represented by a Roman character (or a series of characters). Among all transliteration schemes I find C-DAC the best of all as it’s the outcome of a thorough research on the Indian languages and existing texts in different languages.

Let’s dig into the Google’s implementation on blogger. It goes into two layers of transliteration and one layer of manual editing.

The foremost layer is the advanced transliteration which works on the main editor screen where each word you type makes an AJAX call. The text you type here is searched in the database for matching words and up to five matches are sent as response. While the first match is used to replace the word you typed, the others are available when you click on the transliterated word from which you can select if the result obtained differs from that intended. If the text you intended to type is available in none of the matches, you can click on the “Edit…” option which will take you to the “Edit Mode”, which is the next layer of transliteration. The text in the advanced transliteration is treated case-insensitive, and since the matches come from dictionary, you are prevented from any error arising as with case-sensitive transliteration softwares. However the problem here is that you don’t always get intended results as shown in image below, where the intended text appeared in first option and not as default text.

gtrans1.jpg

The “Edit Mode” brings to you the Basic Transliteration as also available with other softwares mentioned above. Google has defined it’s own set of encoding scheme available at http://help.blogger.com/bin/answer.py?answer=58228 for use in “Edit Mode”. The Case-sensitive strings are transliterated as per the supported encoding scheme. The “Edit Mode” also guides with possible keystrokes, however these guides also may not be always helpful as with the text from Hanuman Chalisa as shown in image below. Whenever you feel helpless in this mode you can switch to the third layer, which is the manual editing or “On screen keyboard”, which can be started by clicking the keyboard icon by the side.

gtrans2.jpg

The “On screen keyboard” allows you to click on each character to type it and also guides for what should the keystroke would have been if you did the editing in “Edit Mode”.

gtrans3.jpg

The disadvantage with the last two layers is that they work in word mode and disappears as soon as you finish with a word, so for most of your work you have no choice other than the “Advanced Transliteration”, which has it’s own disadvantage. If you have a dialup connection for internet and choose to close the connection while you type, transliteration will stop for words other than those already transliterated, showing you error as below. And what could be worse that the other two modes also don’t show up (though responsible code still exist offline) for new words.

gtrans4.jpg

Now for how good the bi-lingual interface works. If you are going for a complete Hindi or complete English page then everything’s fine and you can just enable or disable respectively the transliteration and go on. But if your go for a hybrid document as I did for Khushwant Singhs poem “Hindi bolna hamari duti hai”, you will have a tough time as the interface doesn’t keep track of which words were type in Hindi/English. You try to add a Hindi text besides an English one and you are prone to transliterate the English word too…

gtrans5.jpg

The interface allows you to paste bulk text, and transliterate all of them (max 300 words) by selecting the text and then pressing the Transliteration button. This facility though works, required me pasting 3-4 times and subsequent icon press to achieve this. This doesn’t seem to be a browser problem as I have had same experience on two different locations and OSs.

So that was all about what Google has in store for you with the transliteration facility, and below are my suggestions for what could have made a better user experience.

1. There may be an additional icon for turning off the Advanced Transliteration, which can be helpful for those who are comfortable with Basic Transliteration and find Advanced Transliteration irritating when too many wrong matches show up. This will also be helpful when user chooses to close the net connection and restart when the content is ready for posting.

2. At least the “On screen keyboard” mode may be made to allow multiple words to be keyed in without causing the on-screen keyboard to disappear at the end of word.

3. The Bulk transliteration facility may be made to function without explicitly pressing transliteration key for it, if the editor is already in Transliteration mode.

4. The interface may be made to remember words that were keyed in while out of transliteration mode to prevent their accidental transliteration.

Futher, a similar interface on Gmail will be boon for the users outside the Blogger’s Park…

9 Comments »

  1. can you tell me how i can download this software

    Comment by Amar — April 2, 2007 @ 1:35 pm

  2. You cannot download the software. You can only use it if you have a blogger account. I hope Google will, in future, extend this feature to all its services including Google Search. Let’s pray.

    Comment by Jalaj — April 5, 2007 @ 4:05 am

  3. “…I have been using transliteration for over 10 years and it’s the only way I can type in Hindi language, for I hate learning another keyboard layout….”

    Wow!, then I must request you to start a technical blog in Hindi.

    Welcome to the Hindi Blog world. Have a glimpse at-
    http://narad.akshargram.com/

    There aren’t many - and we need more Hindi blog, specially technical…

    Comment by raviratlami — April 21, 2007 @ 9:03 am

  4. I know it enabled the users to type in hindi pretty easily.
    But still dont you think this a kind of derogative to use english to write ‘hindi’, Hindi is a independent language like german, french etc, all of which can be written without using transliteration, then why hindi needs it? Shouldnt we think about something better than this transliteration thing???

    Comment by amit — April 27, 2007 @ 8:37 pm

  5. Hi Amit,
    Yes that’s a kind of derogative, and I’m sure you would feel more lowly if you visit this page from Google where all country names are given in their own national languages other than our’s (You would be able to view most names if you have “Arial Unicode MS” font installed. It’s not Google’s fault, instead mirrors what we are up to. We find pride in calling our country “India” instead of “भारत”. It’s a matter of Political debate and we should leave it here…

    About using transliteration, I feel it’s the easiest way for a person not deeply involved in typing in Hindi and willing to use existing skills of typing in English. Those agreeing to learn a layout should go for learning Inscript. Old layouts as remington etc would not be fruitful in this age of unicode (though Microsoft IME supports a couple of them too)

    Comment by Jalaj — May 2, 2007 @ 5:27 am

  6. thats True, lot of places you see things like this, like orkut does not have hindi or sanskrit as to be listed in languages while creating a profile. I think what happened because of our ignorence should be forgotten and now we should try to recover back the damage it caused by contributing however small, like asking google/orkut to include and make the changes. Now we are more aware and more empowered and web is going to be the most powerful communication media going forward this can recover the damage pretty faster than we might imagine. Also we have to give something to our coming generations from what we received from our forefathers. anyway it was good visiting your site and wish you all the best. if you want to know more about me mail me at amitshuk//g mail,com, i will be happy to know about you as well as I cudnt get your profile anywhere on the site.

    Comment by amit — May 2, 2007 @ 7:02 pm

  7. Pretty good and interesting piece on Blogger’s transliteration facility.
    You might also want to add the easter egg they put there - try copy-pasting Hindi content from other websites and click on the pasted words.
    Also your 4-point suggestions on what they should do further are very incisive.

    Comment by krolb — May 5, 2007 @ 4:28 pm

  8. sir i want to develop english to hindi word conveter plz help me.
    thanks
    anshuman bhagat

    Comment by Anshuman bhagat — November 3, 2007 @ 8:33 am

  9. by “word convertor” do you mean something like dictionary? if you mean somthing like Google’s Transliterator than that would require a lot of explanation!

    Comment by Jalaj — November 5, 2007 @ 11:48 am

RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.