Multilingual support (Indic)

Jump to: navigation, search

Editor-In-Chief: C. Michael Gibson, M.S., M.D. [1]

Overview

Several pages on Wikipedia use Indic scripts to illustrate the native representation of names, places, quotes and literature. Unicode is the encoding used on Wikipedia and it contains support for a number of Indic scripts. However, before Indic scripts can be viewed or edited, support for Complex Text Layout must be enabled on your operating system. Some older operating systems do not support complex text rendering and you should not use such systems to edit Indic scripts.

This page lists the methods for enabling complex text rendering based on the operating environment or browser you are using. Many of the methods highlighted can be used for non-Indic complex scripts such as Arabic.

Check for existing support

The following table compares how a correctly enabled computer would render the following scripts with how your computer renders them:

Script Correct rendering Your computer
Bengali File:Examples.of.complex.text.rendering.Bengali.png ক + িকি
Devanagari File:Examples.of.complex.text.rendering.Devanagari.png क + िकि
Gujarati File:Examples.of.complex.text.rendering.Gujarati.png ક + િકિ
Gurmukhi File:Examples.of.complex.text.rendering.Gurmukhi.png ਕ + ਿਕਿ
Kannada File:Examples.of.complex.text.rendering.Kannada.png ಕ + ಿಕಿ
Malayalam File:Examples.of.complex.text.rendering.Malayalam.png ക + െകെ
Oriya File:Examples.of.complex.text.rendering.Oriya.png କ + େକେ
Tibetan File:Examples of complex text rendering Tibetan.png ར + ྐ + ྱརྐྱ
Tamil File:Examples.of.complex.text.rendering.Tamil.png க + ேகே
Telugu File:Examples.of.complex.text.rendering.Telugu.png య + ీయీ

If the rendering on your computer matches the rendering in the images for the scripts, then you have already enabled complex text support! You should be able to view text correctly in that script. However, this does not mean you will be able to edit text in that script. To edit such text you need to have the appropriate text entry software on your operating system.

Platform Independent support on Mozilla Firefox

Indic IME, a plugin for Firefox 1.0+ can help you write in many indian languages in your webpages. It is easy to install and works on all platforms where Firefox or other Mozilla-based browsers are running.

The Indic IME toolbar project was started to address the need of typing in Indian Languages in Web Forms, Emails, Blog, Search Boxes etc.

Padmas, a plugin for Firefox 2.0+ converts several Indic fonts to Unicode. This helps several popular Indian vernacular websites to render correctly, without the need for any additional font installation.

Windows 95, 98, ME and NT

These operating systems contain no inbuilt support for Indic scripts. Indic Scripts can only be seen properly in Internet Explorer. You also need to have a appropriate unicode font installed in your system for that script. It is suggested to install Internet Explorer 6.0 because it has better support for Indic scripts.

Mozilla Firefox does not support Indic scripts properly on these operating systems unless a modified version of the program is used, such as the one found here. This is due to a bug in Firefox [2], [3]. This bug is now removed in Firefox 3 Alpha. But Firefox 3 does not support Windows 98/ME.

No Unicode Keyboard Driver Engines (Like Indic IME, BarahaIME etc) are available for these older systems. One can either use online typing tools or offline text editors specially made for this purpose. A list of such tools is given here.

Windows 2000

Supports: Devanagari, Tamil

Complex text support needs to be manually enabled.

File:Install indic languague windows2000.PNG
This is where we setup for Indic options for Windows 2000

Viewing Indic text

  • Go to Start > Settings > Control Panel > Regional Options > General [Tab].
  • In the "Language settings for this system" frame, check the box next to "Indic".
  • Copy the appropriate files from the Windows 2000 CD when prompted.
  • If prompted, reboot your computer once the files have been installed.

If you don't have the Windows CD or don't want to juggle with CD right now, you can simply download this zip file and extract its contents to a folder. When prompted for Windows CD, simply point to this folder using 'Browse' option of the prompt window.

Inputting Indic text

You must follow the steps above before you perform the remaining steps.

  • Select "Input Locale" [Tab].
  • Click the "Add" button in the "Installed input locales" frame.
  • Select the desired language in the "Input Locale" drop-down box on the "Add Input Locale" dialogue box.
  • Now select the appropriate keyboard you wish to use.
  • For the people who are not able to use the above InScript Keyboard, They can use the Phonetic keyboards from Baraha. Baraha Direct included in Baraha Package supports both ANSI & Unicode while BarahaIME supports only Unicode.
  • For people who cannot download the above software, or for people on the move, dboard is an Indian language sandbox which provides an online virtual (visual) keyboard, you can use the following application, copy the text on the clipboard and then copy it back to the Wikipedia editing box.

Windows XP and Server 2003

File:Regional and Language Options Windows XP 2003.PNG
This is where we install Complex Scripts in Windows XP & 2003

Supports: Bengali (XP SP2), Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam (XP SP2), Tamil, Telugu

Complex text support needs to be manually enabled.

Viewing Indic text

  1. Go to Start > Control Panel.
  2. If you are in "Category View" select the icon that says "Date, Time, Language and Regional Options" and then select Regional and Language Options".
  3. If you are in Classic View select the icon that says "Regional and Language Options".
  4. Select the "Languages" tab and make sure you select the option saying "Install files for complex script and right-to-left languages (including Thai)". A confirmation message should now appear - press "OK" on this confirmation message.
  5. Allow the OS to install necessary files from the Windows XP CD and then reboot if prompted.

Inputting Indic text

Windows XP have inbuilt InScript Keyboards for nearly all Indian languages. You can add them via Control Panel. You must follow the steps above before you perform the remaining steps.

  • In the "Regional and Language Options", click the "Languages" tab.
  • Click on the "Details" tab.
  • Click the "Add" button to add a keyboard for your particular language.
  • In the drop-down box, select your required Indian language.
  • Make sure the check box labelled "Keyboard layout/IME" is selected and ensure you select an appropriate keyboard.
  • Now select "OK" to save changes.

You can use the combination ALT + SHIFT to switch between different keyboard layouts (e.g. from a UK Keyboard to Gurmukhi and vice-versa). If you want a language bar, you can select it by pressing the "Language Bar..." button on the "Text Services and Input Languages" dialog and then selecting "Show the language bar on my desktop". The language bar enables you to visually select the keyboard layout you are using.

  • For the people who are not able to use the above InScript Keyboard, there are some other Keyboard Drivers available. For Phonetic typing BarahaIME is suggested and for Remington typing IndicIME is suggested.

Baraha is Phonetic based software and includes nearly all of Indic languages. Baraha Direct included in Baraha Package supports both ANSI & Unicode while BarahaIME supports only Unicode.

  • Indic IME 1 (v5.0) is available from Microsoft Bhasha India. This supports Hindi Scripts, Gujrati, Kannada and Tamil. Indic IME 1 gives the user a choice between a number of keyboards including Phonetic, InScript and Remington.

If you do not have Windows CD, there is a modified version of the installer for Hindi named Hindi Toolkit which automatically installs Indic Support as well as Hindi Indic IME.

  • For people who cannot download the above software, or for people on the move, dboard is an Indian language sandbox which provides an online virtual (visual) keyboard, you can use the following application, copy the text on the clipboard and then copy it back to the Wikipedia editing box.
  • MyMyanmar Projects provide MyMyanmar Unicode System to input Myanmar(Burmese) text.[1]

Windows Vista

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Sinhala, Tamil, Telugu, Tibetan

Complex text support is automatically enabled.

Viewing Indic text

You do not need to do anything to enable viewing of Indic text.

Inputting Indic text

Windows Vista like Windows XP have inbuilt InScript Keyboards for nearly all Indian languages. You can add them via Control Panel.

For Phonetic typing BarahaIME is suggested and for Remington typing IndicIME is suggested.

Mac OS 9 and earlier

The Indian Language Kit, available from Apple at additional cost,[4] provides support for Devanagari, Gujarati and Gurmukhi. No third-party Unicode solutions are known, though numerous custom-encoded fonts exist.

Mac OS X

Viewing Indic text

You do not need to do anything to enable viewing of Indic text as long as you use Safari or most other Cocoa applications, which fully support rearrangement and substitution for AAT-based fonts. Firefox up to version 2.0 does not support Indic script rendering at all because it does not use ATSUI (Firefox renders little rectangles instead). Opera also provides some support, although considerable bugs remain as of version 9.2 (though Opera at least renders the glyphs).

Carbon software such as Microsoft Word, Adobe Photoshop and their siblings do not generally support Indic scripts, due to broken or non-existent ATSUI implementations.

Inputting Indic text

Specific keyboard layouts can be enabled in System Preferences, in the International pane. Switching among enabled keyboard layouts is done through the input menu in the upper right corner of the screen. The input menu appears as an icon indicating the current input method or keyboard layout — often a flag identified with the country, language, or script. Specific instructions are available from the "Help" menu (search for "Writing text in other languages").

Mac OS 10.4 system software comes with two installable Keyboard input options for Tamil: Murasu Anjal and Tamilnet 99. One needs to do the following steps to activate them:

i) Open "international" located within System Preferences and select "language". Select the "edit list", select "Tamil" from the list of languages shown and click OK.

ii) Select "input menu" to see a list of keyboard options available. Select "Anjal" and "Tamilnet99" keyboards under Murasu Anjal Tamil and Click OK.

iii) Anjal and Tamilnet99 keyboard icons appear immediately in the list of keyboards to select under the country flag in the top menu bar.

An alternate way to activate the keyboard(s) for Devanagari (Hindi etc.):

i) Open "International" located within System Preferences and select the "Input Menu" tab. (ii) Check the option for "Devanagari" and/or "Devanagari - QWERTY". (iii) Check the "Show input menu in menu bar" option at the bottom of the "International" panel. Close the panel, and the new keyboard(s) should be available for selection when you click on the menu bar icon (upper right corner).

SIL distributes a freeware Ukelele that allows anyone to design their own input keyboard for Mac OS X.

GNOME

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Tibetan

Viewing Indic text

You do not need to do anything to enable viewing of Indic text in GNOME 2.8 or later. Older versions may have support for some, but not all Indic scripts. Ensure you have appropriate Unicode fonts for each script you wish to view or edit.

Some web browsers may require you to enable Pango rendering to view Indic text properly.

  • For Epiphany, Pango rendering can be enabled in GConf. Press Alt+F2 to bring up the Run Application dialog, then enter gconf-editor and click Run. The Configuration Editor window will appear. In the left pane, unfold appsepiphany and click the web section. In the right pane, check the box next to the enable_pango option, then restart Epiphany.
  • When using Mozilla or Firefox, you can enable Pango rendering by opening xterm and typing MOZ_ENABLE_PANGO=1 mozilla or MOZ_ENABLE_PANGO=1 firefox. After this, all future sessions of Mozilla or Firefox will have Indic language support.
    • This will work only on Firefox compiled with --enable-pango. Only the firefox binaries supplied by Fedora Core 4 and 5, Ubuntu Linux, and Kate OS are compiled with this build option.
    • For Ubuntu 6.06, this support has been turned off due to speed issues. To enable support, you must type MOZ_DISABLE_PANGO=0 firefox. Future sessions do not remember this setting, so it must be repeated.
    • For SUSE 10.1 you have to add the "MOZ_ENABLE_PANGO=1″ to your .profile to make the effect permanent.
      1. Go to your home directory, then edit the .profile file -it is a hidden file.
      2. Scroll down to the last line of the file and add: export MOZ_ENABLE_PANGO=1
      3. Save the .profile file. Restart for the effect to take place
    • The easiest way to check whether --enable-pango was used in your copy of Firefox is to type about:buildconfig in the address bar and to look for the string (--enable-pango).

Inputting Indic text

  • Go to Applications > Preferences > Keyboard.
  • Select the "Layouts" tab.
  • Select the keyboard for the language or script you wish to use from the "Available Layouts" frame and then press "Add".
  • Press "Close" to discard the dialogue box.
  • Right click on the main menu on your desktop and select "Add to Panel...".
  • Select "Keyboard Indicator" and click "Add".
  • Position the keyboard indicator on your menu bar and click it to switch between keyboard layouts.

KDE

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu.

Viewing Indic text

You do not need to do anything to enable viewing of Indic text. Ensure you have appropriate Unicode fonts for each script you wish to view or edit.

Inputting Indic text

  • In the Control Center, go to Regional & Accessibility, Keyboard Layout
  • In the tab Layout, click on Enable keyboard layouts
  • Choose the layout you want in Available layouts
  • Click on Apply
  • Now, you will have an icon for the KDE Keyboard Tool in your panel, in which you can choose the layout you want

Debian Based GNU/Linux Distributions

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, Tibetan, Punjabi.

Viewing Indic text

Enter as root:

apt-get install ttf-indic-fonts

and when the installation is complete restart the X server.

For Tibetan script:

apt-get install ttf-tmuni

For Mozilla and Firefox, see the comments above under "gnome". Rendering should work correctly "out of the box" as of Debian-4.0 (etch).

Fedora Core 6 and Fedora 7 Linux Distribution

Supports: Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu,Punjabi among others.

Installing Indic Fonts

For example, to install Kannada fonts, Simply enter as root on the console and type in the command:

yum install fonts-kannada

This will download the Kannada fonts from the repositories and install it.

Similarly, for Hindi, say, enter as root on the console and type in the command:

yum install fonts-hindi

Keyboard Support for Indic texts

Start the Add/Remove software applet. For example in KDE, say, navigate to System and then Add/Remove software. In the applet window, select Languages on the list box to your left hand side. In the right hand side list box, select the Indian languages of interest to you.

For example, to have Kannada key board support, check the box for Kannada Support. Similarly, for Hindi support, say, check the box for Hindi Support.

It has observed that for Kannada, Fedora not only puts in Kannada keyboard support, but also provides transliteration support and also the keyboard support for KGP (Kannada Ganaka Parishad) keyboards. With this feature, users can directly type in Kannada words in Roman script to be transliterated to Kannada text in the application of your choice. For example into your browser, text editor, document editor, email client etc. Users can also use native Kannada keyboards, KGP based or otherwise to type in Kannada texts directly.

Gentoo Linux

Supports: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.

Installing Indic fonts

emerge fonts-indic

(The mozilla-*-bin products shipped by gentoo are directly taken from mozilla's ftp servers and aren't built with pango support. Unless you notice a problem with this you need to compile your own version with USE-"-moznopango". Firefox 3 will be shipping with pango enabled by default)

Inputting Indic text

emerge -av scim-tables scim-m17n

Study the USE flags and the LINGUAS flags and set them accordingly depending on your desktop environment and language support needed. The following needs to be set whenever you login (append it to your .xinitrc or .xsession).

export XMODIFIERS=@im=SCIM    #case matters for this variable!
export GTK_IM_MODULE=scim
export QT_IM_MODULE=scim

Mozilla apps and precompiled software such as acroread might not play well with scim (C++). In such cases, make use of scim-bridge (C - avoiding C++ ABI issues) [5].

emerge scim-bridge

and startup firefox as:

% GTK_IM_MODULE=scim-bridge firefox

You might have to start the scim daemon manually. (Add it your session's startup)

scim -d

SCIM is a unified frontend for currently available input method libraries.

Unicode OpenType fonts

This section lists OpenType fonts, supported by Microsoft Windows and most Linux distributions. For AAT fonts (required for the Apple Macintosh), see the Mac OS X section above.

If you have followed the instructions for your computer system as mentioned above and you still cannot view Indic text properly, you may need to install a Unicode font:

Department of Information Technology, India has provided Unicode Indic fonts for most of the Indian languages.

WAZU JAPAN's Gallery of Unicode Fonts is an excellent resource for all Indic scripts.

References

External links



Linked-in.jpg