Grant Cox

Kantipur to Unicode

I’ve updated our Flash framework here at OLE Nepal to use the Flash CS5 TLFTextfields, which can now display Unicode Nepali fonts correctly on the Linux Flash Player. This allows us to start localising some of the activities – hopefully enabling other OLPC projects around the world to use our content and to translate it into their local languages.

Up until now the activities have not been using Unicode (as the font support for Nepali ligatures on the Linux Flash Player was rubbish), instead they have been using the “Kantipur” font and a western character set. Kantipur is basically a wing-dings font – it visually remaps a standard ascii set to the various characters and accents used in the Devanagari alphabet, so it all looks like Nepali, but in the background is just ascii. The mash of western characters itself is referred to as “Preeti” – and there are a few fonts that remap in this way. Calling it wing-dings might be a little harsh – it’s basically a different character set, just without actually using a different character set (which at least would be some kind of standard)… So yeah, wing-dings.

Now I’m at the stage of converting a whole bunch of the old activities to the new framework, and the main thing to do is to update the XML configuration to use Unicode text. The in-activity text can stay as Kantipur for now (we’re not recompiling old activity SWFs at this point), but the stuff that appears in the framework itself (Activity title, summary, learning goals, help text) needs to be converted, as the new framework isn’t going to have a shred of Kantipur / Preeti in it. Now there are a few online tools that can convert Preeti to Unicode, albeit not perfectly, and our configuration files have a few little hacks to support a mix of Preeti and actual English, so all the copy + paste + convert + undo hacks was getting tedious.

So, I took the Javascript conversion code from here, cleaned it up and fixed a few bugs, and made it into a Notepad++ plugin (using the NppScripting plugin so you can have Javascript based plugins). Now I just select the existing Kantipur garbage, hit Ctrl + Shift + C, and it’s all shiny and Unicode, and a mix of English and Nepali as needed.  Tada!  I love it when a half day of work actually makes life much easier.

I’ve put this code up as a standalone page, so you can quickly convert text if you need, and the source is also available in Python, Javascript (including for Notepad++) and Actionscript on github.

Comments are currently closed.

2 Responses to “Kantipur to Unicode”