Incremental search/replace - Katakana alphabet in Japanese text

Incremental search/replace - Katakana alphabet in Japanese text

8
NewbieNewbie
8

    Sep 09, 2015#1

    THE BACKGROUND:
    I have some Japanese text that uses only the Katakana alphabet and I have a bitmap font system to display it. However, the bitmap font system does not support Unicode or even UTF-8. I can only use Windows-1252 encoded ASCII which is obviously no use for displaying Katakana.

    THE TASK:
    I have a text file that has each character (and character combinations) of the Katakana alphabet per line. What I would like to do is create a look-up table and then use it to perform a search/replace in the main text.
    I want to keep the standard ASCII characters from space (32) to Z (90). So, starting with ASCII 91 and going up I need each Katakana character to have an alternate ASCII variant. Then I want to use that look-up to search/replace my main Katakana text file and replace it with valid ASCII characters.

    Back in the bitmap font system I will read (for example) the ASCII character å and know that it should be replaced by the Katakana character オ

    I don't even know if what I am trying to do is possible and if it is, I don't know where to start.
    Any suggestions gratefully received.

    Thanks,
    Craig

    6,686585
    Grand MasterGrand Master
    6,686585

      Sep 09, 2015#2

      I don't really understand what you want to do.

      There are the Wikipedia articles about Katakana and Japanese writing system. When not using Unicode (UTF-8 is also Unicode) for Japanese text, often the Windows code page 932 is used although not recommended for Japanese text, see also Windows Codepage: 932 (Japanese Shift-JIS).

      I don't know much about Japanese text editing and display. But I know that for Japanese text display a font must be used which starts with @ in name like @Arial Unicode MS which is designed for vertical text like Japanese. In case of Japanese text file is not a Unicode file, it is necessary to open in UltraEdit View - Set Font, select an appropriate font like @Arial Unicode MS and select Japanese for Script which means selecting in font the code page 932. Next the code page for active file should be also set with View - Set Code Page or via the encoding selector in status bar to 932 (in case of a later conversion to Unicode).

      UltraEdit has View - Views/Lists - ASCII Table which has at bottom the button Select Font. Clicking on this button and selecting @Arial Unicode MS and Japanese for Script results in getting displayed the Japanese characters including those from Katakana independent on which font, script and code page is set for active file.

      There is one more ASCII Table view which is not present anymore by default in menus and toolbars or key mapping configuration dialog. But I have still assigned a hotkey for command with internal command identifier ID_ASCII_TABLE to open this deprecated ASCII Table by hotkey. And I have this command still in my customized toolbar to be able to open this alternate ASCII Table by mouse click. So I have at the moment both ASCII Table views opened in UltraEdit one showing the characters with Japanese code page and the other with Windows 1252 code page.

      In toolbar customization dialog there are ASCII Window which opens same view as View - Views/Lists - ASCII Table and ASCII Table which opens the alternate view of the ASCII table existing mainly for downwards compatibility.

      Is using those 2 views one set for Western (Windows-1252) and font you like and the other with Japanese (Windows-932) with @Arial Unicode MS helpful for you?

      Do you know that you can configure in UltraEdit also a separate font for files depending on file extension?

      I use this feature for editing *.bat and *.cmd files, see Different font depending on file extension.
      Best regards from an UC/UE/UES for Windows user from Austria