Capital letter change in the middle of word

Capital letter change in the middle of word

6
NewbieNewbie
6

    Sep 10, 2005#1

    Someone can help for me?
    My problem is :

    I use video OCR detecting program, but in some place the I and l characters exchange the program. (167 subtitle file)
    I need some kind idea, how i can i detect and change with the Ultraedit (maybe ultraedit macro) in so far as, the capital letter not first character in the words and the small letter earlier then capital letter in the words.

    Example:

    21
    00:03:43,223 --> 00:03:46,434
    No matter how i may Iook,
    I'm actuaIIy very stubborn.

    (instead of )

    21
    00:03:43,223 --> 00:03:46,434
    No matter how i may look,
    I'm actually very stubborn.

    How can i change, that errors?

    6,606550
    Grand MasterGrand Master
    6,606550

      Sep 11, 2005#2

      If you execute Replace All for following regular expressions in UltraEdit style with Match Case several times until nothing is replaced anymore, the uppercase I inside of words will be replaced by lowercase l. First replace is for words starting with a lowercase character, second one for words starting with an uppercase character, i.e. start of a new sentence. Words starting with an uppercase I by mistake are not corrected. You have to search for such words and and replace it manually.

      Find What: ^([a-z]+^)I
      Replace With: ^1l

      Find What: ^([A-Z][a-z]+^)I
      Replace With: ^1l

      This macro does this also:

      InsertMode
      ColumnModeOff
      HexOff
      UnixReOff
      Loop
      Top
      Find MatchCase RegExp "^([a-z]+^)I"
      IfFound
      Find MatchCase RegExp "^([a-z]+^)I"
      Replace All "^1l"
      Else
      ExitLoop
      EndIf
      EndLoop
      Loop
      Top
      Find MatchCase RegExp "^([A-Z][a-z]+^)I"
      IfFound
      Find MatchCase RegExp "^([A-Z][a-z]+^)I"
      Replace All "^1l"
      Else
      ExitLoop
      EndIf
      EndLoop
      Top
      Find MatchCase RegExp "^([~.!^? ^t^p][ ^t^p]+^)I^([a-z]^)"
      Replace All "^1l^2"
      Top

      UnixReOn

      Remove the last red command, if you use regular expression in UltraEdit style by default instead of Unix style - see Advanced - Configuration - Find - Unix style Regular Expressions. UnixReOn/UnixReOff modifies this setting.

      Normally OCR software can be trained by the user to correctly detect what is an I and what is a l.
      Best regards from an UC/UE/UES for Windows user from Austria

      6
      NewbieNewbie
      6

        Sep 11, 2005#3

        Mofi wrote:If you execute Replace All for following regular expressions in UltraEdit style with Match Case several times until nothing is replaced anymore, the uppercase I inside of words will be replaced by lowercase l. ...........
        Thx the advice.
        I have more one question.
        How can i modify your macro if i need more that instruction how the macro search & replace thats I capital letters, which not in the first word in the sestence. In other words the I letter follow the space character.

        Example:

        Well, shall we cross and Iook for other lakes?

        ----------------
        And sorry for the twice question in another topic.

        6,606550
        Grand MasterGrand Master
        6,606550

          Sep 12, 2005#4

          As I already have written, if a word starts with I, it will be hard to detect, if it is a wrong detected l. I have found an expression in UltraEdit style which hopefully can detect wrong capitalized words.

          Find MatchCase RegExp "^([~.!^? ^t^p][ ^t^p]+^)I^([a-z]^)"
          Replace All "^1l^2"

          This expression works as follows:

          Find 1 character, which is not a point '.', exclamation mark '!', question mark '?', space ' ', tab or line ending CR+LF. You can add additional characters, which are identifications for a new sentence/paragraph, a double quote " or double point : for example.

          Next the string should have 1 or more spaces and/or tabs and/or line endings CR+LF.

          The next character after the whitespace character(s) must be an uppercase I.

          The I following character must be a lowercase alphabetical character.


          Here is the example which I used for testing:

          Hello, I'm not happy about that. Ice tastes well.
          Ice is cold. Well, shall I cross and Iook for other lakes?

          Idea, I need an idea.
          Best regards from an UC/UE/UES for Windows user from Austria

          6
          NewbieNewbie
          6

            Sep 12, 2005#5

            Mofi wrote:......Idea, I need an idea.
            See the sentences.
            The wrong I letter 99 in percent stay in middle of words, or first character in the words and follow the I another vowel.

            Examples:

            are the Iast one Ieft who is able

            Well, shall we cross and Iook for other Iakes?

            You just have to Iend me a boat
            ----------------------------------

            I propose the space now _ character for the better seeing and use | character for the changing.

            _Io|ok, _Ie|ad, _Ie|nd , _Ia|ke, _Ii|ke _Iu|re (this letters need the changes)

            _lo|ok, _le|ad, _le|nd , _la|ke, _li|ke, _lure


            Most in English words never come after Capital letter another vowel wich not the first word in the sestence and my texts in 99% this errors contain.
            Unfortunately the OCR detecting program the only one program, which know detecting in the AVI files the permanent subtitle, but contain these errors many place (I & l change).

            6,606550
            Grand MasterGrand Master
            6,606550

              Sep 12, 2005#6

              Your example is no problem for the third search and replace I wrote in the second reply. Add the two commands to the macro immediately before last command (UnixReOn) and all wrong I will be replaced by l - see edited macro code again.
              Best regards from an UC/UE/UES for Windows user from Austria

              6
              NewbieNewbie
              6

                Sep 12, 2005#7

                Mofi wrote:Your example is no problem for the third search and replace I wrote in the second reply. Add the two commands to the macro immediately before last command (UnixReOn) and all wrong I will be replaced by l - see edited macro code again.
                Hm.
                Interesting, not working for me.
                Just in the middle of words working the change.
                But nothing happening thats I characters, which staying early in words and which not the first in the sestence.

                See
                Original line:

                10
                00:02:47,625 --> 00:02:50,712
                are the Iast one Ieft who is abIe
                to protect Hijiri IsIand

                With the modified macro:

                10
                00:02:47,625 --> 00:02:50,712
                are the Iast one Ieft who is able
                to protect Hijiri Island

                The Iast and Ieft word not changed, but the the macro worked fine in the able and Island words.

                6,606550
                Grand MasterGrand Master
                6,606550

                  Sep 13, 2005#8

                  Iast and Ieft are correctly modified to last and left with the modified macro. Make sure you really have copied the macro code correct - see macro code above.

                  And Hijiri Island is an example why I first have written, that you have to manually replace uppercase I at start of a word. The modified macro now replaces IsIand to lsland, because it cannot know, that this is a proper name.
                  Best regards from an UC/UE/UES for Windows user from Austria

                  6
                  NewbieNewbie
                  6

                    Sep 14, 2005#9

                    I used the correct macro code, but not working for me. I check at least ten times.

                    6,606550
                    Grand MasterGrand Master
                    6,606550

                      Sep 14, 2005#10

                      Have you enabled option Continue if a Find with Replace not found in the properties of this macro?

                      If you have not enabled this option only first Search and Replace loop will be executed by the macro.
                      Best regards from an UC/UE/UES for Windows user from Austria