How best to convert words/strings in ALL CAPS to Title or Sentence Case

How best to convert words/strings in ALL CAPS to Title or Sentence Case

2
NewbieNewbie
2

    17:15 - May 25#1

    I've spent the last couple of days reading various highly knowledgeable solutions (All Hail @Mofi) for this task, but I'm more confused now than I was at the start. I'm just starting a hobby of amateur American English subtitle corrector. You would not believe all the stupendously stupid text renditions to subtitles in film and video spoken language out there! When I've finished correcting the errors, I plan to upload the revisions to subtitle hosts for free.

    What follows is not in error, but it is an example from the subtitle file I'm currently working on: Star Trek: First Contact...

    Code: Select all

    2
    00:02:59,440 --> 00:03:01,311
    [DISTORTED BORG VOICES]
    
    3
    00:03:32,299 --> 00:03:33,256
    [GASPING]
    This particular file is quite good, but I hate reading text in all caps. And sadly, a significant fraction of all English subtitle files are entirely in caps! So what I need is a macro or script or whatever is most appropriate that will convert text in all caps to either title case (with each word capitalized) or sentence case (only the first letter of each sentence capitalized). From what I've read so far, apparently I need to use the Perl regular expression syntax to perform this task most effectively (which is somewhat unfortunate since I'm much more comfortable with the UltraEdit legacy regexps).

    But I just don't know how to do this. I had a career in systems software development (even developing a few full-power custom text editors), but all that is 20 years behind me now, and I'm not longer anywhere near the top of my game.

    Would you wise folks help me with this, please?

    ETA: I don't expect the macro/script to know when to convert to sentence case and when to choose title case. That's my job...

    Ovg
    115261
    MasterMaster
    115261

      17:59 - May 25#2

      It seems that UE has builtin functions for that
      Select desired text or entire file and choose desired action:

      20230525_205231.png (75.05KiB)
      It's impossible to lead us astray for we don't care even to choose the way.

      AmbyT47
      2
      NewbieNewbie
      2

        18:37 - May 25#3

        Ovg wrote:
        17:59 - May 25
        It seems that UE has builtin functions for that
        Select desired text or entire file and choose desired action:

        20230525_205231.png
        Wow! I feel like a dunce for missing that, but I feel gratitude more strongly. Thank you!

        Ovg
        115261
        MasterMaster
        115261

          19:05 - May 25#4

          You're welcome!
          It's impossible to lead us astray for we don't care even to choose the way.

          Mofi
          6,52749433
          Grand MasterGrand Master
          6,52749433

            5:38 - May 26#5

            Reformatting the text according to English Grammar rules is not easy. There can be used the following macro code as starting point.

            Code: Select all

            InsertMode
            ColumnModeOff
            HexOff
            PerlReOn
            Top
            Find MatchCase RegExp "^(\[?\w)([^\r\n.!?]*)"
            Replace All "\U\1\E\L\2\E"
            Find MatchCase RegExp "([.!?][\t \xA0]+)(\w)([^.!?]*)"
            Replace All "\1\U\2\E\L\3\E"
            Find RegExp "(i\.?e\.,? \w[^\r\n.!?]*)"
            Replace All "\L\1\E"
            Find MatchCase RegExp "\<i\>(?!\.)"
            Replace All "I"
            
            The first Perl regular expression searches for "sentences" beginning at a line with ignoring an opening square bracket and starting with a word character according to Unicode specification and ending with carriage return, line-feed, dot, exclamation mark or question mark and converts the first word character to upper case and all other characters to lower case.

            The second Perl regular expression searches for "sentences" beginning after a dot or exclamation mark or question mark and one or more normal spaces, horizontal tabs or no-break spaces and ending with carriage return, line-feed, dot, exclamation mark or question mark and converts the first word character to upper case and all other characters to lower case.

            The problem is now which dot is a punctuation mark and which one is used otherwise like in a date, after a number, in an abbreviation, ...

            The third Perl regular expressions handles ie. and ie., and i.e. and i.e., which is hopefully  always found in the middle of a sentence and never at beginning of a sentence and converts to lower case the abbreviation and everything followed up to next carriage return or line-feed or dot or exclamation mark or question mark.

            There are lot more such Perl regular expression replaces necessary for such extra Grammar rules. Applying all special Grammar rules is really difficult. Think about the abbreviations etc. or ... which can be in the middle of a sentence on which the (last) dot is just part of the abbreviation as well as at end of a sentence on which the (last) dot is part of the abbreviation but also the punctuation mark of the sentence. So it will be always necessary proofreading the text finally by a human having learned the complex Grammar rules of English language over years.

            The last Perl regular expression is for converting to upper case the single character i which in English must be always written as I of the next character is not a dot like in i.e. as that would be wrong again.

            There should be added also more Perl regular expressions to make words like weekdays, country and language and other already well-known names beginning with an uppercase character.
            Best regards from an UC/UE/UES for Windows user from Austria