I ran into a very strange situation.
After copying and pasting text from a web page, I realized that I would need to do some regular expression tweaking to fix the problem.
The page is this:
https://www.letras.mus.br/michel-legran ... ducao.html
When I copy the lyrics of the song and its translation and paste it into UltraEdit, I have the original lyrics linked to the last letter of the translation, all on the same line.
It should be right for it to come below the translation.
Like this:
But it's coming like this:
I'm struggling to find a regular expression that fixes this.
I came up with this one:
Which captures the first and second group up to the first capital letter, to keep that part on the first line.
Then it captures the third and fourth group up to the end of the line to throw them on the line below.
I would write the rest of the expression after I had the groups defined correctly.
If I test it on regex101.com, it appears to be fine: https://regex101.com/r/OlzOly/
But UltraEdit find another match.
This expression of mine is capturing everything up to the apostrophe (').
It ignores the second capital letter.
What's wrong?
And I realized that I have to foresee another situation, when there is an apostrophe.
I would also have another question:
Why do certain web pages not respect line breaks?
After copying and pasting text from a web page, I realized that I would need to do some regular expression tweaking to fix the problem.
The page is this:
https://www.letras.mus.br/michel-legran ... ducao.html
When I copy the lyrics of the song and its translation and paste it into UltraEdit, I have the original lyrics linked to the last letter of the translation, all on the same line.
It should be right for it to come below the translation.
Like this:
Code: Select all
Como uma pedra que se atira
Comme une pierre que l'on jette
Na corrente de um riacho
Dans l'eau vive d'un ruisseau
E que deixa atrás dela
Et qui laisse derrière elle
Code: Select all
Como uma pedra que se atiraComme une pierre que l'on jette
Na corrente de um riachoDans l'eau vive d'un ruisseau
E que deixa atrás delaEt qui laisse derrière elle
I came up with this one:
Code: Select all
^([A-Z])([a-z|\s]*)([A-Z])
Then it captures the third and fourth group up to the end of the line to throw them on the line below.
I would write the rest of the expression after I had the groups defined correctly.
If I test it on regex101.com, it appears to be fine: https://regex101.com/r/OlzOly/
But UltraEdit find another match.
This expression of mine is capturing everything up to the apostrophe (').
Code: Select all
Como uma pedra que se atiraComme une pierre que l
What's wrong?
And I realized that I have to foresee another situation, when there is an apostrophe.
I would also have another question:
Why do certain web pages not respect line breaks?