I am working on identifying consonant combos in a dictionary of English to IPA which is have compiled.
A short sample is given below.
As it can be seen the data structure is:
What I need is a macro or regex replace which will identify all consonant conjuncts i.e. two consonants coming together under the following conditions:
Explanation:
Thus a combo of the type b'l in the middle of the word such as
but
At present I am doing the cleanup by hand. A macro which would identify such combinations and separate the consonant clusters by a + would be most useful as in:
would become
Three consonant combos are rare and if found could be separated with + sign.
Many thanks for solving this conundrum.
A short sample is given below.
Code: Select all
ability=ə'bɪlɪti:
abject='æbʤekt
abjection=æb'ʤekʃən
abjections=æb'ʤekʃənz
abjectly='æbʤekt-li:
abjuration=,æbʤʊə'reɪʃən
abjurations=,æbʤʊə'reɪʃənz
abjure=əb'ʤʊər
abjured=əb'ʤʊərəd
abjures=əb'ʤʊəz
abjuring=əb'ʤʊərɪŋg
ablative='æb'lətɪv
ablatives='æb'lətɪvz
ablaut='æb'l aʊt
ablauts='æb'l aʊts
ablaze=ə'bleɪz
able='eɪbəl
able-bodied=,eɪbəl-'bɔːdi:d
abler='eɪblər
ablest='eɪblest
ablution=ə'bluːʃən
ablutions=ə'bluːʃənz
ably='eɪb-li:
abnegation=,æbnɪ'geɪʃən
Code: Select all
English word=IPA
- Only in the IPA column i.e. the column to the right.
- Only when not in initial or final position i.e. in the middle of the IPA string.
- Conjuncts are consonants which come together in the middle of the word and are not separated by a ' or a , [stress markers]. 2 or 3 consonants can come together. 3 consonant combos are rare.
- All the consonants which can form combos in medial position are are given below.
Code: Select all
[bdfghjklmnprstvwyzðŋʃʒʤʧθ]
Thus a combo of the type b'l in the middle of the word such as
Code: Select all
'æb'lətɪv will not be considered since the consonants are separated by a '
able-bodied=,eɪbəl-'bɔːdi:d since l and b are separated by a hyphen and also a '
Code: Select all
'ɔːdliː will be identified since the two consonants immediately come together.
Code: Select all
,æbnɪ'geɪʃən
Code: Select all
,æb+nɪ'geɪʃən
Many thanks for solving this conundrum.