My problem is as described below.
I am working on name variants and one of the tools developed in C is a Metaphone Engine. Basically the engine looks for name variants and conjoins them to provide similar homographs. The engine is designed for Indian languages but the examples given below are in English. The Engine is based on a large database of homographs with the following structure:
So a variant of a name is provided on a line separated by an equal sign. An example will make this clear.
Since the database has been manually prepared, it often happens that duplicates have been created where the left hand side variant and right hand side variant are cross-linked as in the example below.
This has created a database which is bloated because of such additions. This confuses the engine which goes in a loop.
What I need is a script or macro which will remove such cross-linked duplicates.
Example of input and output.
Input:
Expected output after removal of duplicates:
The cross-linked data is removed.
Many thanks in anticipation for your help. And Happy Holidays to all members of the forum.
I am working on name variants and one of the tools developed in C is a Metaphone Engine. Basically the engine looks for name variants and conjoins them to provide similar homographs. The engine is designed for Indian languages but the examples given below are in English. The Engine is based on a large database of homographs with the following structure:
Code: Select all
name=name variant
Code: Select all
Mark=Marc
Mark=Marque
Code: Select all
Mark=Marc
Marc=Mark
What I need is a script or macro which will remove such cross-linked duplicates.
Example of input and output.
Input:
Code: Select all
Mark=Marc
Mark=Marque
Marc=Mark
Marque=Mark
Code: Select all
Mark=Marc
Mark=Marque
Many thanks in anticipation for your help. And Happy Holidays to all members of the forum.