Department of Computer Science
Permanent URI for this collection
Browse
Browsing Department of Computer Science by Subject "Algorithm"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Efficient name search algorithm of Pakistani names(UNIVERSITY OF MANAGEMENT AND TECHNOLOGY, 2015) Yousaf, NadiaIn many applications the crucial and vital role is played by name matching. In every profession the information is retrieved and data is stored in repositories in English. This data can be the names of persons working there or any other type of data. Many algorithms have been developed to match the names because names in every application create unavoidable variations and errors. Spelling, pattern or phonetic modifications are name variations that are considered to develop many algorithms. English language is covered in mostly existing techniques. Pakistan’s official language is English and mother language is Urdu this is the reason that all government documents, data storage, business and professional activities use English. Due to it storage of Urdu names in English is mandatory. Matching names such as Pakistani names against names stored in computer databases or files (written in Roman Urdu), can create the large variety of possible spelling variations. For example, the Muslim Pakistani name “Mohamed” can be represented as “Mohammed,” “Muhhamad,” “Muhamud,” etc. More sophisticated techniques are required to accommodate the large possible variations in spellings. In this research different methods for string matching are discussed. In this research an efficient approach PPNM (Pakistani Phonetic Name Matching) is proposed which phonetically match name strings by using set of preprocessing rules proposed in this thesis for Urdu language. No specific technique has been designed and implemented for Pakistani names up to now. Another contribution of this thesis is the creation of a new dataset for Pakistani names which covers the variations of spellings against these names. This approach is implemented and then justified by performing number of experiments using the created dataset. After comparing this approach with Edit distance technique for name searching, it can be called an efficient approach for Pakistani names. iv