MS DEPARTMENT OF INFORMATION SYSTEM
Permanent URI for this community
Browse
Browsing MS DEPARTMENT OF INFORMATION SYSTEM by Issue Date
Now showing 1 - 20 of 61
Results Per Page
Sort Options
Item Predicting employee attrition by using data science(UMT, Lahore, 2018) Saud Bin TahirEmployee attrition is related to recruitment and selection process of human resource management. Attrition is of two types basically, voluntary and involuntary. First one is a type in which an employee quits the job himself/herself and second is the one on which the employer leaves its employees for what so ever the reason may be. Companies who can afford nowadays trying to develop such a system which is integrated into human resource management system and that can predict different trends especially related to attrition of employees from the company. The purpose of this research is to apply such methods of predictive analysts to solve such human resource management issues. In the end, this research will highlight the tops factors contributing to attrition. This research has tackled the issue in many ways like one of it is to get the data and apply a decision tree algorithm which in turn suggests the important factors contributing to high turnout. We have used C.50 algorithm. Data mining is an emerging field that can answer such questions. It can help us answer the following questions related to turnover analysis, labor planning and recruitment analysisItem COMPARATIVE ANALYSIS OF LINK PREDICATION TECHNIQUES(UMT, Lahore, 2018) Haseeb AhmadIn data mining, predication is the most attracting and beneficial in terms of making the right decision. Recently Link predication proofed its importance to the many researches in general and specially in the social network analysis, bio informatics, complex interconnected network, and chemical interconnection network. By finding the missing links many of the complex pattern in the big data had been found that had made the worth of the old data that is present in our archives, by finding the missing links many answer of complex patterns in big data are being answered. Different kind of algorithms had been purposed to find the links from the graph based data which are categories into three main categories maximum likelihood base algorithms, probability base algorithms and similarity base algorithms and each one is best in its own context, as many researchers had done research on link mining or link predication in each one of above mention algorithms category. So in that research I am going to survey purposed algorithms belong to these categories so from their survey result I will do a comparative analysis and will close the survey with the results and discussion and also on survey results will suggest about furthers directions.Item IMAGE COMPLETION WITH DEEP LEARNING(UMT, Lahore, 2018) Aqeel ShamasIn these days, Image recognition is very important. Image recognition is very widely used in security system to identify and track any object. Most of the time we have incomplete or missing images and we have less quality images as well which are very difficult to process. So in this research, I would try to implement the GANs to complete the missing part of the Images. We also try to look deeply in these network to know how they works and how we can make them more efficient and more accurate. I will use TensorFlow in order to implement these networks using Python. I will interpret images as being samples, generates dummy images and filter the fit fake image for completion using the discriminative network. In this research Generative adversarial networks (GANs) will be used and GANs are one of the Artificial Intelligence approach. These networks are used in unsupervised Machine Learning. These Network was introduced by Ian Goodfellow et al. in 2014. By using these networks/techniques we can generate the photographs which look like superficially authentic to human observers, having many realistic characteristics. These networks use two neural networks which are contesting with each other using the zero-sum game framework. “Zero-sum game is a mathematical equation which represents the situation where each participant loss or gain as compare to the other participants. If one participant gains and then we sum up all the losses by other participants and subtract the loss from gain, we always got zero as sum of the total gains and losses.” In GANs, one neural network generates the output and the other neural network is used to evaluate the output. The second neural network uses the discriminative network discriminates between objects from the true data distribution and outputs produced by the generator.Item Impact of intellectual capital efficiency (ICE) on Financial & Market Performance: Evidence from Pakistan(UMT, Lahore, 2018) Maryam KhanItem EMOTIONS DETECTION FROM TEXTUAL DATA USING MACHINE LEARNING TECHNIQUES(UMT, Lahore, 2018) Muhammad YarEmotions detection from textual data is a comparatively new classification job. Peoples are identified through his expressions and emotions. Emotions can be expressed through various modalities including face, voice, body language, physiology, brain imaging and text. The objective of this thesis is to identify emotions from text using Machine Learning techniques. Text can be a sentence, a paragraph, a book, a news article, a written speech and any text can be detect through emotions. According to science, human have twenty seven different types of emotions. In this research we have made emotions vocabularies by itself because the available emotions vocabularies are only two to four types which is not sufficient for valuable results nor satisfactory for further research work. We have get text of ten speeches of different personalities. We have applied and compare of five different types of classifiers; Naïve Bayes Classifier (NBC), Support Vector Machine Classifier (SVMC), Linear Support Vector Machine Classifier (LSVMC), Logistic Regression Classifier (LRC) and Stochastic Gradient Descent Classifier (SGDC) on these speeches one by one. In this research we have found that the highest accuracy is the Linear Support Vector Machine Classifier (LSVMC) which is 59.70% and the second highest accuracy of Logistic Regression Classifier (LRC) which is 59.49%.Item FAKE NEWS DETECTION USING MACHINE LEARNING TECHNIQUES(UMT, Lahore, 2018) ADNAN HAIDERWith the expansion of social media, fake news detection topic gains a lot of popularity for the researchers in the world. The fabricated and false information is spread on the online network to manipulate the views of the people. Using misleading words, individuals can get contaminated by the fake news effectively and can share them without verification. The widespread of false information in current years increases great concerns. Fake news increased significant consideration in the 2016 United States Presidential Elections. To eliminate the bad impact of fake news, it is necessary to make a plan or system to stop such kinds of misinformation on the online networks. In our research, we purpose a systematic identification of fake news using machine learning techniques. We obtain fake news data from Kaggle and real news data from popular news agencies websites. We implement and compare results of six different machine learning algorithms and two different feature extraction techniques. We extract the sentiment features form the dataset and find a correlation in all sentiments of each news. Our research results find that support vector machine classifier is the best classifier, on the basis of obtain accuracy is 93% and F1 score is 94%. The results show that TF-IDF is the good technique for feature extraction from text.Item Study on temperature rise and weather changes in Lahore, Pakistan.(UMT, Lahore, 2018) Fakhar Abbas MeharLatest study and research tells us that temperature of earth is increasing day by day. Temperature of Planet earth was low in 1980s as we have today. In this study I will try to learn temperature patterns through statistical analysis and I will also try to identify the most important variable which are influencing directly to the temperature in Lahore, Pakistan. In this study I got weather data for last 30 years. For this investigation monthly, mean minimum and maximum temperatures have been examined. Rain, Humidity, Cloud amount, Atmospheric pressure, Vapour pressure and Wind speed have also been studied in this research.Item RESUME PARSER AUTOMATION USING NATURAL LANGUAGE PROCESSING(UMT, Lahore, 2018) Muneeb Ul HaqueRecruitment is a tough and tiring procedure for HR team of any company. The whole HR team has to devote their time and efforts for the recruitment of only once vacancy sometimes. It made the task difficult which makes the existence of mistakes inevitable. In such a situation, resume parser could be a blessing for the company as it does not only saves time but also money as fewer resources are required now for the recruitment process. Resume is being parsed by separating every section through applying multiple techniques notably NLTK, PDF minor, NLP library etc. This process is done by computer through an automation process by dividing the features of resume made by human into sub sections; analysing the job description requirements and shortlisting the most appropriate candidates within few minutes. Hence, this resume parser will convert the recruitment process task into the project of mere few days overallItem Impact of suicide attacks on Karachi Stock Exchange(UMT, Lahore, 2019) Irum NoorSuicide attacks in Pakistan have adversely affected its economic system as well as its overall reputation in the world. This thesis is aimed at identifying the impact of major suicide attacks events took place in from 2010 - 2018 and their impact on the performance of Karachi stock index. A variety of quantitative methods have been used to identify the impact and to what extent these attacks damaged the overall performance of KSE.Item Sequence-Based Identification of DNA Replication Proteins and DNA Replication Inhibitors using Statistical Moments and PseAAC(UMT, Lahore, 2019) Muhammad Arqam AminDNA is undoubtedly important for all living beings and a DNA molecule that holds a lot of information about heritage, also predicts if they are at risk for certain diseases. Double helix DNA consists of two integrated branches. These strings are separated during the copy process. Next, each strand of the original DNA molecule functions as a template and generates its counterpart. This is a process known as semi-conservative iteration. Because of the semi- conservative replication, the new coil is composed of both the original DNA strand and the newly synthesized strand. Cell error correction and error-checking mechanisms ensure almost complete commitment to DNA replication. DNA replication can also be performed in the laboratory. DNA synthesis can be initiated from a known sequence of template DNA molecules using DNA polymers isolated from cells and artificial DNA primers. Examples of polymerase chain reaction, ligase chain reaction and transcription-mediated amplification, but it can be very costly and time-consuming. Similarly, the identification of DNA replication proteins and DNA replication inhibitor proteins is somewhat extremely crucial that requires the reliable and comprehensive computational method that can precisely predict and discriminate the proteins. In this study, identification of DNA replication proteins and DNA replication inhibitors was aimed. This study is totally followed by Chou’s 5 step rule and different types of techniques used to get efficient prediction results by using an artificial neural network algorithm. This study comprehends the construction of novel prediction model to serve the proposed purpose. A prediction model was developed based on the artificial neural network by integrating the position relative features and sequence statistical moments in PseAAC for training neural networks. 10-fold cross-validation and Leave-one- out method was opted by validating at different levels like overall accuracy, sensitivity, and specificity. The study results recommend that the proposed strategy may play a fundamental part in the other existing strategies for DNA replication inhibitors and proteins prediction. Hence the proposed prediction method can offer assistance in foreseeing the DNA replication proteins and inhibitors in a productive and exact way. Our astonishing experimental results demonstrated that the proposed predictor surpass the existing models that can be served as a xi time and cost-effective stratagem for designing novel to identify DNA replication proteins and inhibitors.Item iTSP-PseAAC: Identify Tumor Suppressor Proteins by Using Fully Connected Neural Network and PseAAC(UMT, Lahore, 2019) Muhammad AwaisThe tumor suppressor genes (TSG), are like normal genes, controllers of cells related function from cell production to the death of the cell, if they are working properly, they can control the cell division, repairing of DNA mistakes and many other functions. There is a number of other tumor suppressor proteins that suppress the gene to not encode and produce cells. The gene, to act and perform like tumor suppression, undergo to transcription and translation process and produced the relevant proteins which bind with DNA and perform the tumor suppression activities to control the unwanted growth of cell or activities that are the part of tumor production. This study aims to propose a new and more accurate tumor suppressor proteins predictor and make it, easy to use, user-friendly and publicly available to the experimental biologist to get their desired results. The predictor model has used input features vector (IFV) calculated form the physiochemical properties of proteins based on FCNN to compute the accuracy, sensitivity, specificity, and MCC. The proposed model was validated against different exhaustive validation techniques i.e. self-consistency and cross-validation. Using self- consistency, the accuracy is 99%, for cross-validation and independent testing has 99.80% and 100% accuracy respectively. The overall accuracy of the proposed model is 99%, sensitivity value 98% and specificity 99% and F1-score was 0.99. It concludes, the proposed model for prediction of the tumor suppressor proteins has the ability to predict the tumor suppressor proteins efficiently, but it still has space for improvements in computational ways as the protein sequences may rapidly increase, day by day.Item Spatial Data Analysis of Vehicle Accidents in Victoria Australia from 2006 to 2016 using ANOVA and T- Statistics(UMT, Lahore, 2019) Zain AsifAccidents and the analysis of accidents has always been an area of interest in the present age and is of prime interest to not only passengers but also the manufacturers of those vehicles and the government. The analysis of accidents helps expose the relationship between different types of attributes that are involved in causing the accidents. Accidents can be of airplanes, ships, road accidents etc but for our thesis we will be considering road accidents for different vehicles and focusing on cars. The analysis of various types of accidents using the given dataset we can gather information and contribute to finding the attributes which can cause accidents and how are can use these attributes to decrease the number of accidents. In our study we will be using statistical data analysis on spatial data on the State of Victoria in the Australia region. There have been laws made to reduce the speed limit of the vehicles over the past decades and our focus of this research would be to find out what vehicles are most prone to accidents and the driving rules and policies made by the government to control them. Accident are something which everyone tries to avoid and incase of the mishap what policies and safety measures can be taken to prevent them. Our research is an extension of work of data exploration [4] and finding the vehicles most prone to accidents and then finding out the government policies [3] that were applied to those vehicles and if there was any decrease in the accidents as a result of those policies.Item Leveraging Data Analytics to Maximize Business Outcomes; a Comparison across ICT SMEs in Pakistan(UMT, Lahore, 2019) Rukham BashirData analytics and its contribution of improving business performance is creating a hype in discussion, research and practice. With the emergence of IT enabled services, consumer generated data on online platforms is increasing day by day and henceforth bringing up challenges of measurement and analysis. This study identifies that measuring marketing performance to track business result remained a challenge for marketers to prove their abilities in highlighting campaign performance to scale business gains, especially in resource restrained SME sector. This study aims to explore the contributions of digital marketing efforts with application of data analytics approach to maximize business outcomes in terms of measurement and its impact on business outcomes. A qualitative approach is taken to analyze data from 20 respondents companies in different ICT services. This study concludes that measurement process with a refined approach to meet targets can lead to standardize the metrics that reduces market tensions. The study suggest the components for a data driven approach i.e. integration of data sources, appropriateness of measurement techniques, selection of metrics and analysis of impact can influence overall business performance. Therefore, analysis of marketing efforts is divided into four stages. This study identifies the similar practices in measuring digital marketing efforts with data techniques and tools and identifies the gap of data driven mindset in light of results obtained. It was found that ICT SMEs are growing with a data driven mindset and planning for future improvements, however, e-realty in Pakistan is lagging behind other industries in terms of knowledge and approach to data driven marketing that is limited to target settings for acquisition rather than creation of personalized marketing for long-term retention. The future directions and implications of this study are discussed in last where most important is the testing of the model with another research method and on large population for refining results and improving validity of findings.Item Selecting A Better Classifier Using Machine Learning For COVID-19(UMT, Lahore, 2019) MUHAMMAD IMRANNow a day’s world is confronting a severe issue identified as Coronavirus. Its officially declare as COVID-19. In this infection we don’t use clinically approved vaccines and medicines. Antibiotics give a relief to the effected patients because proper vaccination is not discovered. COVID-19 has resemblance like pervious infectious diseases such as Middle East Respiratory Syndrome (MERS) and Sever Acute Respiratory Syndrome (SARS). World need quick and rapid precautionary measures to handle this outbreak. Wuhan, Chinese city is the hub of this infection. To achieve the outcomes and future forecasting of COVID-19, we analyze the records and datasets of COVID-19 through Machine Learning algorithms. For this purpose, we used various algorithms to construct classifiers such as: Support Vector Machine (SVM), Decision Tree, K-Nearest Neighbor (K-NN), Naïve Bayes and Random Forecast. These algorithms apply on different software Python. In our research we discussed two types of classification: Binary and Multinomial. Support Vector Machine and Decision Tree give us precise results. Other classifier models gave satisfactory outcomes. Above algorithms directly apply on datasets in Python and programming Language. The outcomes may be helping to predict the future circumstances of COVID-19.Item Modeling Influence of Other Countries on Pakistan(UMT, Lahore, 2020) Qumer MumtazItem i DEVELOP A BAYESIAN FRAMEWORK FOR PREDICTING LIKELIHOOD OF HIT MOVIES.(UMT, Lahore, 2020) MUHAMMAD USMAN MANZOORThis study proposes a framework for predicting likelihood of producing a Hit film by implementing probabilistic inference. We propose the Bayesian networks properties are efficient for the problem in hand. We implemented a Bayesian network model to build Stars recommendations system, which is very uncertain in nature. Bayesian Network is Custom-made to the problem in hand. We examine the process through which stars affects the chances of getting an award in film fairs on individual basis and also in group. We performed the Bayesian Network model on the data sets of Lollywood movies and Lux Style Awards from 2002-2019 and the data for the analysis was consisted of all Urdu movies which were released between 2002-2019. The author prepared all the data sets from three different sources which includes IMDB, PAKDB, and PAKMAG and then verified all the data. There were total 239 movies which were part of our initial data set. Our Training Data set was consisted of total 214 movies, 619 stars, and our Test Data set was consisted of 25 movies, which were part of Lux Style award 2019. The Authors validated the model by applying the model on all the movies on the Test Data set on all 25 movies, whether they obtain an award or not. The authors also examine the process through which stars affects the chances of getting an award by Lux style award, that is, whether they influence the movie at least to be selected as a nominee or the best case to get an award. They find that star power influence the success of a film and plays its role. The chemistry between stars also plays key role on screen and is significant factor in the success or failure of the film. The authors also generated a costar network for all the stars and showed the degree of centrality and the closeness centrality as well.Item An Implementable System for Detection and Identification of License Plates in Pakistan(UMT, Lahore, 2020) Muhammad Bilal NayyarAutomated License Plate Identification (ANPR) is a large-scale monitoring system that Photographs vehicles and recognizes their license numbers. The ANPR can help Detect stolen vehicles. Stolen vehicles can be traced effectively. This research provides a way to recognize the use of the ANPR system in highways. Using different vehicles, a rear- view image of the vehicle is captured and processed Algorithm. In this context, the license plate area is located using a new function how to detect license plates that contain multiple algorithms. Whose vehicle plate image is captured by cameras and processed to capture the image License plate information. This system is implemented not only to reduce human consumption but also to facilitate human labor because of the power and its potential use of development of automatic license plate. The identification system will result in greater efficiency in the vehicle monitoring system and number plate Identification systems are used commercially, abroad and locally. This is the system Implemented using the Python Image Processing Toolbox, which uses optical characters Image identification for reading vehicle license plates. The data is collected from safe city and collect by myself locally, where data in the imagery structure is presented. A corresponding model is developed for the purpose of identification and recognition of License Plates and attain a recognition accuracy of at least 95 percent. Significant computing power is required in the case of License Plate Recognition to achieve a satisfactory proficient of recognition in a neural network. This research is a step towards smart city plan of Pakistan. In today's world where basic electronics find their place in areas like home automation, automotive automation. Automatic water storage system and so on, it will take us a little further in the smart city plan.Item Machine Generated Deep Image Captioning with Style(UMT, Lahore, 2020) Muhammad AftabA powerful tool for perceiving the physical world is sight. The study of computer vision aims to provide sight to artificial agents, enabling them to understand complex visual scenes. As a core topic in artificial intelligence and machine learning it has been the focus of extensive research, but is far from solved, with humans still outperforming artificial vision systems in most tasks. Communication between humans is primarily through language. Designing an agent that can communicate via language is an important goal for human-agent interaction and for building agents that can learn from the vast repositories of human knowledge. With these aims natural language processing is a core topic in artificial intelligence and machine learning. Like computer vision, natural language processing has been the focus of extensive research, but remains an open problem. This thesis seeks to connect two core topics in machine intelligence: vision and language. Although several topics exist at the intersection.In this research focus on automatic image captioning: generating natural language descriptions of image content. Automatic captioning involves both the image understanding problem from computer vision and the natural language generation problem from natural language processing. To improve communication the researcher endeavour to add an extra layer to automatic captioning in the form of linguistic style. Stylistic variations in language have a range of useful applications, such as: reaching a broad audience, reducing misinformation, and engaging viewers. With these applications in mind the research develop and evaluate novel methods capable of generating stylised captions for natural images. Previous research into image caption generation has focused on generating purely descriptive captions; In this research the focus is on generating visually relevant captions with a distinct linguistic style. Captions with style have the potential to ease communication and add a new layer of personalisation. First, the researcher consider naming variations in image captions, and propose a method for predicting context- dependent names that takes into account visual and linguistic information. This method makes use of a large-scale image caption dataset, which the researcher also use to explore naming conventions and report naming conventions for hundreds of 9 people. Next the researcher propose the SentiCap model, which relies on recent advances in artificial neural networks to generate visually relevant image captions with positive or negative sentiment. To balance descriptiveness and sentiment, the SentiCap model dynamically switches between two recurrent neural networks, one tuned for descriptive words and one for sentiment words. As the first published model for generating captions with sentiment, SentiCap has influenced a number of subsequent works. The researcher then investigate the sub-task of modelling styled sentences without images. The specific task chosen is sentence simplification: rewriting news article sentences to make them easier to understand. For this task the researcher design a neural sequence-to-sequence model that can work with limited training data, using novel adaptations for word copying and sharing word embeddings. Finally, the researcher present SemStyle, a system for generating visually relevant image captions in the style of an arbitrary text corpus. A shared term space allows a neural network for vision and content planning to communicate with a network for styled language generation. SemStyle achieves competitive results in human and automatic evaluations of descriptiveness and style. As a whole, this thesis presents two complete systems for styled caption generation that are first of their kind and demonstrate, for the first time, that automatic style transfer for image captions is achievable. Contributions also include novel ideas for object naming and sentence simplification. This thesis opens up inquiries into highly personalised image captions; large scale visually grounded concept naming; and more generally, styled text generation with content control.Item Analytical Modeling for Predicting Winning team combinations for Pakistan Super League (PSL)(UMT, Lahore, 2020) Mehak FatimaT20 cricket is the popular and most exciting form of the game. Since its creation, PSL has been very successful and has created a billion-dollar industry. This is of interest to researchers in various disciplines such as data science, economics and finance. Various statistical techniques have been used in sports that affect not only the audience but also the athletes. Using various data mining techniques, predictive models were created that players can choose from. However, no substantially accurate publication has been published until now. Furthermore, the T20 league of Pakistan (PSL) has not been targeted yet, based on individual players profiles and winning teams combination. Thus, considering this issue, the present study was conducted. Herein, research was performed to develop a model that can help franchise owners to bid for talented players and build a winning team with minimum spending. The framework comprised of three main aspects, i.e. data collection, data processing and player statistics calculation, and the probability calculations. The data was collected from ESPNcricinfo and was analyzed for various statistical analyses. Based on these analyses, the probabilistic model was developed. The model achieved 90% accuracy as it was validated through actual teams of 2019 PSL which were winner and runner-up. Thus, on the basis of these results, it is concluded that the proposed model can be a beneficial tool for PSL squad selection and bidding. This model supports the process of creating teams and selecting participants in PSL. Since this study is specifically targeted at the PSL field, which has not been previously selected as a target, it is beneficial for team managers to select and create winning team combinations. The results of this study will bring huge benefits to the cricket, T20 and PSL domains, and will open up new directions for the study of cricket prediction research.Item Changing of objects into words using image captioning(UMT, Lahore, 2020) Muhammad Umair Tariq ChohanThe models of image captioning usually follow a design which is an encoder and a decoder design which use pictures and highlight vectors as an addition to the encoder. Some calculations utilizes include vectors removed from the district proposition got from an item identifier. This study uses Object Relation Transformer, expanding this methodology by expressly joining data about the spatial connection between input distinguished articles through mathematical consideration. The results obtained by qualitative and quantitative approaches show the significance of such mathematical consideration for picture subtitling, prompting enhancements for all basic captioning measurements on the MS-COCO dataset.