Browse
Recent Submissions
Item SENTIMENT ANALYSIS OF ROMAN URDU REVIEWS OF PSL ANTHEMS(UMT, Lahore, 2021) MUJAHID BASHIREasy access and economic availability of Computers, Tabs, Smartphones and high—speed internet people are now using the web for Social interaction and Business correspondence. People are becoming habitual to posting their reviews about any specific entity/product, they used. These reviews are very helpful for both—users and sellers. Initially, these reviews are not too much they can easily be analyzed by reading them. The continuous increase in the amount of these reviews creates a need that reviews can be analyzed and useful patterns to be found and explored through the automated channels. This need leads to a new field in the domain of research known as “Sentiment Analysis”. Sentiment Analysis is the study of people’s opinions, sentiments, attitudes and emotions expressed in written language also said that it is a process of categorizing people’s opinions expressed in the piece of text, especially to determine whether the writer’s attitude towards a particular topic or product is positive, negative, or neutral. The PSL anthems are released every year before the start of the league. There is no work is witnessed on PSL Sentiment Analysis to know the behaviors of the listeners towards the PSL anthems. This research is targeting the sentiment analysis of these reviews of PSL anthems and proposed a model to analyze Roman Urdu Reviews. In this thesis, five different Machine Learning algorithms are used for text classification of reviews by using Rapid Miner Tool. The thesis presents a Sentiment Analysis of Roman Urdu reviews on PSL Anthems available on YouTube. These reviews are scraped, pre-processed and analyzed using Naïve Bayes, Gradient Boost Tree, Support Vector Machine, K-Nearest Neighbors and Artificial Neural Network. The Roman Urdu Sentiment Analysis is performed at 7000 bi-lingual manual annotated reviews. The Naïve Bayes and Logistic Regression correctly predicted 68.86% of reviews. ANN achieved 68.86% on the testing dataset and 69.71% on the validation of the results.Item PREDICTING STUDENT FUTURE INTERACTION USING SELF ATTENTION MECHANISM WITH RANDOMIZATION(UMT, Lahore, 2021) MUHAMMAD USMANIn the modern era, finding quality education and testing the student’s ability on some standards has become a primary challenging task. To attend this problem, there are several methods designed, some are manual and some of them use technology. The latest and technical methods are more useful and reliable. In the pandemic situation, the education system using technology is better than old systems. In the latest methods, a large number of researchers proposed machine-based prediction methods for student study and their future results. In this thesis, an education model is proposed to predict the future results of the student as well as the correct domain or field selection for the student. The KCT (Knowledge Component Theory) based Model is proposed with Satisfying Results. The main goal is to introduce the tracking model based on Deep Knowledge. The EDNet dataset is utilized for the testing of the projected model. The research is based on some phases like data collection, Feature Engineering, and Model Implementation. The main contribution of the researcher is to accurately predict how students will perform in future interactionsItem Forecasting Air Quality and the impact of Meteorological factors on Air Quality in multiple cities of Pakistan(UMT, Lahore, 2021) KHAWAJA HASSAN WASEEMn recent years due to the rapid increase in urbanization and industrialization, the overall quality of air is taking a turn toward the worst. Clean air is necessary for leading a healthy life. Many of the respiratory illnesses have their root in the poor quality of air across the region. Due to the tremendous impact of air quality across people's life, it is extremely essential to come up with a mechanism through which the future value of air quality or its pollutant (PM 2.5, NOx, COx, SOx, etc) could be forecasted. However, forecasting of air quality and its pollutant is complicated as air quality is dependent on several different factors such as weather, vehicular and power plant emission, etc. There has been little to no research in this domain in the context of Pakistan. This thesis aim is to find the impact of weather on the PM 2.5 concentration and to forecast the Daily and Hourly PM 2.5 concentration for the next 30 days and 72 hours respectively for Lahore, Islamabad, and Karachi in Pakistan. This forecasting will be done through the state of art Time series models such as SARMIAX, FbProphet, LSTM, and LSTM Encoder-Decoder, moreover, these models will also be compared against each other based on MAPE and processing speed. Through this research, we were able to determine that weather conditions such as Temperature, Precipitation, UV Index, Humidity, Pressure, Wind speed, Visibility, Dew point, and Cloud cover had a negative correlation with PM 2.5 concentration. This thesis was also able to successfully forecast the proposed daily and hourly PM 2.5 concentration. In respect to model comparison, LSTM Encoder-Decoder was found to the most accurate with a MAPE value of 28.2, 15.07, 42.1 for daily and 11.75, 9.5, 7.4 for hourly forecasting for Lahore, Islamabad, and Karachi respectively. However, in respect of speed, it was also among the slowest. This proves that a data-driven approach is essential just like the knowledge-based method for the resolution of air pollution in Pakistan.Item AI AND DEEPFAKE SYNTHETIC MEDIA(UMT, Lahore, 2021) Wasim AbbasFrom the last two to three year has marked as a fast growth of DeepFake synthetic videos. The biggest challenge for the research community to the detections of DeepFake videos. The aim of this research is to classify the videos whether they are real or fake that can be used to robustly identify the face image in videos. A deep convolutional neural network (CNN) reframed model Multitask-Cascading Neural network (MTCNN) and trained on face area of image getting from videos frames. each videos have 300 frames of face images. A pre-trained model structure similarity is used for classification. On Training model results shows the accuracy of 80% by using 400 videos. A dataset must be larger needed to overcome the overfitting of model and increase the accuracy of model. When sufficient classification accuracies are reached, smart picking methods can be implemented to efficiently handle DeepFake videos.Item BREAST CANCER PROGNOSIS USING DATA ANALYTICS(UMT, Lahore, 2021) SHAHZAD ALICancer is the second leading cause of women’s death worldwide as per World Health Organization. While understanding the best applications of Data Science and Machine Learning models, it is considered a best practice to build a model which can be helpful for early detection of breast cancer. As per clinical research, cancer is a vast study and can be best controlled if diagnosed and treated at the early stages. In this research, we thoroughly studied the cancerous tumor by obtaining the dataset of Fine Needle Biopsy, with perspective of machine learning algorithms. After obtaining the publicly available dataset, we did the feature selection and data cleaning. Feature selection was carried out to select those records of the dataset which may lead to the malignancy of the cancer. After feature selection, we did the data cleansing and removed the unwanted features from the dataset. While Keeping in knowledge the limitations of different machine learning models, we applied linear regression model and used Principle Component Analysis in order to obtain best accuracy and computation.Item GOING DEEPER WITH RUMOR DETECTION(UMT, Lahore, 2021) Abdul RahimThe most trending topic of the year 2020 had been the Presidential elections of The United States of America, all the eyes were on this election because it defines the new rules and has deepening impacts for the whole world. In this event one of the most popular app twitter played an important role. Candidates used this social media platform to interact with their followers and to enhance election campaign. Mean-while, digital media was chocked with lot of false claims and rumors, which had great impact on sympathies and inclination of the voters. A whole nine-yard analysis of rumors via tweets across the world was taken up, focusing on Donald Trump. We used the dataset which contains the tweets having hashtag_donaldtrump keyword. And match this data with well-known factcheck websites and articles by using professional deep learning technique. We proposed the BERT model which is pretrained on larger corpus it is a type of transform learning which is good for training on small dataset and predict large data. To overcome the difficulty of labelling our dataset we scrap around 900 rumors and non-rumors data from different factcheck websites and then scrap around 450 tweets from official twitter account of Donald Trump and merge it into one file after collecting this data we annotate it manually. We train our BERT model on this dataset and predict around one million tweets. The results adequately provide answers to several major rumors and the related stuff as how rumors influence and how the typical manipulation undergoes? Which countries are the main source of rumors, which US state followers spread the rumors most, and which twitter application was used the most for posting rumor tweets? The insights of this research helped us understand as how rumors were generated and how did they effect the mindsets in the recent elections in US.