2022

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 10 of 10
  • Item
    NOVEL APPROACH TO ANALYZE EXCEL MACRO DATA
    (UMT, Lahore, 2025) MUHAMMAD SHAHARYAR ZAFAR
    Massive development in storage architecture over the last decade has assisted users by facilitating them to store heterogeneous data on their system. Extracting the insights from data has been more complex with this progress. In this study, a novel approach to analyze the excel macro data is presented. GCSE results datasets as a sample excel macro data is used to evaluate the approach. Web application is designed and developed that offers visual analytics and prediction on this excel macro-based GCSE results. Data wrangling along with ETL is applied to transform the data in a structure where it can be analyzed, since the raw data source is in excel-macro structure.
  • Item
    ONLINE DATA MONITORING FOR ASSET PREDICTIVE MAINTENANCE
    (UMT, Lahore, 2022) Farhan Ahmad
    Number of researches though have been conducted in order to know about the transformer oil behaviour and to know about the faults that transformer can have due to gases mixed in the oil. In this research, our main aim is to identify the faults and predict on the basis of mixed gases in the transformer oil. This research is conducted on the data set that is fetched from the real time equipment installed in the field. This equipment was installed for the purpose of identification of fault before the time. There is a gap in the field of electrical engineering that there is no prediction based on the data. There is manual trending and based on expert’s opinion, authorities do actions. SVM and KNN will be applied in order to get the predictions. These two models are selected because of data set as data set is balanced and more accurate results can be obtained. Hence, using the data set and the said models, we will identify which model will give more accuracy and we will support our results with the help of literature as well.
  • Item
    Se-Bert Emotion Based Sentiment Analysis
    (UMT, Lahore, 2022) Adil Rehman
    Now a days, a lot of data is produced due to popularity of social network sites, i.e., Twitter, Facebook and Instagram. Due to the brief and straightforward terms used in microblogging, millions of people share their opinions every day. I’ll talk about a technique for takeout sentiment from Twitter, a well-known platform for microblogging where users can express their opinions on virtually everything. In this paper, I used different models by using two sentiment-based datasets. Twitter dataset of sentiments and emotional sentiment dataset. Firstly, I extracted the sentiments and text from twitter dataset, after the extraction and preprocessing used three different models such as RNN and BiLSTM and pre-trained BERT model. Secondly, in the relationship between user’s sentiment, I compare different models RNN and BiLSTM and BERT to determine the good and accurate result. First dataset includes two categories that are “positive” and “negative” and Second dataset belongs to six categories of feelings such as “joy”,” sadness”, “love”, “fear”,” anger”,” surprise” etc. BERT pretrained Model mainly used to get relation between previous and next sentence but there I used for emotional sentiment analysis, so there I proposed a model named Se-Bert for emotional sentiment analysis which is described after pretrained model BERT (bidirectional encoder representations from transformers) implementation. I proposed our model in experiments that I found different results in tweet sentiments had a significant impact on fetching or knowing about user’s behavior. The experiments demonstrate that the Se-Bert model put out in this paper is capable of accuracy levels of 97.29% and 86.77% in two different datasets of tweets sentiments and emotional sentiments simultaneously, the result that Se-Bert, which I developed, outperforms both RNN and BiLSTM techniques.
  • Item
    Ur-SL CNN: Convolutional Neural Networks for Pakistan Sign Language
    (UMT, Lahore, 2022) Khushal Das
    The deaf people or the people who are facing issues in hearing can communicate using sign language (SL), which is just a visual language. In order to translate Urdu SL, this study suggests Ur-SL CNN, a deep learning model based on a convolutional neural network (CNN). Experiments were performed on two datasets containing, 1500 and 78000 images. Despite the fact that deep learning has gained popularity among researchers for a while, convolution neural networks still perform the best on photos. The proposed study contains four modules, including data collection, pre-processing, categorization, and prediction. To achieve prediction with high accuracy, each sign image is scaled, transformed into a greyscale image, and the noise is filtered. Using both deep learning approach and machine learning approach., classification and prediction are performed. In machine learning technique, I applied support vector machine, Gaussian Naive Byes, random forest and k-nearest neighbors’ algorithm (KNN) and I got 0.81, 0.75, 0.88, and 0.68 respectively on Urdu sign language alphabets dataset. While, the CNN based proposed system outperformed than state of art and gave accuracy of 0.95.
  • Item
    Multi-Lingual Hate Speech Detection using Laser, Muse with deep learning
    (UMT, Lahore, 2022) Khalil ul Rehman
    According to the National Institute of Standards and Technology, hate speech is "any communication that disparages a person or a group based on any attribute such as race, ethnicity, gender, sexual orientation, nationality, religion, or other trait." Our study differs from previous efforts in that we conduct the experiment on a considerably broader variety of languages 11 and with more datasets 11. We conduct a total of 4 analysis that consist of models: Logic Regression, mBERT, BERT, and CNN-GRU with the help of LASER and MUSE embedding. We conclude that in the low resource model settings machine learning model as LASER embedding with LR gives the best results, while with the high resource model setting BERT based models give the high performance according to our observations. In low resource languages, Italian and Portuguese get the best results. Our research aims to use current hate speech resources to create models that detect hate speech using LASER, MUSE and Deep Learning Models
  • Item
    Training Optimization of Graph Convolution Neural Network for Document Classification
    (UMT, Lahore, 2022) Muhammad Wajeeh Ahmed Raza Khan
    Long document classification long is a long-time pending problem that was resolved by the GCN Graph Cnvolution Neural Nework which classifies the data without any pre-embedding and has the capabilities of holding global co-occurrence of words that was missing in the traditional DNN model. The only problem is now that model takes a long-time during training and this problem is addressed by optimization of the activation function. The test results reveal that employing the quantitative measurements of LReLU′s modest negative gradient significantly outperforms LReLU and ReLU on problems including the classification of text.
  • Item
    Prediction of Covid-19 using Machine Learning Framework-Logistic Regression.
    (UMT, Lahore, 2022) SAIRA TARIQ
    Many different pandemics have afflicted mankind throughout history. Throughout history, several pandemics have occurred, each with its own characteristics. The most recent big outbreak occurred in the 1920s in San, Florida, and was the deadliest ever reported. A century later, under the name COVID-19, a fresh outbreak akin to the Spanish flu emerged. The WHO declared COVID-19 a pandemic in mid-December, and it has subsequently spread globally. The outbreak was one of the worst ever recorded, affecting individuals across the world. COVID-19 is now one of the most widely researched and popular study topics. COVID-19 is now considered the most common human sickness. Since the World Health Organization announced COVID-19 a pandemic in April, many articles on the virus have been released. These publications cover a wide range of studies and investigations related to COVID-19 and are available online. However, little research has been done on using machine learning to predict and look at COVID-19. This research sought to better understand COVID-19 by analyzing machine learning's function, performance, and usefulness in predicting different symptoms and concerns stated by patients. Using logistic regression, the researchers discovered that machine learning could accurately predict a wide variety of symptoms and concerns. Prior to analyzing the material, it was necessary to categorize it. The research included 137 people, 78% of whom were men and 22% were women. Males controlled the poll. The study found that 25% of respondents were married, more than the national average. When asked about their symptoms during COVID 19, 24% had a fever and 20% had a dry cough. These two symptoms are most common in COVID-19 patients. Shortness of breath and bodily pains were less prevalent problems, with just 15% reporting one or both. The survey found that just 3% of those questioned experienced a loss of fragrance. Clearly, it is the least probable symptom in COVID-19 syndrome. COVID-19 also causes sore throats. The study revealed 13% of people had a sore throat. COVID 19 identified hyperglycemia as one of the most important patient concerns at the time. Contrary to expectations, liver diseases were not as frequent as previously thought. The most frequent disease among COVID-19 patients was hypertension, which affected 19% of those studied.
  • Item
    PROGNOSIS OF BREAST CANCER USING MACHINE LEARNING TECHNIQUES AND ANALYZING FOOD HABITS OF PAKISTANI WOMEN
    (UMT, Lahore, 2022) RABBIA IBRAR
    Breast cancer is easily occurred in all women due to poor eating habits. The present study examined food risk factors for breast cancer, their association with quality of life and changes in eating habits. The research included 200 women data with histological confirmed invasive breast cancer. This research data consists of different food types of patients. In this study different Machine learning algorithms are used like LR, SVM, CNN, Perceptron, GB, ADA Boost, DT, RF, Multi-perceptron. Everyone have different accuracy we analyzed AD Boost classifier have highest accuracy which is 87.5% due to low quantity of our data set.
  • Item
    An E-Commerce Based Loan Prediction Through User Profiling
    (UMT, Lahore, 2022) Ammara Ihsan
    Online shopping is trending and convenient in recent years, and the online business is rapidly developing in the retail industry. Where the user has to pay before trying, whether the product meets their satisfaction criteria or not, that can cause customer churn. Therefore, some eCommerce stores offer a “try before you buy and pay overtime facility” ( n.d.) to mitigate the churn rate. In such cases authenticating the credibility of the customer is very important and crucial if we have no personal information about the customer, no credit history, zip code, salary or any bank details provided, even if we have no label information about whether the customer can pay back or not. Hence, the retail industry is looking for ways to automate the process and make it more efficient to predict the credibility of the customer. Credit scoring has shown to be an effective technique for eCommerce companies to identify prospective churn customers and default debtors. The purpose of this research is to combine unsupervised and supervised techniques with analytics to get the most accurate possible results. In this thesis, we introduced two risk-scoring ensemble prediction models that combine different algorithms to analyze various hypotheses and make a new hypothesis for credit assessment. Firstly, the model predicts the retention score of the customers by using the TabNet classification model and then uses these probabilities scores to predict the customer's credit scores by user profiling. Customers who have a low predicted probability value are likely to be not satisfied customers and have a low level of credibility. To predict the credibility of the user we use the unsupervised GraphSAGE DBSCAN embedding model, and use these embeddings to map them into a Graph-network, and find the demographic-based similarities between users to segment them. Six popular evaluation metrics, consisting of accuracy, the area under the curve (ROC-AUC), F1 score, precision, recall and KS statistics are employed to evaluate the performance of the churn prediction model and achieve 96% accuracy on the test set. The Silhouette Coefficient score, Calinski-Harabasz Index, and Davies-Bouldin Index metrics are used to estimate the proposed unsupervised clustering approach, and results can be reviewed by human analysts. This research examines consumer purchasing, churning, and credibility patterns using graph-based embedding techniques. The study then analyses the trends behind the factors that contributed to the decline in consumer validation in the retail industry by comparing these to the different eCommerce datasets in place.
  • Item
    GENERAL ELECTION FORECASTING MODEL FOR PAKISTAN: LEVERAGING MACHINE LEARNING FOR POLITICS
    (UMT, Lahore, 2022) Ali Ehtsham
    The purpose of this work was to test the extent of current machine learning models’ application to election data in Pakistan. It evaluates the forecasting approaches in practice globally and in Pakistan. This work conducts analysis models and the parameters used to forecast elections globally. The aggregation models were most effective in forecasting elections. The election forecasting models based on sentiment analysis performed below average. The lack of effectiveness in sentiment analysis is due to use of complete tweets data instead of targeted tweets in the geographical constituency area. The dataset used for conducting the research is of Pakistan General Elections. The General Election held in year 2002, 2008, and 2013 were selected because of uniform constituency delimitations. This data was cross-verified with the Gazette of Pakistan. This work proposes a methodology to predict the election of Pakistan. This work presents a proposal towards a forecasting model to forecast the winner or loser at the constituency level on past election data. It uses the classification to differentiate candidates into the winner or loser for a particular constituency. The supervised machine learning algorithms were used for classification. The algorithms used are Logistic Regression and Support Vector Machine. Multiple experiments were conducted with changing parameters and manipulation of data being added as an input to the model. In this work first, experiment used Logistic Regression model with 25,000 iterations. The experiment achieved 99.82 percent accuracy. In the second experiment Logistic Regression model was used after reducing the iterations to 15,000. The experiment achieved 99.82 percent accuracy. There was no difference in accuracy from first experiment. In the third experiment. The input values of independent variables were scaled for Logistic Regression model and iterations were kept at 15,000. The accuracy decreased to 91.01 percent. In the fourth experiment. The scaled features were passed in a pipeline function as input for Logistic Regression model. The accuracy was increased to 98.63 percent. The fifth experiment had training data as input to Support Vector Machine Model with the linear kernel. The accuracy was 1.0. The sixth experiment had a Support Vector Machine with a radial basis function as the kernel. The accuracy reduced to 0.91 with radial based kernel from 1.0 of linear kernel. The dataset comprised of all constituencies from Pakistan for all three General Elections. Therefore, the proposed model is generalizable for whole country. The proposed model conveys the evolution of voter intentions as training data had the whole of Pakistan data. The experiment in this work validate the proof of concept. This methodology can be extended to all elections for creating the complete dataset and election model.