News Classification w/ Multinomial Naive Bayes

In this notebook, we will classify news articles into categories( tech, entertainment, and sports) using Machine Learning. For news classification, the dataset(collected from BBC News) contains news articles including their headlines and categories. The categories covered in this dataset are: Sports Business Politics Tech Entertainment We have used the Multinomial Naive Bayes model for theContinue reading “News Classification w/ Multinomial Naive Bayes”

EDA and NLP with Disaster Tweets

Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programmatically monitoring Twitter (i.e. disaster relief organizations and news agencies).But, it’s not always clear whether a person’s words are actually announcing aContinue reading “EDA and NLP with Disaster Tweets”

EDA for Store Sales Data

Link to competition and data: https://www.kaggle.com/competitions/store-sales-time-series-forecasting/overview In this competition, one will predict sales for the thousands of product families sold at Favorita stores located in Ecuador. The training data includes dates, store and product information, whether that item was being promoted, as well as the sales numbers. Additional files include supplementary information that may beContinue reading “EDA for Store Sales Data”

Balanced Classification Rate: An Explanation

Balanced Classification Rate is used to evaluate the performance of a binary classifier especially if the classes are imbalanced such as anomaly detection and the presence of a disease. We start our discussion from confusion matrix: Here, TP – True Positive FP – False Positive FN- False Negative TN – True Negative Balanced Classification Rate isContinue reading “Balanced Classification Rate: An Explanation”

Vector Auto Regression for Multivariate Time Series on Air Quality Data

Multivariate time series has more than one time-dependent variable where each variable depends both on its past values and on other variables. Example. A dataset has perspiration percent, dew point, wind speed, cloud cover percentage, etc. along with temperature values for the past two years. So, there’re multiple variables for optimal temperature prediction. For multivariateContinue reading “Vector Auto Regression for Multivariate Time Series on Air Quality Data”

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence, Machine Learning, and Deep Learning are in vogue in today’s commercial world to create intelligent applications. Used interchangeably many have problems differentiating between them. As such, a popular Google search request is “are artificial intelligence and machine learning the same thing?”. Let’s see what some famous personalities in tech have to say aboutContinue reading “Artificial Intelligence vs. Machine Learning vs. Deep Learning”

Precision and recall

Simple Definition: Precision/ positive predictive value is the fraction of relevant instances among the retrieved instances. Precision = TP/(TP+FP) TP =  true positives/ the number of items correctly labelled as belonging to the positive class FP = false positives/ items incorrectly labelled as belonging to the class Recall/sensitivity is the fraction of the total number ofContinue reading “Precision and recall”

Neural Network ChatBot

GitHub link: https://github.com/rukshar69/Datascience-Projects/tree/master/ChatBot In this article, we create a dense neural network and train it with a custom dataset to create a chatbot. The dataset includes several user patterns, their responses and associated tags. There are 6 tags in our data representing 6 types of user queries and their responses. They are: [‘age’, ‘goodbye’, ‘greeting’,Continue reading “Neural Network ChatBot”

Collaborative Filtering Recommendation System on Movie Lens Dataset

GitHub link: https://github.com/rukshar69/Datascience-Projects/tree/master/RecommendationSystem In this article, we use the Movie Lens dataset to create a recommendation system with collaborative filtering to recommend 10 movies to a user based on its previous ratings. For this experiment, we use movies.csv and ratings.csv to create our model. A glimpse at the movies.csv: A glimpse inside ratings.csv: We combineContinue reading “Collaborative Filtering Recommendation System on Movie Lens Dataset”

Design a site like this with WordPress.com
Get started