Skip to content

ECS171-Project/Final-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ECS171 2019Fall Project

Comparing the Robustness of Machine Learning Approaches on Spam Filtering Problems

Project objective:

Perform different methods on spam messages detecting, comparing methods like KNN, Naive Bayes classifier, SVM, Neural Networks classifier and find the one gives the best precision rate.

Data source:

The dataset for this project comes from Kaggle. link to Kaggle!

The SMS Spam Collection is a set of SMS tagged messages that have been collected for SMS Spam research. It contains one set of SMS messages in English of 5,574 messages, tagged acording being ham (legitimate) or spam.

Data preprocessing

Data preprocessing

Project code:

KNN

Support Vector Machine

Ramdon Forest

Naive Bayes

LSTM

Gated Recurrent Units: Note that you must download the pre-trained word vector "glove6B.zip" from here:https://nlp.stanford.edu/projects/glove/ in order to run the code about gated recurrent unit with pretrained word embedding layer. You need to unzip the folder once you have downloaded it. Then put "glove.6B.300d.txt" on your working directory.

Model comparisons

Final Report

Report

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 9