Skip to content

wfuu/ML

Repository files navigation

Coding Assignments and Practice

Based on "The Elements of Statistical Learning"

This repository contains implementation of various machine learning algorithms based on QMSS 4058 and STAT 5241 courses at Columbia. All R code is accompanied by explanatory narration.

EM for Image Segmentation: For an image segmentation task, implementing the EM algorithm (mixture model of multinomials) from scratch and visualizing the result.

Optimization, PCR, Classification: Assignment tasks are to optimize a loss function in R; run a PCR model; and explore the best model for classifying a binary outcome.

Smoothing, Trees: Assignment tasks are to fit a generalized additive regression model (GAM); run and compare different tree-based classification models (Bagging, Boosting, Random Forest).

Neural Nets, bartMachine: Assignment tasks are to fit a neural networks model by varying the number of hidden layers; run and compare this and other models (including bartMachine) in prediction accuracy, based on Mean Squared Error.

Predictive Classification (Final Project): Final project for QMSS 4058: Data Mining. Data set is from The Second International Knowledge Discovery and Data Mining Tools Competition. The task is a classification problem with the goal to estimate the response rate (donate vs no donate) to a direct mailing program. Collaborator: Arnold Lau. Data and info here: http://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html

About

Statistical machine learning practice and explorations in R

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published