Using Boosted Trees as Input in a Logistic Regression in R

Recently I encountered an interesting paper from the facebook research team that outlines a method for using decision trees (specifically boosted trees) to create transformed data to be used as input to a final logistic regression. I thought this was really cool and wanted to try and recreate the method... [Read More]

Generate College Football Team Quality Metrics

In order to build a prediction model for next season I need to know how good each team was in the previous season(s). A decent and simple way to measure team quality is to use the Massey rankings, which basically just finds the points above average that each team contributes... [Read More]

College Football Point Differential Charts

I have seen some great visualizations lately that show the play-by-play point differential for the NBA and the NFL and I wanted to recreate that for College Football. You can view all the source code on github. I also need to give a huge thanks to the r/CFBAnalysis subreddit for... [Read More]

Staying Current

Data Science is a fast moving industry where new predictive algorithms, software for data analysis, and big data tools are constantly being released. For me it’s tough to stay current on the latest and greatest in Data Science so this is my attempt to help with that. At the Staying... [Read More]