Data Science is a fast moving industry where new predictive algorithms, software for data analysis, and big data tools are constantly being released. For me it’s tough to stay current on the latest and greatest in Data Science so this is my attempt to help with that. At the Staying Current page on this blog I’ll keep a running list of new papers and blog posts that I think belong in a Data Science Best Practices toolkit. For example what is already up there is about the optimal way to create ensemble models, confidence intervals for Random Forest models, and a creative way to model chracter variables with many levels called Impact Coding (that’s not actually new I just hadn’t heard it formalized until I read the paper). These aren’t data science silver bullets but I think they are interesting techniques that I would like to incorporate into my skillset and I hope you enjoy them as well.