We’ve been talking a lot about what it takes to build a truly sophisticated model and get the best results. We looked at the Titanic challenge on Kaggle, but whatever we did, the accuracy did not exceed a certain percentage. So what to do?
Whatever problem you’re dealing with, having domain expertise gives you a head-start. Taking one step back, to the pre-pre-processing phase: one way of getting a better and intuitive understanding of the data you’re dealing with is by visualizing it. There are a lot of ways to visualize data (see previous posts), a new way was just recently published by Google: Facets.
Facets: An Open Source Visualization Tool for Machine Learning Training Data
Tech and Art
Visualization is powerful. It’s indirectly related but I just stumbled upon this beautiful website R2D3. It combines statistics with interactive design and gives a nice visual intro to Machine Learning. Another very helpful place to look for additional insights is Andy Kirk’s Visualising Data website, a collection of helpful resources.