Thoughtful Machine Learning in Python


Thoughtful Machine Learning is a book designed for software engineers. Since my training also targets the software to data science route, you won’t be surprised to hear that I wholeheartedly agree with the advice and direction of the book.

The word “thoughtful” in the title refers to the goal of producing deterministic behaviour (a particularly difficult problem in data science) and through the use of unit tests. The fact that this is performed from the outset is testament to the software-driven approach. I’m not sure that this is quite the right word, or describes its content accurately, but I appreciate the sentiment.

Despite this, i began to tire of the code listings. I agree that we (as readers and engineers) need reference code to learn from and use. But I don’t think that we should be printing code on dead trees. Out of date libraries, bugs and changes of heart all require that code be easily updated; hard to do when it is etched in time.

Furthermore, the code tends to obfuscate the ideas and beat practices that are so important in data science. If data science is to become a true disciple of engineering, which I belive is the goal of this book, then we need to codify the best practices and the abstractions. Not the specific implementations.

However, I really did enjoy the book, mainly because I agreed with the core tenants of production quality data science. It’s always enjoyable to read something that aligns with your core principals.

The contents leads the reader through a range of different models. It does take a non-classical approach of discussing nearest neighbour and then bayesian classifiers first. Then on to tree-based methods and, quite surprisingly, Markov models. I can’t remember seeing classification models introduced in this order before. And I’m not sure that I agree with the order which seems to indicates that these complex models (by which I mean complexity of the model, not the algorithm) are somehow preferred.

It ends by introducing support vector machines and a brief section on neural networks.

It doesn’t treat the preprocessing of data with a chapter that it deserves, but it is discussed throughout the book. I would have preferred a dedicated chapter due to its importance.

But most importantly it doesn’t talk about regression, clustering or dimensionality reduction techniques at all. And, maybe suprisingly, if I like this. If it had, it would have diluted the content even further. But still, if you are perticularly interested in these topics then you will not find help here.

Overall, if you are an engineer that intends to take models into production, then this book will improve the chances of a good night’s sleep, safe in the knowledge that your models are robust.