Tag Archive for 'book'

Programming Collective Intelligence

Programming Collective Intelligence, 1st Ed. (2007)
Author: Toby Segaran, Published by O’Reilly Media Inc.

From the “long-overdue” dept., here’s my review of Programming Collective Intelligence, which I received for free through the O’Reilly user group distribution channel. Opinions are entirely my own, and probably wrong.

I have a weird bias when it comes to reading technical books. I’m a self-taught programmer, but I’ve taken coursework in the formal mathematical basis for computation, and I’ve always enjoyed seeing an elegant proof or clever algebraic formulation of a hairy coding problem.

Given that perspective, my impressions of Programming Collective Intelligence were mixed. If approached as a “cookbook,” with recipes for analysis applied to the data available from popular social networking sites and web services, you can see real results quickly, without having to understanding the underlying mathematics.

Unfortunately, that also means you miss out on a lot of the potential “Eureka!” moments that can come from thinking a little more deeply about an algorithm. This is especially apparent in the last couple of chapters, when active, rich areas of CS research like support vector machines and genetic programming are covered in a few dozen pages each, which often doesn’t offer space to do much more than tease the reader with the potential power a technique offers.

I think there’s a wide gap between the “blog article” and “peer-reviewed journal” levels of formalism, and I admire Segaran’s effort to span that divide and bring some of the fruits of AI research into the pragmatic domain for us 9-5′ers. On the other hand, I don’t think I walked away with any great new perspective on machine learning, so much as I saw some cool examples of how the author had applied it.

Ideally, I think this book, or another like it, should cover fewer real-world services or problems, and apply more of its algorithms and techniques to each data set. Every page spent describing how to interface with a particular web service (or worse yet, scrape a single site’s HTML structure) screamed “planned obsolescence” to me — APIs change, and sites that are popular today may be dead and gone in a few years, but the analytical tools being discussed are far more timeless.

Overall, I’d give the book a B-, at least for my own uses. If the Chapter 12 and Appendix B content were moved into the main body of the text, and the random dating site scraping techniques dropped or themselves demoted to appendices, I would quickly raise it to a B+ or A-.

I would still recommend the book as-is to anyone who was more interested in getting results tomorrow than gaining deep understanding of the problem space, as well as those with less Internet application programming background, for whom the concrete code examples would have more value.