Category: Data Science

  • Doing Data Science, Helping People Get Jobs @Indeed

    Having just marked my 1-year anniversary at Indeed, it occurs to me that I have not yet blogged about my not-so-new job as a data scientist helping people get jobs. In addition to ending a long (7+ month) drought in my blogging practice, I'm also hoping that in sharing a bit about my work at…

  • Notes from #PyData Seattle 2015

    I was among 900 attendees at the recent PyData Seattle 2015 conference, an event focused on the use of Python in data management, analysis and machine learning. Nearly all of the tutorials & talks I attended last weekend were very interesting and informative, and several were positively inspiring. I haven't been as excited to experiment with…

  • Python for Data Science: A Rapid On-Ramp Primer

    In my last post, I was waxing poetic about an IPython Notebook demonstration that was one of my highlights from Strata 2014: "this one got me the most excited about getting home (or back to work) to practice what I learned" Well, I got back to work, and learned how to create an IPython Notebook. Specifically, I…

  • Hype, Hubs & Hadoop: Some Notes from Strata NY 2013 Keynotes

    I didn't physically attend Strata NY + Hadoop World this year, but I did watch the keynotes from the conference. O'Reilly Media kindly makes videos of the keynotes and slides of all talks available very soon after they are given. Among the recurring themes were haranguing against the hype of big data, the increasing utilization…

  • The Scientific Method: Cultivating Thoroughly Conscious Ignorance

    Stuart Firestein brilliantly captures the positive influence of ignorance as an often unacknowledged guiding principle in the fits and starts that typically characterize the progression of real science. His book, Ignorance: How It Drives Science, grew out of a course on Ignorance he teaches at Columbia University, where he chairs the department of Biological Sciences…

  • An Excellent Primer on Data Science and Data-Analytic Thinking and Doing

    O'Reilly Media is my primary resource for all things Data Science, and the new O'Reilly book on Data Science for Business by Foster Provost and Tom Fawcett ranks near the top of my list of their relevant assets. The book is designed primarily to help businesspeople understand the fundamental principles of data science, highlighting the…