Читать книгу Becoming a Data Head - Alex J. Gutman - Страница 25

So What?

If you understand the restaurant example, you're well on your way to becoming a Data Head. Let's reveal what you learned, little by little:

You performed classification by predicting the label (chain or independent) on a new restaurant by training an algorithm using a set of data (restaurants’ location and their chain/independent label).

This is precisely machine learning! You just didn't build the algorithm on a computer—you used your head.

Specifically, this is a type of machine learning called supervised learning. It was “supervised” because you knew the existing restaurants were (C) chain or (I) independent. The labels directed (i.e., supervised) your thinking about how restaurant location is related to whether it's a chain or not.

Even more specifically, you performed a supervised learning classification algorithm called K-nearest-neighbor.⁵ If K = 1, look at the closest restaurant and that's your prediction. If K = 7, look at the 7 closest restaurants and predict the majority. It's an intuitive and powerful algorithm. And it's not magic.

You also learned you need data to make informed decisions. Realize, however, that you need more than that. After all, this book is about critical thinking. We want to show how stuff works but also how it fails. If we asked you to predict, given the data in this Introduction's images, if the new restaurant would be kid-friendly, you wouldn't be able to answer. To make informed decisions, not just any data will do. You need accurate, relevant, and enough data.

Remember the technobabble we wrote earlier? “… supervised learning analysis of the binary response variable …”? Congratulations, you just did a supervised learning analysis of a binary response variable. Response variable is another name for label, and it's binary because there were two of them, (C) and (I).

You learned a lot in this section, and you did it without even realizing it.

Подняться наверх