Machine Learning 101 Course Evaluation

Sat 26 February 2011 | -- (permalink)

I've just finished the five week Machine Learning 101 class taught at Hacker Dojo.

I didn't get what I had hoped out of the course. I'm quick to acknowledge that I could have gotten more out of it by applying more effort, but I also think there were some unnecessary roadblocks in the course that stymied my (and many other students') attempts to dig into the material. My purpose in this review is to identify the things that I think worked well, identify the things that didn't work well, and offer some suggestions for improving the course in the future. I want to emphasize that this is only one person's opinion; there are surely students whose experience was better (or worse) than mine.

Things That Worked

  • The Instructors. Mike Bowles and Patricia Hoffman are excellent lecturers, and show genuine concern for helping their students to grasp the material.
  • The Homework Assignments. When I was able to complete them, the homework questions did a good job helping me to grasp the concepts they covered.
  • The Venue. Hacker Dojo is just a fun place to visit. Even when there were other events going on at the Dojo during our class time, we had no problems with disruptive noise or contention for space.

Things That Didn't Work So Well

  • Sign Up Process. Initially it appeared that to sign up for the course you needed to join the group, and if you didn't get in under the cap of 70 students you'd be put on a waiting list. The page indicated that payment for the course should be made with a check at the first class session. A few days before the start of the course, however, this was changed so that signup and payment happened on Eventbrite. While I imagine this delighted the wait-listed students who were able to register under the new system, it seems like a trap for students who thought they were safely registered. Suggestion: Get the sign up and payment systems in place weeks before the start of the class.
  • Classroom Management. During the first three weeks of the course, the lectures were excessively interrupted by students asking repetitive questions or making pedantic arguments about basic concepts. Suggestion: Designate a chat channel on Freenode or Convore where students can ask and answer questions during class time without disrupting the lecture.
  • Prerequisites. The course was advertised as being appropriate for students who knew how to program in some language, but not specifically R. I assumed this meant that the instructors would teach the parts of R needed to apply the concepts taught in class and complete the homework assignments. This did not turn out to be the case. Instead of giving in-class instruction on R fundamentals, the instructors announced on the first day of class that they would have a special one-off class on R fundamentals the following morning (Sunday). While I applaud the instructors for dedicating extra time to help with R instruction, doing it with a surprise session outside of class time meant that I (and I assume many other students) could not attend. It wasn't until several weeks into the course, and after pestering my R-using colleagues at work, that I was able to clearly understand the distinction between arrays, vectors, and dataframes, for example. Suggestion: Spend less of the first class session on the 100 mile-per-hour overview of machine learning, and more on R fundamentals such as datatypes and indexing.
  • Four Hour Class Sessions. Unlike the twice-a-week Machine Learning 201 course, Machine Learning 101 met only once per week, for four hours on Saturday morning. I didn't mind the weekend morning start time (I think it did a good job filtering out the uncommitted), but at the end of four hours the attention spans of even the most diligent students have been exhausted. Suggestion: Split the class time into two sessions per week, of two hours each. More frequent meeting times will also have the benefit of helping to keep the subject matter more fresh in students' minds.
  • The Homework Assignments. The homework assignments were frequently insurmountable given the combination of sparsely-commented code samples, unfamiliarity with R, unfamiliarity with the substantive topics we were supposed to be learning, and the absence of instructor-provided solutions. Actually completing the homework assignments required heroic effort, like the guy who spent 5 hours Googling and puzzling over how he was supposed to re-scale the coefficients after doing a ridge regression, which was just one sub-part of one homework question. Suggestions: Diligently comment the code samples. Provide example solutions to the homework assignments against which students can compare their answers as they're working on the problems. Of all the suggestions offered in this evaluation, I think these would have the largest benefit.
  • Student Groups. On the first day of class, the instructors had the students count off into groups of roughly six students each. The members of these groups exchanged email addresses, and were supposed to submit their homework assignments as a group. While this may work well in a university setting where students have mostly similar schedules and no full time jobs, it was mostly ineffective for a class of people who have work and family obligations. Suggestion: Instead of randomly assigning students to groups, provide a means for students to organize themselves into groups based on when and where they're able to meet to work on the problems. The mailing list or a wiki page might be sufficient for this.
  • \$300 Registration Fee. On the first day of class, organizer Doug Chang apologetically stated that the only reason there's a charge for the class was to weed out uncommitted students. I'm skeptical that the fee is achieving that goal, however, and strongly suspect that it has some serious unintended consequences. Rather than filtering for committed students, the registration fee seems to filter for students whose employers are willing to pay the fee for them. Compared to most Hacker Dojo events, the attendees of the machine learning class skew tremendously towards employees of huge companies like Microsoft and Symantec. I have no problem with these students attending a Hacker Dojo class, but it's a shame if other committed students are being priced out because their companies have less generous budgets for employee training and education. Another problem: the more you charge for the class, the more students will feel that they're supposed to be getting highly polished lectures and materials, as opposed to just taking part in a collaborative effort where the students are largely responsible for helping to teach each other. Suggestion: If the fee is really just to weed out uncommitted students rather than to raise money for the Dojo or instructors, then reduce it to something more affordable like \$50 or \$100. Another Suggestion: If you really want to filter for which students are committed to sticking with the class and contributing to it, then you could take the suggestion about student groups above and make it part of the registration process. If students couldn't register for the class until they had organized themselves into study groups, you would better filter out those who expect the instruction to be served to them on a silver platter. I suspect you'd also end up with much more effective study groups.

Mike and Tricia deserve applause for all the time and effort they continually put into their Hacker Dojo machine learning classes. My complaints about my course's effectiveness are minor in comparison to the benefit bestowed on the community by Mike and Tricia teaching these classes. By following the suggestions offered above, I think this benefit could be multiplied, and students' frustration greatly reduced.