I Have Verified the Randall Munroe Wikipedia Philosophy Hypothesis for ‘cheese sandwich’ and ‘Gorillaz’.

Context from the hover text on today’s XKCD:

Wikipedia trivia: if you take any article, click on the first link in the article text not in parentheses or italics, and then repeat, you will eventually end up at “Philosophy”.

You Should Study Accounting

Not necessarily for a career, but as part of being a well rounded member of society. This quote from Swarthmore’s Timothy Burke via The Atlantic’s Megan McCardle explains why:

The most awesome development project I’ve ever seen personally was a former small businesswoman in the Peace Corps who was teaching a handful of small business owners in a small African city how to do double-entry accounting with handwritten ledgers. The smart insight here was that most of them didn’t want to hire anyone who wasn’t kin because they couldn’t track the flow of money through their business, and that most of them couldn’t really invest anything they accumulated or expand their business for the same reason. I’d have given her a donation.

Machine Learning 101 Course Evaluation

I’ve just finished the five week Machine Learning 101 class taught at Hacker Dojo.

I didn’t get what I had hoped out of the course. I’m quick to acknowledge that I could have gotten more out of it by applying more effort, but I also think there were some unnecessary roadblocks in the course that stymied my (and many other students’) attempts to dig into the material. My purpose in this review is to identify the things that I think worked well, identify the things that didn’t work well, and offer some suggestions for improving the course in the future. I want to emphasize that this is only one person’s opinion; there are surely students whose experience was better (or worse) than mine.

Things That Worked

  • The Instructors. Mike Bowles and Patricia Hoffman are excellent lecturers, and show genuine concern for helping their students to grasp the material.
  • The Homework Assignments. When I was able to complete them, the homework questions did a good job helping me to grasp the concepts they covered.
  • The Venue. Hacker Dojo is just a fun place to visit. Even when there were other events going on at the Dojo during our class time, we had no problems with disruptive noise or contention for space.

Things That Didn’t Work So Well

  • Sign Up Process. Initially it appeared that to sign up for the course you needed to join the meetup.com group, and if you didn’t get in under the cap of 70 students you’d be put on a waiting list. The meetup.com page indicated that payment for the course should be made with a check at the first class session. A few days before the start of the course, however, this was changed so that signup and payment happened on Eventbrite. While I imagine this delighted the wait-listed students who were able to register under the new system, it seems like a trap for students who thought they were safely registered. Suggestion: Get the sign up and payment systems in place weeks before the start of the class.
  • Classroom Management. During the first three weeks of the course, the lectures were excessively interrupted by students asking repetitive questions or making pedantic arguments about basic concepts. Suggestion: Designate a chat channel on Freenode or Convore where students can ask and answer questions during class time without disrupting the lecture.
  • Prerequisites. The course was advertised as being appropriate for students who knew how to program in some language, but not specifically R. I assumed this meant that the instructors would teach the parts of R needed to apply the concepts taught in class and complete the homework assignments. This did not turn out to be the case. Instead of giving in-class instruction on R fundamentals, the instructors announced on the first day of class that they would have a special one-off class on R fundamentals the following morning (Sunday). While I applaud the instructors for dedicating extra time to help with R instruction, doing it with a surprise session outside of class time meant that I (and I assume many other students) could not attend. It wasn’t until several weeks into the course, and after pestering my R-using colleagues at work, that I was able to clearly understand the distinction between arrays, vectors, and dataframes, for example. Suggestion: Spend less of the first class session on the 100 mile-per-hour overview of machine learning, and more on R fundamentals such as datatypes and indexing.
  • Four Hour Class Sessions. Unlike the twice-a-week Machine Learning 201 course, Machine Learning 101 met only once per week, for four hours on Saturday morning. I didn’t mind the weekend morning start time (I think it did a good job filtering out the uncommitted), but at the end of four hours the attention spans of even the most diligent students have been exhausted. Suggestion: Split the class time into two sessions per week, of two hours each. More frequent meeting times will also have the benefit of helping to keep the subject matter more fresh in students’ minds.
  • The Homework Assignments. The homework assignments were frequently insurmountable given the combination of sparsely-commented code samples, unfamiliarity with R, unfamiliarity with the substantive topics we were supposed to be learning, and the absence of instructor-provided solutions. Actually completing the homework assignments required heroic effort, like the guy who spent 5 hours Googling and puzzling over how he was supposed to re-scale the coefficients after doing a ridge regression, which was just one sub-part of one homework question. Suggestions: Diligently comment the code samples. Provide example solutions to the homework assignments against which students can compare their answers as they’re working on the problems. Of all the suggestions offered in this evaluation, I think these would have the largest benefit.
  • Student Groups. On the first day of class, the instructors had the students count off into groups of roughly six students each. The members of these groups exchanged email addresses, and were supposed to submit their homework assignments as a group. While this may work well in a university setting where students have mostly similar schedules and no full time jobs, it was mostly ineffective for a class of people who have work and family obligations. Suggestion: Instead of randomly assigning students to groups, provide a means for students to organize themselves into groups based on when and where they’re able to meet to work on the problems. The mailing list or a wiki page might be sufficient for this.
  • $300 Registration Fee. On the first day of class, organizer Doug Chang apologetically stated that the only reason there’s a charge for the class was to weed out uncommitted students. I’m skeptical that the fee is achieving that goal, however, and strongly suspect that it has some serious unintended consequences. Rather than filtering for committed students, the registration fee seems to filter for students whose employers are willing to pay the fee for them. Compared to most Hacker Dojo events, the attendees of the machine learning class skew tremendously towards employees of huge companies like Microsoft and Symantec. I have no problem with these students attending a Hacker Dojo class, but it’s a shame if other committed students are being priced out because their companies have less generous budgets for employee training and education. Another problem: the more you charge for the class, the more students will feel that they’re supposed to be getting highly polished lectures and materials, as opposed to just taking part in a collaborative effort where the students are largely responsible for helping to teach each other. Suggestion: If the fee is really just to weed out uncommitted students rather than to raise money for the Dojo or instructors, then reduce it to something more affordable like $50 or $100. Another Suggestion: If you really want to filter for which students are committed to sticking with the class and contributing to it, then you could take the suggestion about student groups above and make it part of the registration process. If students couldn’t register for the class until they had organized themselves into study groups, you would better filter out those who expect the instruction to be served to them on a silver platter. I suspect you’d also end up with much more effective study groups.

Mike and Tricia deserve applause for all the time and effort they continually put into their Hacker Dojo machine learning classes. My complaints about my course’s effectiveness are minor in comparison to the benefit bestowed on the community by Mike and Tricia teaching these classes. By following the suggestions offered above, I think this benefit could be multiplied, and students’ frustration greatly reduced.

Gutless Comparisons

If you’re going to write an article comparing X and Y, try to have an opinion. Don’t end it with this:

The best way to determine which is right for you is to download both and put each through a comprehensive evaluation.

The Server Side: Comparing MySQL and Postgres 9.0 Replication

Time for a New Core Curriculum

There are some things that schools and colleges really focus on that I just don’t think are very relevant anymore. There are other things that kids really need to know, but that are only taught as electives if they’re taught at all.

  • Less useful: Cursive writing. You don’t need to know this anymore. It is far more important that you know how to type. Yet typing is taught much later and with a lot less time devoted to it.
  • Less useful: Shop classes. When the most common post-school career was to go work in a factory somewhere, it made a lot of sense for schools to emphasize shop classes. My high school had separate classes for both wood and metal shop. The only “computers” class they had actually consisted of working through some painfully basic tutorials on how to use spreadsheets. (They used Quattro Pro). I can’t remember how, but I managed to drop the class and weasel into getting the credit hours some other way. I think the policy has changed since then and the school now offers Java classes and an AP prep class in computer science. It’s a start.
  • Less useful: Foreign languages. This one’s kind of a sacred cow. I know lots of people will think “We really need students to learn a foreign language so they can learn new ways of thinking and be able to interact with the rest of the world.” Phooey. The rest of the world is learning English as fast as they can for this very reason. High school and college foreign language requirements usually go one of two ways in my experience: 1) the student takes a language that they already know (often Spanish) and breezes through the classes, or 2) the student picks a romantic-sounding but increasingly-irrelevant language like Russian or French, spends a couple years learning how to ask where the bathroom is, and then forgets it soon after leaving college.
  • More useful: HTML. A couple weeks back I had to write a painstakingly-detailed tutorial on how to create HTML hyperlinks for someone at work who had never done it before. I consider this basic literacy these days. You don’t need to know a bunch of fancy HTML and make complete page layouts, but everyone should know enough to be able to write a blog post. This should be taught in elementary school.
  • More useful: Statistics. We are surrounded by more and more data every day. Yesterday I went to the SF Bay ACM’s Data Mining Camp. I was struck by how many of the attendees were from India, Russia, and a smattering of other foreign countries. Probably half the attendees were not US natives. This says to me that there’s a big demand for these skills, and that the US is just not meeting it.
  • More useful: Graphic/UI Design. It takes a special kind of person to be able to design something attractive and then also know enough geekery to make it interactive and useful. There was a time about 10 years ago when I thought graphic design was a “wishful thinking” major, with a lot more people going into it than there were jobs available. Maybe that’s still true of print design, but in the tech world there’s a desperate need for these skills.
  • Less useful: Law. The legal economy shrank even more than the economy as a whole in the recent downturn. The traditional view of the law as one of the distinguished, learned professions is woefully out of date. Law school teaches very little that is of practical value in a legal career. All law students bring on roughly the same debt, but only a small proportion of lawyers from the top schools will earn big law firm money. Also, watch this.
  • More useful: Accounting. Whether just for personal use, or because you’re starting a small business, or because you want to work for an accounting firm, I think everyone should have at least one accounting class in high school or college. You will use this.

Check out the current graduation requirements from my high school.

Geoffrey Pullum Imposes Death Penalty for Leaving Lame Blog Comments

Over at Language Log.

I would add “anyone who walks by a janitor cleaning windows at the JSMB and says ‘Can you come do mine next?!’

Secret Mormons!

That insightful guy on the religious discussion forum might be more than he says he is.

Ben reports.