I drafted this one back in May when the controversy described was live, but never quite got around to pushing it out the door. There haven't been any real developments in the news since then, and I still believe in the points I made, so I'm publishing it now before it gets any more stale.
A course I used to teach -- cs50 -- has seen some on-campus news (and editorial) coverage recently in the wake of a leak that 60 students in the course were reported to the Administrative Board on suspicions of academic dishonesty. I don't have much to say, not having any relationship with the course since 2016 and not having any particularly relevant inside information as a former Teaching Fellow. (As a relatively junior member of the course staff, I wasn't asked to work on any cases of academic dishonesty, and the revelations that I could give have already been shared with the press.) But there's one thing that's come up in the ensuing commentary that I would like to put a finer point on.
The Crimson's Editorial Board has called for the course to lay out a more explicit collaboration policy:
The course has an open-ended and amorphous honor policy, under the banner of being "reasonable." We understand that the collaborative nature of computer science requires students to work together, and may thus make promulgating strict rules about plagiarism more difficult...
Though the course staff have undoubtedly given this matter some thought, we urge them to more explicitly delineate what is allowed and what is unacceptable. That many of the accused students allege that they were unaware of having done anything wrong may suggest that the honor policy was simply not reasonable enough...
Drawing these distinctions is especially difficult for freshmen, who do not have experience with Harvard's norms of appropriate collaboration and make up many of the students enrolled in cs50...
We would encourage cs50 to adopt a collaboration policy that is more similar to that of other courses in the Computer Science department. For example, they could consider continuing to encourage students to discuss the concepts while prohibiting viewing another student's code. Academic dishonesty is an issue that is always paramount in an institution of higher learning, and we hope cs50 and the Honor Council can work together to better define the limits of the reasonable for students. (...)
I think the Crimson's got this one slightly sideways. The piece claims "That many of the accused students allege that they were unaware of having done anything wrong may suggest that the honor policy was simply not reasonable enough." But where, exactly, do they think the ambiguity lies? And how seriously have they considered how best it should be resolved?
There are cases where a policy of "be reasonable" suffices and cases where it doesn't. Here are some of each:
- A student uses handout code to complete the assignment. They understand the handout code and could reproduce it on their own, given time.
- A student uses handout code to complete the assignment. The handout code is an enigma to them and they couldn't possibly reproduce it on their own.
- A student asks a TF questions in English about the ideas of the problem set. They talk, and clear up confusion.
- A student asks a TF why their code doesn't work, and they go over it together. At the end, the student ends up with code they understand, and could probably reproduce on their own.
- A student asks a TF why their code doesn't work, and they go over it together. At the end, the student ends up with code they don't understand, and couldn't reproduce on their own.
- A student asks a classmate questions in English about the ideas of the problem set. They talk, and clear up confusion.
- A student asks a classmate why their code doesn't work, and they go over it together. At the end, the student ends up with code they understand, and could probably reproduce on their own.
- A student asks a classmate why their code doesn't work, and they go over it together. At the end, the student ends up with code they don't understand, and couldn't reproduce on their own.
- A student searches sources on the Internet to clear up confusion about ideas of the problem set.
- A student searches sources on the Internet to figure out why their code doesn't work. Eventually, the student ends up with code they understand, and could probably reproduce on their own.
- A student searches sources on the Internet to figure out why their code doesn't work. Eventually, the student ends up with code they don't understand, and couldn't reproduce on their own.
- A student submits, for some piece of their assignment, code that their TF dictated to them at office hours.
- A student submits, for some piece of their assignment, code that a classmate dictated to them when working together.
- A student submits, for some piece of their assignment, code that a classmate wrote for them.
- A student submits, for some piece of their assignment, code that they copied from an Internet source.
- A student submits an outright solution that a classmate dictated to them when working together.
- A student submits an outright solution that a classmate wrote for them.
- A student submits an outright solution that they copied from an Internet source.
and: Does the 'reasonableness' of these change for code that's the real heart of the problem, versus helper code, debugging code, or testing code?
The answers aren't universal -- they should be determined by the pedagogical purpose of the assignment at hand! I expect that a member of the teaching staff, familiar with a course's educational philosophy, should be able to tell which of these are 'being reasonable' and which are 'being unreasonable'. I certainly don't expect that every student will.
For one, there's the issue that reasonableness is necessarily in context of the what the course is trying to achieve, and the teaching staff are more aware of that than the students. And for another, many if not most students in an introductory course will have a fuzzy view of which of the difficulties they face are essential, and which are accidental (as Fred Brooks uses those terms in "No Silver Bullet"). My former student Diondra Dilworth put it very well in remarks to the Crimson:
For younger users, it might be more difficult to discern what is information that's pushing them in the right direction, and information that's just actually solving the problem... I don't think people are, like, trying to cheat or anything, it's encouraged to look for help elsewhere, and sometimes that help just gives you too much of the answer. (...)
And former teaching fellow Mark Grozen-Smith:
I think [the collaboration policy] is vague, but I think it's also intentionally vague because in certain situations it's okay to share a couple lines of code; in other situations, this one line of code is the entire problem, so you shouldn't be talking or even looking at what the person did. (...)
The ability to tell the difference between one line of code that's the entire problem and a couple of lines that any computer scientist would agree there's no value in re-writing yourself is a hard thing to learn -- and one of the most important things that cs50 could teach. (I say 'could' because I can't remember ever receiving guidance as a teaching fellow on how to teach students the difference.)
Experienced programmers -- and even those fresh out of cs50 who did well enough to end up in the teaching corps -- have very likely forgotten what it was like to struggle for hours trying to fix a bug in some unenlightening side-problem far from the actual heart of the assignment. And they're likely to underestimate just how much that's going to happen, and how much time crunching through hard, essential problems gets lost because students are bashing their heads against the incidentals.
Similarly, consider the student's assumed unfamiliarity with the course's educational priorities and the intended design of the assignments -- if that's accurate, isn't it...a problem? Regardless of subject, education should be a participatory process, not something that is done to a student, and unless the student has figured out for themselves the operative design of the course, they'll only get value from the assignments by accident.
So I disagree with the Crimson that the solution to alleged misunderstandings of the collaboration policy and the bounds of 'reasonableness' is cause for writing a more explicit policy. It's certainly grounds for change, but in the direction of more education, not more legislation.
Besides, what the heck does "a collaboration policy that is more similar to that of other courses in the Computer Science department" mean? Most courses in the computer science department have even more vague collaboration policies.
I had to go back and check the official policy for the other course I served as a TF for -- Operating Systems:
- Web work is to be completed individually and should not be discussed prior to class. If you find questions on the web work confusing, please email Professor Seltzer, who will either answer directly or dedicate a few minutes of class time to clarifying confusing points.
- The documents you turn in for assignment 1 should be the work of an individual, however, you are allowed and encouraged to discuss the assignment and problems with other students. Those conversations should be limited to a spoken language and not code.
- Assignments 2-4 will be completed in pairs. You may discuss the assignment with other groups and as indicated below, you will be giving and receiving feedback on your design by another group. These design reviews must be documented in your design document. The code you turn in should have been produced only by team members (design documents might contain pseudo code, but should not contain real code). If you receive significant design advice or feedback from anyone, please document that in your design document.
- Exams are to be completed entirely independently. Details on what materials will be allowed for the two tests will be specified on the test itself.
- When in doubt: Ask!
The teaching staff didn't have to think about that even once, among our 30 students. Of course, they were all experienced, sharp programmers and we worked hard to make the assignments reflect essential difficulties in operating systems design, rather than accidental ones. (The official line is that students build an operating system kernel in the course of the semester, but there's an awful lot of annoying, incidental filler that we hand out to them.)
Anyway, here's cs50's policy:
The essence of all work that you submit to this course must be your own. Collaboration on problem sets is not permitted except to the extent that you may ask classmates and others for help so long as that help does not reduce to another doing your work for you. Generally speaking, when asking for help, you may show your code to others, but you may not view theirs, so long as you and they respect this policy’s other constraints. Collaboration on the course’s test and quiz is not permitted at all. Collaboration on the course’s final project is permitted to the extent prescribed by its specification.
Below are rules of thumb that (inexhaustively) characterize acts that the course considers reasonable and not reasonable. If in doubt as to whether some act is reasonable, do not commit it until you solicit and receive approval in writing from the course’s heads. Acts considered not reasonable by the course are handled harshly...
- Communicating with classmates about problem sets' problems in English (or some other spoken language).
- Discussing the course’s material with others in order to understand it better.
- Helping a classmate identify a bug in his or her code at office hours, elsewhere, or even online, as by viewing, compiling, or running his or her code, even on your own computer.
- Incorporating a few lines of code that you find online or elsewhere into your own code, provided that those lines are not themselves solutions to assigned problems and that you cite the lines' origins.
- Reviewing past semesters' quizzes and solutions thereto.
- Sending or showing code that you’ve written to someone, possibly a classmate, so that he or she might help you identify and fix a bug.
- Sharing a few lines of your own code online so that others might help you identify and fix a bug.
- Turning to the course’s heads for help or receiving help from the course’s heads during the quiz or test.
- Turning to the web or elsewhere for instruction beyond the course’s own, for references, and for solutions to technical difficulties, but not for outright solutions to problem set’s problems or your own final project.
- Whiteboarding solutions to problem sets with others using diagrams or pseudocode but not actual code.
- Working with (and even paying) a tutor to help you with the course, provided the tutor does not do your work for you.
- Accessing a solution to some problem prior to (re-)submitting your own.
- Asking a classmate to see his or her solution to a problem set’s problem before (re-)submitting your own.
- Decompiling, deobfuscating, or disassembling the staff’s solutions to problem sets.
- Failing to cite (as with comments) the origins of code or techniques that you discover outside of the course’s own lessons and integrate into your own work, even while respecting this policy’s other constraints.
- Giving or showing to a classmate a solution to a problem set’s problem when it is he or she, and not you, who is struggling to solve it.
- Looking at another individual’s work during the test or quiz.
- Paying or offering to pay an individual for work that you may submit as (part of) your own.
- Providing or making available solutions to problem sets to individuals who might take this course in the future.
- Searching for or soliciting outright solutions to problem sets online or elsewhere.
- Splitting a problem set’s workload with another individual and combining your work.
- Submitting (after possibly modifying) the work of another individual beyond the few lines allowed herein.
- Submitting the same or similar work to this course that you have submitted or will submit to another.
- Submitting work to this course that you intend to use outside of the course (e.g., for a job) without prior approval from the course’s heads.
- Turning to humans (besides the course’s heads) for help or receiving help from humans (besides the course’s heads) during the quiz or test.
- Viewing another’s solution to a problem set’s problem and basing your own solution on it.
cs50's problem is not that it has a vague written policy.
Seriously, the most-vague parts are the 'reasonable' item "[t]urning to the web or elsewhere for instruction beyond the course’s own, for references, and for solutions to technical difficulties, but not for outright solutions to problem set’s problems or your own final project." and the 'unreasonable' item "[f]ailing to cite (as with comments) the origins of code or techniques that you discover outside of the course’s own lessons and integrate into your own work, even while respecting this policy’s other constraints." -- What is a 'technical difficulty'? What is an 'outright solution'?
But, again, the course is failing at its pedagogical task if students don't know that they don't know the answers to those questions -- or if they don't have anyone to ask.