Sunday, July 17, 2011

Why I will never pursue cheating again

Update

You can read my commentary in my new blog post: A tale about parking.

The discussion on Hacker News was good as well. Also see the response I posted on the Business Insider website and the coverage in Inside Higher Education.


============================================================
TL;DR: Cheating is not a 'bad apple' problem when incentives and assessment design make it cheap and low-risk. Detection tools help, but the scalable fix is redesign.



Last Fall, it was my first semester of teaching as a tenured professor. It was also the semester that I realized how pervasive cheating is in our courses. After spending a tremendous amount of time fighting and pursuing all the cheating cases, I decided that it makes no sense to fight it. The incentive structures simply do not reward such efforts. The Nash equilibrium is to let the students cheat and "perform well"; in exchange, I get back outstanding evaluations. Fighting cheating cannot happen through policing. We need to consider alternative approaches to evaluating students that are structurally cheating-proof.

But let me give you the complete story, as it contains tidbits that I found, in retrospect, highly entertaining.

Clarification 1: Before you jump to conclusions, though, that I just gave up, please go to the end of the article, at the "Future" section, and read my final thoughts.

Clarification 2: The point of this blog post is not to show that "business students cheat" or that our own university is anyhow different than others. I have no reason to believe that other institutions suffer any less from cheating. If you want to think that it is NYU, Stern, or business schools, the only place where cheating happens, then you are turning a blind eye to the problem. The fact that nobody is putting significant effort into detecting or combating cheating does not mean cheating does not exist. The main point I want to make is that cheating happens because we, structurally, put the right incentives in place for it to occur. I propose a pedagogically correct solution. And I welcome comments and feedback.



How it all started: Tenure and Turnitin Integration

There were two new things in the Fall 2010 semester:

First, it was my first semester teaching as a tenured faculty member. This allowed me to be more relaxed and stricter on things related to cheating.

Second, for the first time, our Blackboard installation had full integration with Turnitin. For those unfamiliar with these, Blackboard is a course management system, and Turnitin is a plagiarism-detection software. The integration meant that when students submitted assignments, the uploaded documents were automatically processed by Turnitin to produce originality reports.

Turnitin has a vast database of assignments (all submitted assignments are added to its database) and also checks the Internet to identify parts of the assignment that may be copied from a website. For those curious about the technicalities, detection occurs by checking for unusual n-grams that appear in two or more documents. For essay-based assignments, you can be assured that Turnitin will detect most cases of plagiarism.

So, given the ease of deployment, I decided to use Turnitin for the first time. I uploaded all my past assignments to Turnitin from prior semesters and configured Blackboard to automatically submit all new assignments through Turnitin.



First assignment out: Essay about WiMax, LTE, and the future of wireless communications

The first assignment of the semester asked students to study the technologies for "4G" wireless data transfer and to understand how the wireless carriers' choice of underlying technologies can affect their strategies. To make the assignment different from the one distributed last year, I also added LTE questions, in addition to the WiMax questions we were using before.

The assignments came back, and here is how the Turnitin report looked:


Yep. 20 assignments appeared to have more than 20% plagiarized content. Some were false positives, but most actually contained plagiarized content.

Trying to understand what is going on, I studied the reports in detail. Here is how one assignment looked, with the highlighted parts indicating parts that have been copied from other Internet sources (e.g., bbtantenna.com, moopz.com, and so on):



This student created a report by using three buttons in his keyboard: Find site on Internet, copy, paste; Find site on Internet, copy, paste; Find site on Internet, copy, paste. Although it was not a blatant case of cheating, it demonstrated an alarming practice. Students get used to preparing reports by simply looking things up on the Internet and then just pasting everything together, with minimal further editing. Even more alarming: no citations to the original sources.

I decided not to punish the students who engaged in this practice, but I had to discuss in class at length why it is a nasty habit. This is not "research" as some students call it. Plagiarism is habitual and can have dire consequences for one's professional life. A quick check on the news of that week revealed two articles about such a type of plagiarism:


Not sure if the message came across, but I tried to educate the class about what plagiarism is. Some of the students actually protested that I did not punish this behavior (they felt they had been educated enough about it in the past). Still, I decided to be lenient, since it was just a couple of cases like that. In retrospect, I was being stupid. At least one of these students cheated again in a later assignment.



The blatant cheaters

But what I considered a deep problem was not this copy-and-paste behavior. At least these students were learning how to find information online, which was admittedly relevant. With a bit of practice in properly citing their sources and some effort, these issues could be resolved. The deep problem was with students who were really cheating.

Here is the report for one offender, with 95% of the content copied from a student who took the class in Fall 2009:


There were other similar cases, but this was the most extensive. 95% of the assignment was copied word-for-word.

The student, after receiving the notification that the assignment was processed by Turnitin (but without knowing whether it was marked as plagiarized), sent me the following, highly entertaining email (emphasis is mine):

Sorry for the confusion but the assignment which I handed in online was not the correct assignment. I was away for the weekend and wrote my homework on a different lab top not my own and when finished emailed it to myself. Yesterday after class i heard the news that my best friends grandmother had a stroke and was in the hospital and i went there to help out. Then i remembered that i had the homework to hand in. I asked my roommate to turn in the work for me. Since it was not written in correct program he had to transfer into a word documentI asked someone who had already done the assignment to send him theirs to he could format my answers into the correct format. In this process he accidentally copied the other persons work into document and not mine. The only way i realized is when i looked at the Turnitin receipt and saw it was not mine. Attached is my correct work and i am sorry for the confusion.

You cannot blame the student for lack of creativity in the excuse, can you?

What are the chances of the given excuse being true? Well, let's see another page of the report:


Everything was indeed cut and pasted from an old homework, but (surprise!) the number "2009" from the old assignment was changed to "2010". I am wondering what the OS was on his "laptop" that could do such a bright copy and paste.



Blatant cheating, attempt #2

I decided to run the newly submitted assignment through Turnitin. I could not really believe that he would try cheating again. What the heck, let's submit the assignment to Turnitin. I tried anyway. Here is what came back:


Yep, the "revised" assignment was actually 57% copied from a Fall 2009 assignment. And from which one? From the very same assignment from which the student copied to start with! You cannot make this stuff up.

At that point, I had to suspend him from the class and refer him to the honorary council for further punishment. If not being punished for plagiarism, the student should have been penalized for just being stupid.



The class announcement: "Who cheated? "

For processing the remaining cases, I decided not to confront students directly: the case above took about 3 hours of my time to get the student to admit what he had done, despite the overwhelming evidence.

Instead, I sent an email to the class. I just said that plagiarism was detected, and whoever cheated could come find me. For the rest, I would report the case to the Dean's office, provide the evidence, and let them decide what to do and whether to pursue the case.

The result? Many more students than I was expecting were waiting outside my office during office hours. While nobody was willing to admit wrongdoing, most of them readily accepted that "took a look at an assignment of my roommate, or "got some help from my fraternity brothers," and so on. Of course, Turnitin allowed me to easily find the name of the person who "helped" them. At that point, most students just gave up and admitted that they copied.

One interesting observation: Cheating clustered in tight social networks. Not just among international students "borrowing" from their compatriots (we do not have that many in the undergrad program), but also among US-born students. A result of socializing in similar student groups? Same fraternities and sororities? I do not have enough data points to make statistical claims, but the pattern seemed very strong.



Excel-based assignment: The party continues

A few weeks later, I posted an assignment requiring students to perform Excel-based analysis. To make it easier to detect cheaters, I added some extra features that would make it difficult to just copy and paste from another assignment. (Font choice, re-sizing random cells in non-visible parts of Excel, defining variables with slightly different names, and many other small tricks.) I also modified past assignments by slightly changing the required formulas, and by adjusting the parameter values in very slight ways (e.g., from price = 10.467, I used price = 10.468, and in Excel, rounded up to 2 digits, both showed up at 10.47)

When the results came back, it was a big mess. First, students submitted Excel spreadsheets containing their classmates' names. Or the authors' names of past PhD students, who prepared solution keys in 2006. (And which have the incorrect solution as well.) It was also obvious to detect students who used layouts from past solutions, as some of them did not even remove the border formatting from the Excel cells. (Yes, if you double underline cells E5 to E9, and use a Garamond font just for that part of the assignment, there is a strong suspicion that you copied and pasted the solution from 2008, which had exactly these characteristics.)

One of the offenders was actually a repeat offender from the prior assignment and was also dismissed from the class.

Another student had a nervous breakdown in my office, crying loudly and uncontrollably for 2 hours. It was awkward. On the one hand, I wanted to prevent the student from being embarrassed, and I tried to close my office door. On the other hand, I did not even want to think of being in my office behind closed doors with an undergraduate student who is crying loudly.

A complete and utter mess...



The wasted time

By the end of the semester, 22 students admitted to cheating out of the 108 enrolled in the class.

The process of discussing all the detected cases was not only painful but also extremely time-consuming.

Students would come to my office and deny everything. Then I would present the evidence to them. They would soften but continue to deny it. Only when I was saying, "Enough, I will just give the case to the honorary council, who will decide," did most students admit wrongdoing. But every case was at least 2 hours of wasted time.

With 22 cases, that was a lot of time devoted to cheating: More than 45 hours in completely unproductive discussions, when the total lecture time for the course was just 32 hours. This is simply too much time.



The overall experience

When 1 out of 5 students in the class is involved in a cheating case, the lectures and class discussions become awkward. For the rest of the semester, there was a palpable sense of anxiety in class. Instead of having friendly talks, the discussions became contentious. Not a pleasant environment.

This, of course, had a direct effect on my teaching evaluations. Instead of the usual assessments in the range of 6.0 to 6.5 out of 7, this time my ratings went down by almost a point: 5.3 out of 7.0. Instead of being in the upper percentiles as a teacher, I was now below average.




Will I do it again in the future?

Was it worth it? Absolutely not. Emotionally, pedagogically, and financially, it was a bad decision to be so vigilant.

The usual mode of catching and punishing only the egregious cases was much better. Why pursue only the cases where there is evidence of cheating? Yes, I was able to generate proofs for all the instances that I sought, but at what cost!

I also did not like the overall teaching experience, and this was the most important thing for me. Teaching became annoying and tiring. There was a very different dynamic in class, which I did not particularly enjoy. It was a feeling of "me-against-them," rather than the much more pleasant "these things that we are learning are really cool!"

Adding insult to the injury, my yearly evaluation came back with my first "average" rating for my annual performance as a professor. And, together with the "average" rating, I received the lowest salary increase I have ever received. (It was actually below inflation, so effectively it was a salary decrease.) Not that it would matter much if my salary increase were higher, but it would signify that there is some incentive to actively pursue cheating cases. The lousy increase was the nail in the coffin.

Will I pursue cheating cases in the future? Never, ever again!




The future: How to deal with cheating?

So, how to deal with cheating in the future?

I doubt that I will be checking again for cheaters. First, this is a losing battle: as I use more advanced cheating detection schemes, the cheaters will adapt. I am not a policeman fighting crime. My role is to educate, not to enforce honest behavior. This is a university, not a kindergarten. Second, when a couple of students cheat, they have a problem. When 22 students cheat, well, the problem is mine!

Suggestions to completely change the assignments from year to year are appealing at first glance, but they create other problems: it is tough to know in advance whether an assignment will be too easy, too hard, or too ambiguous. Even small-scale testing with TA's and other faculty does not help. You need to "test" the new assignment by giving it to students. If it is a good one, you want to keep it. If it is a bad one, you just gave the students a useless exercise.

What I came to realize is that the assignments' style was an inherent part of the problem. The solution is not to detect cheating. The solution is to create assignments that are inherently not amenable to cheating:
  • Public projects: The database projects that use NYC Data Mine data (see the projects from 2009 and 2010) are one approach: they are public, and it would be meaningless to copy a project from a past semester. The risk of public embarrassment is a significant deterrent.
  • Peer reviewing: The other successful project is one in which students research a new technology and present their findings in class; the only grade they receive comes from their peers. The social pressure is so high that most of the presentations are of excellent quality. This year, the student presentation on augmented reality was so impressive that, for an MBA class, we simply showed the recorded presentation to the MBA students.
  • Competitions: To teach students how the web works, I ask them to create a website and attract at least 100 unique visitors. The student with the most visitors at the end of the semester receives an award (most often an iPod). I had some great results with this project (e.g., one student created a website on "How to Kill Nefarian" and got 150,000 visitors over 8 weeks) and some highly entertaining incidents.

These types of assignments work well for specific types of problems. I am still not sure how I can teach students, for example, to write database queries, without some "boring," well-defined assignments with pre-determined "correct" outcomes.

In other words, my theory is that cheating (on a systematic level) happens because students try to gain an edge over their peers/competitors. Even top-notch students cheat to ensure a perfect grade. Fighting cheating is not something that professors can do well in the long run, and it is counterproductive by itself. By channeling this competitive energy into creative activities, in which you cannot cheat, everyone is better off.

Any other suggestions are greatly appreciated. I am interested in what others are doing to deal with the problem.