Wikipedia is the most popular site for copying and pasting content into student papers, a new study reports, and social media and content-sharing sites are not far behind.
The report is called “Plagiarism and the Web: Myths and Realities.” It’s written by Chris Harrick, vice president of marketing for Turnitin, a popular service designed to detect content matches from other student papers, Web sites, as well as an entire library of academic journal and book material.
Analyzing nearly 40 million submitted high school and college papers between June 2010 and March 2011, the company detected 140 million content matches and classified them based on their source:
- Social Networking and Content Sharing (33.0%)
- Homework and Academic sites (25.0%)
- Paper Mills and Cheat sites (14.8%)
- News and Portals (13.6%)
- Encyclopedias (9.5%)
- Other (4.1%)
Harrick believes that digital media has created a cultural shift among our youth, who need to be educated to value originality in academic thought and writing. He writes:
A digital culture that promotes sharing, openness and re-use is colliding with one of the fundamental tenets of education – the ability to develop, organize and express original thoughts. For many students who have grown up sharing music, retweeting thoughts and downloading free software, the principle of originality in research and writing can seem antiquated.
For a study of 40 million papers and 140 content matches, I found this report excessively lean. We are not told how much content is typically matched in student papers, the typical number of content sources per paper, or the percentage of papers that don’t suffer from any content match, nor do we have a breakdown based on education type, subject, or grade level.
More importantly, while the company is clear about distinguishing plagiarism (a deliberate attempt to claim ownership of another’s intellectual contribution) from content matching (the simple overlap of text), Harrick often equates the two in the report, and most explicitly in its title (“Plagiarism and the Web”).
The service does not detect nor determine plagiarism – it detects patterns of matching text to help instructors determine if plagiarism has occurred
Unfortunately, stories using the “P-word” have already started showing up in the media: The Chronicle of Higher Education (“Plagiarism Goes Social“), Inside Higher Ed (“The Sources of Plagiarism“), PR newswire (“Turnitin Debunks Myths Surrounding Plagiarism on the Web“), The Washington Post (“Study: 8 top sites for potential plagiarism“), and US News & World Report (“Plagiarists Turn to Academic Sites, Not Paper Mills“)
While I have no doubt that plagiarism is present in many of the 140 million content matches in the dataset, the study did not attempt to investigate plagiarism, which is why I take issue with the use of the “P-word” in this context. Many of the content matches may simply be attributed pieces of text (such as a quotation that is found on Wikipedia), or a block of text that is followed with a citation or footnote. Even academics charged with plagiarism find ways of attributing the match to a missing quotation mark or the loss of a few footnotes, as in the case of the late American historian, Stephen Ambrose. In academia, the P-word is more offensive than the F-word, which is why it should be used carefully.
At present, the Turnitin study provides some novel and informative data about where students are getting content for their papers. For this reason alone, the report is valuable to high-school and college educators, librarians, and academic publishers who may all attempt to steer their students to more authoritative sources of content (or alternatively, start populating these sites with the content they want students to read). But without knowledge of whether the content they use has been properly attributed with a reference, it’s a stretch to make strong claims about student plagiarism.
Students, you can quote me on this: Just make sure to include a full citation.