Thoughts on Turnitin’s database

== Summary == Universal recycling symbol outli...
Image via Wikipedia

I was reviewing some originality reports generated by Turnitin for some first year undergraduate work today and it struck me that their database  has grown to such an extent that I no longer trust it. I thought that this would never be the case. I assumed that as the database of harvested web pages and student papers grew, the accuraacy of detection would increase. Bigger is better, right? I didn’t think about noise though. There is some filtering in the Turnitin system, but increasingly it seems it isn’t good enough.

An example, one student has a list of 6 different sources all matching to small parts of student essays deposited as ‘reference material’ on and related websites all owned byt he same company, and in likelihood sharing the same database. Do I believe that this student copied from this site? I’m not sure. Do I believe that each of the essays on the site may have originally been copied from wikipedia? more likely. Do I believe that all content on wikipedia is original? Not really.

Recycling and repurposing of text online is becoming so ubquituous that the noise is casuing a problem in the interpretation of originality reports. They used to save us time in investigating cases of plagiarism, now I’m not so sure.

Reblog this post [with Zemanta]

8 thoughts on “Thoughts on Turnitin’s database”

  1. Thanks for the observations, and I agree with you. I’ve suspected that T doesn’t always link to the original source, and so many websites contain Wikipedia information you can never be sure where the student has obtained the stuff from. I think the coursework black market must be massive now, with students selling on work. I think our only solution is to design smarter assessments, after we’ve hit the gin bottle of course.

  2. Interesting and timely view as pressure for Turnitin to be made available is high in many of the programme committees I’m currently attending – especially from external examiners remarks. It’s very likely we will be providing it soon (pending policy decisions, not technical delays), and it will be interesting to see how it is percieved when it is.

    1. @Nick – I don’t doubt Turnitin as a tool to be able ot detect the presence of non-original text. Interpretation of the reports has always required academic judgement. What is new is that the level of noise is such that academics need to be aware that they are not looking at matches to the original source of the work anymore.

      1. Understood – I did get that from the post, I probably didn’t make that clear; typing on a phone screen doesn’t make for very elloquant replies at times 🙂

  3. This has been suggested in some of the comments, but the message to get across is that turnitin does not detect plagiarism – what it does do is find possible matches to other sources that are on the internet or in their own submission database. It is then the tutor who uses this information to determine if it is or isn’t plagiarism.

    When I run training sessions on TurnItIn I make it clear that I am not interested in detecting plagiarism, what I want to do is to deter it from happening, and for that it is a superb tool.

  4. I’d agree with Dave’s point about this key point about T – a key point I make in my training is that ‘T doesn’t detect plagiarism: academics do’.

    Sure there’s a lot (and indeed growing amount) of noise on T, but it does give you quite powerful tools to cut through it. The default view is ‘show highest matches together’ and in this view you can click on the little cross in the top right hand corner of the matches column to exclude these from your search.

    Hope this helps cut down the ‘noise’ in your plagiarism checking!

    But if you change this view to ‘show matches one at a time’ it allows you to see deeper into the levels of matches to find where the original source might be.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s