Skip to main content

New Test for Computers - Grading Essays at College Level - NYTimes.com

Can computer software evaluate student papers? The debate has waged for about three years in higher education, and lately it has taken on a new urgency. This article from 2012 raises the question.
According to a new study [http://dl.dropbox.com/u/44416236/NCME%202012%20Paper3_29_12.pdf] by the University of Akron, computer grading software is just as effective in grading essays on standardized tests as live human scoring. After testing 16,000 middle school and high school test essays graded by both humans and computers, the study found virtually identical levels of accuracy, with the software in some cases proving to be more reliable than human grading. While the results are a blow to technology naysayers, the software is still controversial among certain education advocates who claim the software is not a cure-all for grading student essays.
http://www.aaeteachers.org/index.php/blog/714-study-robo-readers-more-accurate-in-scoring-essays, April 23, 2012
Now, a recent New York Times article explores the role software will have grading papers in large online courses (and large lecture courses). The simple reality is that no teacher can grade 1000 or more papers. A team of graders is also expensive, and any "cost benefits" of online courses is lost. (However, many of us — myself included — don't see online education as a money-saving tool. Instead, we see it as a way to teach non-traditional students.)

The NYT article begins with a hypothetical:
New Test for Computers: Grading Essays at College Level
http://www.nytimes.com/2013/04/05/science/new-test-for-computers-grading-essays-at-college-level.html
Imagine taking a college exam, and, instead of handing in a blue book and getting a grade from a professor a few weeks later, clicking the "send" button when you are done and receiving a grade back instantly, your essay scored by a software program.

And then, instead of being done with that exam, imagine that the system would immediately let you rewrite the test to try to improve your grade.

EdX, the nonprofit enterprise founded by Harvard and the Massachusetts Institute of Technology to offer courses on the Internet, has just introduced such a system and will make its automated software available free on the Web to any institution that wants to use it. The software uses artificial intelligence to grade student essays and short written answers, freeing professors for other tasks.
We might not call it "grading" or "evaluating," yet our software already helps us with instant feedback. I use spellcheck and grammar check. I use automatic formatting tools for academic citations. Many of us in education do use plagiarism detection software to screen student papers, though those applications are far from perfect.

Instant feedback helps me, as a writer. I'm sure it will help many students. But, I'm not yet ready to rely on a computer to do more than offer suggestions or to highlight potential problems. Often, my word processor's spellcheck is wrong.

I certainly endorse the feedback model:
Anant Agarwal, an electrical engineer who is president of EdX, predicted that the instant-grading software would be a useful pedagogical tool, enabling students to take tests and write essays over and over and improve the quality of their answers. He said the technology would offer distinct advantages over the traditional classroom system, where students often wait days or weeks for grades.

"There is a huge value in learning with instant feedback," Mr. Agarwal said. "Students are telling us they learn much better with instant feedback."
The software used to evaluate writing is an example of artificial intelligence. The software builds a database based on samples. Over time, as human readers double check the feedback, the system "learns" and "refines" its criteria.
The EdX assessment tool requires human teachers, or graders, to first grade 100 essays or essay questions. The system then uses a variety of machine-learning techniques to train itself to be able to grade any number of essay or answers automatically and almost instantaneously.

The software will assign a grade depending on the scoring system created by the teacher, whether it is a letter grade or numerical rank. It will also provide general feedback, like telling a student whether an answer was on topic or not.

Dr. Agarwal said he believed that the software was nearing the capability of human grading.

"This is machine learning and there is a long way to go, but it's good enough and the upside is huge," he said. "We found that the quality of the grading is similar to the variation you find from instructor to instructor."
I do not like the "good enough" notion, nor do I like grading. I like feedback and guidance for students, at least from a computer system.

Still, AI is going to improve. I wonder how much it will improve in the next five years. The potential is there for software to mimic human readers to the point it will be impossible to tell the software-analyzed essay from the human-analyzed work.
Last year the Hewlett Foundation, a grant-making organization set up by one of the Hewlett-Packard founders and his wife, sponsored two $100,000 prizes aimed at improving software that grades essays and short answers. More than 150 teams entered each category. A winner of one of the Hewlett contests, Vik Paruchuri, was hired by EdX to help design its assessment software.

"One of our focuses is to help kids learn how to think critically," said Victor Vuchic, a program officer at the Hewlett Foundation. "It's probably impossible to do that with multiple-choice tests. The challenge is that this requires human graders, and so they cost a lot more and they take a lot more time."

Mark D. Shermis, a professor at the University of Akron in Ohio, supervised the Hewlett Foundation's contest on automated essay scoring and wrote a paper about the experiment. In his view, the technology — though imperfect — has a place in educational settings.

With increasingly large class sizes, it is impossible for most teachers to give students meaningful feedback on writing assignments, he said. Plus, he noted, critics of the technology have tended to come from the nation's best universities, where the level of pedagogy is much better than at most schools.

"Often they come from very prestigious institutions where, in fact, they do a much better job of providing feedback than a machine ever could," Dr. Shermis said. "There seems to be a lack of appreciation of what is actually going on in the real world."
The final two paragraph should worry everyone. The best teachers I know are at community colleges and state universities. This false dichotomy is disheartening.

However (you know that was coming), I know that mediocrity is a problem in higher education. Mediocre schools will struggle to compete in the future. Many of them will be endangered. Online education will not save them — it will probably help kill them. And I fear these are the schools at which administrators will rush to adopt automation, further diluting their institutions' roles in higher education.

Comments

Popular posts from this blog

MarsEdit and Blogging

MarsEdit (Photo credit: Wikipedia ) Mailing posts to blogs, a practice I adopted in 2005, allows a blogger like me to store copies of draft posts within email. If Blogger , WordPress, or the blogging platform of the moment crashes or for some other reason eats my posts, at least I have the original drafts of most entries. I find having such a nicely organized archive convenient — much easier than remembering to archive posts from Blogger or WordPress to my computer. With this post, I am testing MarsEdit from Red Sweater Software based on recent reviews, including an overview on 9to5Mac . Composing posts an email offers a fast way to prepare draft blogs, but the email does not always work well if you want to include basic formatting, images, and links to online resources. Submitting to Blogger via Apple Mail often produced complex HTML with unnecessary font and paragraph formatting styles. Problems with rich text led me to convert blog entries to plaintext in Apple Mail

Learning to Program

Late last night I installed the update to Apple's OS X programming tool suite, Xcode 4. This summer, in my "free" time I intend to work my way through my old copy of Teach Yourself C and the several Objective-C books I own. While I do play with various languages and tools, from AppleScript to PHP, I've never managed to master Objective-C — which is something I want to do. As I've written several times, knowing simple coding techniques is a practical skill and one that helps learn problem solving strategies. Even my use of AppleScript and Visual Basic for Applications (VBA) on a regular basis helps remind me to tackle problems in distinct steps, with clear objectives from step to step. There are many free programming tools that students should be encouraged to try. On OS X, the first two tools I suggest to non-technical students are Automator and AppleScript. These tools allow you to automate tasks on OS X, similar to the batch files of DOS or the macros of Wor

Learning to Code: Comments Count

I like comments in computer programming source code. I've never been the programmer to claim, "My code doesn't need comments." Maybe it is because I've always worked on so many projects that I need comments  to remind me what I was thinking when I entered the source code into the text editor. Most programmers end up in a similar situation. They look at a function and wonder, "Why did I do it this way?" Tangent : I also like comments in my "human" writing projects. One of the sad consequences of moving to digital media is that we might lose all the little marginalia authors and editors leave on manuscript drafts. That thought, the desire to preserve my notes, is worthy of its own blog post — so watch for a post on writing software and notes. Here are my rules for comments: Source code files should begin with identifying comments and an update log. Functions, subroutines, and blocks of code should have at least one descriptive comment.