Skip to main content

New Test for Computers - Grading Essays at College Level - NYTimes.com

Can computer software evaluate student papers? The debate has waged for about three years in higher education, and lately it has taken on a new urgency. This article from 2012 raises the question.
According to a new study [http://dl.dropbox.com/u/44416236/NCME%202012%20Paper3_29_12.pdf] by the University of Akron, computer grading software is just as effective in grading essays on standardized tests as live human scoring. After testing 16,000 middle school and high school test essays graded by both humans and computers, the study found virtually identical levels of accuracy, with the software in some cases proving to be more reliable than human grading. While the results are a blow to technology naysayers, the software is still controversial among certain education advocates who claim the software is not a cure-all for grading student essays.
http://www.aaeteachers.org/index.php/blog/714-study-robo-readers-more-accurate-in-scoring-essays, April 23, 2012
Now, a recent New York Times article explores the role software will have grading papers in large online courses (and large lecture courses). The simple reality is that no teacher can grade 1000 or more papers. A team of graders is also expensive, and any "cost benefits" of online courses is lost. (However, many of us — myself included — don't see online education as a money-saving tool. Instead, we see it as a way to teach non-traditional students.)

The NYT article begins with a hypothetical:
New Test for Computers: Grading Essays at College Level
http://www.nytimes.com/2013/04/05/science/new-test-for-computers-grading-essays-at-college-level.html
Imagine taking a college exam, and, instead of handing in a blue book and getting a grade from a professor a few weeks later, clicking the "send" button when you are done and receiving a grade back instantly, your essay scored by a software program.

And then, instead of being done with that exam, imagine that the system would immediately let you rewrite the test to try to improve your grade.

EdX, the nonprofit enterprise founded by Harvard and the Massachusetts Institute of Technology to offer courses on the Internet, has just introduced such a system and will make its automated software available free on the Web to any institution that wants to use it. The software uses artificial intelligence to grade student essays and short written answers, freeing professors for other tasks.
We might not call it "grading" or "evaluating," yet our software already helps us with instant feedback. I use spellcheck and grammar check. I use automatic formatting tools for academic citations. Many of us in education do use plagiarism detection software to screen student papers, though those applications are far from perfect.

Instant feedback helps me, as a writer. I'm sure it will help many students. But, I'm not yet ready to rely on a computer to do more than offer suggestions or to highlight potential problems. Often, my word processor's spellcheck is wrong.

I certainly endorse the feedback model:
Anant Agarwal, an electrical engineer who is president of EdX, predicted that the instant-grading software would be a useful pedagogical tool, enabling students to take tests and write essays over and over and improve the quality of their answers. He said the technology would offer distinct advantages over the traditional classroom system, where students often wait days or weeks for grades.

"There is a huge value in learning with instant feedback," Mr. Agarwal said. "Students are telling us they learn much better with instant feedback."
The software used to evaluate writing is an example of artificial intelligence. The software builds a database based on samples. Over time, as human readers double check the feedback, the system "learns" and "refines" its criteria.
The EdX assessment tool requires human teachers, or graders, to first grade 100 essays or essay questions. The system then uses a variety of machine-learning techniques to train itself to be able to grade any number of essay or answers automatically and almost instantaneously.

The software will assign a grade depending on the scoring system created by the teacher, whether it is a letter grade or numerical rank. It will also provide general feedback, like telling a student whether an answer was on topic or not.

Dr. Agarwal said he believed that the software was nearing the capability of human grading.

"This is machine learning and there is a long way to go, but it's good enough and the upside is huge," he said. "We found that the quality of the grading is similar to the variation you find from instructor to instructor."
I do not like the "good enough" notion, nor do I like grading. I like feedback and guidance for students, at least from a computer system.

Still, AI is going to improve. I wonder how much it will improve in the next five years. The potential is there for software to mimic human readers to the point it will be impossible to tell the software-analyzed essay from the human-analyzed work.
Last year the Hewlett Foundation, a grant-making organization set up by one of the Hewlett-Packard founders and his wife, sponsored two $100,000 prizes aimed at improving software that grades essays and short answers. More than 150 teams entered each category. A winner of one of the Hewlett contests, Vik Paruchuri, was hired by EdX to help design its assessment software.

"One of our focuses is to help kids learn how to think critically," said Victor Vuchic, a program officer at the Hewlett Foundation. "It's probably impossible to do that with multiple-choice tests. The challenge is that this requires human graders, and so they cost a lot more and they take a lot more time."

Mark D. Shermis, a professor at the University of Akron in Ohio, supervised the Hewlett Foundation's contest on automated essay scoring and wrote a paper about the experiment. In his view, the technology — though imperfect — has a place in educational settings.

With increasingly large class sizes, it is impossible for most teachers to give students meaningful feedback on writing assignments, he said. Plus, he noted, critics of the technology have tended to come from the nation's best universities, where the level of pedagogy is much better than at most schools.

"Often they come from very prestigious institutions where, in fact, they do a much better job of providing feedback than a machine ever could," Dr. Shermis said. "There seems to be a lack of appreciation of what is actually going on in the real world."
The final two paragraph should worry everyone. The best teachers I know are at community colleges and state universities. This false dichotomy is disheartening.

However (you know that was coming), I know that mediocrity is a problem in higher education. Mediocre schools will struggle to compete in the future. Many of them will be endangered. Online education will not save them — it will probably help kill them. And I fear these are the schools at which administrators will rush to adopt automation, further diluting their institutions' roles in higher education.

Comments

Popular posts from this blog

Slowly Rebooting in 286 Mode

The lumbar radiculopathy, which sounds too much like "ridiculously" for me, hasn't faded completely. My left leg still cramps, tingles, and hurts with sharp pains. My mind remains cloudy, too, even as I stop taking painkillers for the back pain and a recent surgery.

Efforts to reboot and get back on track intellectually, physically, and emotionally are off to a slow, grinding start. It reminds me of an old 80286 PC, the infamously confused Intel CPU that wasn't sure what it was meant to be. And this was before the "SX" fiascos, which wedded 32-bit CPU cores with 16-bit connections. The 80286 was supposed to be able to multitask, but design flaws resulted in a first-generation that was useless to operating system vendors.

My back, my knees, my ankles are each making noises like those old computers.

If I haven't already lost you as a reader, the basic problem is that my mind cannot focus on one task for long without exhaustion and multitasking seems…

MarsEdit and Blogging

MarsEdit (Photo credit: Wikipedia) Mailing posts to blogs, a practice I adopted in 2005, allows a blogger like me to store copies of draft posts within email. If Blogger, WordPress, or the blogging platform of the moment crashes or for some other reason eats my posts, at least I have the original drafts of most entries. I find having such a nicely organized archive convenient — much easier than remembering to archive posts from Blogger or WordPress to my computer.

With this post, I am testing MarsEdit from Red Sweater Software based on recent reviews, including an overview on 9to5Mac.

Composing posts an email offers a fast way to prepare draft blogs, but the email does not always work well if you want to include basic formatting, images, and links to online resources. Submitting to Blogger via Apple Mail often produced complex HTML with unnecessary font and paragraph formatting styles. Problems with rich text led me to convert blog entries to plaintext in Apple Mail and then format th…

Screenwriting Applications

Screenplay sample, showing dialogue and action descriptions. "O.S."=off screen. Written in Final Draft. (Photo credit: Wikipedia) A lot of students and aspiring writers ask me if you "must" use Final Draft or Screenwriter to write a screenplay. No. Absolutely not, unless you are working on a production. In which case, they own or your earn enough for Final Draft or Screenwriter and whatever budget/scheduling apps the production team uses.

I have to say, after trying WriterDuet I would use it in a heartbeat for a small production company and definitely for any non-profit, educational projects. No question. The only reason not to use it is that you must have the exclusive rights to a script... and I don't have those in my work.

WriterDuet is probably best free or low-cost option I have tested. It is very interesting. Blows away Celtx. The Pro version with off-line editing is cheaper than Final Draft or Screenwriter.

The Pro edition is a standalone, offline versio…