Skip to main content

New Test for Computers - Grading Essays at College Level -

Can computer software evaluate student papers? The debate has waged for about three years in higher education, and lately it has taken on a new urgency. This article from 2012 raises the question.
According to a new study [] by the University of Akron, computer grading software is just as effective in grading essays on standardized tests as live human scoring. After testing 16,000 middle school and high school test essays graded by both humans and computers, the study found virtually identical levels of accuracy, with the software in some cases proving to be more reliable than human grading. While the results are a blow to technology naysayers, the software is still controversial among certain education advocates who claim the software is not a cure-all for grading student essays., April 23, 2012
Now, a recent New York Times article explores the role software will have grading papers in large online courses (and large lecture courses). The simple reality is that no teacher can grade 1000 or more papers. A team of graders is also expensive, and any "cost benefits" of online courses is lost. (However, many of us — myself included — don't see online education as a money-saving tool. Instead, we see it as a way to teach non-traditional students.)

The NYT article begins with a hypothetical:
New Test for Computers: Grading Essays at College Level
Imagine taking a college exam, and, instead of handing in a blue book and getting a grade from a professor a few weeks later, clicking the "send" button when you are done and receiving a grade back instantly, your essay scored by a software program.

And then, instead of being done with that exam, imagine that the system would immediately let you rewrite the test to try to improve your grade.

EdX, the nonprofit enterprise founded by Harvard and the Massachusetts Institute of Technology to offer courses on the Internet, has just introduced such a system and will make its automated software available free on the Web to any institution that wants to use it. The software uses artificial intelligence to grade student essays and short written answers, freeing professors for other tasks.
We might not call it "grading" or "evaluating," yet our software already helps us with instant feedback. I use spellcheck and grammar check. I use automatic formatting tools for academic citations. Many of us in education do use plagiarism detection software to screen student papers, though those applications are far from perfect.

Instant feedback helps me, as a writer. I'm sure it will help many students. But, I'm not yet ready to rely on a computer to do more than offer suggestions or to highlight potential problems. Often, my word processor's spellcheck is wrong.

I certainly endorse the feedback model:
Anant Agarwal, an electrical engineer who is president of EdX, predicted that the instant-grading software would be a useful pedagogical tool, enabling students to take tests and write essays over and over and improve the quality of their answers. He said the technology would offer distinct advantages over the traditional classroom system, where students often wait days or weeks for grades.

"There is a huge value in learning with instant feedback," Mr. Agarwal said. "Students are telling us they learn much better with instant feedback."
The software used to evaluate writing is an example of artificial intelligence. The software builds a database based on samples. Over time, as human readers double check the feedback, the system "learns" and "refines" its criteria.
The EdX assessment tool requires human teachers, or graders, to first grade 100 essays or essay questions. The system then uses a variety of machine-learning techniques to train itself to be able to grade any number of essay or answers automatically and almost instantaneously.

The software will assign a grade depending on the scoring system created by the teacher, whether it is a letter grade or numerical rank. It will also provide general feedback, like telling a student whether an answer was on topic or not.

Dr. Agarwal said he believed that the software was nearing the capability of human grading.

"This is machine learning and there is a long way to go, but it's good enough and the upside is huge," he said. "We found that the quality of the grading is similar to the variation you find from instructor to instructor."
I do not like the "good enough" notion, nor do I like grading. I like feedback and guidance for students, at least from a computer system.

Still, AI is going to improve. I wonder how much it will improve in the next five years. The potential is there for software to mimic human readers to the point it will be impossible to tell the software-analyzed essay from the human-analyzed work.
Last year the Hewlett Foundation, a grant-making organization set up by one of the Hewlett-Packard founders and his wife, sponsored two $100,000 prizes aimed at improving software that grades essays and short answers. More than 150 teams entered each category. A winner of one of the Hewlett contests, Vik Paruchuri, was hired by EdX to help design its assessment software.

"One of our focuses is to help kids learn how to think critically," said Victor Vuchic, a program officer at the Hewlett Foundation. "It's probably impossible to do that with multiple-choice tests. The challenge is that this requires human graders, and so they cost a lot more and they take a lot more time."

Mark D. Shermis, a professor at the University of Akron in Ohio, supervised the Hewlett Foundation's contest on automated essay scoring and wrote a paper about the experiment. In his view, the technology — though imperfect — has a place in educational settings.

With increasingly large class sizes, it is impossible for most teachers to give students meaningful feedback on writing assignments, he said. Plus, he noted, critics of the technology have tended to come from the nation's best universities, where the level of pedagogy is much better than at most schools.

"Often they come from very prestigious institutions where, in fact, they do a much better job of providing feedback than a machine ever could," Dr. Shermis said. "There seems to be a lack of appreciation of what is actually going on in the real world."
The final two paragraph should worry everyone. The best teachers I know are at community colleges and state universities. This false dichotomy is disheartening.

However (you know that was coming), I know that mediocrity is a problem in higher education. Mediocre schools will struggle to compete in the future. Many of them will be endangered. Online education will not save them — it will probably help kill them. And I fear these are the schools at which administrators will rush to adopt automation, further diluting their institutions' roles in higher education.


Popular posts from this blog

Comic Sans Is (Generally) Lousy: Letters and Reading Challenges

Specimen of the typeface Comic Sans. (Photo credit: Wikipedia) Personally, I support everyone being able to type and read in whatever typefaces individuals prefer. If you like Comic Sans, then change the font while you type or read online content. If you like Helvetica, use that.

The digital world is not print. You can change typefaces. You can change their sizes. You can change colors. There is no reason to argue over what you use to type or to read as long as I can use typefaces that I like.

Now, as a design researcher? I'll tell you that type matters a lot to both the biological act of reading and the psychological act of constructing meaning. Statistically, there are "better" and "worse" type for conveying messages. There are also typefaces that are more legible and more readable. Sometimes, legibility does not help readability, either, as a type with overly distinct letters (legibility) can hinder word shapes and decoding (readability).

One of the co…

Let’s Make a Movie: Digital Filmmaking on a Budget

Film camera collection. (Photo credit: Wikipedia) Visalia Direct: Virtual Valley
June 5, 2015 Deadline
July 2015 Issue

Every weekend a small group of filmmakers I know make at least one three-minute movie and share the short film on their YouTube channel, 3X7 Films.

Inspired by the 48-Hour Film Project (, my colleagues started to joke about entering a 48-hour contest each month. Someone suggested that it might be possible to make a three-minute movie every week. Soon, 3X7 Films was launched as a Facebook group and members started to assemble teams to make movies.

The 48-Hour Film Project, also known as 48HFP, launched in 2001 by Mark Ruppert. He convinced some colleagues in Washington, D.C., that they could make a movie in 48 hours. The idea became a friendly competition. Fifteen years later, 48HFP is an international phenomenon, with competitions in cities around the world. Regional winners compete in national and international festivals.

On a Friday night, teams gathe…

Edutainment: Move Beyond Entertaining, to Learning

A drawing made in Tux Paint using various brushes and the Paint tool. (Photo credit: Wikipedia) Visalia Direct: Virtual Valley
November 2, 2015 Deadline
December 2015 Issue

Randomly clicking on letters, the young boy I was watching play an educational game “won” each level. He paid no attention to the letters themselves. His focus was on the dancing aliens at the end of each alphabet invasion.

Situations like this occur in classrooms and homes every day. Technology appeals to parents, politicians and some educators as a path towards more effective teaching. We often bring technology into our schools and homes, imagining the latest gadgets and software will magically transfer skills and information to our children.

This school year, I left teaching business communications to return to my doctoral specialty in education, technology and language development. As a board member of an autism-related charity, I speak to groups on how technology both helps and hinders special education. Busin…