Can computer software evaluate student papers? The debate has waged for about three years in higher education, and lately it has taken on a new urgency. This article from 2012 raises the question.
The NYT article begins with a hypothetical:
Instant feedback helps me, as a writer. I'm sure it will help many students. But, I'm not yet ready to rely on a computer to do more than offer suggestions or to highlight potential problems. Often, my word processor's spellcheck is wrong.
I certainly endorse the feedback model:
Still, AI is going to improve. I wonder how much it will improve in the next five years. The potential is there for software to mimic human readers to the point it will be impossible to tell the software-analyzed essay from the human-analyzed work.
However (you know that was coming), I know that mediocrity is a problem in higher education. Mediocre schools will struggle to compete in the future. Many of them will be endangered. Online education will not save them — it will probably help kill them. And I fear these are the schools at which administrators will rush to adopt automation, further diluting their institutions' roles in higher education.
According to a new study [http://dl.dropbox.com/u/44416236/NCME%202012%20Paper3_29_12.pdf] by the University of Akron, computer grading software is just as effective in grading essays on standardized tests as live human scoring. After testing 16,000 middle school and high school test essays graded by both humans and computers, the study found virtually identical levels of accuracy, with the software in some cases proving to be more reliable than human grading. While the results are a blow to technology naysayers, the software is still controversial among certain education advocates who claim the software is not a cure-all for grading student essays.Now, a recent New York Times article explores the role software will have grading papers in large online courses (and large lecture courses). The simple reality is that no teacher can grade 1000 or more papers. A team of graders is also expensive, and any "cost benefits" of online courses is lost. (However, many of us — myself included — don't see online education as a money-saving tool. Instead, we see it as a way to teach non-traditional students.)
— http://www.aaeteachers.org/index.php/blog/714-study-robo-readers-more-accurate-in-scoring-essays, April 23, 2012
The NYT article begins with a hypothetical:
New Test for Computers: Grading Essays at College LevelWe might not call it "grading" or "evaluating," yet our software already helps us with instant feedback. I use spellcheck and grammar check. I use automatic formatting tools for academic citations. Many of us in education do use plagiarism detection software to screen student papers, though those applications are far from perfect.
http://www.nytimes.com/2013/04/05/science/new-test-for-computers-grading-essays-at-college-level.html
Imagine taking a college exam, and, instead of handing in a blue book and getting a grade from a professor a few weeks later, clicking the "send" button when you are done and receiving a grade back instantly, your essay scored by a software program.
And then, instead of being done with that exam, imagine that the system would immediately let you rewrite the test to try to improve your grade.
EdX, the nonprofit enterprise founded by Harvard and the Massachusetts Institute of Technology to offer courses on the Internet, has just introduced such a system and will make its automated software available free on the Web to any institution that wants to use it. The software uses artificial intelligence to grade student essays and short written answers, freeing professors for other tasks.
Instant feedback helps me, as a writer. I'm sure it will help many students. But, I'm not yet ready to rely on a computer to do more than offer suggestions or to highlight potential problems. Often, my word processor's spellcheck is wrong.
I certainly endorse the feedback model:
Anant Agarwal, an electrical engineer who is president of EdX, predicted that the instant-grading software would be a useful pedagogical tool, enabling students to take tests and write essays over and over and improve the quality of their answers. He said the technology would offer distinct advantages over the traditional classroom system, where students often wait days or weeks for grades.The software used to evaluate writing is an example of artificial intelligence. The software builds a database based on samples. Over time, as human readers double check the feedback, the system "learns" and "refines" its criteria.
"There is a huge value in learning with instant feedback," Mr. Agarwal said. "Students are telling us they learn much better with instant feedback."
The EdX assessment tool requires human teachers, or graders, to first grade 100 essays or essay questions. The system then uses a variety of machine-learning techniques to train itself to be able to grade any number of essay or answers automatically and almost instantaneously.I do not like the "good enough" notion, nor do I like grading. I like feedback and guidance for students, at least from a computer system.
The software will assign a grade depending on the scoring system created by the teacher, whether it is a letter grade or numerical rank. It will also provide general feedback, like telling a student whether an answer was on topic or not.
Dr. Agarwal said he believed that the software was nearing the capability of human grading.
"This is machine learning and there is a long way to go, but it's good enough and the upside is huge," he said. "We found that the quality of the grading is similar to the variation you find from instructor to instructor."
Still, AI is going to improve. I wonder how much it will improve in the next five years. The potential is there for software to mimic human readers to the point it will be impossible to tell the software-analyzed essay from the human-analyzed work.
Last year the Hewlett Foundation, a grant-making organization set up by one of the Hewlett-Packard founders and his wife, sponsored two $100,000 prizes aimed at improving software that grades essays and short answers. More than 150 teams entered each category. A winner of one of the Hewlett contests, Vik Paruchuri, was hired by EdX to help design its assessment software.The final two paragraph should worry everyone. The best teachers I know are at community colleges and state universities. This false dichotomy is disheartening.
"One of our focuses is to help kids learn how to think critically," said Victor Vuchic, a program officer at the Hewlett Foundation. "It's probably impossible to do that with multiple-choice tests. The challenge is that this requires human graders, and so they cost a lot more and they take a lot more time."
Mark D. Shermis, a professor at the University of Akron in Ohio, supervised the Hewlett Foundation's contest on automated essay scoring and wrote a paper about the experiment. In his view, the technology — though imperfect — has a place in educational settings.
With increasingly large class sizes, it is impossible for most teachers to give students meaningful feedback on writing assignments, he said. Plus, he noted, critics of the technology have tended to come from the nation's best universities, where the level of pedagogy is much better than at most schools.
"Often they come from very prestigious institutions where, in fact, they do a much better job of providing feedback than a machine ever could," Dr. Shermis said. "There seems to be a lack of appreciation of what is actually going on in the real world."
However (you know that was coming), I know that mediocrity is a problem in higher education. Mediocre schools will struggle to compete in the future. Many of them will be endangered. Online education will not save them — it will probably help kill them. And I fear these are the schools at which administrators will rush to adopt automation, further diluting their institutions' roles in higher education.
Comments
Post a Comment