Monday, November 24, 2008



The semester is over (so FAST!!) and now the exams are here. Not that I had that many exams, merely two finals in this week and that is it already. Today was the first and it was quite sh*** already. I am not sure whether it was really so difficult, but it didn't go all to smooth.

Which brought me back to think what would be my current CAP if the exam does not turn out to be an A. For those who don't know what CAP is, it is basically a weighted average score of a student's grades. And because our graduate school expects its scholars to top of the crop, we must maintain a CAP of at least 3.8 which is about B+ and above. For that reason, one might be tempted to sign up for a few easy modules (anything in management will do, I guess) to increase the CAP score. And in fact, students sometimes take up modules that are supposed to be easy to increase their scores. Only what happens if a lot of people decide to make the same move and end up int he same class? As the scores are moderated, the standard of the class should automatically rise and it should get harder to score A.

But now is not enough time to make a detailed study in student's behavior, the next exam is on Wednesday. Hopefully that one will be more successful and I can increase my CAP to > 3.8 again ;)


Wednesday, November 12, 2008


Can my computer pass my exams?

I am taking a Chinese module this semester and after a long while actively continue my long struggle to master this amazingly beautiful and also amazingly difficult language. My last post that was in Chinese was a homework for that course.

So today there was the final exam for this semester. The exam consisted of the following parts that each contributed 20% of the total score:

1. Make sentences: given a Chinese word make a sentence with this word
2. Reading comprehension: read a short story and answer the questions
3. Spot the error: find the error in a sentence and correct it
4. Word jumble: re-arrange the words to form a correct sentence
5. Fill the blanks: Complete a sentence fragment using a given word

After finishing the exam, I started wondering, if I could actually build a computer program to solve the exam that I had just completed. And actually, I believe that most of the questions should be relatively easy to solve for a computer.

1. For making sentences, you could simply go and use a concordancer to collect a list of sentences that use the given word from a corpus or from the web. This seems a bit like cheating, as it does not involve any 'creative' aspect nor does it require a deeper semantic understanding of the word that is asked for. On the other hand, it seems to be okay for a student to memorize example sentences that he read before the test and put them in whenever it seems appropriate for him/her.

2. Reading comprehension is the the task that is probably the most difficult so far, because it requires two extremely difficult NLP tasks: natural language understanding and generation.
There is a whole bunch of work on these tasks, but the task is definitely hard. I really wonder if any NLP langauge understanding system could have done a job on the paper today.

3. Spot the error is a task that is already implemented in some commercial applications like Microsoft Word. So far most systems can only recognize a limited number of errors, for example determinor choice, but they can probably get 50%-60% of the questions in the exam correct.

4. To find the correct order of words is well-studied problem in machine translation (MT). Most MT systems translate the words and then try to solve the word re-arrangement problem in a consecutive step. One important model for this step is a n-gram language model, which basically computes the probability of a sentence by decomposing it into smaller sequences of length n where n is typically 2 or 3. I believe that a simple 3-gram model trained on a large enough Chinese corpus could have done a pretty good job on my exam today.

5. This step is somewhere between the first two tasks. Not sure, how well this works, but with enogh data it should probably be possible to solve this step by using a concordancer and a langauge model as well. Maybe it won't give the most meaningful sentences but probably good enough to come up with a well formed sentence.

This all, seems to be more a toy application, but it is an accepted standard to test the language competence of primary school students and second language learners by test questions like the above. If a computer could solve them just as well, it would show a certain langauge competence of the machine.


This page is powered by Blogger. Isn't yours?