Posts Tagged ‘NLP’

NLP as an impossible task

Monday, April 14th, 2008

Good NLP is very difficult, because of real-world ambiguity inherent to nearly every real-world situation. By its nature, it truly is a NP-complete problem. Trying to solve it brute force is pointless.

“He knew what he had to do. It was, of course, an impossible task. But he was used to them. Dragging a rat all the way from the wood to the hole had been an impossible task. But it wasn’t impossible to drag it a little way, so you did that, and then you had a rest, and then you dragged it a little way again… The way to deal with an impossible task was to chop it down into a number of merely very difficult tasks, and break each one of them into a group of horribly hard tasks, and each one of them into tricky jobs, and each one of them…” — quoting Hamish Cunningham’s quote from Terry Pratchett, Truckers, p. 119.

Here comes the strategy of Commentag. True NLP is impossible. Right. So what ? No big deal ! Let’s just move on and make at least a few things work. And then some better POS-tagging. And then some more powerful semantem extraction. Until we finally obtain as good results as our big brothers.

So we are on a concrete detection of local and specific information extraction, such as dealing with nicknames, recognizing several specific subjects, defining a minimistic set of microgeneric sems or taxems, etc.

Let’s travel a bit lighter this week. Do you feel that fresh air ?

Share/Save/Bookmark

NLP report, first edition

Friday, March 21st, 2008

Here are the main lines of the last week activity concerning NLP :

- several attempts to improve performances of the research of a specific tag’s semantic neighbours : still in progress

- testing the execution on a local server : done

- adding new fields inside database for learning : added tables for tag groups, tag and tag-tag frequencies + first design for an implementation of incremental learning using co-location an co-occurences measures on every tag

- testing the use of OpenCyc as an ontology (www.opencyc.org) : can be added to the neighbour words intension level

That’s all for now… See you next week !

Share/Save/Bookmark