Chapter 5. Natural Language Tools
Sean Burke, author of Perl and LWP and a professional linguist, once described artificial intelligence as the study of programming situations where you either don't know what you want or don't know how to get it.
Natural-language processing, or NLP, is the application of AI techniques to answer questions about text written in a human language: what does it mean, what other documents is it like, and so on. As Perl is often described as a text-processing language, it shouldn't be much of a surprise to find that there are a great many modules and techniques available in Perl for carrying out NLP-related tasks.
But as we've seen so far in this book, the real strength of Perl is not in the ease with which we can program particular techniques, but that so many of the techniques we needtechniques to break texts into sentences and words, to correctly strip the endings off inflected words, to put the right endings back on again, and so onhave already been implemented and placed on CPAN. So in this chapter we're going to take a tour of the natural language section of CPAN, and see how we can use its modules to slice and dice any language text we need to deal with.