Week 2: NLP’s Commercial Uses And Potential Drawbacks
March 17, 2023
What is NLP? NLP is short for Natural Language Processing, which is a branch of AI that gives computers the ability to understand text. Natural Language is human language, such as English, Spanish, French, etc. Natural Language Processing is the modeling that computers use to recognize human language with statistics and machine learning.
At its core, NLP combines computational linguistics with deep learning models. Together, it enables computers to process data that comes in text, voice, and writing data. Some examples of NLP include spoken commands to Siri/Alexa, translating text via Google Translate, Chat GPT, and more. Natural Language Processing is an amazing application of Machine Learning that has immense potential to change the course of Artificial Intelligence and the world as we know it.
Of course, Natural Language Processing is not entirely accurate. It can introduce many new challenges, thus it has plenty of room for improvement. I believe that it is important to stay aware of the biases/risks of applying any technology to the real world. Therefore, in Week 2 of my senior project, I wanted to research some of the commercial applications of NLP, as well as the potential dangers of utilizing NLP in society. I found two research papers about the topics I wanted to cover this week, and they can be found here:
Commercial applications of natural language processing: https://dl.acm.org/doi/pdf/10.1145/219717.219778
The Social Impact of Natural Language Processing:
https://aclanthology.org/P16-2096.pdf
The first paper details the commercial applications of NLP. To educate myself more on what has been done in the field of Natural Language Processing, I read this research conducted by Kenneth W. Church of AT&T Bell Labs and Lisa F. Rau of the Systems Research and Applications Corporation. In this paper, the authors sought to demonstrate the potential profitability of Natural Language Processing. The reason why I liked this paper is that the authors kept in mind the dangers and “hype” of NLP, which could be misconstrued by the public. Multiple large-scale systems on the market are detailed, such as Systran, one of the oldest MT systems that provide a fully automated translation of text in real time. Another example was Meteo, which translated 20 million words annually, saving the Candinain Government millions of dollars every year. Other smaller systems such as spell and grammar checks and other common NLP systems were also mentioned in the paper. These were labeled as more standard and becoming more lucrative in the word processing market, becoming internationalized as well. The paper concludes with some possible areas NLP could be utilized, such as information management, which has become increasingly important because of the vast quantities of electronic text.
The second paper is an in-depth analysis of the societal impacts of NLP. Many topics involving bias, perspective, and overgeneralization were brought up throughout the paper. This research stood out to me in particular because it not only highlighted the potential problems of NLP, but also proposed novel solutions to mitigate any of its drawbacks. Some examples included downsampling, dummy labels, error weighting, as well as a careful research design (for exposure problems). Lastly, it featured a discussion about the ethical considerations regarding collecting data, designing “experimental setup”, and potential applications of NLP.