Linguistic Resources for Natural Language Processing

By Max Silberztein

Release Date: 2024-03-13
Genre: Computers

Description

Empirical — data-driven, neural network-based, probabilistic, and statistical — methods seem to be the modern trend. Recently, OpenAI’s ChatGPT, Google’s Bard and Microsoft’s Sydney chatbots have been garnering a lot of attention for their detailed answers across many knowledge domains. In consequence, most AI researchers are no longer interested in trying to understand what common intelligence is or how intelligent agents construct scenarios to solve various problems. Instead, they now develop systems that extract solutions from massive databases used as cheat sheets. In the same manner, Natural Language Processing (NLP) software that uses training corpora associated with empirical methods are trendy, as most researchers in NLP today use large training corpora, always to the detriment of the development of formalized dictionaries and grammars.
Not questioning the intrinsic value of many software applications based on empirical methods, this volume aims at rehabilitating the linguistic approach to NLP. In an introduction, the editor uncovers several limitations and flaws of using training corpora to develop NLP applications, even the simplest ones, such as automatic taggers.
The first part of the volume is dedicated to showing how carefully handcrafted linguistic resources could be successfully used to enhance current NLP software applications. The second part presents two representative cases where data-driven approaches cannot be implemented simply because there is not enough data available for low-resource languages. The third part addresses the problem of how to treat multiword units in NLP software, which is arguably the weakest point of NLP applications today but has a simple and elegant linguistic solution.
It is the editor's belief that readers interested in Natural Language Processing will appreciate the importance of this volume, both for its questioning of the training corpus-based approaches and for the intrinsic value of the linguistic formalization and the underlying methodology presented.

Linguistic Resources for Natural Language Processing

By Max Silberztein

Description

More by Max Silberztein

Automatic Processing of Natural-Language Electronic Texts with NooJ

Linda Barone, Mario Monteleone & Max Silberztein

Formaliser les langues avec l’ordinateur : de INTEX à Nooj

Svetla Koeva, Denis Maurel & Max Silberztein

Formalizing Natural Languages with NooJ and Its Natural Language Processing Applications

Samir Mbarki, Mohammed Mourchid & Max Silberztein

Formalizing Natural Languages: Applications to Natural Language Processing and Digital Humanities

Magali Bigey, Annabel Richeton, Max Silberztein & Izabella Thomas

Automatic Processing of Natural-Language Electronic Texts with NooJ

Tatsiana Okrut, Yuras Hetsevich, Max Silberztein & Hanna Stanislavenka

Natural Language Processing and Information Systems

Max Silberztein, Faten Atigui, Elena Kornyshova, Elisabeth Métais & Farid Meziane

Formalising Natural Languages: Applications to Natural Language Processing and Digital Humanities

Božo Bekavac, Kristina Kocijan, Max Silberztein & Krešimir Šojat

Formalizing Natural Languages: Applications to Natural Language Processing and Digital Humanities

Mariana González, Silvia Susana Reyes, Andrea Rodrigo & Max Silberztein

Formalizing Natural Languages with NooJ 2018 and Its Natural Language Processing Applications

Ignazio Mauro Mirto, Mario Monteleone & Max Silberztein

Formalizing Natural Languages with NooJ 2019 and Its Natural Language Processing Applications

Héla Fehri, Slim Mesfar & Max Silberztein

Formalizing Natural Languages

Max Silberztein

Linguistic Resources for Natural Language Processing

Max Silberztein