Eneko Agirre

University of the Basque Country

[introductory/intermediate] Natural Language Processing in the Large Language Model Era

Summary

Deep learning models have been successfully applied to natural language processing, and are now changing radically how we interact with machines, as seen in machine translation, search engines, Siri, Alexa and GPT to name a few. These models are able to infer a continuous representation for words and sentences, and generalize to new tasks with much less training data.

The course will introduce the main deep learning models used in NLP, including transformers and pre-trained language models such as GPT4, T5 and BERT, and their use in fine-tuning and prompting, as well as instruction learning and human feedback. Attendants will be able to understand the models and play with implementations in Keras.

Syllabus

Introduction to NLP and DL
Multilayer Perceptron and Transformer
Language models, fine-tuning, prompting and human feedback

References

Deep Learning. Ian Goodfellow, Yoshua Bengio and Aaron Courville. 2015.

Natural Language Processing with Pytorch. Delip Rao and Brian McMahon. 2019.

Deep Learning with Python. Francois Chollet. 2017.

Liu et al. (2021). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv:2107.13586.

Zhao et al. (2023). A Survey of Large Language Models. arXiv:2303.18223.

Hugging Face Course: https://huggingface.co/course

Pre-requisites

Addressed to professionals, researchers and students who want to understand and apply deep learning techniques to text. The practical part requires basic programming experience.

Short bio

Eneko Agirre is the director of HiTZ, the Basque Center for Language Technology (hitz.eus), and full profesor in the Computer Science department of the University of the Basque Country (UPV/EHU). He has published over 150 international peer-reviewed articles and conference papers in NLP. He has been secretary and president of the ACL SIGLEX, member of the editorial board of Computational Linguistics, Transactions of the ACL and Journal of Artificial Intelligence Research. He is co-founder of the Joint Conference on Lexical and Computational Semantics (*SEM), now in its 12th edition. He is a usual reviewer for top international journals, a regular area chair and member of the program committees for top international conferences. He has coordinated several national and European projects. He has received three Google Research Awards in 2016, 2018 and 2019, and five best paper awards and nominations. Several PhD dissertations under his supervision have received the SEPLN, Spanish Computer Science and EurAI 2021 best PhD awards. He received the Spanish Computer Science Research Award in 2021, and is one of the 80 fellows of the Association of Computational Linguistics.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_9	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.