DeepLearn 2024
11th International School on Deep Learning
(and the Future of Artificial Intelligence)
Porto - Maia, Portugal · July 15-19, 2024
Registration
Downloads
  • Call DeepLearn 2024
  • Poster DeepLearn 2024
  • Lecture Materials
  • Home
  • Lecturers
  • Schedule
  • Sponsoring
  • News
  • Info
    • Travel
    • Accommodation
    • UMaia / UP students and staff
    • Visa
    • Code of conduct
    • Testimonials
  • Home
  • Lecturers
  • Schedule
  • Sponsoring
  • News
  • Info
    • Travel
    • Accommodation
    • UMaia / UP students and staff
    • Visa
    • Code of conduct
    • Testimonials
deeeplearn-speakers-he

Yulan He

King’s College London

[introductory/intermediate] Machine Reading Comprehension with Large Language Models

Summary

Large Language Models (LLMs) have demonstrated their impressive capabilities across various tasks, including content generation, code writing, and human-like conversation. They have transformed the approaches to Machine Reading Comprehension (MRC). In MRC, an AI system is trained to read and comprehend text, and generate answers to the questions posed. MRC tasks can vary in complexity, ranging from simple fact-based extractive Question-Answering (QA) where answers can be directly extracted from text, to more complex questions that require situational awareness, reasoning and inference. MRC can find applications in many real-world scenarios, such as helping in understanding lengthy narratives, facilitating customer support chatbots, and enhancing educational assessments.

This tutorial on MRC with LLMs offers a comprehensive exploration of the subject matter. It will begin by delving into the fundamentals of MRC, including a discussion of the evolution of LLMs, the core architecture of LLMs, and some prominent LLM examples. Afterwards, it will cover techniques for parameter-efficient fine-tuning, prompt engineering, and in-context learning. Subsequently, it will shift the focus to present the use cases related to MRC, such as narrative understanding, long-range question-answering, automated student answer scoring, and claim veracity assessment. Students will also be introduced to a validation framework tailored for evaluating LLMs’ performance in MRC tasks. Finally, the tutorial will explore the aspects of explainability of LLMs and future trends in the field. This tutorial will guide students to navigate the evolving landscape of MRC with LLMs, preparing them to address real-world language understanding challenges.

Syllabus

  • Fundamentals of Machine Reading Comprehension (MRC): brief history and evolution of Large Language Models (LLMs); architecture of LLMs; examples of LLMs.
  • Learning of LLMs: parameter-efficient fine-tuning; prompt engineering; in-context learning.
  • Case studies of MRC: narrative understanding; long range QA; automated student answer scoring; claim veracity assessment.
  • Evaluation: validation framework for LLMs.
  • Explainability: understanding the emergent capabilities of LLMs; uncertainty interpretation of LLMs.
  • Future trends: limitations and challenges in the field; emerging trends in MRC.

References

Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. Neural Information Processing Systems.

Devlin, J., Chang, M., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. North American Chapter of the Association for Computational Linguistics.

Raffel, C., Shazeer, N.M., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P.J. (2019). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., 21, 140:1-140:67.

Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., et al. (2020). Language Models are Few-Shot Learners. ArXiv, abs/2005.14165.

Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2021). Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys, 55, 1 – 35.

Hu, J.E., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., & Chen, W. (2021). LoRA: Low-Rank Adaptation of Large Language Models. ArXiv, abs/2106.09685.

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Chi, E.H., Xia, F., Le, Q., & Zhou, D. (2022). Chain of Thought Prompting Elicits Reasoning in Large Language Models. ArXiv, abs/2201.11903.

Kadavath, S., Conerly, T., Askell, A., Henighan, T.J., Drain, D., Perez, E., Schiefer, N., et al. (2022). Language Models (Mostly) Know What They Know. ArXiv, abs/2207.05221.

Liang, P., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga, M., Zhang, Y., Narayanan, D., et al. (2023). Holistic Evaluation of Language Models. Annals of the New York Academy of Sciences, 1525, 140 – 146.

Guo, Z., Jin, R., Liu, C., Huang, Y., Shi, D., Supryadi, Yu, L., Liu, Y., Li, J., Xiong, B., & Xiong, D. (2023). Evaluating Large Language Models: A Comprehensive Survey. ArXiv, abs/2310.19736.

Pre-requisites

Participants should have prior knowledge on machine learning and deep learning.

Short bio

Yulan He is a Professor at the Department of Informatics in King’s College London. She is currently holding a prestigious 5-year UKRI Turing AI Fellowship. Yulan’s research interests lie in the integration of machine learning and natural language processing for text analytics. She has published over 200 papers on topics including natural language understanding, model interpretability, rumour veracity assessment, question-answering, sentiment analysis, topic and event extraction, and biomedical text mining. She has received several prizes and awards, including a SWSA Ten-Year Award, a CIKM 2020 Test-of-Time Award, and AI 2020 Most Influential Scholar Honourable Mention by AMiner. She has served as the General Chair for AACL-IJCNLP 2022, a Program Co-Chair for EMNLP 2020, and as an Action Editor for Transactions of the ACL and an Associate Editor for the Royal Society Open Science journal.

Other Courses

deeeplearn-speakers-hanJiawei Han
Katia SycaraKatia Sycara
deeeplearn-speakers-beniniLuca Benini
Gustau Camps-VallsGustau Camps-Valls
Nitesh ChawlaNitesh Chawla
Daniel CremersDaniel Cremers
deeeplearn-speakers-cuiPeng Cui
deeeplearn-speakers-gleyzerSergei V. Gleyzer
Frank HutterFrank Hutter
deeeplearn-speakers-karypisGeorge Karypis
deeeplearn-speakers-neyHermann Ney
Massimiliano PontilMassimiliano Pontil
Elisa RicciElisa Ricci
Wojciech SamekWojciech Samek
Xinghua Mindy ShiXinghua Mindy Shi
Michalis VazirgiannisMichalis Vazirgiannis
James ZouJames Zou

DeepLearn 2023 Summer

CO-ORGANIZERS

University of Maia

Institute for Research Development, Training and Advice – IRDTA, Brussels/London

Active links
    Past links
    • DeepLearn 2023 Summer
    • DeepLearn 2023 Spring
    • DeepLearn 2023 Winter
    • DeepLearn 2022 Autumn
    • DeepLearn 2022 Summer
    • DeepLearn 2022 Spring
    • DeepLearn 2021 Summer
    • DeepLearn 2019
    • DeepLearn 2018
    • DeepLearn 2017
    © IRDTA 2023. All Rights Reserved.
    We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
    Cookie SettingsAccept All
    Manage consent

    Privacy Overview

    This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
    Necessary
    Always Enabled
    Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
    CookieDurationDescription
    cookielawinfo-checkbox-advertisement1 yearThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
    cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
    cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
    cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
    cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
    cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
    PHPSESSIDsessionThis cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
    viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
    Functional
    Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
    Performance
    Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
    Analytics
    Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
    CookieDurationDescription
    _ga2 yearsThis cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
    _gat_gtag_UA_74880351_91 minuteThis cookie is set by Google and is used to distinguish users.
    _gid1 dayThis cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
    Advertisement
    Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
    Others
    Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
    SAVE & ACCEPT
    Powered by CookieYes Logo