DeepLearn 2023 Spring
9th International School
on Deep Learning
Bari, Italy · April 03-07, 2023
Registration
Downloads
  • Call DeepLearn 2023 Spring
  • Poster DeepLearn 2023 Spring
  • Lecture Materials
  • Home
  • Schedule
  • Lecturers
  • Sponsoring
  • News
  • Info
    • Accommodation
    • Travel to Bari
    • Code of conduct
    • Visa
    • Testimonials
  • Home
  • Schedule
  • Lecturers
  • Sponsoring
  • News
  • Info
    • Accommodation
    • Travel to Bari
    • Code of conduct
    • Visa
    • Testimonials
Xiaowei Xu

Xiaowei Xu

University of Arkansas Little Rock

[intermediate/advanced] From Transformer to ChatGPT and beyond: How Large Language Models Revolutionize AI?

Summary

In recent years, the development of large language models, such as the transformer architecture, has led to a paradigm shift in the field of artificial intelligence. These models have transformed natural language processing tasks, including machine translation, question answering, and language generation. One such model, ChatGPT, is trained on vast amounts of text data to generate human-like responses to textual prompts.

This lecture tutorial will provide a comprehensive overview of the journey from the transformer architecture to ChatGPT and beyond. We will explore the training methodologies and architectures used to build these models and how they have paved the way for future research in AI. Two major paradigms of training large language models, fine-tuning and in-context learning, will also be covered. Additionally, we will demonstrate how to use pre-trained large language models for causal inference and other cutting-edge machine learning tasks.

Furthermore, we will discuss the ethical implications of these models and their impact on society. By the end of this tutorial, participants will gain a comprehensive understanding of the evolution of large language models and their significance in shaping the future of AI. Whether you are a researcher or a practitioner in the field, this tutorial will provide valuable insights into the latest developments in natural language processing and the potential applications of large language models.

Syllabus

  • Introduction
  • Generative models
  • Language model
  • Transformer model
  • Scaling law of large language models
  • BERT: Bidirectional Encoder Representations from Transformers
  • Pre-training and fine-tuning
  • GPT: Generative Pre-trained Transformer
  • In-context learning
  • ChatGPT
  • Language model powered causal inference: the art of discovery of cause-and-effect relationships from text
  • Conclusion and future directions

References

Vaswani, Ashish, et al. “Attention is all you need.” Advances in neural information processing systems 30 (2017).

Radford, Alec, et al. “Improving language understanding by generative pre-training.” (2018).

Radford, Alec, et al. “Language models are unsupervised multitask learners.” OpenAI blog 1.8 (2019): 9.

Kaplan, Jared, et al. “Scaling laws for neural language models.” arXiv preprint arXiv:2001.08361 (2020).

Brown, Tom, et al. “Language models are few-shot learners.” Advances in neural information processing systems 33 (2020): 1877-1901.

Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed H. Chi, Quoc Le, Denny Zhou. Chain of Thought Prompting Elicits Reasoning in Large Language Models. CoRR abs/2201.11903 (2022).

Wang, Yizhong, et al. “Self-Instruct: Aligning Language Model with Self Generated Instructions.” arXiv preprint arXiv:2212.10560 (2022).

Alpaca: A Strong, Replicable Instruction-Following Model.

Cheng, Daixuan, et al. “UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation.” arXiv preprint arXiv:2303.08518 (2023).

Shinn, Noah, Beck Labash, and Ashwin Gopinath. “Reflexion: an autonomous agent with dynamic memory and self-reflection.” arXiv preprint arXiv:2303.11366 (2023).

Wang, Xingqiao, et al. “InferBERT: a transformer-based causal inference framework for enhancing pharmacovigilance.” Frontiers in Artificial Intelligence 4 (2021): 659622.

Lambert, Nathan, et al. “Illustrating Reinforcement Learning from Human Feedback (RLHF)”, Hugging Face Blog, 2022.

Ouyang, Long, et al. “Training language models to follow instructions with human feedback.” arXiv preprint arXiv:2203.02155 (2022).

Wang, Xingqiao, et al. “DeepCausality: A general AI-powered causal inference framework for free text: A case study of LiverTox.” Frontiers in Artificial Intelligence 5 (2022).

Qin, Chengwei, et al. “Is ChatGPT a General-Purpose Natural Language Processing Task Solver?.” arXiv preprint arXiv:2302.06476 (2023).

OpenAI. “How should AI systems behave, and who should decide?”

Pre-requisites

Mathematics and machine learning at the level of an undergraduate degree in computer science: basic multivariate calculus, probability theory, linear algebra, probabilistic graphical models, and neural networks.

Short bio

Xiaowei Xu, a professor of Information Science at the University of Arkansas, Little Rock (UALR), received his Ph.D. degree in Computer Science at the University of Munich in 1998. Before his appointment in UALR, he was a senior research scientist in Siemens, Munich, Germany. His research spans data mining, machine learning and artificial intelligence. Dr. Xu is a recipient of 2014 ACM SIGKDD Test of Time award for his contribution to the density-based clustering algorithm DBSCAN, which is one of the most commonly used clustering algorithms.

Other Courses

Babak Ehteshami BejnordiBabak Ehteshami Bejnordi
speakers-gleyzerSergei V. Gleyzer
speakers-kumarVipin Kumar
speakers-goldbergerJacob Goldberger
Christoph LampertChristoph Lampert
speakers-jingbianYingbin Liang
Xiaoming LiuXiaoming Liu
Michael MahoneyMichael Mahoney
Liza MijovicLiza Mijovic
William S. NobleWilliam S. Noble
Bhiksha RajBhiksha Raj
Holger Rauhut‪Holger Rauhut
Bart ter Haar RomenyBart ter Haar Romeny
Tara SainathTara Sainath
Martin SchultzMartin Schultz
Adi Laurentiu TarcaAdi Laurentiu Tarca
Emma TolleyEmma Tolley
Michalis VazirgiannisMichalis Vazirgiannis
Atlas WangAtlas Wang
Guo-Wei WeiGuo-Wei Wei
Lei XingLei Xing

DeepLearn 2023 Spring

CO-ORGANIZERS

Department of Computer Science
University of Bari “Aldo Moro”

Institute for Research Development, Training and Advice – IRDTA, Brussels/London

Active links
  • DeepLearn 2023 Summer – 10th International Gran Canaria School on Deep Learning
  • BigDat 2023 Summer – 7th International School on Big Data

Photos by: Ph. Eufemia Lella

Past links
  • DeepLearn 2023 Winter
  • DeepLearn 2022 Autumn
  • DeepLearn 2022 Summer
  • DeepLearn 2022 Spring
  • DeepLearn 2021 Summer
  • DeepLearn 2019
  • DeepLearn 2018
  • DeepLearn 2017
© IRDTA 2021. All Rights Reserved.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
Cookie SettingsAccept All
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-advertisement1 yearThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSIDsessionThis cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
CookieDurationDescription
_ga2 yearsThis cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_74880351_91 minuteThis cookie is set by Google and is used to distinguish users.
_gid1 dayThis cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT
Powered by CookieYes Logo