Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/itrummer/lm4db

Material for my VLDB'22 and BTW'23 tutorials on application for language models in data management
https://github.com/itrummer/lm4db

bert-model databases gpt-3 language-model ml nlp

Last synced: about 1 month ago
JSON representation

Material for my VLDB'22 and BTW'23 tutorials on application for language models in data management

Awesome Lists containing this project

README

        

The introduction of Transformer-based language models has led to astonishing advances in the domain of natural language processing over the past years. Not only do such models dominate in a variety of standard benchmarks. The latest generation of language models can be specialized to novel, formerly unseen tasks with little to virtually no training data.

In this tutorial, I discuss the two key ideas enabling ultra-large language models: a new neural network architecture, the Transformer, and an unsupervised training process, based on the idea of transfer learning. After discussing the theoretical concepts behind language models, I demonstrate GPT-3 and other models and provide pointers on how to get access to this technology. Finally, I discuss novel use cases in data management that are enabled by language models, covering recent research and open problems.

**Slides of the VLDB'22 tutorial (90 minutes) are [here](lm4dbtrummer.pdf).**

**Slides of the BTW'23 tutorial (180 minutes) are [here](https://drive.google.com/file/d/1U-2j8oi5au3nuYwPIlhnno7c6UNDfifl/view?usp=sharing).**

Please use the following citation to refer to this tutorial:
```
@article{Trummer2022e,
author = {Trummer, Immanuel},
doi = {10.14778/3554821.3554896},
journal = {PVLDB},
number = {12},
pages = {3770 -- 3773},
title = {From BERT to GPT-3 Codex: Harnessing the Potential of Very Large Language Models for Data Management},
volume = {15},
year = {2022}
}
```