README.md 5.14 KB
Newer Older
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
1 2 3 4 5 6
# Introduction to Natural Language Processing 2020 (Notebooks)


## Table of contents
* [General info](#general-info)
* [Prerequisites](#prerequisites)
Shohreh Haddadan's avatar
Shohreh Haddadan committed
7
* [Technologies introduced in the course](#technologies-and-tools-introduced-in-the-course)
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
8
* [Notebooks](#notebooks)
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
9 10 11 12
* [Licence](#licence)
* [Contact](#contact)


Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
13 14 15

### General info

Shohreh Haddadan's avatar
Shohreh Haddadan committed
16
This repository contains notebooks that we ([Shohreh Haddadan](https://github.com/shohrehhd) and [Ekaterina Kamlovskaya](https://github.com/katyamatya)) created and used for teaching in the practical part of **Introduction to Natural Language Processing** course (Winter semester 2020, University of Luxembourg, Prof. Christoph Schommer). 
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
17 18 19 20 21

### Prerequisites 

Basic knowledge of Python programming language

Shohreh Haddadan's avatar
Shohreh Haddadan committed
22
### Technologies and tools introduced in the course
Shohreh Haddadan's avatar
Shohreh Haddadan committed
23
  - Python Packages and Libraries
Shohreh Haddadan's avatar
Shohreh Haddadan committed
24 25
    - NLTK (https://www.nltk.org/)
    - spaCy (https://spacy.io/)
Shohreh Haddadan's avatar
Shohreh Haddadan committed
26
    - Stanza (https://stanfordnlp.github.io/stanza/)
Shohreh Haddadan's avatar
Shohreh Haddadan committed
27 28 29
    - VADER (https://github.com/cjhutto/vaderSentiment)
    - sklearn (https://scikit-learn.org/)
    - gensim
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
30

Shohreh Haddadan's avatar
Shohreh Haddadan committed
31

Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
32
  - WordNet (https://wordnet.princeton.edu/)
Shohreh Haddadan's avatar
Shohreh Haddadan committed
33 34 35 36
  - Machine learning
  - Neural networks
  - Word Embedding modelling (word2vec, Glove)
  - Topic Modelling
Shohreh Haddadan's avatar
Shohreh Haddadan committed
37
    - Mallet (http://mallet.cs.umass.edu/)
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
38 39 40 41 42 43 44 45 46 47 48 49 50

### Notebooks

| Filename | Topics |
| ------ | ------ |
|[Practical_01_Preprocessing.ipynb][01]| Tokenising, lowercasing, stemming, frequency distribution, ngrams, concordance analysis, POS tagging, NER |
|[Practical_02_Chatbots_with_Regex.ipynb][02] | Turing test, edit distance, regular expressions, creating a chatbot|
|[Practical_03_Language_Models.ipynb][03]| Language models, Markov chain, Markov assumption, Ngram and neural network models |
|[Practical_04_Word_Embeddings.ipynb][04] | Word vector representation, word embeddings (word2vec, Glove etc.), model visualisation|
|[Practical_05_POS_Tagging.ipynb][05] | Part-of-speech tagging (NLTK, spaCy, Stanza) |
|[Practical_06_Parsing.ipynb][06] | Syntax (sentence structure) analysis, constiuency and dependency parsing |
|[Practical_07_Information_Extraction.ipynb][07] | Named Entity Recognition, relation and event extraction |
|[Practical_08_Word_sense_disambiguation.ipynb][08] |Lexical ambiguity, word senses, WordNet, word sense disambiguation |
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
51
|[Practical_09_The_Role_Of_Machine_Learning_In_NLP.ipynb][09]| Supervised and unsupervised machine learning and its NLP applications |
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
52
|[Practical_10_Sentiment_analysis.ipynb][10]| Sentiment analysis (VADER, sklearn, Random Forest Classifier, neural networks)|
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
53
|[Practical_11_Unsupervised_Learning_and_Topic_modeling.ipynb][11]| Unsupervised machine learning and topic modelling |
Shohreh Haddadan's avatar
Shohreh Haddadan committed
54
|[Practical_12_NLP_Ethics.ipynb][13]| NLP ethics (big data dangers, gender and racial bias, green NLP, privacy, crowdsourcing, text generation (GPT-2, GPT-3), chatbots |
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
55 56 57



Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
58

Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
59 60
### License

Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
61 62
[![Creative Commons License](https://i.creativecommons.org/l/by-nc/4.0/88x31.png)](http://creativecommons.org/licenses/by-nc/4.0/)  
This work is licensed under a [Creative Commons Attribution-NonCommercial 4.0 International License](http://creativecommons.org/licenses/by-nc/4.0/).
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
63

Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
64 65
### Contact

Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
66 67
Please feel free to drop us a line if you have any questions or comments! We will appreciate any feedback.

Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
68 69
[![https://twitter.com/kamlovskaya](https://img.shields.io/twitter/url/https/twitter.com/kamlovskaya.svg?style=social&label=Follow%20%40kamlovskaya)](https://twitter.com/kamlovskaya)

Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
70 71 72 73
[![https://twitter.com/shohrehhd](https://img.shields.io/twitter/url/https/twitter.com/shohrehhd.svg?style=social&label=Follow%20%40shohrehhd)](https://twitter.com/shohrehhd)



Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
74 75 76



Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
77 78 79 80 81 82 83 84 85 86 87 88 89 90


   [0]: <https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_01_Preprocessing.ipynb>
   [01]: <https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_02_Chatbots_with_Regex.ipynb>
   [02]: <https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_02_Chatbots_with_Regex.ipynb>
   [03]: <https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_03_Language_Models.ipynb>
   [04]: <https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_04_Word_Embeddings.ipynb>
   [05]: <https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_05_POS_Tagging.ipynb>
   [06]: <https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_06_Parsing.ipynb>
   [07]:<https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_07_Information_Extraction.ipynb>
   [08]:<https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_08_Word_sense_disambiguation.ipynb>
   [09]:<https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_09_The_Role_Of_Machine_Learning_In_NLP.ipynb>
   [10]:<https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_10_Sentiment_analysis.ipynb>
   [11]:<https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_11_Unsupervised_Learning_and_Topic_modeling.ipynb>
Shohreh Haddadan's avatar
Shohreh Haddadan committed
91
   [13]: <https://gitlab.uni.lu/mine/introduction-to-nlp-practicals-2020/-/blob/master/Practical_12_NLP_Ethics.ipynb>
Ekaterina Kamlovskaya's avatar
Ekaterina Kamlovskaya committed
92 93