Assignment M1A2 Hands-on deep learning for NLP
This assignment should be performed in groups of 1-4 students with a preference for larger groups. We encourage cooperation and you will get a lot more out of the assignment, as you can cover more ground, if you work together. That said, we will allow you to solve it individually if you absolutely prefer this.
In this assignment you will investigate the use of neural networks for natural language processing (NLP). You will work on the problem of sentiment analysis and you will work with the dataset prepared in Learning Word Vectors for Sentiment Analysis Links to an external site. by Mass et al. The data is mined from the IMDB movie database and the task is to determine if a review is positive or negative based on the review text.
Read these movie reviews and determine how you would classify them:
- "Uhhh ... so, did they even have writers for this? Maybe I'm picky, but I like a little dialog with my movies. And, as far as slasher films go, just a sliver of character development will suffice."
- "Liked Stanley & Iris very much. Acting was very good. Story had a unique and interesting arrangement. The absence of violence and sex was refreshing. Characters were very convincing and felt like you could understand their feelings. Very enjoyable movie."
- "Everything that made the original so much fun seems to absent here. This is simply a "run of the mill demons on the loose wrecking havoc" slasher, but without the passion that graced the original"
- OK, so the musical pieces were poorly written and generally poorly sung (though Walken and Marner, particularly Walken, sounded pretty good). And so they shattered the fourth wall at the end by having the king and his nobles sing about the "battle" with the ogre, and praise the efforts of Puss in Boots when they by rights shouldn't have even known about it.Who cares? It's Christopher Freakin' Walken, doing a movie based on a fairy tale, and he sings and dances. His acting style fits the role very well as the devious, mischievous Puss who seems to get his master into deeper and deeper trouble but in fact has a plan he's thought about seven or eight moves in advance. And if you've ever seen Walken in any of his villainous roles, you *know* the ogre bit the dust HARD at the end when Walken got him into his trap. A fun film, and a must-see for anyone who enjoys the unique style of Christopher Walken.
According to the IMDB movie review database Links to an external site., these are classified as negative, positive, negative and positive respectively. Some of them you probably find easy to classify, filled with positively or negatively valued words, some however require a deeper understanding of the English language.
The task is to use deep learning to build such a classifier. The IMDB movie data base is nowadays considered "too small" to reach the full potential of deep networks (there are 50.000 reviews in the dataset). The advantage is that the training can be performed even on a laptop with CPU (though you would appreciate the possibility to run on a GPU...).
A word about the training...
In this assignment you will train neural networks. This will take some time. It is recommended that you do this work in parallell to some other task that you can switch to while you wait for the training to finish. Also, make sure to distribute the work within the group. We recommend that you go through the tasks together first and ensure that you all understand the basic concepts (the first questions for Part 1 and 2). Then you can split the evaluation work among yourselves.
Start Jupyter
The following assumes that you have setup the system as in Computer Tools for Learning and Knowledge and that you got the notebook code for the assignment installed.
source ~/tensorflowenv/bin/activate
cd ~/tensorflowenv/wasp_as2_m1a2
jupyter notebook
This should open up your browser and show you a list of notebooks. To open a notebook simply click on it.
Part 1
In this task you will investigate the use of convolutional layers in the network to process text data. Run through the notebook 1_baseline_convnets.ipynb and make sure that you understand what is going on. In particular ensure that you can discuss and reason around
- The role of training and validation data
- Overfitting and undercutting and how you can see this during training
- The different neurons (relu and sigmoid)
- The different layer and the number of parameters for these (make sure you understand the model summary table)
Perform experiment to investigate the performance of the network when varying
- how much training data you provide the system with
- max_review_length and n_unique_words
- the amount of dropout
- batch size
- different architectures
- how do smaller and larger networks respectively handle less data
- other things that you find interesting
What is the best performance (validation accuracy) you can get?
Part 2
In this task you will investigate how to use more advanced models that take into account the temporal/spatial correlation between data. In particular you should investigate RNNs and LSTMs.
You can start from the notebook 2_rnn_lstm.ipynb. Ensure that you understand the basic idea behind
- RNN
- LSTM
Perform experiments to investigate
- How does the performance of RNN/LSTM compare to the simpler ConvNet used in Task1? If better why? If worse why?
- Does it pay off to stack RNNs?
What is the best performance (validation accuracy) you can get?
Part 3 (OPTIONAL)
Pick another dataset and modify the code so that you can solve the corresponding task.
Your handin
You should hand in a zip-file in Canvas before the deadline. Please name this file with the surnames of all team members, e.g. Jensfelt-Maggio-Bernhardsson.zip. All team members need to hand in, even if it will be the same file (this makes it possible for us to follow up who has not handed in). All team-members should understand all the results.
The handin should contain a short presentation (at most 10 slides) and the Jupyter notebooks that you created / modified needed to reproduce your findings. The presentation should give a concise description of your findings from Part 1 and Part 2.