FINA 5240: FinTech Analytics Assignment 5 (Python代写,Natural Language Processing代写,FINA 5240代写,香港科技大学代写,香港程序代写)

Question. In the previous homework assignment we used tf-idf method to make predictions on labels based on the content of whitepapers of ICOs (Initial Coin Offerings). Now we want to use feed-forward neural networks for the same classification problem. Please use Python to do the following tasks.

联系我们
微信: biyeprodaixie 欢迎联系咨询

本次CS代写的主要涉及如下领域: Python代写,Natural Language Processing代写,FINA 5240代写,香港科技大学代写,香港程序代写

FINA 5240: FinTech Analytics Assignment 5

Halis Sak

October 17, 2019

Question.In the previous homework assignment we used tf-idf method to make predictions on labels based on the content of whitepapers of ICOs (Initial Coin Offerings). Now we want to use feed-forward neural networks for the same classification problem. Please usePythonto do the following tasks.

a)We need a tokenizer to split the content of the documents into words. Please usenltkpackage (follow the steps at https://pythonspot.com/tokenizing- words-and-sentences-with-nltk/ to download all the required packages). Af- ter completing the installation process fornltkpackage, create a newPandas dataframe having columns [“tok_content”,“label”]. Tokenize the content of whitepa- pers in “ICOData.csv" using “word_tokenize” function ofnltkpackage and store them in “tok_content” column of the new dataframe. The column “label” should be the label of documents in “ICOData.csv”.

b)We can usegensimpackage for creating vectors for the tokens of our whitepapers. (We used this package for news articles in Mandarin in Lecture 3 and 4.) First, train a word2vec model for tokens of our whitepapers using gensimpackage. Then, find the most similar words to “Bitcoin”.

c)Construct a mapping for tokens in our dictionary to integers as we did in our lecture notes. Then, split the new dataframe into two groups; training and testing (“df_train” and “df_test”). The first 130 rows of the data should be in “df_train” and the rest should be in “df_test”.

d)We want to fit a two-layer feed-forward neural network to our data as we did in Lecture 3. Please set the parameters of the model. The maximum number of tokens, “mlen”, can be assigned to 3000. You are welcome to experiment with the hyperparameters of the model.

e)Finally, we want to train the model and compute the classification accu- racy. ThePythoncode that I wrote for Lecture 3 can be used mostly without any change. However, data and target lines of the code should be changed.