My A2Z final idea is based on my previous project that I collaborated with Ivy- Unreadable News. It is an installation that make people to aware of media manipulation in daily life.
But when we show this piece to outside people, sometimes people would wonder why those 4 pieces of news, cuz the content of the newspaper is not changing. And sometimes I feel it’s kinda too serious. So I’m think of a way that could convey our idea but in a different way. Maybe machine learning would be a good approach.
It would be nicer if the content, including text and image of the newspaper could dynamically change based on the date or other key words.
I choose to use LSTM module to train generate news.
- I need my dataset. First, I tried New York Times API, but it cannot let me get the whole content of every article. Then I tried a news api, it was nice cuz it provide content but it has limitations such like you can only get 20 articles if you don’t pay for it. So I ended up googling news dataset and get a raw data of BBS news.
I manually combine those 5 categories and got a 5M data, however it showed the error:
Then I decreased the size of the dataset from 5m to 2m, It works.
2. I used Paperspace to train the data. I tried several setting, and the training time are extremely different
Then I feed my training result back to the LSTM model, and got my generator:
The BBS dataset is kinda old, and I feel the result is not that interesting, so I maybe should find some other dataset.
My next steps are:
Find better dataset;
Think about the way of how people interact with it. How to get data from people instead of type something into the program.