Our stock sentiment analysis system has been working like a charm now for a little over 3 days. There was some confusion on the Yahoo News API which caused a temporary outage. But the data has moved around enough such that we can now start showing some data science on the collected data and see if we cannot make some educated choices in what we expect the stock price to do in the upcoming weeks.
The Ticker News Sentiment Analysis System (TNSAS we’ll call it) has gone through quite a transformation lately. Not only have we taken advantage of our containerization to scale the tone analysis service independently from the content scraper service, but we have also taken advantage of Redis based caching to reduce the total number of calls sent to IBM Watson to an absolute minimum as well as providing a log of all scraped documents and their related analysis.
We report this data via the console and copy and paste to a spreadsheet for tracking over time. This is not ideal and require significant manual data collection over an extended period time to get data at a resolution small enough to enable something approaching a real time system. We need a solution that can stand up and collect data consistently and then report it per ticker.
Just a quick note that with the latest caching and multithreading improvements we have added a whole slew of tickers – not blue chips but less tumultuous ETFs and penny stocks – to the tracking spreadsheet.
Developing microservices for pleasure and profit works when it doesn’t cost anything to do so. After our last month of developing a tone analysis system using IBM Watson on BlueMix, I tallied up quite a few queries and as a result got a bill from BlueMix for a whole 9 cents.
Problem – Tone Analysis is Too Slow
Currently we fetch a list of the latest 20 news stories for a stock ticker and then iterate through the list sequentially scraping the content and then submitting that scrape to the IBM Watson tone analysis service. As a result we have a time of O(n) to process a list of URLS.