Our stock sentiment analysis system has been working like a charm now for a little over 3 days. There was some confusion on the Yahoo News API which caused a temporary outage. But the data has moved around enough such that we can now start showing some data science on the collected data and see if we cannot make some educated choices in what we expect the stock price to do in the upcoming weeks.
Problem – Tone Analysis is Too Slow
Currently we fetch a list of the latest 20 news stories for a stock ticker and then iterate through the list sequentially scraping the content and then submitting that scrape to the IBM Watson tone analysis service. As a result we have a time of O(n) to process a list of URLS.
I figured I would go off and run the tool on a few prominent tickers, make a prediction for a time period, and revisit later to see how those predictions fared.
Keep in mind, this is rather unscientific. I’ll be updating throughout the day as the analysis finish. Largely a task illustrating how long this takes to do without architectural enhancements we do in upcoming steps.