Technology is becoming more and more of an influential factor in the lives of ordinary people around the globe and the internet has expanded in such a way that living without it, has in some countries even become impossible. Currently, Artificial Intelligence and Quantum Computing are at the verge of break through and could potentially become as influential in society as the internet has become in our current daily lives. Correspondingly, in the world of finance the rise of the internet and its subsequent technological developments are greatly impacting financial markets. For instance, transactions have become electronic and the time that it takes to execute a trade has decreased to milliseconds, and even nanoseconds. In addition, a new custom-built chip which is able to execute trades within 740 nanoseconds is being launched by Fitnetix, a UK based company. According to Johnson et al. (2012) this technological race is likely to be pushed further until the physical limits of the speed of light are met. Amongst these technological developments in the financial markets, automated trading might be the most present-day and prominent revolution. An algorithm can be defined as a precise plan of steps that uses computations to transform the input values into an output value (Leshik & Cralle, 2011). Supply and demand on the stock markets is increasingly in the hands of these computational algorithms that fully autonomously decide to buy or sell a stock on the behalf of its “owner”. As presented in Figure 1 by Glantz & Kissel (2013, p. 258), the percentage of market volume that can be attributed to algorithmic trading has risen greatly in the past twenty years with asset managers, high frequency traders and hedge funds accounting for most of the volume (Glantz & Kissel, 2013). Our proxy for algorithmic trading based on CRSP data support findings and also shows a clear rise in algorithmic trading activity as can be observed in Figure 2.

Figure 1. Algorithmic trading as a percentage of market volume. Reprinted from: Multi-asset risk modeling: techniques for a global economy in an electronic and algorithmic trading era, by M. Glantz, & R. Kissel, 2013, p. 258, Copyright by Academic Press.

Figure 2. Proxy for Algorithmic Trading based on CRSP data

Nevertheless, algorithmic trading is still a new topic and even though its foundation can be traced back to 1949 it has only become widely spread in the last two decades (Leshik & Cralle, 2011). To give an example, if one searches algorithmic trading on Google Scholar (Date: 18/7/2017), only 500 results will appear that contain “algorithmic trading” in its title of which most are working papers and only 20 of these were written before 2005. When put into context, these 500 papers and books amount to only 0,08% of the 67000 articles which hold “financial crisis” in its name.

For this reason, many of the used sources remain books and working papers as information on algorithmic trading is still limited.

However, according to Kaya (2016), in 2014 high frequency trading already accounted for 49 percent of all the volume in U.S. equity markets, where one must keep in mind that high frequency trading is merely a subgroup of algorithmic trading. The connection between algorithmic trading and its effects on the human aspects are barely touched upon within existing financial literature. It is likely that algorithmic trading in combination with improved artificial intelligence and quantum computing will completely change the financial markets as they are known to us now. Its relevance is undeniable and yet still so little is known about how the automation revolution impacts financial markets. Quantum computing and artificial intelligence still lie in the future, nevertheless human traders are already being substituted by computers on a great scale and its effects should be measurable using quantitative data. Measuring the effects of algorithmic trading is likely to give insights in how financial markets will behave in the future. The rise of algorithmic trading imposes that a decline in direct human influence has manifested itself within the financial markets. It can therefore be reasoned that trading algorithms differ in trading behavior from human investors in the sense that trading algorithms are assumed to never deviate from their set of predefined rules unless stated within their rules. In other words, a trading algorithm will always behave within its programmed boundaries but accounting for all the information that is delivered to it. On the other hand, human traders are more likely to act based on their intuition and what is happening in their environment, with the tendency to value certain information above others.

These influences can be identified as behavioral biases which are recurring patterns in human behavior that simplify the predictability of their behavior (Heiner, 1983).

Humans are rational but only boundedly so and often are attracted to a majority opinion (Kahneman, 2003). In the world of finance, this pull of social gravity to the majority opinion, together with bounded rationality, cause the amplification of inefficiencies in the stock market as investors consistently keep overpricing popular stocks and underpricing less favored equities (Deman & Lufkin, 2000). Furthermore, Kim and Kim (2014) state that investor sentiment is affected by historical share price performance, which further strengthens the market inefficiencies. Considering that the stock market is already to a certain extent inefficient, it is likely that investor sentiment is often biased because of unrepresentative share prices which then again could lead to more inaccurate forecasts. Additionally, Chaboud, Chiquoine, Hjalmarsson & Vega (2014) find evidence that “algorithmic trading contributes to a more efficient price discovery process via the elimination of triangular arbitrage opportunities”. All in all, it can be assumed that the market is becoming more efficient with the increased influence of algorithms. Furthermore, according to the efficient market hypothesis developed by Fama (1995) this development should reinforce the random walk of stock prices and consequently its unpredictability. Research on price dispersion related to algorithmic trading has not been performed previously and the most connected literature is on transaction costs dispersion by Enge, Russel & Ferstenberg (2007) where only Morgan Stanly data instead of complete stock market data is used. Furthermore, the link between algorithmic trading and market predictability also knows no predecessors and will explore new terrain in the field of algorithmic trading, using the fundamental relationships between algorithmic trading, market quality and information previously researched by Hendershott, Jones & Menkveld (2011) and Lyle & Naughton (2015).

For this reason, the main theme within this study is to evaluate how increased algorithmic trading has affected analysts’ capabilities to predict future market movements. Removing emotional entities from the market is expected to improve the efficiency of the market and hence decrease the market predictability. Moreover, another sub question is used to develop an empirical foundation for answering the main question which sums up to: Does algorithmic trading lead to less price dispersion within the stock market? Chaboud et al. (2014) show that automated trading strategies are less diverse than strategies used by human investors and that humans are responsible for a larger part of the variance in returns than their algorithmic counterparts. It follows that as algorithms possess more similarities than human traders it leads to suspect that the size of the range of returns also known as dispersion has decreased with increased algorithmic trading. Moreover, when looking at our data graphically it can be observed that return dispersion shows a clear downtrend over time, except for some extreme values during the financial crisis in 2008/2009, see Figure 3. Additionally, regressing dispersion against time confirms the downward slope resulting in a negative statistically significant coefficient on time with a p-value of 0.001. Considering that algorithmic trading increased over time it could imply a relation with dispersion.

Figure 3. Dispersion against time

The current study investigates the effects of algorithmic trading in more detail, by systematically performing fixed effects panel data regressions. This might enable us to see how increased algorithmic trading has affected return dispersion and market predictability.

The regression findings lead to the conclusion that dispersion is indeed reduced through increased algorithmic trading. Furthermore, it is found that more algorithmic trading led to smaller prediction errors and hence improved market predictability.

In the next chapter, the theoretical framework that was used to establish this research will be discussed, built on the following research questions:

Does increased algorithmic trading within the market affect analysts’ capabilities to predict future market movements?

Sub question:

Does algorithmic trading lead to less price dispersion in the stock market?

Theoretical Background

Current State of Literature

To determine the influence of algorithmic trading on dispersion and market predictability, first of all the origins of trading algorithms and the use of automated trading systems must be investigated. Additionally, to find how fewer human traders impact market predictability and dispersion, financial behavioral biases and market predictability should be examined as well.

Algorithmic Trading and Automated Trading Systems (ATS)

Leshik & Cralle (2011) explain that algorithms used for trading can be traced back to 1949 when Alfred Winslow Jones used an algorithm to balance between long and short positions on a hedge fund. An algorithm can be defined as a precise plan of steps that use computations to transform the input values into an output value. Fundamental to computer software and computations, algorithms have become a mainstream aid to the daily trader. It was not until the 1980’s when algorithmic or black box trading became hugely profitable due to the invention of Pair Trading. Decreased costs, improved control mechanisms with self-documenting trade record and speed of execution are some of the advantages that algorithmic trading can offer to increase the likelihood of a trade to turn out successful. First of all, in order to understand how exactly financial markets are affected by algorithmic trading it is of need to get to the very basis of how a trading algorithm works. For that reason, an example algorithm for a coke vending machine is introduced. The algorithm can be constructed as simple as:

if sum of COINS INSERTED > $1 then RETURN(sum of COINS INSERTED – 1)

if sum of COINS INSERTED = $1 then DROP CAN

if sum of COINS INSERTED < $1 then SHOW MESSAGE(Insufficient Amount)

if ABORTED then RETURN(COINS INSERTED)

In this example the amount of coins inserted is the main input, its total value instructs the vending machine to drop the coke can and return any change if necessary. The algorithm will simply follow the set of rules to transform input into output and never deviates from these rules during the process. Similarly, to the example algorithm, trading algorithms are merely the set of predefined rules that convert input into output. Hence, trading algorithms are implemented within Automated Trading Systems that facilitate data collection to obtain input values and to transform output values into an actual action. Automated Trading Systems, also known as ATS, are a combination of both hardware and software that, by using trading algorithms, manages orders and positions within a stock portfolio on a basis of real- time data feeds and historical data that is stored in a database. The data input usually is a combination of factors such as the share price, volume, number of trades, technical indicators, and even news events can serve as an input value for the more advanced learning algorithms (van Vliet, 2007). It follows that the Automated Trading System autonomously creates orders based on its input values and implements these on the exchange, all within milliseconds competing with human investors (van Vliet, 2007). Hence it can be argued that an ATS is to a trading algorithm what a physical coke vending machine can be considered to be to a coke vending algorithm.

To construct an ATS one has to be familiar with computer science, quantitative finance, trading strategy and quality management. As “data is the lifeblood of electronic markets” the basis of ATS lies in the underlying data which can be managed using Microsoft Visual C++ or .NET applications. Technological superiority through ATS can offer an enormous advantage against competitors, but still does not imply profitability (van Vliet, 2007). Leshik and Cralle (2011) consider the most popular and widely used algorithms to be: Volume Weighted Average Price (VWAP), Time Weighted Average Price (TWAP), Percentage of Volume (POV), Search for Liquidity (Black Lance), Stay Parallel with the Market (The PEG), Large Order Hiding (Iceberg), Pair Trading Strategy, Leshik-Cralle, Recursive, Serial, Parallel and Iterative. Whereas Izumi, Toriumi & Matsui (2009) evaluated a distinct set of automated trading strategies. Izumi et al. compare the risk and return of all strategies within their sample set and concluded the strategies to provide better information than conventional methods. Moreover, the research showed that the impact of automated trading strategies on markets does not merely depend on their code. Additionally, the way they are combined and influence each other can impact the market more so. The common factor amongst almost all popular trading algorithms seems to lie in technical analysis as the most popular trading algorithms are largely based on technical analysis related indicators such as moving average and the relative strength index as main indicators to create the buy or sell decision. Technical analysis pertains to predicting future stock prices by studying past stock price performance and several other trading statistics like trading volume and number of trades (Brock, Lakonishok & LeBaron, 1992).

Technical analysis is often considered as non-scientific due to its non-fundamental nature, nonetheless a survey study by Menkhoff (2010) proves that the vast majority of all fund managers rely on technical analysis. Additionally, Bessembinder & Chan (1997) demonstrate that even rather simple technical analysis holds statistically significant forecasting power within financial markets. Technical analysis is more related to psychology than fundamentals and the more inductive technical analysis is used, the more it reinforces its own predictive powers almost like a self-fulfilling prophecy.

In Figure 4 the risk and return outcome of the by Izumi et al. (2009, p. 3474) tested automated trading strategies agents are displayed. Partially to illustrate some available strategies other than the ones mentioned by Leshik & Cralle (2011). The results were achieved using back testing on several stock markets. For these trading strategies to work, several parameters for the input variables can be used, it is elementary that the parameters take on values that reflect the price level of fundamental information to the firm and economic conditions and preferably use adaptive agents. The parameters and code as used by Izumi et al. (2009) can be found in Apendix B. Moreover, from the parameters can be derived that actual trading algorithms are very similar to the coke vending machine example algorithm illustrated above. For most of these algorithms, technical indicators based on price or volume information such as moving averages or upper and lower bands are used as input values.

Figure 4. Standard deviations versus Returns of ATS. Reprinted from “Evaluation of automated- trading strategies using an artificial market.” By K. Izumi, F. Toriumi & H. Matsui, 2009, 72(16), 3474.

Not only can ATS use price and volume information or technical indicators as input values. The algorithms can be integrated with machine learning to automatically read news feed and turn these into input values for the algorithm. According to Nuij et al. (2014) automating the incorporation of news feed into stock trading strategies can boost the returns of individual technical indicators compared to those without the incorporation of news messages. By means of extracting an event from a news feed text and pairing these with an impact based on historic stock price deviations for a specific event this news variable can be used in addition to existing technical indicators.

Subsequently the rules that are created through news associated events can be mutated within the trading algorithm by improved versions of the rules which have led to higher returns. Such automatic reprogramming on the basis of previous return outcomes is one example of how machine learning can be implemented in ATS.

Predictability & Biases in Behavioral Finance

Algorithmic trading is connected to behavioral finance in the sense that algorithms many times are programmed to trade on investor biases that exist because of individual or group behavior. The technical indicators incorporated in trading algorithms function through behavioral finance. Therefore, it could even be argued that technical economic indicators are actually socio-economic indicators. Behavioral finance often is contradictive to the efficient market theory suggesting that stock prices are actually to a certain extent predictable because of psychological and social concepts that cause inefficiencies on the stock market (Shiller, 2003).

There is polarity in human behavior that reflects how stocks oscillate between up and down trends similarly to state of mind and mood that a human or group of humans are in. All forms of emotion seem to exert forces on the stock market in one way or another. To name an example, even reaching physical new highs in the form a tall building reverbs on the stock market by leaving a peak in the graph followed by a fall. The Dubai stock market rose significantly after finishing the Burj Khalifa, world’s tallest building (Mitroi, 2014). Moreover, there are recursive patterns for some financial anomalies such as the day-of-the-week effect which are not yet understood. Evidence seems to suggest that these anomalies happen because of mass psychology (Shiller, 2003).

Vasiliou, Eriotis & Papathanasiou (2008) mention that moving averages stress where a trend is headed and flatten out fluctuations caused by the noise of irrational investors also known as noise traders. Additionally, Vasiliou et al. find that the utility of the technical trading rules used in their research improved over time.

Market Efficiency and Predictability

Litzenberger, Castura & Gorelick (2012) stated that market quality has improved in the past decades. A clear cause for this trend is increased competition through more automation and high frequency trading in the market which leads to decreases in bid and ask spreads and improved liquidity. This improved liquidity causes the orders in limit order books to be exercised in a faster pace. Moreover, when relating market quality to algorithmic trading, Lyle, Naughton and Weller (2015) discovered that algorithmic trading strategies which provide liquidity such as market making strategies increase market quality. Whereas liquidity taking, non-market maker algorithmic trading activity harms market quality. Bouchaud, Farmer & Lillo (2008) conclude prices in markets to sustain a close to perfect unpredictability in the short run. Firstly, considering that outstanding liquidity is always small meaning that prices do not immediately mirror all information available to the market. Secondly on electronic markets there is no possibility to distinguish informed and uninformed trades for all trades have the same impact. It follows that all informative aspects of a trade should be internal to the market meaning that trades, order flow and cancellations carry information.

Beja and Goldman (1980) rightfully state that a market constructed by humans can impossibly be so mechanically perfect and efficient that all information would directly be integrated in the prices before it can be observed.

Implying that price anomalies will always be present, leaving room for predictability. Moreover, Pesaran (2003) reinforces predictability by stating that “A large number of studies in the finance literature have confirmed that stock returns can be predicted to some degree by means of interest rates, dividend yields and a variety of macroeconomic variables exhibiting clear business cycle variations.” According to Pesaran market efficiency should be distanced from predictability.

Methodology

Data Collection & Processing

Most of the data and queries used for the research have been obtained through Wharton University of Pennsylvania’s WRDS database & query tool (Wharton Research Data Services). In this research, three different datasets are used that exist within the WRDS database, named: CRSP – Daily Stock, IBES – Price Target and Federal Reserve Bank – Interest Rates. These sub-datasets eventually will be merged before the hypotheses can be tested and will be elaborated on in the following section. Further details on the datasets can be obtained from Table A1 where all query extraction specifications are denoted.

The chosen data period from 1999 to 2017 is a trade-off between covering a period as extensive as possible while at the same time trying to keep the data editable within Stata using the limited computing power that the research has to its disposal. Moreover, since IBES data is only available from 1999 onwards, this will automatically be the start of the period. Furthermore, it can be argued using Glantz & Kissel’s (2013, p. 258) Figure 1 that algorithmic trading before 1999 would have amounted to such a small percentage of the market volume that it is not of critical value in answering the research question.

Additionally, only NASDAQ and NYSE equity price data is used as the U.S. based stock exchanges were first in establishing facilities to support the development of algorithmic trading. Consequently, high frequency trading gained volume share in the US more rapidly than in Europe, as shown in Figure 5 (Kaya, 2016, p. 2). Given these arguments and considering the limited computing power, U.S. data on algorithmic trading follows as the more established choice.

Figure 5. % Share of High Frequency Trading in total equity trading per year. Reprinted from “High-frequency trading: reaching the limits.” By O. Kaya, 2016, 2. Copyright by Deutsche Bank Research.

CRSP – Daily Stock

First of all, the daily prices and trading data such as the daily number of trades and daily volume are extracted from the CRSP U.S. Stock database within WRDS. The previous mentioned CRSP query will function as the master dataset within the Stata environment and contains end-of-day prices for equity securities on the NYSE and NASDAQ exchanges. Additionally, CRSP also contains quote data, holding period returns, shares outstanding and trading volume information. Initially the entire database is extracted for the period from 1999 to 2017 containing over 34 million observations. To start, only common stock observations are maintained within the query to improve the post-merger data compatibility with the IBES Price Target dataset. For common stock the variable share code amounts to either 10 or 11, hence only these share codes are kept within the sample. Moreover, tickers with multiple different shares are dropped as those are not properly comparable to the IBES identifiers which will be elaborated on later.

Additionally, a .TXT file consisting of the remaining company ticker identifiers is derived from the dataset within Stata in order to simplify extraction of successive queries within WRDS as only information on those predetermined companies will be withdrawn from WRDS thus depressing the file size. Within the daily stock price query the actual price, bid, ask and shares outstanding are adjusted using the so-called adjustment factors in order to make the mentioned variables comparable over the entire 1999-2017 period. These adjustment factors are constructed by CRSP and adjust for corporate actions such as stock splits, dividends and rights offerings. Additionally, the effective spread variable is created similarly to Hendershott et al. (2011) by means of taking the difference between the closing bid and ask its midpoint and the actual transaction price of that day as well as a volatility variable that is calculated as the deviation amid the daily high and the daily low.

IBES – Price Target

IBES also known as the Institutional Brokers’ Estimate System is a Thomson Reuters’ database which holds historical analyst estimates for more than twenty forecast measures such as earnings per share, revenue, price targets, buy-hold-sell recommendations and gross profits regarding over 60,000 companies. After completing the extraction of price target estimation data including their horizon and analyst name data from WRDS using the same 1995-2017 period as used before, it was found that the IBES data could not directly be merged with the CRSP data. Concerning IBES, it contains two ticker variables and merely the variable official ticker is compatible with the ticker variable in CRSP and should not be confused with “ticker” in the IBES dataset.Hence, “oftic” is changed to its CRSP name: ticker.

Additionally, it must be mentioned that the in IBES so called “announcement date” should be the leading date. Finally, price target estimation values are matched with their respective future actual price by lagging the forecast with its horizon meaning that an estimation with a horizon of 6 months is lagged 6 months.

Federal Reserve Bank – Interest Rates

The WRDS RATES database used in this research is based upon the Federal Reserve Board’s H.15 release that contains selected interest rates for U.S. Treasuries and private money market and capital market instruments. Daily rates are per business day and reported in annual terms. To include interest rates as a controlling factor within the regressions, the rates of U.S. treasury bills with a maturity of 3 months are extracted from the WRDS RATES database for the period 1995 to 2017. The rates are merged with the master dataset using date as the common variable.

Data Analysis Methodology

To shed light on the automation process that entails the shift from human traders to automated trading systems, analyst predictions and their accuracy will be elaborated on in relation to algorithmic trading. However, first our scope will focus on how algorithmic trading is measure and how dispersion has changed through algorithmic trading. Moreover, all independent variables that will be used in regressions, are standardized to facilitate economic interpretation. Standardization is performed by subtracting the corresponding time series’ mean from the variables and dividing this deviation by the time series’ standard deviation.

By standardizing all independent variables in such fashion, the standardized regression coefficients will represent a standard deviation change of the independent variables in the dependent variable. Hence, independent variable X is standardized such that:

〖X’〗_tj =(X_tj- μ(X))/( σ(X))

Algorithmic Trading Measure

Preparatory, a proxy has been developed to measure the development of algorithmic trading over time within the available CRSP data. To quantify algorithmic trading in a variable Hendershott, Jones, and Menkveld (2011) and Boehmer, Fong & WU (2015) use the daily number of electronic messages from the TAQ database per $100 of trading volume as proxy to measure algorithmic trading. It is the most established measure within academic research, however the TAQ database is not at this research’s disposal and hence an inferior but comparable proxy is created. Inferiority lies in the fact that electronic messaging traffic information is not available in CRSP. However, as volume data is available, the best alternative measure would be a proxy that replaces the number of electronic messages with a comparable variable. Our data shows that volume did not increase over time while the number of trades did in a comparable way to the electronic messages used in HJM’s proxy, making this a simplified but functioning replacement within our proxy for algorithmic trading. Moreover, algorithmic trading is associated with improved liquidity and an increased number of trades with smaller volume per trade (Hendershott et al., 2011).

Hence the new proxy for algorithmic trading is calculated as the daily number of trades executed for ticker j per dollar trading volume of that day derived from the CRSP database.

(2) 〖Algorithmic Trading〗_tj =〖number of trades〗_tj/〖volume〗_tj

For it being a much noisier proxy, it gives a very similar representation of the development of algorithmic trading over time that was established by Glantz & Kissel (2013) which can be noted in Figure 1 & 2.

Effects of Algorithmic Trading on Dispersion

It is assumed that algorithms have more similarities than its human counterparts and for this reason dispersion is expected to decrease with more algorithmic trading. As flash crashes are known to happen with algorithmic trading (Johnson et al., 2012) extreme short-term dispersion might have increased instead. However, considering that this study is only able to use daily data, flash crashes are not expected to influence the results. Hence, the hypotheses are formulated as:

H0: Dispersion does not change with increased algorithmic trading

H1: Dispersion changes with increased algorithmic trading

Idiosyncratic or stock-specific volatility is used to measure dispersion. Idiosyncratic risk can be calculated in numerous ways, the various measures however all give comparable results (Malkiel & Xu, 2003). Moreover, according to Bello (2008) there are no significant differences between the Capital Asset Pricing Model, the Fama French Three Factor Model and the Carhart Model regarding their outcome. Hence, in this study the CAPM is used to calculate idiosyncratic volatility as this suits the dataset best. The CAPM formula used is as follows:

R_tj-〖Rf〗_t=〖 α〗_j+β_j (〖Rm〗_t-〖Rf〗_t )+ ε_tj

Where: Rtj is Return of Stock j, Rft is equal to the Risk Free Rate, Rmt is the Return of Market portfolio and εtj is the error term of returns (i.e. idiosyncratic or company specific risk). First of all, two new variables are created to simplify the alpha and beta estimation process within Stata, namely: 〖ERS=R〗_tj-〖Rf〗_t and 〖ERM= Rm〗_t-〖Rf〗_t. These are then applied in a simple OLS regression to estimate alpha and beta per ticker over the entire period. Almost 9500 regressions similar to (4) below are performed using a loop function in Stata after which the results are then saved in the variables α and β.

Y_(ERS )= 〖 α〗_j+ β_j*〖ERM〗_t

Once alpha and beta are estimated ε_tj is then calculated as:

ε_tj=〖ERS〗_tj-〖 α〗_j-β_j (〖ERM〗_t)

It follows that idiosyncratic volitality and thus dispersion is the monthly standard deviation of the error term as displayed below:

〖Idiosyncratic Volatility〗_(t(m)j)= σ_(t(m)) (ε_tj)

Finally, idiosyncratic volatility or preferably called dispersion is regressed on the algorithmic trading measure as in line with the hypotheses to analyze if return dispersion has changed through an increase in algorithmic trading. The model is also performed while controlling for firm fixed effects and year fixed effects as it is clear from Figure 3 that for dispersion there seems to be quite a variance amongst different years and in particular for years of financial crisis.

The reason why fixed effects are used instead of random effects is that the Hausman test for random effects versus fixed effects is significant at the 99.9% significance level for regression (7) meaning that the unique errors ε_tj are correlated with the regressors and hence fixed effects panel data regressions are used to analyze dispersion. In regression (8) and (9) firm fixed effects and year fixed effects are added respectively to see if and how firm and year specific effects influence our model. Comparing the results of regressions (7) and (8) will show the effect of firm specific effects whereas the comparison of (8) and (9) is to display the influence of year fixed effects.

〖Y_(Idiosyncratic Volatility)〗_tj=〖β_0+β〗_1*〖Algorithmic Trading〗_tj + ε_tj

〖〖 Y〗_(Idiosyncratic Volatility)〗_tj=〖β_0+β〗_1*〖Algorithmic Trading〗_tj + a_j+ ε_tj

〖〖 Y〗_(Idiosyncratic Volatility)〗_tj=〖β_0+β〗_1*〖Algorithmic Trading〗_tj + a_j+γ_t+ ε_tj

*With a_j as firm fixed effects and γ_t as year fixed effects

Effects of Algorithmic Trading on Analyst Forecast Accuracy

To analyze the prediction accuracy of the remaining human analysts within the market, historical Thomson Reuters analysts’ estimations obtained from the IBES dataset are used to obtain the prediction error for a certain forecast. It follows that the difference between the estimation value at time t and the adjusted price on date t divided by the adjusted price on that date gives the prediction error of a certain estimation by analyst i for stock j. Additionally, the prediction error is squared to emphasize on the analysts that were off most in their forecasts, be it below or above. As the squared prediction error will only return positive values it lays focus on just the deviation itself for the direction of the deviation is not of concern.

〖Prediction Error〗_(t,i,j)=((〖estimation value〗_(t,i,j)-〖adjusted price〗_(t,j))/〖adjusted price〗_(t,j) )^2

Consecutively, the analyst prediction error variable will then be tested using regression analysis within the Stata statistical analysis software to see if analysts’ predictions have become statistically more accurate since the development of automation within stock markets. The dataset can be described as an unbalanced three-dimensional panel dataset for which stock ticker, date and analyst name represent the dimensions, for every ticker there are different numbers of analyst estimations on varying dates. The “missing” data is due to analysts specializing in specific stocks and because the date at which estimations are placed is random, there is however no actual missing data.

The ticker and analyst variable are into a new combined variable called tic_alys where each group merely represents the specific forecasts by analyst i for ticker j. This procedure removes the need to drop the third dimension in order to run a multi-dimensional fixed effects panel data regression within Stata. These dimensions are only combined for regression (14) and (16) where firm and analyst fixed effects are included conjointly. To answer the research question the following hypotheses are developed:

H0: Analysts’ prediction error is not influenced by increased algorithmic trading

H1: Analysts’ prediction error is influenced by increased algorithmic trading

These hypotheses lead to the regressions below of which it is expected that analyst prediction error has indeed increased in the period where automation has taken place. It seems unlikely that analysts can predict the direction of future stock prices as the analysts would have to be able to execute transactions faster than the algorithms.

Therefore, it is hard to form a definite hypothesis as algorithmic trading probably also leads to less dispersion which could facilitate analyst predictions. For this reason, the hypothesis is two-sided where time t is in date format and per day. Testing analyst prediction error versus algorithmic trading is the most direct way of examining the effects that algorithmic trading has on analyst forecast accuracy. As many other factors potentially affect the forecast accuracy, sufficient control variables are to be added and fixed or random effects will be controlled for. Moreover, to determine whether the regressions need to be controlled for fixed or random effects the Hausman test is used again. Testing for random versus fixed effects again gives a significant outcome with a 99.99% confidence level and hence H0 is rejected meaning that fixed effects need to be applied within the panel data regressions.

It follows, that six different panel data regressions will be tested within Stata to determine how prediction error is influenced. The first regression model is a plain panel regression merely to test the effect of algorithmic trading on the analyst prediction error whereas the remaining five are fixed effects panel data regressions that each control for a certain fixed effect. Regression (11) is the plain panel data regression, then firm fixed effects are added in (12) to see how firm specific effects affect the regression output compared to the plain model. Thirdly, year fixed effects are controlled for as well using year dummies to control for a time trend and comparing regression (13) with (12) should deliver insight in the effects that time exerts on the dependent variable. Successively, analyst fixed effects are controlled for in regression (14) and again by merely adding this factor to the model it should become clear if and how the model is influenced through analyst-specific properties. By comparing the outcomes of the four regressions it should become clear if, how and which fixed effects affect prediction error. The first four regressions amount to: