Regressions
- 06:28
Understand how regression analysis can be used to explain the relationship between variables
Downloads
No associated resources to download.
Transcript
Regressions. Now, regressions are the most commonly used predictive model in statistics. And we use it to describe and and understand relationships between two variables. We often call the two variables an independent variable and a dependent variable and we use historical data to measure their relationship. Now, the independent variable can be thought of as an input or a cause. A couple of examples are the broad market which is most common in finance. The S&P 500 would be a typical measure but also you may see other other variables like GDP and CapEx. Now, on the dependent side, you can think of that think of it as the variable that we're trying to predict using the historical relationship data that we have. A couple of examples. And the most common is the stock price on a security but also can be used to predict earnings or sales data. Now to give you one more example imagine that a company wants to understand how past marketing expenditures have related to sales in order to make future decisions on how much to spend on advertising and marketing. Well the dependent variable in this case is sales. And the independent variable in this case is advertising spending. We wanna know how much advertising spending relates to sales. Now here's a typical regression graph, and in this case we graph the Apple stock, AAPL versus the S&P 500, the Apple stock being the dependent variable and the S&P 500 being the independent variable. So we're trying to determine how much of the broad market movement described Apple stock movement in the red line we see here is known as the best fit line in the scatter plot. And the line is calculated by determining the smallest possible set of distances between itself and each data point. And you can also see that the line can be described as a formula which makes it very easy to use in financial models. Now let's dig into that formula a little bit and again we commonly use regressions in finance to describe the returns of a specific security and compare it to the returns of a of a factor or the broad market. In this case, here's a formula that we just saw for the Apple stock versus the S&P 500. And here it's components. The Y in the formula is the return on the Apple stock the return that we're trying to predict and to predict that dependent variable we are looking at the returns or the data points around an independent security and in this case the S&P 500. And the slope of the line known as beta is the level of movement in the return of a given security for each unit of movement in the market in general. So in this case, how much will the Apple stock move for every percentage change in the S&P 500? Now the last term is the intercept of the line and it's known as alpha. And the alpha term in the aggression represents the securities propensity to move independent of the market. Now, in addition to a graphical output of a regression Excel and all other statistical software we'll also give you a summary output of the regression analysis like we're seeing here. A couple of key points and data to highlight. First of all, R squared also known as the coefficient of determination. It measures how good a fit our regression is so how good a fit the independent variable is in explaining the independent variable. In this case a perfect predictive model would have a value of one. Here we have .19, not very high, and the lower the number the less prediction power the regression has. So this would lead us to to believe that there are other factors that are affecting Apple stock outside of the broad market movement, which seems logical. Next we have the beta value which is shown as the S&P 500 coefficient and the alpha value, which is the intercept. And lastly, we have the standard error value which is similar to R squared in the sense that it gives us a sense of how good a predictive model our current regression is and the larger the sample set the lower the standard error should be which would indicate a more robust predictive model. Now, some other key points about regressions. Usually there is more than one variable that influences a dependent variable and we just saw that in the apple output where the R square was fairly low, indicating that there are other variables influencing the stock. So to get a better understanding of what's influencing a dependent variable like a stock price, we can run what we call a multiple regression. And here's your typical formula for a multiple regression. And in fact, many factor-based funds or strategies use multiple variables and not just beta or the market impact. Some examples might be size, momentum, valuation that are also incorporated into the analysis. Also, regression models assume a linear relationship so a straight line relationship similar to what we saw previously. Now security returns are usually modeled assuming a straight line relationship, and that's not always the case. And it could affect your predictive model. Another limitation around regression models is known as parameter instability. In regressions, we're using historical data to determine the relationship between two variables. Well, those relationships could change over time and in fact, if they do, the predictive ability of your regression model is greatly affected. One example may be that currently many will cite that retail companies are going through what we call a secular shift in the relationship of their stocks to the broader market from historical data is no longer relevant. Now that we're seeing a larger shift away from brick and mortar retail to online retail.