Constructing and Testing Alternative Versions of the Fama–French and Carhart Models in the UK
This paper constructs and tests alternative versions of the Fama-French and Carhart models for the UK market with the purpose of providing guidance for researchers interested in asset pricing and event studies. We conduct a comprehensive analysis of such models, forming risk factors using approaches advanced in the recent literature including value weighted factor components and various decompositions of the risk factors. We also test whether such factor models can at least explain the returns of large firms. We find that versions of the four-factor model using decomposed and value-weighted factor components are able to explain the cross-section of returns in large firms or in portfolios without extreme momentum exposures. However, we do not find that risk factors are consistently and reliably priced.
Keywords:asset pricing, multi factor models, CAPM, Fama-French model, performance evaluation, event studies
The test portfolios and factors underlying this paper can be downloaded from:
Constructing andTesting Alternative Versions oftheFama–FrenchandCarhart
Fama and French (2011) show that regional versions of asset pricing models provide “passable descriptions” of local average returns for portfolios formed on size and value sorts. In general, and specifically for Europe, such models provide better descriptions of returns than global models. Their results provide evidence that asset pricing is not integrated across regions. Whilst Fama and French (2011) are silent on the reasons for this, explanations may include differing exposures to macro-economic factors in smaller or more open economies, differing degrees of internationalisation in companies between countries, and (historically at least) differing accounting treatments affecting the measurement of book values, used to sort stocks on book to market ratios. If regional asset pricing models perform better than global models, then by extension we might expect country level models to out-perform regional level models. Griffin (2002) notes that country specific three-factor models explain the average stock returns better than either world models or international versions of the model and suggests that “cost-of-capital calculations, performance measurement and risk analysis using Fama and French-style models are best done on a within country basis”. Yet to date, there is little evidence to suggest that at a national level the Fama- French three-factor (FF) model adequately describes the cross-section of stock returns in the UK (Michou, Mouselli and Stark 2012, [hereafter MMS]; Fletcher and Kihanda, 2005; Fletcher, 2010).
From a practical point of view, firm managers require guidance on project-specific costs of capital for discounting purposes, and also need information on the cost of equity for financing decisions. In the context of UK utility pricing and competition policy, regulators need some model of “fair” rates of return. In addition, researchers interested in event studies, portfolio performance evaluation and market based accounting research are interested in models that adequately describe “normal” returns. Recent examples of such UK investigations that use either a three or four factor model include Agarwal, Taffler and Brown (2011), Dissanaike and Lim (2010), Gregory, Guermat and Al-Shawraweh (2010), Dedman et al (2009) and Gregory and
Whittaker (2007). The absence of evidence that there exists a reliable and robust model for the UK therefore leaves researchers and managers in a difficult position.
Given the above, we extend the search for an improved model that adequately describes the cross-section of returns in the UK in the following ways. We construct and test models using alternative specifications of the factors examined by MMS together with a momentum factor. The momentum factor we construct is the UK
equivalent of the UMD factor for the US1. Noting the Cremers, Petajisto and
Zitzewitz (2010, hereafter CPZ) critique, we construct the FF factors, by value- weighting (rather than equally weighting) the individual component portfolios. We construct models using decomposed factors, along the lines of Zhang (2008), Fama and French (2011) and CPZ. We examine the APT factors identified in Clare, Priestly and Thomas (1997) . Finally, we construct and test these alternative models from the sample of the largest 350 firms by market capitalisation, in an attempt to see if we can find a model that works at least for larger and more liquid firms.
We test these alternative factor models against portfolios formed by intersecting sorts on size and book-to-market (BTM), as in Fama and French (2011), and on portfolios formed using sequential sorts on size, BTM and momentum. However, both Lo and MacKinlay (1990) and Lewellen, Nagel and Shanken (2010) warn against relying on tests of a model on portfolios whose characteristics have been used to form the factors in the first place. Lewellen et al. (2010, p.182) suggest, interalia, tests based on portfolios formed on either industries or volatility. MMS follow this advice by testing on industry portfolios, showing that only the HML factor appears to be priced when tested against this more demanding set of portfolios. In this paper, we follow the Lewellen et al. (2010) suggestion of testing on volatility. We do this partly to extend the range of test portfolios used in the UK, given that MMS test against industry portfolios, and partly to avoid difficulties caused by certain industry changes in the
UK.2 In addition, recent work by Brooks, Li and Miffre (2011) raise the intriguing
possibility that idiosyncratic risk may be priced in the US, which makes testing against portfolios formed on the basis of past volatility interesting.
1 Available on Ken- French data library.
2 In particular, privatisations of utilities and the rail industry during our observation period have led to the emergence of significant new sectors. These changes are essentially the result of political choices and so differ from structural changes brought about by technological innovation.
We conduct tests of our models in two stages. In the first stage we use the the F-test of Gibbons, Ross, and Shanken [GRS], 1989. In common with Fama and French (2011), in our first stage tests we find that UK models perform reasonably well when describing returns on test portfolios formed using size and book to market, but perform very poorly when tested on portfolios formed on the basis of momentum. This is probably not surprising, given the recent results in MMS and Fletcher (2010). However, we find that two versions of the four-factor model (the Simple 4F model and a CPZ version of the model) do a reasonable job of describing the cross section of returns from test portfolios formed on the basis of volatility.
In the second stage, we go further than Fama and French (2011) in that we run Fama- MacBeth (1973) type tests to examine whether factors are priced. Consistent with the findings of MMS and Fletcher (2010), we find that the factors are not consistently and reliably priced.
One explanation for this poor performance is that there are limits to arbitrage, especially in smaller stocks. These might come about because of liquidity constraints and limits to stock availability in smaller firms, or because short selling constraints might limit the ability of investors to short over-priced “loser” stocks or over-priced “glamour” stocks (Ali and Trombley, 2006; Ali, Huang and Trombley, 2003). Yet as Thomas (2006) points out, it is not difficult to short-sell most large capitalisation stocks. Given that we would expect such limits to arbitrage to be considerably less in larger stocks, we repeat all of our tests on a sub-sample of the 350 largest UK firms, forming both factors and test portfolios from this restricted universe of large stocks. Consistent with this expectation, tests on the large firms sample show that all our models provide reasonable explanations of the cross-section of returns even when portfolios are formed on the basis of momentum. However, the priced factors vary with the test portfolios employed. Based on our findings, our pragmatic advice for fellow researchers using UK data is that, in event study applications either a four factor model, or a decomposed value-weighted four factor model, as proposed by CPZ, might be appropriate, unless the event being studied is likely to feature a large number of smaller stocks. If, however, the objective is to establish a meaningful
measure of the expected cost of equity then it is difficult to recommend any one model over the others, given that the factors are not reliably priced.
We classify the various models that we test into basic models, value-weighted factor components models and decomposed factor models. A detailed description of the construction of the factors used in these models is in a separate section below.
Our first model is the Fama-French (1993) three factor model, which is:
ft i mt
ft i t i
Where, Riis the return on an asset/portfolio i, the first term in parentheses is the usual CAPM market risk premium, where Rm is the return of a broad market index,Rfis the risk free rate of return, and SMBand HMLare respectively size and “value” factors formed from six portfolios formed from two size and three book-to-market (BTM) portfolios.
The second model we investigate is a four-factor model similar to the Carhart (1997)
model, which in addition to using the three factors of Fama-French (1993) also uses a
“winner minus loser” factor to capture the momentum effect. The model is:
ft i t i
t i t it
Where UMD is a momentum factor and the other terms are as in (1) above.
CPZ argue that the FF method of equally weighting the six constituent portfolios (from which the SMB and HML factors are formed) gives a disproportionate weight to small value stocks. So we construct factors using a CPZ-style market capitalisation weighting of the SMB, HMLand UMDcomponent portfolios, which we label SMB_CPZ, HML_CPZ and UMD_CPZ.
ft i t i
CPZhHML CPZ wUMD
it ft i mt
ft i t i
t i t it
Decomposed factor models
Zhang (2008), Fama and French (2011) and CPZ argue that a decomposition of the FF factors may be helpful. The intuition is that value effects may differ between large and small firms.
In our fifth model, we decompose the value factors based on both large and small firms as in Fama-French (2011) and construct our fifth model. This is referred to as
the FF decomposition.
Rit Rft i
i t i
Bt wiUMDt + it
Where HML_S and HML_B denote the value premium in small firms and large firms respectively.
In our sixth model, we further decompose the HMLfactor into large and small firms (BHML_CPZand SHML_CPZ),and also decompose the SMB factor into a mid-cap minus large cap factor (MMB_CPZ) and a small cap minus mid-cap factor
(SMM_CPZ) in the spirit of CPZ. This is referred to as the CPZ decomposition.
t i t i t
CPZt + it
Note that when testing (6) on the largest 350 firms only, SMM_CPZ as a factor is not calculated.
Our data comes from various sources and cover the period from October 1980 to December 2010. The monthly stock returns and market capitalisations are from the The London Business School Share Price Database (LSPD), The book-values are primarily from) Datastream, with missing values filled in with data from: Thomson One Banker; tailored Hemscott data (from the Gregory, Tharyan and Tonks  study of directors’ trading) obtained by subscription; and hand collected data on bankrupt firms from Christidis and Gregory (2010). By combining several data sources we are able to infill any missing data in Datastream.
In the construction of the factors and test portfolios, we only include Main Market stocks and exclude financials, foreign companies and AIM stocks following Nagel (2001) and Dimson, Nagel and Quigley (2003, hereafter DNQ). We also exclude companies with negative or missing book values. The number of UK listed companies in our sample with valid BTM and market capitalisations is 896 in 1980 with the number peaking to 1,323 companies in 1997. This number then falls away progressively to 1,100 in 2000, ending up with 513 valid companies by the time
financials have been excluded in 2010, plus 36 companies with negative BTM ratios.3
We now turn to the construction of the portfolios and factors.
Break points forportfolio construction
Our central problem in forming the factors and portfolios is to find a UK equivalent for the NYSE break points used to form the portfolios and factors in Ken French’s data library. In the particular context of this paper, the London Stock Exchange exhibits a large “tail” of small and illiquid stocks, which are almost certainly not part of the tradable universe of the major institutional investors that make up a large part of the UK market. Use of inappropriate breakpoints will result in factors and test portfolios heavily weighted by illiquid smaller stocks and lead to incorrect inferences in asset pricing tests, event studies or performance evaluation studies. One way of dealing with this is by altering the break points. The alternative is to employ value weighting in factor construction. CPZ is an example of the latter approach, motivated by concerns about performance evaluation, whereas MMS is an example of the former. As break points and weighting schemes can be viewed as complimentary approaches to the problem of the over-representation of small and illiquid stocks, in this paper we look at the impact of both changing the break points and employing the CPZ style value-weighting scheme.
Fama and French (2011) clearly recognise the importance of using the appropriate break points in forming their regional portfolios, and the issue has received a good deal of attention in the previous UK research discussed below. GHM and DNQ deal
with this by using the median of the largest (by market capitalisation) 350 firms and
3 To cross check this reduction in the number of firms, we compare our data with the market statistics on the London Stock Exchange website, and find that from December 1998 (the earliest month for which data are available on the LSE website) to December 2010, the number of UK listed firms on the Main Market has reduced from 2,087 to 1,004, a decline of nearly 52% .
the 70th percentile of firms respectively in forming the size breakpoints for market value, in both cases excluding financial stocks. Gregory et al. (2001) base their BTM breakpoints on the 30th and 70th percentiles of the largest 350 firms, whereas DNQ use the 40th and 60 percentiles. However, more typically, other UK studies (Al-Horani et al., 2003; Fletcher, 2001; Fletcher and Forbes, 2002; Hussain et al., 2002; Liu et al.,
1999 and Miles and Timmerman, 1996) use the median of all firms. For the reasons outlined in the introduction, we believe it is important to consider the likely investable universe for large investors, and in this paper we use the largest 350 firms as in Gregory et al. (2001, 2003) and Gregory and Michou (2009, hereafter GM).4
In the models (1)-(6) above, Rm–Rf is the market factor (market risk premium). Rm is the total return on the FT All Share Index, and Rf(risk free rate) is the monthly return on 3 month Treasury Bills.
In addition to a market factor, the Simple FF model (1) above uses a SMB (size) and a HML (value) factor which are constructed from six portfolios formed on size (market capitalisation) and BTM. Our portfolios are formed at the beginning of October in year t. Following Agarwal and Taffler (2008), who note that 22% of UK firms have March year ends, with 37% of firms having December year ends, we match March year tbook value with end of September year tmarket capitalisation to get the appropriate size and BTM to form the portfolios.
In detail, to form the portfolios, we independently sort our sample firms on market capitalisation and BTM. Sorting on market capitalisation first, we form two size groups “S”-small and “B”-big using the median market capitalisation of the largest
350 companies (our proxy for the Fama-French NYSE break point) in year tas the size break point. Then, sorting on the BTM, we form the three BTM groups, “H”-
High, “M”-medium and “L”-Low, using the 30th and 70th percentiles of BTM of the
4 We also construct and test our models using the alternative Dimson et al. (2003) 70th percentile breakpoints, the Al-Horani et al. 50th percentile breakpoints together with the Fletcher (2001) and Fletcher and Kihanda (2005) factor construction methods. An excellent and detailed review of the methods used in UK portfolio construction can be found in MMS. Given that our evidence on these alternative factor specifications is similar to that in MMS, we do not report these tests in the paper, although full test results are available from the authors on request.
largest 350 firms as break points for the BTM. Using these size and BTM portfolios, we form the following six intersecting portfolios SH; SM; SL BH; BM; BL where “SH” is the small size high BTM portfolio, “SL” is the small size low BTM portfolio, “BL” is the big size low BTM portfolio and so on.
These portfolios are then used to form the SMB and HML factors. The SMB factor is (SL + SM + SH)/3 – (BL + BM + BH)/3 and the HML factor is (SH + BH)/2 – (SL + BL)/2. Note that in this model, all the components from which SMB and HML are formed receive equal weighting..
The Simple 4F, model (2) above, uses an UMD(momentum) factor, which we construct using the methodology described on the Ken French’s website as follows. Using size and prior (2-12) returns5 we first create six portfolios, namely SU; SM; SD; BU; BM and BD where SU is a small size and high momentum portfolio, SM is the small size and medium momentum portfolio, SD is the small size and low momentum portfolio, BU is the big size and high momentum portfolio and so on. These portfolios, which are formed monthly, are therefore intersections of two portfolios formed on size and three portfolios formed on prior (2-12) return. The
monthly size breakpoint (our proxy for the Fama-French NYSE break point) is the market capitalisation of the median firm in the largest 350 companies. The monthly prior (2-12) return breakpoints are the 30th and 70th of prior (2-12) performance of the largest 350 companies each month. The UMD factor is then calculated as 0.5 (SU + BU) – 0.5 (SD + BD), where U denotes the high momentum portfolio and D the low momentum portfolio. As in the case of the SMB and HML factors, the components used to form the UMD factor are equally weighted.
Factorsfor the value–weighted components anddecomposed factor models
The SMB_CPZ, HML_CPZ and UMD_CPZ factors employed in CPZ_FF and CPZ_4FF, model (3) and model (4) above, are calculated by replacing the equal weighting of the components of the SMB, HML and UMD factors (used in (1) and (2) above) with a value weighting based on the market capitalisation of the SH, SM, SL,
BH, BM BL, SU, BU, SD, and BD components.
5 We also form an alternative, UMD_carfactor, by following the approach in Carhart (1997) where the portfolios are constructed from past year returns without interacting with size.
The decomposition of HML used in FF_4F_decomposed model (5), uses HML_S which is constructed as (SH-SL) and HML_B which is constructed as (BH-BL). In order to separate the SMB factor into mid-cap (MMB_CPZ) and small-cap (SMM_CPZ) elements for the CPZ_4F decomposed model (6), the value-weighted return on the upper quartile firms in the largest 350 firms is used as a proxy for the returns on the big firms, and the value-weighted return on the remaining 350 firms is used as a proxy for the mid-cap return. Small firm returns are then the value- weighted return on all other firms in the sample.
A diagrammatic representation of the factor construction methods is shown in figures
1 and 2. Figure 1 shows the construction of SMB, HML, UMD, HML_S and HML_B, SMB_CPZ, UMD_CPZ , BHML_CPZ , SHML_CPZ factors and Figure 2 shows the construction of MMB_CPZ, and SMM_CPZ factors.
As with the portfolios used to from the factors, the test portfolios are formed at the beginning of October of each year t. In detail, we construct the following value- weighted portfolios for use in our tests of asset pricing models:6
1. 25 (5×5) intersecting size and BTM portfolios: We use the whole sample of firms to form these portfolios. The five size portfolios are formed from quartiles of the largest 350 firms plus one portfolio formed from the rest of the sample. For the BTM portfolios we use the BTM quintiles of the largest 350 firms as break points for the BTM to create 5 BTM groups.
2. 27 (3x3x3) sequentially sorted size BTM and momentum portfolios: The three size portfolios are formed as two portfolios formed from only the largest 350 firms, using the median market capitalisation of the largest 350 firms as the break point plus one portfolio from the rest of the sample. Then within each size group
we create tertiles of BTM to create the three BTM groups. Finally, within each of
6 We actually employed a wider range of test portfolios but in the interests of brevity, we do not detail all of the portfolios we used here. The whole range of test portfolios based on size, book to market, momentum and varying combinations of these are available on our website.
these nine portfolios we create tertiles of prior 12-month returns to form three momentum groups.
3. 25 portfolios ranked on standard deviation of prior 12-month returns.
4. For our large firm only tests, we form the 25 intersecting size and BTM portfolios using five size and five BTM groups using the largest 350 firms, limit the sequentially sorted size, value and momentum portfolios to a 2 x 2 x 3 sequential sort and finally we limit the volatility portfolios to twelve groups.7
We emphasise that our choice of partitioning the size portfolios on the basis of the largest 350 stocks is designed to capture the investable universe for UK institutional investors. Our conversations with practicing fund managers and analysts suggest that large internationalinvestors may view the opportunity set of UK firms as comprising the FTSE100 set of firms at best. To take account of these investment criteria we define “large” firms as those with a market capitalisation larger than the median firm of the largest 350 firms by market capitalisation. “Small” becomes any firm that is
not in the group of the largest 350 firms.8
FactorandPortfolio summary statistics
In Table 1, we report the summary statistics for our factors. We note that none of the size factors, nor any of the decomposed elements of the size factors, are significantly different from zero. No matter how they are defined, the HML factors are significantly different from zero at the 10% level or less, but breaking down HMLinto small and large elements, as in the FF_4F_decomposed model, raises the standard deviation of the elements so that neither element is reliably different from zero at the
10% level in two-tailed tests. However, when using the CPZ-decomposition, SHML_CPZ is significantly different from zero, although BHML_CPZ fails to be. In the Simple FF and simple 4F models, UMD has the highest mean of any of the factors (0.77% per month), but also exhibits the greatest negative skewness and the largest kurtosis. Switching to the factors used in the CPZ_FF and CPZ_4F models causes an
increase in the mean, median and the standard deviation of the SMBand HMLfactors,
7 We also tested our results using fifteen portfolios, with very similar results.
8 However, note that we also form 25 “Alternative 350 group” (three portfolios from largest 350 plus 2 portfolios from the rest and quintiles based on BTM), 25 “DNQ group” using DNQ cut -points, simple
decile and quintile portfolios for both size and BTM, for those who believe that alternative definitions
of size and book to market are more appropriate. Inferences on factors and test portfolios formed on these groupings do not change.
with a marked decrease in kurtosis for the latter. For UMD,the mean and median are reduced, whilst the standard deviation is increased. For the decompositions of the HMLfactor, conclusions on whether the effect is larger or smaller in large or small stocks depend upon the method of decomposition.
The correlations in Table 2 reveal that despite the difference in weightings between FF [models (1) and (2)] and CPZ (models (3) and (4)] factors, the correlations are strongly positive: 0.92 in the case of SMB, 0.88 in the case of HML, and 0.97 in the case of UMD. Decomposing the factors reveals that the large and small firm components of HMLare significantly positively correlated with an FF decomposition the correlation of 0.43, and a CPZ decomposition of 0.33. The correlation between the decomposed elements using these alternative factor constructions is strong: 0.98 for the large firm element of HML, and 0.62 for the small firm element. The CPZ decomposition of the size effect reveals that MMB_CPZ and SMM_CPZ have a correlation of only 0.05. One striking feature of the correlation table is the negative
correlation between HMLand momentum9. This is -0.5 in the case of the FF factors,
and -0.4 in the case of the CPZ factors.10
In Tables 3-5, we report the mean, standard deviation, skewness, maximum, minimum, median and kurtosis of the returns for our value-weighted test portfolios11. Table 3 reports results for 25 intersecting Size and BTM portfolios formed as described above. The tendency within size categories is for returns to increase as BTM ratio increases, although the effect is not completely monotonic in all of the size categories. The general pattern appears to be for skewness to be more negative and kurtosis to be greater in the “glamour” category than the “value” category within any size group, with the exceptions being kurtosis in the second smallest (S2) and medium
9 Clifford (1997) notes a similar effect in the US.
10 This led us to investigate several alternatives in our subsequent tests, which we do not report for
space reasons. First, we examined a “pure” Carhart (1997) factor, constructed without intersecting with size effects.10 Second, we examined whether such a factor performed better in association with factors formed using the Al-Horani et al. (2003), Fletcher (2001), Fletcher and Kihanda (2005), and DNQ (2003) approaches to factor construction. Third, we investigated constructing the factor by inter – acting momentum and value (instead of size) portfolios. As none of these alternatives changed our reported results in any way, we do not report them here, but results are available from the authors on request.
11 Note that equally weighted versions are also available for download from our website.
Our next set of portfolios reported in Table 4 are the value-weighted 27 portfolios sequentiallysorted on size, BTM and momentum. In the table, the first letter denotes size (Small, S; Medium, M; Large, L), the second the BTM category (Low or “Glamour”, G; Medium, M; High, or “value”, V), and the third momentum (Low, L; Medium, M; High, H). Compared to (unreported) sorts based upon size and momentum, and to the summary factors reported in Table 1, the return patterns here are intriguing, as they suggest a much lower momentum effect when BTM is also controlled for. Indeed, within the “small value” set of firms, momentum effects are actually reversed. However, what is striking is that sequentially sorting, as opposed to forming intersecting portfolios, seems to substantially dampen down any momentum
effect. Sequential sorting (within any size category12) has the effect of ensuring each
sub-group has equal numbers of firms within it, whereas intersecting portfolios can have quite different numbers of firms within each portfolio. In practice, it emerges that different numbers of firms within sub-categories is only an issue within the smallest market capitalisation quintile, where there is a concentration of firms in the low momentum category. We note that 39% of all the smallest quintile stocks fall into this “low momentum” group.13
Finally, we report the characteristics of the 25 portfolios formed on the basis of prior
12-month standard deviations in Table 5. These portfolios are interesting in several respects. First, past volatility seems to predict future volatility. As we progress from the low standard deviation (SD1) to high standard deviation (SD25) portfolios, standard deviations of the portfolio returns tend to increase. Whilst the effect is not monotonic, the SD25 portfolio has a standard deviation of over twice that of the SD1 portfolio. However, returns do not obviously increase with standard deviation – indeed the lowest mean return portfolio is SD25. Of course, this is not inconsistent with conventional portfolio theory provided that higher risk portfolios have an offsetting effect from lower correlations with other assets. There are no obvious patterns that emerge in either skewness or kurtosis across these portfolios.
Tests of factormodels
12 Recall that by design we form the size portfolios so that the largest two size groupings by market capitalisation have fewer firms than the smallest size groups.
13 Results for size and momentum portfolios are available on our website.
We now turn to the central theme of this paper, asset pricing tests of our models. These testing procedures are described in detail in Cochrane (2001, Ch.12). Essentially, our test is in two stages. In the first stage test, we regress the individual test portfolios on models (1) to (6) and test if the alphas are jointly zero using the Gibbons, Ross and Shanken (1989) or GRS test. More formally, we run time-series regressions as follows
Ritis the return on a test portfolio iin month t, Rftis the risk-free rate in month t, Ftis the vector of factors corresponding to the model that is being tested. A regression on each of the test portfolio iyields an intercept i. The GRS test is used to then test if these are jointly indistinguishable from zero.
In the second-stage we test whether the factors are reliably priced using the-MacBeth (1973) two-pass regression using either an assumption of constant parameter estimates or rolling 60-monthly estimates of the parameters, which allows for time variation. To adjust for the error-in-variables problem we also compute Shaken (1992) corrected t-statistics. More formally, the two-pass Fama-MacBeth test first estimates a vector of estimated factor loadings by regressing the time-series of excess returns on each test portfolio on the vector of risk factors which depend on the particular model being tested. The test then proceeds by running following cross-sectional regression for each month in the second pass.
Where Ri is the return of test portfolio i, Rf denotes the risk free return, γ0 is the constant, is the vector of cross-sectional regression coefficients and is the vector of estimated factor loadings from the first pass regression. From the second pass cross- sectional regressions we obtain time series of and . The average premium is calculated as the mean of the time series of s. A cross-sectional R2 tests for goodness of fit and a 2 test is used to check if the pricing errors are jointly zero. The first pass regressions are run either as rolling regressions or as a single regression over the entire time-series.
Fullsample results– first stage tests
Tables 6-8 report the results from the first stage tests on the 3 sets of test portfolios described above. To save space, we do not report the coefficients on the factors for
each model14. Each Table has six pairs of columns, each pair representing the result from each of our 6 models. The first column of each pair reports the α (the intercept) and the second column, reports its associated t-statistic.
In Table 6, we report the results when our models are tested using the 25 size and BTM portfolios. The Simple FF model passes the GRS test, and only two of the 25 intercept terms are significant at the 5% level, with both of these failures are in the small firm value end categories. Whilst the Simple 4F model passes the GRS test, there are now three significant intercepts, two of them in the portfolios that exhibited the same result in the Simple FF model. The additional portfolio that fails the intercept test is another “value” portfolio, this time M3H. The average adjusted R- squared is almost imperceptibly different between the two models, at 0.783 and 0.784 for the Simple FF and Simple 4F models respectively. Despite the much longer data period and the focus on a single country, these results are broadly in line with the local model results for Europe reported in Tables 3 and 4 of Fama and French (2011)
For the value-weighted factor components models, we observe that both models pass the GRS test and that the mean adjusted R-squared is slightly lower than that of the Simple FF and Simple 4F models. For the CPZ_FF model, we detect no significant alphas at the 5% level, although three are significant at the 10% level. Although the improvement is marginal, it does seem that there is some advantage in following the CPZ proposal on value weighting components, at least in terms of the significance of the intercept terms. The CPZ_4F model shows three intercepts being significant at the 5% level, with one being significant at the 10% level.
In the last four columns of Table 6 we report the effect of disaggregating the factor components. Doing so seems to increase the mean R-squared compared to the aggregated models, whilst leaving the GRS tests unaffected. The FF decomposition, though, produces four significant alphas, and these are concentrated in the smallest stocks. By contrast, a particularly striking feature of the CPZ decomposition is that it
seems able to price the problematic small stock portfolios. The only significant
14 The individual factor loadings are reported in full on our website.
intercept at the 5% level is M3H, and at the 10% level B4H, both of which are positive.
Table 7 tests these factors on the sequentially-sorted size, BTM and momentum portfolios. Surprisingly, given these portfolios bear a relationship to the way factors are formed, all six of our models fail the basic GRS test. The Simple FF has five significant alphas at the 5% level, with four of these occurring in small size groupings. Adding UMDimproves matters marginally, with three significant alphas occurring, but the GRS F-test is still a highly significant 1.75.
The central group of columns show that changing the factor component weightings does little to improve the performance of either model. The CPZ_4F model produces four significant alphas at the 5% level, all of them amongst smaller firms, whilst the CPZ_FF model produces a similar result overall, but the failures are not concentrated amongst smaller stocks.
The FF decomposition (reported in the final four columns of the table) does nothing to rescue the models, with five significant alphas in the model. However, the CPZ decomposition exhibits only two significant alphas at the 5% level, although a further five are significant at the 10% level. The CPZ decomposition also has the lowest GRS test score and the highest mean adjusted R-squared. Nonetheless, the disappointing ability of any of these models to price portfolios which ultimately reflect, at least to some degree, the characteristics used to form the factors is not promising. These results are in line with those of Fama and French (2011), who also find that their European local models are unable to price portfolios sorted and size and momentum, and conclude that a four factor model is likely to be problematic in applications involving portfolios with momentum tilts.
Table 8 examines the ability of each model to explain the cross-section of returns in portfolios sorted on the basis of prior volatility. In the Simple FF model (Panel A), we see that there are two significant alphas at the 5% level, but that the model fails the GRS test at the 10% level. However, the Simple 4F model produces only one significant alpha at the 5% level and passes the GRS test.
In the central columns of Table 8, we see the effect of changing to the CPZ weightings. For the CPZ_FF model, the GRS test fails at the 10% level, and the number of significant alphas is two. The CPZ_4F model passes this test, though with three significant alphas. As in the Simple FF and Simple 4F tests, the less risky portfolios have positive alphas. Here, the most risky (SD25) has a negative alpha, significant at the 10% level.
In the final four columns of Table 8, we report the results using decomposed factors. Note that we cannot reject the null hypothesis for either model. Both decompositions shows the pattern of positive alphas amongst the less risky portfolios. In conclusion, on the first stage tests, the various specifications of the 4F model all pass the GRS test
when tested, as suggested by Lewellen et al. (2010), on volatility-ranked portfolios.15
Fullsample results– second stagetests
We now turn to the second-stage regression tests, and in Tables 9-11 we show the results from the Fama-MacBeth (1973) estimation process using both the assumption of constant parameter estimates (the “Single” regression columns) and rolling 60- monthly estimated coefficients (the “Rolling” regression columns) using our alternative groups of test portfolios. We show results for both three and four factor models, and the estimates are expressed in terms of percent per month. The t-statistics (“t-sh” in the Tables) are shown after applying the Shanken (1992) corrections for errors-in-variables. In each table, Panel A shows the results from the Simple FF and Simple 4F models in the top rows, whilst the bottom rows show the results using value weighted components models. Panel B shows results from the decomposed factor models. As we estimate these regressions using excess returns, the intercept should be zero and the coefficients on the factors should represent the market price of the risk factor.
Table 9, Panel A, reports results using the 25 size and BTM portfolios and shows that for the Simple FF model, whether estimated on a fixed or rolling basis, we cannot
15 This is perhaps surprising, given the results from testing on the sequentially sorted portfolios, and so following Fama and French (2011) we tested our factors on 5×5 portfolios sorted by intersecting size and momentum. The (unreported) tests show that we can reject the null hypothesis of alphas not being jointly significantly different from zero for all our models. As in that paper, it seems that the real difficulty for our models is pricing momentum effects, particularly in small stocks.
reject the null hypothesis that pricing errors are significantly different from zero. However, when estimated on a rolling basis the intercept term (_cons) is significantly positive. For both bases, only HMLis priced, and at a level which is not inconsistent with the factor mean in Table 1. However, Rm–Rfis not significant. The Simple 4F model represents an improvement in terms of both rolling and single regressions satisfying the chi-squared test and the zero-intercept requirement. Note, though, that the implied price of HMLshows a marked increase. The cross-sectional R-squared is also slightly higher. Using CPZ weightings does not change any of the inferences, and except where rolling regressions are used in the context of the CPZ_FF model, the zero intercept requirement is satisfied. The implied factor price on HML_CPZis greater than that on HML, and in all cases the price is higher than the mean value reported in Table 1.
The results for the decomposed factor model are reported in Table 9 Panel B. For the FF decomposition, we see that the chi-squared test and zero-intercept requirements are both met. Both HML_Sand HML_Belements appear to be significantly priced in the single regression model, although the implied price of the former is a good deal higher than implied by the Table 1 mean. Using rolling regressions results in lower estimates and HML_Sbeing not significantly priced. Again, there is no hint that either market risk or SMB is a priced factor.
For the CPZ decomposition, inferences from the single regression model are similar to those from the FF decomposition. Both BHML_CPZand SHML_CPZ are priced. However, in the rolling regression test whilst these two remain significantly priced, the UMD_CPZfactor is also significantly priced, and all three factors are priced at a level that is consistent with their sample period means. The consistent result from all of these models is that some form of value premium (HML) is priced, market risk and size are never priced, and that whether or not momentum is priced is model specific and dependent on rolling, rather than fixed, regressions being estimated.
The results of Fama-MacBeth tests on the sequentially sorted size, BTM and momentum portfolios are reported in Table 10 are disappointing. First, for all our models, no matter whether they are run on a single or rolling estimation basis, we can reject the null hypothesis that the pricing errors are jointly zero. Turning to the
individual models, in Panel A for the Simple FF model, the intercept is significantly positive for both single and rolling estimates, although in the case of the former HML is significantly priced. For the Simple 4F model, although the intercept is zero and HMLappears to be priced, the chi-squared test strongly rejects the null of no significant pricing errors. The CPZ weighted factors fail to rescue either model, in that besides the rejection in the chi-squared test all of the intercept terms also are significantly positive, at the 10% level at least.
The models using decomposed factors in Panel B of Table 10 are a modest improvement, with components being priced in a fashion consistent with pricing in the Table 9 tests, but the chi-squared test is significant (at the 10% level in the case of the CPZ model). Whilst for all models we can reject the null of pricing errors being jointly zero, the one factor that appears to be priced is some decomposed element of HML.
In Table 11, we report the results of the Fama-MacBeth test on the 25 standard deviation portfolios. In Panel A, the chi-squared tests show that we cannot reject the null hypothesis that pricing errors are jointly zero for all the models. However for the Simple FF model, none of the factors are significantly priced, irrespective of whether a single regression or rolling regressions are employed. We also note that the constant is significant and positive. For the Simple 4F model, conclusions vary according to whether a single regression or rolling regression is employed. For the former, nothing is priced, but for the latter, the constant is significant and HMLis significantly priced at the 10% level.
Using CPZ weightings, the constant is always significant and positive. In the rolling regression version of the CPZ_FF model, the market factor is negatively priced. In both the single and rolling versions of the CPZ_4F model, none of the factors are priced. Turning to the decomposed factor results in Table 11, Panel B, we can accept the null hypothesis of no significant pricing errors for all our models but unfortunately for the Fama-French (2011) decomposition, nothing is priced except for the constant term in the rolling regressions. With the CPZ decomposition run on a single regression basis, UMD_CPZis priced, although at a level that is roughly twice its sample period mean. However, when we switch to rolling regressions, the sign on
UMD_CPZchanges, although the coefficient is insignificant, and that BHML_CPZ now appears to be priced. However, the level of pricing implied is some five times its sample mean.
In conclusion on these second-stage pricing tests, if we follow the Lewellen et al. (2010) recommendations of looking at GRS and chi-squared tests, examining whether constant terms are significant, and checking whether the implied prices of factors seem plausible, we are forced to be sceptical on whether these models are informative on which risk factors are priced in the UK.
One interesting feature of the tests is that when the models are tested on the portfolios used to form the factors, the single regression tests yield slightly higher cross- sectional R-squared than the rolling regressions. This is consistent either with a mean reversion effect in the factor loadings in these portfolios, or with the rolling regressions simply being noisier estimates of the true factor loadings. However, we do not observe such an effect when testing models on the volatility-ranked portfolios, when there is little to choose between the single and rolling regressions. Indeed, if anything the rolling regression approach provides weak evidence that HML(or a component of it in the case of the decomposed CPZ model) may be priced in the CPZ and decomposed models, whereas the single regression approach suggests otherwise. Given the weak explanatory power of these models, it is unwise to make too much of this, but it may be that factor loadings are more likely to be time varying when test portfolios are formed on characteristics that are not used in factor construction. Although we do not formally test this conjecture here, we note that this is entirely consistent with the evidence on industry factor loadings reported in Fama and French (1997) and Gregory and Michou (2009).
Given our scepticism on the adequacy of these asset pricing models, we run two further groups of tests. First, we undertake the robustness checks to ensure our results above are not driven by omitted variables or the period over which factor loadings are estimated. Second, observing that our models have particular difficulty in pricing smaller stocks, we examine whether we can find a model that works at least for larger and more liquid firms.
Our first robustness checks extend our models by including two variants of the Clare, Priestly and Thomas (1997) APT model. We do this because if such APT factors are priced in a manner not fully captured by size, BTM and momentum-based factors, then the above results might be explained by an omitted variables problem. First, we run the Clare et al. (1997) base model with all their variables excluding retail bank
lending.16 Second, we include their variables as an extension to the FF and Carhart
models. They do not appear to add anything to the basic FF and Carhart models, and none of these variables are priced in the Fama-MacBeth regressions, and so we do not report the results here.
Kothari, Shanken and Sloan (1995) show that conclusions drawn on tests of the CAPM are sensitive to the period over which betas are estimated. To test whether such an effect is important in the UK, we follow Fletcher (2010) and run tests using quarterly data. The principal effect on our results is that the spread of observed betas appears to increase in tests using the 25 standard deviation portfolios. However, our observations on the pricing of risk factors in the second stage regression tests do not change. Whilst results from the robustness checks above are not reported for space reasons, they are available from the authors on request.
Fama and French (2011) note that smaller stocks are particularly challenging to price. As we observe above, whilst there may be good reasons why arbitrage activity is restricted in smaller stocks, those reasons do not apply to the universe of larger and more liquid stocks. As a proxy for this tradable universe, we next limit our factor formation and test portfolios to the largest 350 firms (excluding financials) by market
capitalisation.1718 Factor means are close to zero for SMB, 0.32% per month for HML,
16 We exclude bank lending for several reasons. First, the data are not currently available as a monthly series for our whole sample period. Second, Clare et al (1997) use the first difference of the natural logarithm of bank lending and as we find the series has negative values, using their definition on our observed data series is not possible here. We also note that this data series is extremely volatile on a monthly basis.
17 Note that this is a proxy for the FTSE 350 index, which was unavailable at the start of our study period.
and 0.63% per month for UMD. Our test portfolios are 25 (5×5) size and BTM sorts of the top 350 firms, 12 (2x2x3) size, BTM and momentum portfolios sorted sequentially and 12 portfolios sorted on prior volatility.
We do not report the detailed intercept coefficients and t-statistics for each set of portfolios as we do for the full sample, but instead report just the GRS F-test statistic, the associated p-value, and the average adjusted R-squared across all the test portfolios. These results are striking and are reported in Table 12 . Using each of our
6 models, and our three portfolio formation methods, we only reject the null hypothesis of alphas being jointly zero in one case, which is for the CPZ_FF model tested on the standard deviation portfolios. The FF models do well when tested on the size and BTM portfolios, and the 4F models do better when tested on the size, BTM and momentum portfolios, which is perhaps not surprising given that as Fama and French (2011) observe, these models are playing “home games”. Note also that the decomposed factor models seem to do a little better than the aggregated models.
Tables 13-15 report the full Fama-MacBeth tests. Turning to the tests based on size and BTM sorted portfolios first, we see that the Table 13, Panel A results suggest that the basic FF model has an insignificant chi-squared test for both single and rolling regressions, with a constant term not significantly different from zero. The HML factor seems to be priced at plausible levels in both specifications, and although Rm- Rfhas a positive coefficient, no other factors are significantly priced. Moving to the basic Carhart model does not change these basic conclusions, and neither does the adoption of the CPZ weightings of the factor components make much difference.
In Table 13, Panel B, For the decomposed models, we cannot reject the null hypothesis of no jointly significant pricing errors for either model no matter how the coefficient estimates are formed. In the FF_4F model, only HML_Bis priced, suggesting that the value premium is more important in the largest sub-set of firms. However, when the CPZ_4F model is estimated on a single regression basis, both BHML_CPZand SHML_CPZappear to be priced. These conclusions change when
the model is estimated on a rolling basis, when the market risk premium, Rm–Rf, and
18 We are grateful to the editor, Peter Pope, for suggesting these large firm only tests.
BHML_CPZare priced. Taken as a whole, these results suggest that HMLis consistently priced, that the large firm element of this value premium is consistently priced, but that conclusions on the pricing of other factors are sensitive both to the model employed and on whether or not rolling estimates are made.
We next examine the performance of these models when tested against size, BTM and momentum portfolios. Table 14, Panel A reveals that both the basic and CPZ versions of the FF models fail the chi-squared test when estimated using rolling regressions. Furthermore, none of the factors in either version of the model are priced. When we switch to the basic Carhart model, estimated on a single regression basis, both HMLand UMDappear to be priced, the intercept term is zero, and we cannot reject the null hypothesis of no significant pricing errors. However, the implied prices of the factors are some way in excess of the sample means. We also note that the market factor is just significant at the 10% level, although the factor price implied again seems high. When we estimate the model on a rolling basis, we can reject the null hypothesis and no factors are priced. For the CPZ_4F model, whilst we are not able to reject the null hypothesis for either single or rolling regression estimates and the intercept is not significantly different from zero in either case, the conclusion on which factor is priced differ according to how the regression is estimated. For the single regression basis, HML_CPZis priced, whilst for the rolling regression basis it is UMD_CPZthat is priced.
The decomposed factor models in Table 14 Panel B all pass the chi-squared test for the joint significance of pricing errors, and in all cases the intercept term is insignificant. When we estimate the FF_4F model on a single regression basis, it appears that Rm–Rf, HML_S, HML_Band UMDfactors are all priced. Whilst the HMLcomponents and momentum are priced at plausible levels, the implied price of the market factor, at 1.6% per month, seems to be three times higher than might reasonably be expected. When we switch to estimating the model on a rolling basis, only HML_Sis priced. The alternative CPZ_4F, estimated on a single regression basis, again shows that Rm–Rfand momentum are priced, along with SHML. Once again, though, the implied price of the market risk factor is implausible. When estimated on a rolling regression basis, only SHML_CPZand UMD_CPZare priced.
When we employ test portfolios formed on the basis of prior 12-month standard deviation, from the tests in Panel A, it is clear that we can reject the FF model no matter how the factors are formed. Despite the chi-squared tests being insignificant, factors are never priced at levels even close to being significant. A similar conclusion is reached when estimating the basic Carhart model on a rolling basis. When the models are estimated using a single regression, UMDand UMD_CPZare both priced, but at implausibly high levels. Finally, we turn to the decomposed models in Table
15, Panel B. Briefly summarised, disaggregation adds little to the Carhart models described earlier. In both cases, momentum is priced only when single regression estimates are made. Whilst the implied prices are still high, they are somewhat dampened down compared to the estimates from Panel A.