The Art and Science of Forecasting the Real Price of Oil

1 Introduction

The adverse macroeconomic implications of oil cost fluctuations have been known since the 1970's, making the existent price of crude oil a key variable in both macroeconomic forecasting and structural analysis (east.g., Hamilton 1983, 2009; Kilian 2009; Ravazzolo and Rothman 2013; Baumeister and Hamilton 2019). For instance, central banks, private sector forecasters and international organizations view the price of oil equally 1 of the key variables in generating macroeconomic projections and in assessing macroeconomic risks. Oil price forecasts are also crucial for how some sectors of the economy operate their business, for example, airlines, utilities, and automobile manufacturers. Mutual to all, however, is the notion that the price of oil is difficult to forecast. Hamilton (2009) documents that the statistical regularities of changes in the real price of oil take historically tended to be (a) permanent, (b) hard to predict, and (c) governed past very different regimes over time. He further argues that the toll of oil seems to follow a random walk without drift. It is therefore, mutual among professional person forecasters to either use the current spot price or the price of oil futures contracts every bit the forecast of the price of oil. More recently, researchers have explored numerous alternative models and methods in order to forecast the nearly likely hereafter realization of the oil price (east.g., Alquist and Kilian 2010; Alquist, Kilian, and Vigfusson 2013; Baumeister and Kilian 2012, 2015; Manescu and Robays 2016; Bernard et al. 2018; Pak 2018; Garratt, Vahey, and Zhang 2019; Baumeister, Korobilis, and Lee 2020). These articles show that while the random walk is difficult to vanquish in out-of-sample forecasting exercises, careful attention to the economic fundamentals that are driving energy markets can generate improvements in point forecast accuracy.

In this commodity, nosotros provide a novel and numerically efficient quantification approach to forecast dubiousness of the real price of crude oil using a combination of probabilistic private model forecasts. The proposed Forecast Density Combination (FDC) model is based on a probabilistic and econometric interpretation of the Bayesian Predictive Synthesis (BPS) model due to McAlinn and West (2019) and McAlinn et al. (2020). BPS is a coherent Bayesian framework for evaluation, calibration, and data-informed combination of multiple forecast densities, that is based on the earlier literature on Bayesian "agent/expert opinion analysis" (West 1984; Genest and Schervish 1985; West 1988, 1992; West and Crosse 1992). The proposed combination method extends before approaches that take been applied to oil cost forecasting models, by allowing for three key features. First, the method features fourth dimension-varying combination weights, and explicitly factors into the model combination the inherent incertitude surrounding the estimation of the combination weights. Second, it allows for modelling and estimation of fourth dimension-varying forecast biases and facets of miscalibration of private forecast densities and fourth dimension-varying inter-dependencies amid models. Tertiary, it provides diagnostic analysis of model fix incompleteness and learning from previous forecast mistakes, which serves to improve model specification.

We brand use of the proposed combination model and provide an all-encompassing set of empirical results about time-varying out-of sample forecast operation, forecast uncertainty and risk for the real cost of oil. Following Garratt, Vahey, and Zhang (2019) recent replication of Baumeister and Kilian (2015), we employ real-time monthly data, where the full sample covers the catamenia 1973:01–2017:12. Our starting point is to document substantial changes over fourth dimension in the hateful and volatility of the real price of oil, a blueprint that also transfers into the shape of the data densities. Nosotros then go along past showing vi primary results, which extends the current literature on forecasting the price of oil. These conclusions are found to exist robust across different oil toll series, BPS specifications and model combination sets.

First, our combination approach systematically outperforms all benchmarks we compare it to, both in terms of bespeak and density forecasts. The competing benchmark models, range from the six land-of-the-art private forecasting models used in Baumeister and Kilian (2015), including the commonly used naive no-modify model, to alternative combination approaches such as equal weights for the individual models and Bayesian Model Averaging (BMA). While the gains from the model combination relative to the alternative models are limited at the one month horizon, substantial gains in relative forecast accurateness are obtained at all other horizons. At the six-calendar month horizon, the magnitude of reduction in terms of mean squared prediction fault (MSPE) and logarithmic score (LS) relative to the no-change model exceeds 10% for MSPE and 12% for LS and are credibly different. For longer horizons, the gains are substantially larger, with reductions in the range of 30% and 40% for MSPE and fifty% and 110% for LS, at the 12 and 24 months horizon, respectively.

Second, the favorable forecast performance from the proposed approach is not specific for sure time periods but it holds throughout the evaluation period. Big time variation is institute in the relative operation of the various private models and culling model combinations. For instance, in line with Baumeister and Kilian (2012), it is seen that the VAR model performs well upwards until 2010, merely then does relatively poorly over the subsequent period, thereby corroborating recent results in Baumeister, Korobilis, and Lee (2020).

Tertiary, the arroyo allows for time-varying individual model weights that sequentially adapt according to the recent relative forecast functioning of each model inside the model combination fix. At all forecast horizons, we document a considerable time variation in the weights attached to each model, as well reflecting the large time-variation in the individual models forecasting operation. One key characteristic is that the weights are not restricted to be a convex combination in the unit interval—like about of the combination methods that are currently used within the econometrics literature—but are instead specified every bit a general linear combination and are thereby permitted to evolve along the real line. This has two advantages. Outset, assuasive for both positive and negative weights means that it's possible to hedge confronting whatever potential forecast gamble in contempo periods. Second, a failing weight on one model does not necessarily imply an increasing weight on another model. Instead, model weights are permitted to modify in accordance to the forecasts of the individual models. Such features are clearly desirable to a broad range of practitioners. For instance, just as a financial portfolio director hedges against risk past assigning a negative weight to an asset, the proposed arroyo is able to automatically assign a negative weight to mitigate the impact of forecast take a chance, such equally forecast bias. Such natural behavior is not possible under combination models in which the model weights are restricted to be convex combinations, for case, equal weights or BMA.

4th, BPS has a built-in time-varying intercept that is absent from simpler combination methods such as BMA. It is well known within the econometrics literature on forecast combinations that BMA assumes that the truthful model is included in the model set. By allowing for an intercept component that can adapt during episodes of depression frequency signals from a gear up of forecasting models, our combination arroyo is able to better mitigate furnishings of model set misspecifciaton, that is, model set incompleteness. We show that this has been specially important since 2010, as the intercept term for all forecast horizons gradually starts to increase earlier abruptly dropping during the oil price plummet of 2014.

Fifth, the combination approach has built-in diagnostic data measures about forecast inaccuracy and/or model ready incompleteness, which is also absent from simpler combination methods such as BMA. This is measured by the estimated time-varying model combination residual, which provides clear signals of model incompleteness during three crisis periods. This type of diagnostic information gives important signals about specifying models and model gear up improvements.

Finally, a bones analysis of profit-loss and hedging against toll chance is presented in order to highlight the models potential for policy analysis. As a mensurate of optimal hedge ratio, the Minimum Variance Hedge (MVH) ratio is used which fluctuates between 0.1 and 0.4 at most horizons. Nevertheless, notable spikes occur around the turn of the century besides as the two oil toll collapses of 2009 and 2014.

Our article is related to the recent resurgence in interest combination of density forecasts in macroeconomics, econometrics, and statistics. Prominent new developments range from combining predictive densities using weighted linear combinations of prediction models, evaluated using various scoring rules (eastward.g., Hall and Mitchell 2007; Amisano and Giacomini 2007; Jore, Mitchell, and Vahey 2010; Hoogerheide et al. 2010; Kascha and Ravazzolo 2010; Geweke and Amisano 2011, 2012; Gneiting and Ranjan 2013; Aastveit et al. 2014), to more than complex combination approaches that allows for time-varying weights with mayhap both learning and model set incompleteness (e.g., Terui and Van Dijk 2002; Hoogerheide et al. 2010; Koop and Korobilis 2012; Billio et al. 2013; Casarin et al. 2015; Pettenuzzo and Ravazzolo 2016; Del Negro, Hasegawa, and Schorfheide 2016; Aastveit, Ravazzolo, and Van Dijk 2018; McAlinn and Due west 2019; McAlinn et al. 2020; Takanashi and McAlinn 2020; Casarin et al. 2020). See also Aastveit et al. (2019) for a recent survey of these developments. Despite these research activities on several macroeconomic and financial variables, in that location are currently, to the best of our knowledge, no studies on how to quantify forecast uncertainty associated with the dynamic behavior of the real cost of crude oil.

The contents of this article is structured as follows. In Section two we present our Forecast Density Combination model using Bayesian inference. In Section three a summary of the models used is given. Forecasting results and the hazard analysis are presented in Sections 4 and 5, respectively. Section 6 concludes and suggests directions for futurity research.

2 Forecast Density Combination Approach

A basic probabilistic approach to combine forecast information from dissimilar sources proceeds every bit follows. Let yt be the economical variable of interest; let y ˜ t = ( y ˜ i , t , , y ˜ n , t ) be forecasts for this variable from i = 1 , , n models. In a simulation context y ˜ i , t is a depict from the forecast distribution with conditional density p ( y ˜ i , t | I i , t 1 , Thousand i ) given data fix I i , t 1 and model Thousandi . Permit v t = ( five 0 , t , , five due north , t ) be latent continuous random variable parameters where 5 ane , t , , five n , t volition be used to combine the model forecasts and the role of five 0 , t is discussed below. The decomposition of the joint density of y t , v t , y ˜ t for the case of continuous random variables is given as: (1) p ( y t | I t 1 , Thou ) = p ( y t | v t , y ˜ t ) p ( v t | y ˜ t ) p ( y ˜ t | I t 1 , M ) d v t d y ˜ t , (1) where I t 1 is the articulation data set of all models and M the matrimony of all models. We characterization p ( y t | 5 t , y ˜ t ) every bit the combination density; p ( 5 t | y ˜ t ) as the variable parameter density and p ( y ˜ t | I t i , M ) as the joint forecast density of the different models. Note that the integrals are of dimension n + ane and due north.

A central footstep is to give specific content to the dissimilar densities. For the case of BPS it follows that: (2) p ( y t | v t , y ˜ t ) = northward ( y t | v 0 , t + i = 1 n v i , t y ˜ i , t , σ t 2 ) , (ii) (iii) p ( v t | v t 1 , Σ t ) = n ( v t | v t ane , Σ t ) , (3) (four) p ( y ˜ t | I t one , M ) = i = 1 n p ( y ˜ i , t | I i , t ane , Thousand i ) . (4)

Nosotros emphasize that the combination density is a multivariate normal one with a time-varying abiding 5 0 , t in the conditional mean. This specification adds flexibility to the model combination and allows for forecast adjustments to shocks and government changes in the data serial while σ t two allows for fourth dimension-varying volatility. The parameter Σ t = σ t two W t and Westward t is a diagonal matrix with elements wit given below.

A start feature of this approach, compared to standard combination models like BMA, is an analysis of the dynamic behavior of the error ε t implied by the combination density. It is given as (v) ε t = y t ( v 0 , t + i = ane n v i , t y ˜ i , t ) . (5)

The forecast mistake of the ith model is usually defined as y t y ˜ i , t due to, for instance, sudden shocks in the serial and model misspecification. Given the provisional mean of the combination density in (2), information technology is seen that the BPS error in (5) can be viewed as weighted combination of model forecast errors from each of the individual models.

We also investigate the dynamic behavior of the error ε i , t using only model Grandi , using: (6) ε i , t = y t ( v 0 , t + v i , t y ˜ i , t ) . (6)

A second characteristic of the approach is the possibility to learn about the contribution of the different individual model forecasts in the combination. Learning is specified every bit a random walk procedure of the continuous latent variable parameters five t = ( v 0 , t , , v n , t ) , run across Equation (iii). Nosotros note that the weights may become negative which in some cases helps in dynamic averaging. That is, the proposed arroyo is able to automatically assign a negative weight which may mitigate the impact of forecast risk, such as forecast bias. Such natural beliefs is non possible under combination models in which the model weights are restricted to be convex combinations, for example, BMA or equal weighting methods.

Given the specified probability model, there exists a system of equations that has been labeled a latent dynamic factor model by McAlinn and West (2019) and McAlinn et al. (2020). However, nosotros interpret this system as a multivariate regression model with generated regressors y ˜ t and latent time-varying parameters five i , t . By structure, this equation organization can be represented in the form of a generalized linear Land Space Model where the explanatory variables y ˜ i , t are not given information but generated draws from the forecast distributions of the north models: (7) y t = v 0 , t + i = 1 n five i , t y ˜ i , t + ε t , ε t NID ( 0 , σ t 2 ) , (7) (eight) v i , t = v i , t 1 + ε v , t , ε v , t NID ( 0 , σ five , t two = σ t 2 west t ) , i = 0 , , n . (8)

The time-varying volatility parameters σ t 2 and σ 5 , t 2 play important roles in this model equally smoothness parameters: σ t 2 = δ σ t 1 two γ t is a beta–gamma volatility model in which δ ( 0 , 1 ] is a volatility disbelieve factor and γ t Beta ( δ h t 1 two , ( 1 δ ) h t 1 ii ) is an independent Beta innovation such that h t = δ h t 1 + 1 and Due east [ γ t | h t 1 ] = δ at all dates t = 1 , , T . The weight w t = one β β westward t ane is a component discount term in which β ( 0 , i ] is a state disbelieve gene. For details, run across Due west and Harrison (2006), sec. 6.three.ii and ten.8.

In Effigy 1 nosotros testify in a roadmap the connections between the components of the model. We distinguish between ii figure shapes: rectangles which contain data and forecasts from different models and their combination; circles which contain latent time-varying regression parameters and the unobserved random parameters from the stochastic volatility process which have to exist filtered/integrated out.

Fig. i FDC model in generalized linear state space form. Given data, rectangles indicate model forecasts and combined forecasts. Circles refer to latent time-varying regression parameters and the random parameters from the stochastic volatility procedure where filtering/integration is used.

Bayesian estimation procedure using MCMC

The analytic solution of the integrals specified in the probability model (i)–(4) is often not known. We brand use of simulation methods in order to bargain with this trouble. Nosotros also make utilize of Bayesian inference specifying prior information on the parameters of the stochastic volatility processes and the time-varying equation parameters which can be interpreted every bit unobserved states. Autonomously from the cardinal choice for Bayesian inference, there exists a practical reason in our example. Using simulation-based Bayesian inference the generated forecast draws from the different models are computationally directly carried forward to the estimation of the combination density. Thus, the dubiety in the forecasts of the different models carries directly into the uncertainty of the combination forecasts. In contrast, frequentist methods like method of moments or maximum likelihood proceed in a two-step fashion by substituting the betoken forecasts of the dissimilar models in the combination equation and as such the second phase results suffer from the generated regressor problem; see, for example, Pagan (1984).

The specification of the model discussed so far leads to the formulation of the likelihood of a generalized linear Land Space model. Our Bayesian inferential procedure requires to choose prior values for the discount factors δ ( 0 , 1 ] , β ( 0 , 1 ] and priors for the initial values of the time-varying parameters ( five i , t , σ t 2 ) at t = 0. The role of disbelieve factor β ( 0 , ane ] is to operate on the parameter evolution via w t = 1 β β w t ane . Setting β = 1 implies a constant coefficients model, that is, wt = 0, while β ( 0 , 1 ] is consistent with time-varying coefficients. The parameter δ ( 0 , 1 ] operates on the volatility development via σ t 2 = δ σ t 1 two γ t in which γt are Beta distributed innovations. Relevant choices of the disbelieve factors β and δ are, of form, e'er context dependent. In our application, we opted for the value 0.9. As a unproblematic robustness check, we computed LS and RMSFE values for the average of the triple (0.85, 0.9, 0.95). The results were consistent with the unmarried value chosen.

Post-obit McAlinn and West (2019), the prior on the latent time-varying parameters is provisional normal v i , 0 | σ 0 two N ( m 0 , C 0 σ 0 two south 0 ) where the hyperparameter: thousand 0 controls the mean, C 0 controls variance and h 0 and s 0 jointly command mean and variance of the measurement volatility, southward 0 indirectly effects the variance of parameters. The initial prior on the measurement variance is marginal inverted gamma σ 0 ii IG ( h 0 / 2 , h 0 s 0 / ii ) . Specific choices for hyperparameters are one thousand 0 = ( 0 , one / north , , one / n ) , C 0 = ten 4 I p , s 0 = 0.002 , and h 0 = x .

Algorithmic outline using Kalman Filter and MCMC

Nosotros emphasize that the proposed method makes use of a 3-step Monte Carlo process instead of the usual two-footstep method. The extra pace is due to the generation of random draws from the forecast distributions of the unlike private models.

  • Forecast from north models. Generate draws from the forecast distributions from the n different models which gives y ˜ i , t , i = 1 , , n

  • Latent variable parameters. Use the Kalman update with initial value five i , 0 , i = 1 , , n , and generate variable parameters v i , t , i = i , , due north from the random walk procedure.

  • SV parameters. Given draws y ˜ i , t , i = 1 , , northward , v i , t , i = 0 , , north , generate a draw of the SV parameters from inverted Gamma distribution.

Forecasting proceeds as follows: Given generated five i , t , i = 0 , , n , generated SV values and generated y ˜ i , t , i = i , , northward , use (7) to generate a one step forecast value y t + one . Repeating this process gives a synthetic sample of future values and a forecast density at time t + 1.

3 Individual Models and Culling Combinations

Allow St denote the spot toll of crude oil at date t. Forecasts are obtained using a general stochastic volatility model with Pupil'southward t-distributed errors given equally (nine) S t + h | t South ̂ t + h | t = ϵ t + h | t , ϵ t + h | t T ( μ , eastward h t + h | t , ν ) , (9) (10) h t + h | t = μ + ϕ ( h t + h i | t μ ) + ζ t + h | t , ζ t + h | t NID ( 0 , ω two ) , (ten) in which | ϕ | < i and S ̂ t + h | t denotes a point forecast of the existent oil cost, which is fix equal to the conditional hateful of the posterior predictive density. The model is estimated using the Metropolis-inside-Gibbs sampling algorithm described in Chan and Hsiao (2014), using 10,000 draws from the posterior distribution subsequently discarding the showtime 5000 draws every bit a burn-in.

Private models

Following Baumeister and Kilian (2015), the signal forecasts of the existent price of oil, Due south ̂ t + h | t , are obtained using six state-of-the-art oil price forecasting models. The names and acronyms are listed in Table 1. Next, we summarize the specifications.

Table 1 List of private forecasting models and various forecast density combination approaches and their acronyms.

The prepare of models starts with a no-modify forecast, with acronym NC: (xi) S ̂ t + h | t = Southward t . (xi)

The second model includes the changes in the price index of nonoil industrial raw materials and is denoted by CRB: (12) S ̂ t + h | t = S t | t ( ane + π t h , r thou E t [ π t + h ( h ) ] ) , (12) in which π t h , r m denotes the pct alter of an index of the spot price of industrial raw materials (other than oil) over the preceding h months and is obtained from the Commodity Research Bureau (CRB), and π t + h ( h ) denotes the expected rate of inflation over the side by side h-periods which is proxied by recursively constructed averages of past U.Southward. consumer price inflation data. This model is based on the intuition that there are broad-based predictable shifts in the demand for globally traded commodities.

The tertiary model includes West Texas Intermediate (WTI) oil futures prices and is denoted past Futures: (13) South ̂ t + h | t = South t | t ( 1 + f t WTI , h s t WTI Due east t [ π t + h ( h ) ] ) , (13) in which f t WTI , h is the log of the current WTI oil futures price for maturity h and s t WTI is the log the WTI spot price. This model reflects thought that many practitioners and policy institutions rely on the price of oil time to come contracts in generating forecasts of the oil price.

The fourth model includes the spread betwixt spot prices of gasoline and crude oil and is denoted by Gasoline: (14) S ̂ t + h | t = Southward t | t exp ( β ̂ [ due south t gas s t WTI ] E t [ π t + h ( h ) ] ) , (xiv) in which s t WTI is the log of the nominal U.S. spot cost of gasoline. This model reflects the thought of many market practitioners believing that a ascent spread between the price of gasoline and the toll of oil signals upwards force per unit area on the price of oil.

The fifth model is a fourth dimension-varying parameter model of the gasoline and heating oil spreads and is denoted by TVSpread: (15) S ̂ t + h | t = Southward t | t exp ( β ̂ 1 , t [ south t gas south t WTI ] + β ̂ 2 , t [ due south t heat s t WTI ] Due east t [ π t + h ( h ) ] ) , (fifteen) in which s t rut is the log of the nominal U.Southward. spot price of heating oil which is obtained from the Environmental impact assessment and the time-varying parameters evolve according to a random walk with independent Gaussian white-noise errors.

The sixth model is an oil market Vector Autoregressive Model and is denoted by VAR: (16) y t = b + i = 1 p B i y t i + due east t , (xvi) where yt is a four × 1 vector of variables including: the percent change in global crude oil production; global real economical action index of Kilian (2009); the log of the real price of oil – Due south ̂ t + h | t = exp ( y ̂ 3 , t + h | t ) and global above-basis crude oil inventories. This model can be viewed every bit the reduced-form representation of the global oil market structural VAR model developed by Kilian and Murphy (2014). Recently, Hamilton (2021) argued that an culling mensurate, derived from globe industrial production, is a better indicator of global real economic activity. Comparing diverse measures of global existent economic activity, Baumeister, Korobilis, and Lee (2020) notice that models based world industrial production or a mutual gene extracted from a panel of existent commodity prices provide the all-time forecasts of the real price of oil. However, due to the lack of available existent-time data vintages, we refrain from using these culling measures of global economical weather.

Combination models

Nosotros employ five of the individual models listed in Tabular array 1 to compose model combinations: CRB, Futures, Spread, TVspread and VAR. The NC model is so used as a benchmark model to compare relative forecast performance of the various models.

In addition to the BPS model discussed in Section 2, nosotros besides consider Bayesian model averaging (BMA). BMA is a popular ensemble learning method that has been widely used within the econometrics literature (see, e.g., Aastveit et al. 2019 and references therein). When using BMA the individual forecast densities, p ( y ˜ t , h | M i , I t ) , from model Ki are pooled into a combined posterior/forecast density, p ( y ˜ t , h | I t ) , given every bit (17) p ( y ˜ t , h | I t ) = i = one N w i , t , h p ( y ˜ t , h | Yard i , I t ) , (17) where the weights, westward i , t , h , are specified in one of two ways. In the first instance, following, among others, Amisano and Giacomini (2007), Hall and Mitchell (2007), and Jore, Mitchell, and Vahey (2010), we utilise recursive weights based on the logarithmic score which take the form (18) west i , t , h = exp ( t = T 0 T ane h ln p ( y ˜ t , h | G i , I t ) ) i = i Due north exp ( t = T 0 T one h ln p ( y ˜ t , h | M i , I t ) ) , (18) in which T 0 denotes the start date of the forecast evaluation menses and T denotes the finish date of the menstruum. In addition to this, we consider a version of BMA which uses a two-year rolling window when updating the weights.

Finally, we also consider equally weighted forecasts (equal) where nosotros set the weight attached to each model to west i , t , h = 1 / Due north in Equation (17). In fact, such a simple combination of forecasts is commonly used, come across, for instance, Timmermann (2006), Stock and Watson (2006), Clark and McCracken (2010) and Baumeister and Kilian (2015), and is oftentimes found to outperform more sophisticated adaptive forecast combination methods.

4 Forecasting Results

In this section we present results from a real-time, out-of-sample forecast study, in which we generate both indicate and density forecasts of the real cost of oil in the global marketplace for crude oil. Following Garratt, Vahey, and Zhang (2019) recent replication of Baumeister and Kilian (2015), we utilise real-time monthly data, where the real price of oil in the global market is approximated past deflating the U.S. refiners' conquering cost for crude oil imports (IRAC) by the seasonally adapted U.South. consumer price index for all urban consumers (CPI). The dataset also includes monthly existent-time vintages of variables used for estimating the various private models, such every bit, for example, earth oil production, oil inventories and the global real economic activity index. The full information cover the period 1973:01–2017:12. The initial forecasts discussed in Department 3 are estimated on data from 1973:01–1991:12 and forecasts are then made over the remaining information for the period 1992:01–2017:12 using real-time data vintages. When constructing the combinations, BPS requires an initial grooming data menses which nosotros ready to 50 months. Since the start 24 months worth of forecasts business relationship for differences in the forecast horizons, all forecasting models are evaluated on the same period of 1998:03–2017:12. Our objective is to forecast the terminal release of the existent oil price information.

4.1 Typical Data Patterns of the Real Price of Oil

We begin the assay by examining the real price of oil in diverse transformations every bit shown in Effigy 2. The shaded regions highlight various episodes of historical significance for the global market for rough oil: The 1979 oil crunch, the kickoff of the Iran–Iraq War in 1980, the disbandment of OPEC in 1985, the 1990/1991 Persian Gulf War, the Asian Financial Crunch 1997/1998, the oil price surge of mid 2003–2008, the collapse of the oil price during the Great Recession and the oil price refuse of mid-2014 to early 2015. The various transformations collectively highlight 3 typical data features in the period 1973–2018. First, the log-level serial show substantial changes in the mean of the serial which suggests that a fourth dimension-varying autoregressive mean procedure may be beneficial. Second, the returns and squared returns series show volatility clustering suggesting that stochastic volatility is an of import data characteristic to model. Third, using subperiod assay, a changing mean and volatility blueprint in the log-level series indicate that a time-invariant autoregressive mean model with SV may provide reasonably authentic forecasts over the initial data period, 1974–2002, likewise as the periods 2010–2014 and 2015–2018. This confirms that in sub-periods stable patterns are nowadays but periodic shocks at the mean level have occurred as indicated above. Further testify of fourth dimension-varying elements of the oil cost distribution can be seen in Figure three, which shows data distributions over the total data catamenia and notable sub-periods: The forecast evaluation menstruation (1998–2018), a menstruum of turmoil (1973–1987), a period of tranquility (1988–1997), the oil price surge (1998–2007) and the most contempo decade (2008–2018). In each case, the horizontal axis represents the level of the existent price of oil in USD, every bit besides shown in the summit left console in Effigy 2, pooled into nine bins. The vertical axis represents the pooled count of observations over the corresponding (sub-)periods. Substantial fourth dimension-variation in the shape of the information distributions is seen. It is noteworthy that asymmetry, fat tails and bimodality are important features of the information. This suggests that models which allow for nonlinearities in both mean and variance, as well equally fat-tails, such every bit in our proposed BPS framework, may provide forecast improvements over simpler linear models, such equally the suite of private forecasting models, and the normally used equal weight and BMA combination schemes discussed in Department 3. In the side by side section we talk over the accuracy of the estimated densities compared to the different information distributions.

Fig. ii Real IRAC price of oil at monthly frequency over the period: 1973:01–2017:12.

Fig. iii Distributions of the existent IRAC toll of oil in levels at monthly frequency over the period: 1973:01–2017:12, the forecast evaluation period: 1998:03–2017:12, and various sub-sample periods. The horizontal centrality represents the real price of oil in USD and the vertical axis represents the pooled counts of observations in the corresponding bins.

iv.two Forecast Accurateness of Individual Models and Combinations

Forecast results beyond the evaluation menstruum are provided in Table 2. The upper panel provides density forecast results evaluated by the log score relative to a no-change model benchmark. The lower panel provides point forecast results evaluated by the RMSFE relative to a no-change model benchmark. For estimation purposes, log score values that are greater than zip indicate that the model outperforms the benchmark and vice versa. In contrast, RMSFE values that are less than ane indicate that the model outperforms the criterion and vice versa. To determine whether the forecast improvements or deteriorations are credibly unlike from zero, we report results from the Diebold-Mariano exam with both 95% and 99% credible intervals. To make up one's mind the best model with a given level of brownie, we as well written report tail probabilities (p-values) for the model apparent fix (MCS)—a Bayesian interpretation of the model confidence gear up of Hansen, Lunde, and Nason (2011)—in Table iii.

Table 2 Density (Log Score) and point (RMFSE) forecast results relative to a no-change benchmark.

Table 3 Model apparent gear up (MCS) tail probabilities (p-values) for density (Log Score) and point (RMSFE) forecasts.

Offset focusing on density forecasts, nosotros observe that the proposed BPS approach provides the best forecasts when forecasting the existent price of oil across the immediate one-month-ahead forecast horizon, and that these improvements are credibly different from zero. The no-change model is difficult to beat when producing one month ahead density forecasts, however, the performance difference between BPS and the no-change model is non credibly different from zero at this horizon. Interestingly, nosotros find that the Futures model improves upon the no-modify model at each forecast horizon, while the other individual models by and large neglect to outperform the benchmark; exceptions include the CRB model at the one-step-ahead horizon and the VAR at horizon 24. Due to aggregated forecast uncertainty, the equal weight and BMA combination methods practise substantially worse than the no-alter model at all horizons. Past allowing for a more than flexible weighting procedure the two-year rolling window BMA2 outperforms both equal weighted and expanding window BMA approaches, and also improves upon the no-change criterion at longer horizons.

Shifting focus to point forecasts, we discover that the proposed BPS model provides substantial improvements over the no-alter benchmark at all forecast horizons, and that these improvements are credibly dissimilar from nix beyond the i month horizon. Consequent with Baumeister and Kilian (2012), we observe that the commodity price-based model improves upon the no-modify model at the one-month horizon but does worse at longer horizons. Also in line with Baumeister and Kilian (2015), we find that the equal weights combination model outperforms the no-change benchmark at all forecast horizons, and that the importance of futures-based information improves with the forecast horizon. In dissimilarity to the density forecast results, all combination methods are plant to meliorate upon the point forecast accuracy of the benchmark, yet, it is clear that specifying time-varying weights as in the BPS and BMA2 produces the largest gains. In line with the density forecast results, the point forecast accuracy of BMA is found to be like to the equal weights model—a result that is commonly referred to as the "forecast combination puzzle" within the broader literature on model combinations of competing point forecasts. That being said, nosotros find that past specifying more dynamic weighting schemes in BMA2 and learning weights in BPS, nosotros are able to generate greater forecast accuracy at all horizons. Moreover, the proposed BPS approach provides substantial improvements beyond all private and combination models. The size of these improvements is likewise increasing with the forecast horizon, in which oil prices are mostly assumed to exhibit near random walk beliefs. This suggests that the proposed BPS model may exist especially useful for practitioners who hedge against oil price adventure. Nosotros further explore this in Section 5.

The usefulness of BPS for forecasting the real oil price is further supported past results in Table iii which show that BPS is by and large inside the prepare of superior models, with one exception being the one-step-ahead density forecasts. The observation that combinations with equal and BMA weights are excluded from the set of superior models at all horizons for the density forecasts, while BMA2 and BPS are included, is especially noteworthy. This emphasizes the importance of assuasive for time-varying weights in combination models.

As a next stride, we make up one's mind whether differences in forecast accuracy between models and model combinations concord throughout the time series periods for the different forecast horizons. To this end, we prove the time patterns of cumulative Log Scores and RMSFEs relative to a no-change model benchmark in Figures 4 and five, respectively. For interpretation purposes, values in Figure iv that are greater than zero bespeak that the model outperforms the benchmark and vice versa, while values in Figure 5 that are less than one indicate that the model outperforms the criterion and vice versa.

Fig. 4 Time patterns of cumulative log scores relative to a no-change model criterion over the forecast evaluation period: 1998:03–2017:12. Forecast horizons: 1, six, 12, and 24 months alee.

Fig. 5 Time patterns of RMSFEs relative to a no-change model criterion over the forecast evaluation period 1998:03–2017:12. Forecast horizons: 1, 6, 12, and 24 months alee.

The results in Figure iv reveal considerable time variation in the relative performance of both individual and combination model specifications. In line with the full data menstruation results, the proposed BPS model is competitive at the one-footstep horizon and outperforms the no-alter benchmark, all individual models, and all combination models, at longer forecast horizons. In contrast, both equal weight and BMA models get progressively worse relative to the no-change benchmark, while the individual model specifications tend to cluster around like values. Finally, while the recursive window BMA model (BMA2) does quite poorly when forecasting one-month-ahead, the forecast accuracy is competitive with the best private model forecasts when forecasting 6-months-ahead, and is second only to the BPS model at the longer 12 and 24 month horizons.

Turning our attention to the point forecast results, reported in Figure 5, nosotros again find considerable time variation in the relative performance of each model specification. In line with the density forecast results, we find that the BPS model is competitive at the 1-month horizon and provides substantial improvements at longer horizons. The fact that most of the models RMSFEs cluster around one at the 1-calendar month horizon suggests that the oil price exhibits near random walk beliefs at this horizon. A notable exception occurs around the oil cost collapse in 2009, during which the combination models provide notable improvements over the no-alter benchmark, all the same, these gains gradually misemploy over the next few years. In dissimilarity to the relatively similar forecast performance at the one-month horizon, the longer horizons exhibit much more dispersion. For instance, at each horizon, the futures cost model produces by and large superior forecasts relative to the no-change benchmark with the exception of the early to mid 2000's. This is in line with existing results that the real price of oil betwixt mid 2003–2008 was driven by unexpectedly loftier growth mainly in emerging Asia (Aastveit, Bjørnland, and Thorsrud 2015). We also notice that the VAR model performs well upward until 2010, just and so does relatively poorly over the subsequent menses, thereby corroborating recent results in Baumeister, Korobilis, and Lee (2020).

In Table A1 in the supplementary materials, we report results for absolute forecast accuracy by testing if the density forecasts are correctly calibrated over the entire forecast evaluation menstruum, using the test in Knüppel (2015). We also study in Tabular array A2 in the supplementary materials results from a two-sample Kolmogorov–Smirnov test betwixt the empirical cumulative distribution role of the data and the cumulative distribution role of the BPS forecasts at each forecast horizon. The results from these tests suggest that the BPS forecasts are well calibrated and provide a good approximation of the data distribution at each of the forecast horizons.

4.3 Learning virtually Time-Varying Combination Weights and Model Diagnostics

An important characteristic of BPS is that it allows for time-varying individual model weights that adapt according to the recent relative forecast functioning of each model within the model combination set. The means of the densities of the individual model weights are shown in Figure 6. The general observation across all forecast horizons is that while considerable fourth dimension variation exists, in that location are similarities among models. First, focusing on the one-step-ahead results we observe that the mean weights for both the CRB and futures toll models tend to follow a like trajectory, while the Futures, Spread and Television receiver spread models, respectively, follow a similar path that is distinct from the former trajectory. Information technology is particularly notable that the mean weights for the two serial in the old group abruptly increment during the two oil toll collapses of 2009 and 2014 respectively, only gradually reject in the subsequent years surrounding these events. In contrast, the mean weights for the three series in the latter group sharply declined during the oil price collapse of 2009, merely then gradually increase, with all hateful weights sharing roughly similar time patterns by the end of the period. Interestingly, the same weighting clusters are not observed at longer horizons. At the 12-month-ahead horizon, we find that each of the mean weights tends to follow a similar trajectory. One notable exception is the abrupt increase in the hateful weight of the Futures model following the 2014 oil price collapse.

Fig. 6 Time-varying posterior predictive hateful of the individual model weights ( v i , t ) in the BPS model, sequentially computed at each point in time over the forecast evaluation period 1998:03–2017:12.

Information technology is likewise worth noting that the BPS mean weights are not restricted to be a convex combination in the unit of measurement interval—like virtually of the combination methods that are currently used within the econometrics literature—but are instead specified as a general linear combination and are thereby permitted to evolve forth the real line. While this may have a disadvantage relative to the natural interpretation of mean weights within convex combinations every bit representing a probability distribution over unlike possible models, specifying a linear combination offers two practical advantages. In the first case, allowing for both positive and negative hateful weights means that information technology's possible to hedge against any potential forecast risk in recent periods. Second, a declining mean weight on ane model does not necessarily imply an increasing mean weight on another model. Instead, mean weights of models are permitted to change in accord to the forecasts of the individual models. Such features are conspicuously desirable to a broad range of practitioners. For instance, just equally a financial portfolio manager hedges against risk by assigning a negative hateful weight to an asset, the BPS approach is able to automatically assign a negative mean weight to mitigate the impact of forecast run a risk, such as forecast bias. Such natural behavior is non possible under combination models in which the mean weights are restricted to be convex combinations, for example, BMA or equal weighting methods.

Another important feature of BPS is that it has a built-in fourth dimension-varying intercept that is absent-minded from simpler combination methods such as BMA. It is well known within the econometrics literature on forecast combinations that BMA assumes that the true model is included in the model set. However, the model set could be misspecified due to incompleteness. Past allowing for an intercept component that can adapt during episodes of low frequency signals from a set of forecasting models, BPS is able to improve mitigate the effects of this problem. For instance, from the previous section, we know that there exists a model within the combination set—for example, the VAR model—that provided superior one-pace-ahead forecasts of the real toll of oil up to and during the oil price driblet of Neat Recession relative to the no-change criterion. After 2010, however, none of the models forecasted the oil cost collapse of 2014. This suggests that there exists a caste of model set up incompleteness since 2010, and we expect that this is reflected in the estimated BPS intercept for the one-step-ahead forecast. This is exactly what we detect in Figure 7 which shows the mean intercept terms at each forecast horizon. The ane-step-ahead mean intercept is around zilch up until 2010, when it starts to gradually increase before abruptly dropping and becoming negative during oil price collapse of 2014. Moreover, as shown previously in Figure five, this feature allows BPS to ameliorate upon the no-change benchmark despite the relatively weak signals stemming from the models during this period. Similar scenarios can be observed in the remaining estimates, which each showroom a hump shaped response for the respective estimated mean intercepts over the period 2008–2014.

Fig. 7 Time-varying posterior predictive hateful of the intercept coefficient weight ( v 0 , t ) in the BPS model, sequentially computed at each bespeak in time over the forecast evaluation period 1998:03–2017:12. The ruddy dotted line evidence the 95% credible bands.

The final important characteristic of BPS is that it has congenital-in diagnostic data measures about forecast inaccuracy and/or model fix incompleteness which is also absent from simpler combination methods such as BMA. We first present this diagnostic mensurate for σ t 2 for the model set in Figure eight. It shows clearly that during iii crisis periods this measure out increases. In Figure A2 in the supplementary materials, we also provide estimates of this diagnostic measure for each individual model, σ i , t two , within the BPS framework. Information technology is seen from the longer term forecasts that none of the individual models is capable to accurately forecast crises, in item the VAR does poorly in crisis periods estimated over short besides as long horizons. This type of diagnostic information gives important signals about specifying model and model fix improvements. We leave this as a topic for further research.

Fig. 8 Time-varying variance, measured equally the posterior predictive hateful of the measurement variance, in the BPS model ( σ t 2 ), sequentially computed at each signal in time over the forecast evaluation menstruum 1998:03–2017:12. The scarlet dotted line show the 95% credible bands.

four.4 Robustness Checks

In this section we consider various robustness checks to our main forecasting practice.

iv.4.1 Alternative Oil Cost Series

We have focused on forecasting the IRAC toll of crude oil, which is usually viewed as a proxy for the global price of oil. Two alternative series that are often cited in the press are the Brent and West Texas Intermediate (WTI) prices of crude oil. Nosotros therefore, repeated the chief forecasting exercise using both of these serial. Results in Tables A3 and A4 in the supplementary materials show that while some quantitative differences emerge, our qualitative determination that BPS provides the all-time forecast results at all merely the one-step-ahead horizon remains robust to the selection of oil toll series. Associated MCS and PITs tests. Tables A5–A7 in the supplementary materials are also broadly consequent with those from the IRAC.

four.4.two Alternative BPS Specification

The BPS coefficients are able to simultaneously alter over time and learn from previous performance. Information technology is therefore, natural to explore the practical significance of the random walk land equation in the main BPS specification. To this end, we redo the main forecasting exercise using an alternative specification in which nosotros maintain the aforementioned stochastic volatility structure as in the BPS model, however, the random walk component is close off and the combination weights and intercept are instead estimated with standard linear regression techniques. Given the recursive nature of any forecasting practice, this enables the combination weights to update over the forecast evaluation period, withal, there will be substantially less flexibility in the learning process. This culling specification can be estimated with straight forward modifications of the equation organisation in Equations (7)–(8) and textbook algorithms in Due west and Harrison (2006), sec. 6.three.2 and 10.8.

In Table A8 in the supplementary materials we prove the density and point forecast results using the culling specification in which the combination weights and intercept are estimated with linear regression techniques. The main insight is that the culling specification provides comparable results to the main BPS specification at the shorter horizons, however, the main specification with combination weights and intercept with random walk learning provide superior results at longer horizons. This result highlights the importance of allowing for both flexible combination weights and intercept at longer forecast horizons.

4.4.iii Alternative Models

In our master analysis we have expanded on the empirical results in Baumeister and Kilian (2012, 2015) past investigating whether a combination forecast using BPS tin can outperform their half-dozen individual models and conventional combination methods. Extensive analysis in Alquist, Kilian, and Vigfusson (2013) as well suggests that these models tend to produce amend point forecasts than elementary univariate time series models such as AR and ARMA models. That being said, forecasters may not necessarily utilise such half-dozen models in practice, and may opt for simpler regressions with alternative predictors such as exchange rates or interest rates. To conserve space nosotros here present a concise give-and-take of the results and defer all tables with results to the Appendix, supplementary materials.

With this in mind, we estimate an additional vii predictive regressions for which the selection of variables is motivated by the tests of Granger causality in Table 8.one of Alquist, Kilian, and Vigfusson (2013), and include measures of exchange rates, involvement rates, money and inflation. Each of these specifications take the same class equally in Equation (12), where nosotros replace the CRB commodity price index with the alternative predictor. The main insight is that none of the boosted models increase the forecast accurateness of the no-change model.

We have further investigated how calculation some of these predictive models to the combination ready furnishings the results from the various combination methods considered in our article. The results, detailed in the supplementary materials, show that including these regression models in the combination set have very lilliputian effect on the BPS forecasting operation. In contrast, the culling combination methods by and large yield worse results. This is particularly the example for density forecasts for equal weights and BMA combinations.

Finally, to investigate the relative importance of fourth dimension-varying combination weights and fourth dimension-varying model parameters, nosotros accept estimated each of the predictive regressions with time-varying parameters via a random walk country equation. Overall, the results for the individual models better upon the constant parameter regression models, particularly for density forecasts. This suggests that using fourth dimension-varying parameter models is somewhat useful when forecasting the price of oil. To further explore this insight, nosotros compute forecast combinations where we include TVP regression models in the combination set. We find that the forecasting accuracy from combination models with individual TVP regeression models are very similar to the ones of combination models with abiding coefficient private models. This indicates that it is more important to account for fourth dimension-varying combination weights than individual time-varying parameters when forecasting the real price of oil.

5 Risk Analysis

In this Department nosotros analyse the risk and render properties of investing in the global market place for rough oil using the BPS modelling approach as an investment tool. The ways of the profit and loss distribution including 95% brownie regions associated with the forecasted spot prices from BPS are shown in Figure 9. For interpretation purposes, we highlight that positive hateful values indicate a profit and negative mean values indicate a loss. There are no periods of profits or losses that are credibly different from nix, nor is there whatever observable series correlation pattern across the entire data menstruation, thus, corroborating the widely held view that, in the curt run, oil prices exhibit random walk behavior. That being said, there are periods in which BPS does well, and not so well. For instance, the means signify notable profits could have been made during the two oil price collapses of 2008–2009 and 2014–2015, however, they then tend to gradually revert to zero and the credible set contains negative values.

Fig. 9 Means of the profit and loss distribution (turn a profit positive and loss negative) associated with the forecasts from the BPS model, sequentially computed at each point in time over the forecast evaluation period 1998:03–2017:12. The red dotted line show the 95% credible bands.

To quantify the measure of risk of loss associated with the profit and loss distributions, we next compute the value-at-run a risk (VaR). The VaR is widely used past regulators and practitioners in the financial industry to measure out the quantity of assets needed to cover possible losses. The unsaid ane% and 5% VaR for the BPS model profit and loss distribution over the forecast evaluation period are shown in Figure A3 in the supplementary materials. The vertical axis are in percent. For interpretation purposes, this means that, for instance, a i-calendar month 1% VaR of -ii means that at that place is a 1% take chances of a 2% loss during the one-calendar month menstruation.

Faced with such risks when operating in global oil markets, firms and portfolio managers naturally face the decision whether or not to hedge against unanticipated fluctuations in the price of oil. For instance, a petroleum company may wish to hedge its buy toll of crude oil by purchasing a futures contract. A widely used strategy for computing the optimal number of contracts needed to hedge a position is the ratio of the product of the optimal hedge ratio and the units of the position beingness hedged, to the size of a futures contract. The well-nigh common optimal hedge ratio is the Minimum Variance Hedge (MVH) ratio which aims to minimize the variance of the position's value. Information technology is calculated as the product of (i) the correlation coefficient between the changes in the spot and futures prices, ρ South , F , and (2) the ratio of the standard departure of the changes in the spot price, σS , to the standard deviation of the futures price, σF .

Means of the MVH ratios, including 95% credibility regions, are shown for the forecasted spot toll from the BPS model in Effigy 10. Since the Brent toll of rough oil is ofttimes used equally a global price criterion, we have used Brent futures data from 1992:1–2017:12 as provided by Garratt, Vahey, and Zhang (2019). The results show that the means of the optimal hedge ratio differ based on the forecast horizon. For instance, the mean of the MVH ratio tends to fluctuate betwixt 0.2 and 0.4 at the one-step-ahead horizon, compared to 0–0.ane at the six-step-ahead horizon. That being said, at each horizon we find notable spikes occur around the turn of the century as well as the 2 oil price collapses of 2009 and 2014.

Fig. 10 The minimum variance hedge ratio (MVH) from the BPS model, sequentially computed at each point in fourth dimension over the forecast evaluation period 1998:03–2017:12. The ruddy dotted line show the 95% credible bands.

half dozen Conclusion

Given the typical data pattern of the real price of oil over a substantial time period and some well-known models that draw these data, discussed in Sections 3 and 4, nosotros have successfully specified a basic probabilistic model structure and corresponding state space equation system based on the Bayesian Predictive Synthesis approach. Compared to more standard approaches similar BMA, our BPS approach contains important extensions about diagnostic analysis of model fix incompleteness and fourth dimension-varying learning weights in the combination. This arroyo also leads to the use of numerically efficient Markov chain methods in order to evaluate the Forecast Density Combination.

Applying a Bayesian process to gauge this model, nosotros take obtained an extensive set of empirical results nigh time-varying forecast incertitude and risk for the real price of oil over the period 1974–2018. This yielded substantial gains in forecast accurateness from point and, in particular, density forecasts using model combinations compared to individual models. These forecast gains are confirmed by exploring the estimated forecast time patterns, especially in the long term like 12–24 months, which is relevant information when forecasts are used for policy decisions. Dynamic patterns of the estimated individual model weights showed the relative contribution of individual models in the forecast combination. In addition, time patterns of diagnostic information about model incompleteness were obtained which give data on possible improvements nigh model specifications. We ended our analysis past showing results of time-varying risk in the oil toll forecasts and presented a basic assay of profit-loss and hedging against cost risk.

The inquiry presented can be extended in several directions and for classes of many economic datasets that are of involvement for forecasters and policy makers. Exchange rate forecasting and chance analysis using sets of countries is an obvious instance. Using micro-information to strengthen the information independent in macroeconomic forecasts and using large sets of finance data for dynamic portfolio assay are further research topics with potential interesting policy implications.

We finish with a remark on the possible connections betwixt typical data patterns of economic variables of interest, the complexity of an FDC model and the class of Monte Carlo simulation algorithms which has to be used for the numerical evaluation of the densities involved. The literature of this field is extensive and still expanding; for a recent survey we refer to Aastveit et al. (2019). Which FDC approach is the well-nigh useful to apply depends on data patterns and model specification. This is an interesting topic of future enquiry merely beyond the telescopic of the present article.

mcconnellbutragreake72.blogspot.com

Source: https://www.tandfonline.com/doi/full/10.1080/07350015.2022.2039159

0 Response to "The Art and Science of Forecasting the Real Price of Oil"

Отправить комментарий

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel