wbnicholson / bigvar Goto Github PK

View Code? Open in Web Editor NEW

57.0 57.0 17.0 11.6 MB

Dimension Reduction Methods for Multivariate Time Series

R 51.75% C++ 42.02% C 0.06% Python 6.17%

bigvar's People

Contributors

Stargazers

Watchers

Forkers

robhayward efaysal sthossain yixuan caomw vishalbelsare pig618 zla99 salkh90-temp faforecasting loochao abc1206 lbybee jonlachmann allisterh shizelong1985 rcppdeepstate

bigvar's Issues

Invoking BigVar with pandas DataFrame for data does not work, to_numpy() provides workaround

It is not unusual to pass a pandas DataFrame where an array is requested.
(Not sure whether it is responsibility of library to know about this.)

With xyz as a DataFrame, I successfully invoked:

mod=BigVAR(xyz, p=lag_max, struct="Basic", gran=[150,10], T1=T1, T2=T2, VARX={})

But then rolling_validate(mod) fails with an error such as:

--> 190         trainY = np.array(Y[p:Y.shape[0], :], copy=True)
    191         print(f'trainY shape is {trainY.shape}')
    192         trainY = np.array(trainY[0:T2, :], copy=True)

d:\Anaconda3\envs\py39\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   3022             if self.columns.nlevels > 1:
   3023                 return self._getitem_multilevel(key)
-> 3024             indexer = self.columns.get_loc(key)
   3025             if is_integer(indexer):
   3026                 indexer = [indexer]

d:\Anaconda3\envs\py39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3078             casted_key = self._maybe_cast_indexer(key)
   3079             try:
-> 3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:
   3082                 raise KeyError(key) from err

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '(slice(2, 365, None), slice(None, None, None))' is an invalid key

Workaround: be sure to convert DataFrame to a numpy array before calling BigVar.
This code does not encounter the same error when subsetting Y.

mod=BigVAR(xyz.to_numpy(), p=lag_max, struct="Basic", gran=[150,10], T1=T1, T2=T2, VARX={}) # {'k':var_count, 's':1})

Here, the DataFrame uses to_numpy() so the reduction to an array occurs before BigVAR sees the structure.

Over-penalization after 2022 March Update

Hi Dr. Nicolson, thanks for the new upgrade.

Regarding my code below:

Model1<-constructModel(as.matrix(z),1,"Basic",gran=c(20,10),cv="Rolling")
ENET<-cv.BigVAR(Model1)

Before the 2022 March update, the same code works well for both Elastic Net and Lasso method, and both yields beta matrix of appropriate sparsity level. However, after the 2022 March update, I run the same code but yielded different results. While ENET estimate still works fine, Lasso estimate tend to over-penalize and get 99.999% of beta coefficients to be 0.

This over-penalization problem seems to also affect the 2 newly added methods in this upgrade, MCP and SCAD. Both of them also result in super sparse beta matrices.

I suspect you may have changed some codes about cv.BigVAR, particularly about some methods such as Lasso in the recent upgrade. I emailed you earlier with my data attached, just in case you may want to check it yourself.

Thanks a lot for your work!

cv.BigVAR assertion code incorrectly written for single lambda parameter

My code reads:

Model1 <- constructModel(mdata, p=2, "OwnOther",ownlambdas=TRUE, gran=c(1), intercept=FALSE)
Model1Results = cv.BigVAR(Model1)

An error results:

Error in if (object@Granularity[2] == 1) { : missing value where TRUE/FALSE needed

The cause of this error has been determined as follows:

BigVARObjectClass.R around line 592 attempts to check for gran having length 1, but does so incorrectly:

        if(object@Granularity[2]==1){
            stop("only one penalty parameter; run BigVAR.est instead of cv.BigVAR")
        }

When I invoke with a single parameter for gran, this code is intended to tell me I have one penalty parameter, but instead the condition check fails. Based on other code in the repo, this one line should be corrected to:

if (length(object@Granularity)==1) { ... }

Checking the length rather than dereferencing second element of an array which may have length 1.
As this fix is made, consider whether ==1 or <=1 is more appropriate.

This is a minor issue, now that it is understood and documented here. The seeming workaround is to call BigVAR.est rather than cv.BigVAR when one lambda is supplied. However, as I write below, there are reasons I was trying to use cv.BigVAR, and these are workarounds already.

I could fix and push, but I have run into some other hiccups on this code path. So I can use this ticket also to explain what I am doing and issues I encounter.

a. I want to show that BigVAR and VAR are equivalent at lambda=0, before starting to vary lambda. This is not directly possible since lambda=0 is explicitly allowed (not sure why) and I can approximate with lambda = 0.00000001. FYI. I think it would be nice to support lambda=0. Is there a reason why lambda=0 cannot be directly supported?

b. But when I use BigVAR.est (to test lambda=0.0000001), I encounter problems calculating FEVD with frequencyConnectedness::genFEVD. The latter is designed to work with BigVAR, but explicitly checks the class returned from the fit function (cv or est). cv.BigVAR returns a class which checks out with frequencyConnectedness. But BigVAR.est returns a (fairly limited) list type, and frequencyConnectedness refuses to report the FEVD since the class type assertion fails.

My sense is that this may be a BigVAR issue because BigVAR.est is not building out a full structure similar to cv.BigVAR, and therefore I cannot then use frequencyConnectedness. But I defer to the author in determining whether this is a BigVAR bug, a frequencyConnectedness bug, or a case where the user should build their own structures to simulate what cv.BigVAR would normally be producing.

generateIRF Sigma

Hi, first I'd like to say thanks for this package. I have found in the issue #5 the function you design for generateIRF. Nevertheless I have doubts about how to extract the required arguments of the function
from a cv.BigVAR() object.

Any clarification regarding what is the meaning of the arguments and how to extract them one the model is fitted would be a great help!

generateIRF <- function(Phi,Sigma,n,k,p,Y0){}

Thanks you so much for developing such an amazing package!

BigVAR.fit not respecting VARX$contemp = FALSE

When I use BigVAR.fit to fit a model, I may have a VARX list on the form list(k=2, s=1, s1=0, contemp=FALSE). Then, BigVAR.fit will run the following code:

if (!is.null(VARX$contemp)) {
  contemp <- TRUE
  s1 <- 1
} else {
  contemp <- FALSE
  s1 <- 0
}

This will then, erroneously set contemp to TRUE, and the accompanying parameter s1 to 1. I would expect it to respect that I have chosen to not have contemporaneous interaction of the exogenous variables if I set it to FALSE.

Another question, which relates to this, is why does EFX not support contemp=TRUE?

The variance of predictions and a refit function

is it possible to get the variance of predictions according to VAR-lasso?
Can I get the function of refitting used in the simulation study in the paper "BigVAR: Tools for Modeling Sparse High-Dimensional Multivariate "?

python README: variable k not declared

k=3 declaration occurs after first use of k.

propose to shift up in the README.

I wonder if the sample code should be included in the repo.
Also the file does not follow PEP8, so I could tweak the code to follow python standards when making pull request.

predict() failed but the source code in BigVARObjectClass.R can work

Hi @wbnicholson,
Thank you for developing this valuable package! I would really appreciate some help with the BigVAR::predict function.

Package Version: the GitHub version (1.0.7) instead of the CRAN version(1.0.6)

Data: I have a matrix of 108 quarterly time series (call it train_data). Each of them has 24 points. I also created a subset of 15 time series for testing purpose (call it sub_train_data). In addition, I added 3 season dummies.

Model Setup:
varx_list=list()
varx_list$k = ncol(train_data) - 3 # number of equations (variables) = 111 - 3 = 108
varx_list$s = 4
lasso.var = constructModel(Y=train_data, p=4, struct='Basic', gran=c(50, 10), VARX=varx_list, RVAR=F, h=1, cv='LOO', MN=F, verbose=F, IC=T)
results=cv.BigVAR(lasso.var)
BigVAR::predict(results, n.ahead=1)

Issue: the predict function worked fine on the sub_train_data, but threw out an error for the train_data: "Error in if (contemp) { : argument is of length zero". (As expected, the model with larger data contains more 0 coefficients, but I am not sure how that affects the predict function.)

Investigate: I focused on n.ahead=1 because I could not trace the VARXCons function to test n.ahead>1. I found that, if I create a udf directly using the source code (line 2094 of BigVARObjectClass.R), this udf can make predictions.

Questions:

the source code can work but why does it fail when I use BigVAR::predict()?
the BigVAR::predict() can work on a small data but why does it fail when I use a larger data?

colnames betaPred

Since the order of the columns in betaPred is not the same as coefficients matrix in other classes, it would simplify the work if they had column names

Example of a named coefficients matrix from the varest class

The script can run without stop.

For some data (I think the magnitude of the data are huge), the cv.BigVAR function can run without stop. How I can resolve this problem?
Thanks for help!

window size and dual not working?

Hi Dr. Nicolson,

I'd really appreciate your help with this issue, I'm trying to set <dual> to TRUE in order to conduct dual cross validation, but it's not working, I'm trying to use it on the example data (Y), but in the construcModel function this argument is not used, ¿where or in which function should I input the dual argument?

mod1 <- constructModel(Y, p = 3, struct="BasicEN", gran = c(50,10), h = 10, cv="Rolling", verbose =FALSE, dual = TRUE)
Error in constructModel(Y, p = 3, struct = "BasicEN", gran = c(50, 10), : unused argument (dual = TRUE)

Also, whenever I try to set the window.size to something different from zero,
mod1 <- constructModel(Y, p = 3, struct="BasicEN", gran = c(50,10), h = 10, cv="Rolling", verbose =FALSE, window.size = 10) results = cv.BigVAR(mod1) # The main function of the BigVAR package. Performs cross

I get the following error when trying to run cv.BigVAR:
Error in apply(Z, 1, mean) : dim(X) must have a positive length

Dim reductoin on big dataset

Great package.

Is the package suitable for very big datasets? I am talking about the datasets of dimension (1.000.000x300)?

I have just tried this code:

mod1<-constructModel(data_sample,p=4,"Basic",gran=c(150,10),RVAR=FALSE,h=1,cv="Rolling",MN=FALSE,verbose=FALSE,IC=TRUE)
results=cv.BigVAR(mod1)

and it is pretty slow with just (1000x100) X matrix (cca 10 minutes).

My goal is to do dimension reduction, but not sure if your package is appropriate for this.

How can i extract coefficients for each variable from a BigVAR object?

Thanks for your contribution.
How can I extract coefficients for each variable from a BigVAR object?
Thank you.

Restrictions on coefficients

Is possible to include some restrictions on coefficients (such as setting coefficients to zero) in VARXFit?
The Foschi and Kontoghiorghes 2013 - Estimation on VAR Models Computational Aspects page 12 session 4. VAR Models with Zero Coefficient Constraints ( VAR model can be written as the SURE model ...) is implemented on BigVAR ?

When Y has class "mts"

The line

if(class(Y)!="matrix"){Y=matrix(Y,ncol=1)}

assumes that class(Y) has length 1. However, commonly we have

class(Y)
[1] "mts" "ts" "matrix"

and applying "as.matrix" as suggested in https://arxiv.org/pdf/1702.07094.pdf on page 10 doesn't convert this to have class "matrix" since this is an inherited class from "matrix"...

class(as.matrix(Y))
[1] "mts" "ts" "matrix"

So it leads to a warning when the above "if" statement is called, and worse -- it converts the multivariate ts into a (nrow(Y) * ncol(Y))-by-1 matrix.

Forecast for VARXFit

There is a way to forecast a VARXFit result ?

Default T1 causes error

I get an error on row 896 in BigVARObjectClass.R
Error in ZFull$Z[, 1:(v - h)] : only 0's may be mixed with negative subscripts
v=7
h=12

v comes from
for (v in (T1-h+1):T2) { 18-12+1 = 7

T1 is earlier in the code decreased by p on row 579 in BigVARObjectClass.R
T1 <- T1-max(p,s) = 30 - 12 = 18

I use the default T1 when constructing the model;
T1=floor(nrow(Y)/3) = 30

Is the default T1 too small when constructing a model? since nrow(Y)/3 < 2h+p+1
or is it the wrong character on row 579?

Out-of-sample evaluation

I am sorry to ask this probably naive question, but I need to be sure and I am not coding in R often.

When using cv.BigVAR() the right-hand side endogenous variables are automatically lagged?
results@preds in the example below are oos predictions, i.e., the first prediction is obtained from an estimation up to T_2 for T_2 +1. The next predictions are rolled over by one period, in my case, keeping the optimal lambda fixed?
I need to lag exogenous variables myself, such that the timing corresponds to equation (1) in the vignette.

Here is my code, which should give for each of the endogenous variables in the system a separate OOS R2 (benchmarked wrt to conditional mean forecast):

VARX=list(k=5,s=1)
mod1<-constructModel(y_full,
                    VARX=VARX,
                    p=1,
                    "BasicEN",
                    gran=c(100,10),
                    h=1,
                    cv="Rolling",
                    verbose=FALSE,
                    IC=TRUE,
                    model.controls=list(intercept=TRUE, alpha=0.5))

results=cv.BigVAR(mod1)

model.pred <- results@preds
mean.pred  <- results@MeanPreds
# test dep var
y.test     <- y[floor(2/3*nrow(y_full)+1):nrow(y_full),]
# MSFE
msfe.model <- colMeans((y.test - model.pred)**2)
msfe.mean  <- colMeans((y.test - mean.pred)**2)
# OOS R2
1- msfe.model/msfe.mean

Segmentation faults in ICX and ZmatF when using short data (Fix in PR)

When running BigVAR on short data, the C++ functions ICX and ZmatF can experience segmentation faults which crashes the whole R session.

The code below demonstrates and explains the behaviour

# Create some data
x <- matrix(rnorm(9*13), 9, 13)
y <- matrix(rnorm(9), 9, 1)

# Cause segmentation faults
BigVAR:::ICX(y, x, 1.0, 12, 1, 13.0, "AIC", 1) # segfault in ICX casued by nrow(Y) == p (i in the loop)
# Segfault occurs in
#
# MatrixXd Z(k*p,M);
# Z.setZero();
# Z.col(0)=Y1.reverse();
#
# M will be 0 and Z thus has no columns.

BigVAR:::ICX(y, x, 1.0, 12, 1, 13.0, "AIC", 3) # segfault in ZmatF caused by nrow(Y) == max(i, j) + h - 1
# Segfault occurs in
#
# MatrixXd Y2=Y.bottomRows(T-c);
#
# T-c will be 0, and thus no rows will be selected, which does not work.

A fix is available in PR: #32

Predict only returns last n.ahead prediction

Hi @wbnicholson,
I've noticed that the predict function only returns the last point forecast for all endogenous variables in a multistep forecast.

I have tried to locate the problem and I think it's in the predictMS / predictMSX function where the pred variable isn't stored before running the next n.ahead iteration. One solution could be to return Y which stores all predictions and filters out rows with historic data.
predictMS return(rbind(Y, pred)[-(1:p), ] )
predictMSX return(rbind(Y, pred)[-(1:max(p,s)), ] )

And then skip the matrix limitation (ncol=1) in predict class function
fcst <- predictMS(matrix(fcst,nrow=1),Y[(nrow(Y)-p+1):nrow(Y),],n.ahead-1,betaPred,p,MN)

fcst <- predictMSX(matrix(fcst,nrow=1),as.matrix(Y[(nrow(Y)-C+1):nrow(Y),1:(k)]),n.ahead-1,betaPred,p,newxreg,matrix(Y[(nrow(Y)-C+1):nrow(Y),(ncol(Y)-m+1):ncol(Y)],ncol=m),m,s,1,MN,contemp)

I have verified the prediction result by translating the BigVar class to a varest class and run their prediction function

Example output without this fix (predict.BigVAR, n.ahead=6, number of endogenous variables 7)

[,1]	[,2]	[,3]	[,4]	[,5]	[,6]	[,7]
-0.07483348	0.2078078	0.5577497	0.4923881	0.7485516	0.2824906	0.1842608

Example output with the fix; (last row is the same as current output)

Y1	Y2	Y3	Y4	Y5	Y6	Y7
0.19694157	0.1400551	0.2231896	0.5610747	0.4529248	0.8701000	0.3086403
0.03157659	0.2037264	0.3376861	0.4832005	0.4502285	0.6097922	0.2669111
0.03696794	0.1572429	0.4032445	0.5270550	0.5196305	0.5684885	0.2580613
-0.42382382	0.1871817	0.4830292	0.5387928	0.7028809	0.4456416	0.2973962
0.12573047	0.2354062	0.5292030	0.6077442	0.6309911	0.3632288	0.2060916
-0.07483348	0.2078078	0.5577497	0.4923881	0.7485516	0.2824906	0.1842608

PS. I have not looked into how the changes in predictMS are affecting the cross-validation in cv.BigVAR

Predict - Confidence intervals

It would help me a lot if the predict function also returned the confidence intervals. Is this something that is planned?

python README sample: not repeatable

The rolling_validate(mod) line fails for various sets of data.
I found this with my own data, but ultimately also with the README sample code.

The sample code does not presently seed the random generator and therefore MultVARSim is generating different arrays and the output varies. Setting the seed value explicitly shows that rolling_validate(mod) will fail on many seed values.

I wrapped the code in a loop where I set the random seed each iteration (I had started lower but then adjusted the seed range to have one success at seed=5 before a failure at seed=6):

import numpy as np
from BigVAR.BigVARSupportFunctions import MultVARSim, CreateCoefMat
from BigVAR.BigVARClass import BigVAR,rolling_validate

k=3;p=4

# example coefficient matrix
B1=np.array([[.4,-.02,.01],[-.02,.3,.02],[.01,.04,0.3]])
B2=np.array([[.2,0,0],[0,.3,0],[0,0,0.13]])
B=np.concatenate((B1,B2),axis=1)
B=np.concatenate((B,np.zeros((k,2*k))),axis=1)
#print(B)
A=CreateCoefMat(B,p,k)

for sd in np.arange(5,20):
    print(f'Random seed {sd}')
    np.random.seed(sd)
    Y=MultVARSim(A,p,k,0.01*np.identity(3),T=500)
    VARX={}

    # construct BigVAR object:
    # Arguments:
    # Y T x k multivariate time series
    # p: lag order
    # penalty structure (only Basic and BasicEN supported)
    # granularity (depth of grid and number of gridpoints)
    # T1: Start of rolling validation
    # T2: End of rolling validation
    # alpha: elastic net alpha candidate
    # VARX: VARX specifications as dict with keys k (number of endogenous series), s (lag order of exogenous series)

    mod=BigVAR(Y,p,"Basic",[50,10],50,80,alpha=0.4,VARX=VARX)

    res=rolling_validate(mod)

    # coefficient matrix
    res.B

    # out of sample MSFE
    res.oos_msfe

    #optimal lambda
    res.opt_lambda
    print('-------------')

The failure when seed=6 is:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-9-f464d9f4ab81> in <module>
     34     mod=BigVAR(Y,p,"Basic",[50,10],50,80,alpha=0.4,VARX=VARX)
     35 
---> 36     res=rolling_validate(mod)
     37 
     38     # coefficient matrix

e:\repo\bigvar\python\BigVAR\BigVARClass.py in rolling_validate(self)
    208     oos_aic = eval_ar(Y, T2, Z1.shape[1], 'aic', p, loss)
    209 
--> 210     oos_bic = eval_ar(Y, T2, Z1.shape[1], 'bic', p, loss)
    211 
    212     oos_mean = eval_mean(Y, T2, Z1.shape[1], loss, p)

e:\repo\bigvar\python\BigVAR\BigVARSupportFunctions.py in eval_ar(Y, T1, T2, ic, p, loss)
    142         mod = var_mod.fit(maxlags=p, ic=ic)
    143         lag_order = mod.k_ar
--> 144         yhat = mod.forecast(trainY[-lag_order:], 1)
    145         MSFE_temp = calc_loss(Y[u+p, :]-yhat, loss)
    146         MSFE.append(MSFE_temp)

d:\Anaconda3\envs\py39\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py in forecast(self, y, steps, exog_future)
   1083         else:
   1084             exog_future = np.column_stack(exogs)
-> 1085         return forecast(y, self.coefs, trend_coefs, steps, exog_future)
   1086 
   1087     # TODO: use `mse` module-level function?

d:\Anaconda3\envs\py39\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py in forecast(y, coefs, trend_coefs, steps, exog)
    227     """
    228     p = len(coefs)
--> 229     k = len(coefs[0])
    230     # initial value
    231     forcs = np.zeros((steps, k))

IndexError: index 0 is out of bounds for axis 0 with size 0

If I restart the kernel and start with seed=6 it fails with this error on the first iteration, so it does not seem to be due to state.

Large k, small T scenario

Hey Will,

Thanks for the great R package. I came across a case where my data is a panel data with many cross-sections and relatively short time period (k in thousands and T in 30s). Typical VAR won't work due to the k>T issue. Since LASSO can handle high dimensions, I hope VAR-LASSO is a possible solution.

So I give a shot at BigVAR. It seems to be working fine (used the basic LASSO) and provides me some estimates, but it comes with a warning of k being bigger than T (from the function 'constructModel'). I'm wondering whether this scenario has been justified before. I did not find evidence in your paper. So I'm hoping to gain some insights from you.

Thanks.

Rolling n-period ahead predictions

Hi, and thanks for this great package. It really helps!

I want to get the n-period ahead forecast in a rolling sense. This means that I need to find the best lambda, make n-period ahead predictions, increase my validation sample (recursive or rolling, this does not matter) by one time period, and re-do the step above.

What would be the most efficient way to get the n-ahead predictions for this scenario? Right now I am thinking about using cv.BigVAR to get the optimal lambda and subsequently re-fit using BigVAR.fit(), before predicting predict(n.ahead=n).

Put differently, I am wondering if my result can be achieved in a one-shot estimation?

Is coef() method return the best estimated beta?

First, thank you for the package! :)

In the Vignette, it is said that:
A formatted dataframe of the coefficient matrix of the final iteration of forecast evaluation can be obtained via the coef
method.

So it is NOT the best estimated beta under the best_lambda ?
To get the best estimated beta under the best_lambda, I have to run another BigVAR.fit() ? As code below:

library(BigVAR)
data(Y)
mod1<-constructModel(Y,p=4,"Basic",gran=c(150,10),h=1,cv="Rolling",verbose=FALSE,IC=TRUE,model.controls=list(intercept=TRUE))
results=cv.BigVAR(mod1)
best_lambda = results@OptimalLambda
coef1 = coef(results) # NOT the best

model2=BigVAR.fit(Y,p=4,"Basic",lambda=best_lambda,intercept=TRUE)
coef2 = model2[,,1] # best estimated beta under the best_lambda

generateIRF

Hi, first I'd like to say thanks for this package. I was reading the BigVAR paper on arXiv and noticed the mentioned generateIRF function on page 18. I don't see this function in this repo though. Are there plans to incorporate this or is it imported from a different package?

Thanks!

BigVAR.fit failing with "Forecast Horizon too long; increase T1 or decrease h".

Hi!

When using the BigVAR.fit function, explicitly without using any cross validation, when I have short data I get the error Forecast Horizon too long; increase T1 or decrease h. This appears to be wrong as I would not care about the values of T1 or T2 when I dont want to run any cross validation.

Upon inspecting the code I see that a call to constructModel to create a variable called temp.bv is created and never used. Is the intention here to perform some general checks on the model before allowing it to be fit? Or is it just something that can be removed?

See the code here:

BigVAR/BigVAR/R/BigVARFitFun.R

Lines 281 to 282 in 07bdaea

 temp.bv <- constructModel(Y, p, struct = struct, gran = c(lambda), ownlambdas = TRUE, 

 VARX = VARX, cv = "LOO", model.controls=list(MN=MN, RVAR=RVAR, C=C, intercept = intercept,gamma=gamma))

I would be happy to provide a PR removing that line, or fixing the issue in another way, I mainly need some advice on how to proceed so that the intentions of the code are preserved.

Lag-weighted Lasso in BigVar.fit

Hi and thanks for this wonderful package, it certainly helps me with my research.

My topic involves dimension reduction in VAR model. So I found this package recently and tried to explore.
While I am experimenting with different structures built in the BigVAR.fit(), I found out that it only returns one coefficient matrix from structures like 'Basic' or 'SCAD'. But When I set the structure to 'Tapered'. It returns ten matrices and I am not sure which one is the coefficient matrix. Attached is a section copied from my console.

> b <- BigVAR.fit(epu,p=1,'Tapered',1e-2)
> dim(b)
[1] 24 25 10
> b <- BigVAR.fit(epu,p=1,'BasicEN',1e-2)
> dim(b)
[1] 24 25  1
> b <- BigVAR.fit(epu,p=1,'SCAD',1e-2)
> dim(b)
[1] 24 25  1

I read the tutorial but didn't find the answer. Does anybody know why 'Tapered' gives a such result? Thanks in advance.

Predict function VARX

I have a penalized VARX model after applying cv.BigVAR() function and I want to perform a 10 -step forecast but the function predict returns the following error: "Error in 2:nrow(Z) : argument of length 0".

I have, 30 endogenous variables and 15 exogenous variables so I am specifying:
predict(Model_results, n.ahead = 10, newxreg = exogenous_variables[1:10, 1:15])
There is no problem when I launch forecast with n.ahead = 1.

Any suggestion of what could be the problem?

Thank you so much!

Lamda grid parameters question.

Hi Dr. Nicholson,
I'm trying to understand the vector of parameters of the gran() argument in the constructModel function.
Your BigVAR guide states, "The first option controls the depth of the lambda grid (a good default option is 50). The second option controls the number of grid values (a good default is 10)."

My question refers to what is meant by "depth of the Lambda grid" .
On the other hand, it is unclear how to choose the second option in gran().

I would really appreciate your help or guidance in finding resources that can help me solve this inquiry.

In-sampe estimation without cross-validation

Is there any function that could help me fit a model without cross-validation which chooses the lambda value to minimize in-sample MSFE on the whole dataset?

I tried to fit a BigVAR model on the whole dataset without cross-validation running BigVAR.est. It returns 10 lambda and coefficient matrixes. I see in the manual that "This method allows the user to construct their own penalty parameter selection procedure.". So I then need to predict and calculate accuracy based on this to choose the best lambda
Is there any easier method? I've tried to modify T1 and T2 for cv.BigVAR without success

Why fitting a model without cross-validation?
I have a COVID-effect on the last 12 observations in the data and therefore added an exogen blip dummy to filter this effect out, however, since the dummy is at the end of the time series the coefficient for the dummy after running cv.BigVAR is 0, since the dummy is located in the forecast evaluation section.

last model fit by cv.BigVAR?

First, thank you for the package! :)

Why the functions cv.BigVAR and BigVAR.est give different estimates of beta for the same lambda? Or am I missing something with the training sample?

library(BigVAR)

data(Y)
y_st <- apply(Y, 2, FUN = scale)

model_spec <- constructModel(y_st, p = 6, 
                             struct = "OwnOther", 
                             ownlambdas = TRUE,
                             gran = 10, 
                             verbose = TRUE, 
                             VARX = list())

model_cv <- cv.BigVAR(model_spec)
model_est <- BigVAR.est(model_spec)

b11_est <- model_est$B[1, 1, 1]
b11_cv <- model_cv@betaPred[1, 1]

b11_est
b11_cv

Predict with VARX returns error in VARXCons

Hi,

First of all, thank you for this great package.

Second, I have a penalized VARX model and I want to perform a forecast of 10 periods ahead. The model consists of 17 endogenous variables and 34 exogenous variables. I'm specifying:

predict(BV,10,newxreg=test[ , (num.endogenous):ncol(test)])

And I'm getting the following error:

Error in VARXCons(as.matrix(Y), X, ncol(Y), p, m, s, oos = TRUE) : 
  Not compatible with requested type: [type=list; target=double]

It seems there's a variable type error inside the C++ function VARXCons. Any suggestion of what could be the problem?

Thanks!

MCP gets stuck in an infinite loop in some cases - Fix in PR

When running MCP models I sometimes end up with BigVAR hanging and using 100% CPU. After some investigation and debugging I found that the function mcp_loop gets stuck because it cannot exit the inner loop. I have created an example which reproduces this here:

library(BigVAR)

data <- c(-1.48044, 1.38428, -1.11709, -1.05228, 1.23826, -1.16733, -1.17302, -1.02108, -0.573326, -1.42612, 1.22905, -1.41378, 0.631645, 2.18095, -1.16015, -0.0359897, -1.88949, 1.02950, -1.36236, 1.07887, -1.18923, 0.0302420, 0.465342, -0.823213, -0.0494255, -0.963969, 0.800805, -1.31066, 1.00684, -1.25239, -1.17256, 0.444063, -0.888952, -0.119965, 0.393385, -0.769630, -1.24420, 1.00684, -1.36941, -0.110486, 0.576233, -0.782189, 1.05065, 0.384588, -0.0168027, -1.16844, 1.00684, -1.22769, 0.130075, 0.478583, -0.337238, 1.67375, 0.100864, -0.0168027, -1.11272, 1.00684, -1.22489, 0.110229, 0.386608, -0.606174, 0.981795, -0.741247, 0.0155875, -1.04067, 1.00684, -0.971007, 0.490917, 0.680503, -0.689486, -0.103170, 0.239964, 0.310044, -0.969660, 1.00684, -1.15860, 0.691785, 0.576233, -0.480017, 0.521608, -0.618353, 0.768415, -0.915338, 1.00684, -1.07082, -0.270459, 0.312838, -0.481030, 0.0227934, -0.473929, -0.148327, -0.853851, 1.00684, -0.862410, 0.591351, 0.331281, -0.307814, 0.194104, 0.347571, 0.244282, -0.789942, 1.00684, -1.18606, 0.992487, 0.176176, -0.0630332, 0.491377, -1.85833, 0.800805, -0.735438, 1.00684, -0.535723, -0.0900386, 0.00877643, -0.110999, -0.328225, -0.397809, -0.0825647, -0.686150, 0.921618, -0.667125, -0.511021, -0.278262, -0.191999, -0.479381, -1.41559, -0.540936, -0.624656, 0.646707, -0.598799, -0.491174, -0.291266, -0.443399, -0.141799, -0.376824, -0.311259, -0.572744, 0.535418, 0.0477189, -0.711889, -0.304507, -0.424499, -1.25867, -0.674388, -0.639088, -0.517906, 0.344424, -0.748785, -0.611455, -0.392935, -0.441267, -0.976516, -0.517850, -0.769630, -0.453095, 0.260521, -0.131507, -0.772030, -0.301906, -0.893573, -1.27379, 0.143908, -0.933545, -0.367973, 0.170103, -0.537704, -0.230767, -0.338554, -0.794473, -0.234172, -1.45275, -0.409412, -0.279967, 0.170103, -0.543600, -0.150179, -0.257219, -0.770764, -0.415559, -0.179374, -0.246479, -0.175063, 0.170103, 0.0309459, 1.17291, -0.345647, -0.710805, -0.308071, 0.989165, 0.931348)
datamat <- matrix(data, 21, 9, byrow=T)

model <- BigVAR::constructModel(Y=datamat, p=4, h=4, struct="MCP", gran=c(50, 10), VARX=list(), T1=13, T2=17)
fit <- BigVAR::cv.BigVAR(model)

I do not really know why this happens just some of the time, it seems to be sensitive to the data used, why I have included that in the example above.

I created a solution to this problem by adding a counter to the inner loop which allows it to run for a maximum of 100 iterations. It is available as a PR here: #30

Error in .lassoVARTL() argument palpha is missing

It seems that the argument palpha is missing in the function call on row 520

BigVAR/BigVAR/R/BigVARFitFun.R

Line 520 in 6041ea6

 temp <- .BigVAR.fit(group,beta,trainZ,trainY,lambda,tol,p,m,k1,k,s,s1,MN,C,intercept,separate_lambdas,dual,activeset,q1a,jj,jjcomp,VARX,alpha,kk) 

	temp.bv <- constructModel(Y, p, struct = struct, gran = c(lambda), ownlambdas = TRUE,
	VARX = VARX, cv = "LOO", model.controls=list(MN=MN, RVAR=RVAR, C=C, intercept = intercept,gamma=gamma))