Version 1.7 of the BayesReg package has now been released. There are several improvements in this version:
- Sampling speed has been improved for large design matrices
- Sampling speed has been improved when using block sampling with Gaussian data
- The sampling efficiency for the horseshoe+ has been significantly improved by using a modified sampler
As before, the precompiled MEX files for Windows, Linux and MacOSX to increase sampling speed can be obtained here. To use these, all you need to do is download this file and unzip the contents into the “bayesreg” folder.
Version 1.6 of the BayesReg package has now been released. There are two main changes to this version:
- The code now calculates and reports the Widely Applicable Information Criterion (WAIC) score in place of the DIC score.
- Options for sampling the coefficients in blocks have been added. This allows the code to be applied to very large predictor matrices (p > 50,000) even on a PC.
Note: The precompiled MEX files for Windows, Linux and MacOSX to increase sampling speed can be obtained here. To use these, all you need to do is download this file and unzip the contents into the “bayesreg” folder.
Version 1.5 of the BayesReg package has now been released. There are two main changes to this version:
- We have written a new version of the logistic regression sampling code in C++ that makes use of multiple cores. This can result in significant speed-ups when sampling.
- An efficient MATLAB implementation of logistic regression sampling has been added to the code, so that it now runs even without MEX files (though it will not be as fast).
Precompiled MEX files for Windows, Linux and MacOSX can be obtained here. To use these, all you need to do is download this file and unzip the contents into the “bayesreg” folder.
Version 1.4 of the BayesReg package has been released. This has a large additional feature — users can now assign predictors to logical groupings (potentially overlapping, so predictors can be part of multiple groups). This can be used to exploit a priori knowledge regarding predictors and how they may be related to each other (for example, in grouping genetic data into genes and collections of genes such as pathways). The features added are:
- Added option ‘groups’ which allows grouping of variables into potentially overlapping groups
- Grouping works with HS, HS+ and lasso priors
- Fixed a regression bug with g priors and logistic models
- Updated examples to demonstrate grouping
You can obtain the latest version of the BayesReg software from here.
Version 1.3 of the BayesReg package has been released. This has some substantial improvements in user-friendliness. Specifically, the changes to the new version are:
- Tidied up the summary display
- Added support for MATLAB tables
- Added support for categorical predictors
- Added a prediction function that also provides prediction statistics
- Updated and improved the example scripts
- Fixed a bug in computation of R2
You can obtain the latest version of the BayesReg software from here.
I have recently uploaded some new MATLAB code that implements lasso based estimation of linear models in which the residuals follow a Student-t distibution using the expectation-maximisation algorithm. By varying the degrees-of-freedom parameter of the Student-t likelihood, the model can be made more resistant to outlying observations.
The software has the following features:
- Automatically generate complete lasso regularisation paths for a given degrees-of-freedom
- Selection of lasso regularisation parameter and degrees-of-freedom using either cross-validation or information criteria.
The code is straightforward to run, efficient and comes with several examples that recreate the analyses from the paper below. To cite this toolbox, please use the reference below:
The code can be obtained from MathWorks File Exchange. If you find this code useful, I would be greatly obliged if you could leave a comment or rating on the above File Exchange page.
- “Robust Lasso Regression with Student-t Residuals”, D. F. Schmidt and E. Makalic, Lecture Notes in Artificial Intelligence, to appear, 2016
In conjunction with Enes Makalic I have recently finished writing MATLAB and R code to implement efficient, high dimensional Bayesian regression with continuous shrinkage priors. The package is very flexible, fast and highly numerically stable, particularly in the case of the horseshoe/horseshoe+, for which the heavy tails of the prior distributions cause problems for most other implementations. It supports the following data models:
- Gaussian (“L2 errors”)
- Laplace (“L1 errors”)
- Student-t (very heavy tails)
- Logistic regression (binary data)
It also supports a range of state-of-the-art continuous shrinkage priors to handle different underlying regression model structures:
- Ridge regression (“L2” shrinkage/regularisation)
- LASSO regression (“L1” shrinkage/regularisation)
- Horseshoe regression (global-local shrinkage for sparse models)
- Horseshoe+ regression (global-local shrinkage for ultra-sparse models)
The MATLAB code for Version 1.2 of the package can be downloaded here, and the R code can be obtained from CRAN under the package name “bayesreg”. This R package can also be installed from within R by using the command “install.packages(“bayesreg”)”.If you use the package, and wish to cite it in your work, please use the reference below.
- “High-Dimensional Bayesian Regularised Regression with the BayesReg Package”, E. Makalic and D. F. Schmidt, arXiv:1611.06649 [stat.CO], 2016
I have just return from the 29th Australasian Joint Conference on Artificial Intelligence, held at Hobart, Tasmania, Australia from the 5th through 9th of December. This conference is usually an interesting, open and friendly environment to discuss topics in applied machine learning, and this year was no different. There was quite a focus on “deep learning”, as would be expected given the current hype surrounding this neural network revival, but there was also a number of other interesting topics covered in the technical sessions.
I presented, or was involved with the presentation of, three papers: “Approximating Message Lengths of Hierarchical Bayesian Models Using Posterior Sampling”, “Bayesian Robust Regression with the Horseshoe+ Estimator” and “Bayesian Grouped Horseshoe Regression with Application to Additive Model”, which were all quite “horseshoe”-centric, given my current interest in global-local shrinkage models.
If you are an Australian — or even international — researcher with interests in applied machine learning and artificial intelligence, I recommend you give this conference a visit one time. Next year is particularly attractive as it co-incides with IJCAI and is being held in Melbourne.
I have uploaded the MATLAB implementation of the Bayesian LASSO sampling hierarchy for inference of autoregressive models from an observed time series. The idea behind the approach is place Laplace prior distributions over the partial autocorrelations of an AR(k) model, which leads to a relatively simple Gibbs’ sampling scheme, and guarantees stationarity. Both empirical Bayes and fully Bayesian estimation of the shrinkage hyperparameter is available.
Once downloaded and extracted from the ZIP file, all 3 folders/subfolders should be added to the MATLAB path. The script “RunRealDataTest” demonstrates how to use the software. [code]
- “Estimation of Stationary Autoregressive Models with the Bayesian LASSO”, D. F. Schmidt and E. Makalic, Journal of Time Series Analysis, Vol. 34, No. 5, pp. 517–531, 2013
For my first post of real content, I’ve decided to comment on a document I have just finished writing that discusses Jorma Rissanen’s MDL (minimum description length) linear regression criterion that was presented in “MDL Denoising” [J. Rissanen, IEEE Transactions on Information Theory, Vol. 46, No. 7, 2000]. I’ve wanted to understand the mathematics behind this paper for quite some time now, and last week I finally decided to sit down and work through the paper.
The result (available here) is a detailed, step-by-step derivation of Rissanen’s criterion, which I personally think is significantly easier to understand than the more terse derivation presented in the original paper, and I hope someone will find it useful 🙂 Over the next couple of weeks, time permitting, I plan a similar exercise for several more of J. Rissanen’s papers – particular, “Fisher Information and Stochastic Complexity” and “Strong Optimality of the Normalized ML Models as Universal Codes and Information in Data”.