The Ordinary Least Squares (OLS) estimator is β̂ = (XtX)−1(XtY) for a design matrix X and an outcome vector Y (Hansen 2022).
The following code shows how to compute the OLS estimator using
Armadillo and sending data from R to C++ and viceversa using
cpp11
and cpp11armadillo
(Sanderson and Curtin 2016):
#include <cpp11.hpp>
#include <cpp11armadillo.hpp>
using namespace arma;
using namespace cpp11;
[[cpp11::register]] doubles_matrix<> ols_mat_(const doubles_matrix<>& x) {
Mat<double> Y = as_Mat(x); // convert from R to C++
Mat<double> Yinv = inv(Y); // Y^(-1)
return as_doubles_matrix(Yinv); // convert from C++ to R
}
The previous code includes the cpp11
and
cpp11armadillo
libraries (cpp11armadillo calls Armadillo)
to allow interfacing C++ with R. It also loads the corresponding
namespaces in order to simplify the notation (i.e., using
Mat
instead of arma::Mat
), and the function
as_Mat()
and as_doubles_mat()
are provided by
cpp11armadillo
to pass a matrix
object from R
to C++ and that Armadillo can read and then pass it back to R.
The use of const
and &
are specific to
the C++ language and allow to pass data from R to C++ without copying
the data, and therefore saving time and memory.
cpp11armadillo
provides flexibility and in the case of
the resulting vector of OLS coefficients, it can be returned as a matrix
or a vector. The following code shows how to create three functions to
compute the OLS estimator and return the result as a matrix or a vector
avoiding repeated code:
Mat<double> ols_(const doubles_matrix<>& y, const doubles_matrix<>& x) {
Mat<double> Y = as_Mat(y); // Col<double> Y = as_Col(y); also works
Mat<double> X = as_Mat(x);
Mat<double> XtX = X.t() * X; // X'X
Mat<double> XtX_inv = inv(XtX); // (X'X)^(-1)
Mat<double> beta = XtX_inv * X.t() * Y; // (X'X)^(-1)(X'Y)
return beta;
}
[[cpp11::register]] doubles_matrix<> ols_mat_(const doubles_matrix<>& y,
const doubles_matrix<>& x) {
Mat<double> beta = ols_(y, x);
return as_doubles_matrix(beta);
}
[[cpp11::register]] doubles ols_dbl_(const doubles_matrix<>& y,
const doubles_matrix<>& x) {
Mat<double> beta = ols_(y, x);
return as_doubles(beta);
}
In the previous code, the ols_mat_()
function receives
inputs from R and calls ols_()
to do the computation on C++
side, and ols_dbl_()
does the same but it returns a vector
instead of a matrix.
The package repository includes the directory
cpp11armadillotest
, which contains an R package that uses
Armadillo, and that provides additional examples for eigenvalues,
Cholesky and QR decomposition, and linear models.