2
0
Fork 0

CRAN submittion

This commit is contained in:
Daniel Kapla 2021-03-08 13:31:13 +01:00
parent c554ae6e9c
commit a419a8cdc7
12 changed files with 70 additions and 111 deletions

7
.gitignore vendored
View File

@ -1,9 +1,10 @@
simulations/results/* simulations/results/*
literature/* literature/*
doc/*
CVE/src/*.o CVarE/src/*.o
CVE/src/*.so CVarE/src/*.so
CVE/src/*.dll CVarE/src/*.dll
*.tgz *.tgz
*.tar.xz *.tar.xz

View File

@ -1,12 +1,28 @@
Package: CVE Package: CVarE
Type: Package Type: Package
Title: Conditional Variance Estimator for Sufficient Dimension Reduction Title: Conditional Variance Estimator for Sufficient Dimension Reduction
Version: 0.3 Version: 1.0
Date: 2021-03-04 Date: 2021-03-05
Author: Daniel Kapla <daniel@kapla.at>, Lukas Fertl <lukas.fertl@chello.at>
Maintainer: Daniel Kapla <daniel@kapla.at> Maintainer: Daniel Kapla <daniel@kapla.at>
Description: Implementation of the Conditional Variance Estimation (CVE) method. Author: Daniel Kapla [aut, cph, cre],
Lukas Fertl [aut, cph],
Efstathia Bura [ctb]
Description: Implementation of the Conditional Variance Estimation (CVE)
Fertl and Bura (2021) <arXiv:2102.08782> and the Ensemble Conditional Variance
Estimation (ECVE) Fertl and Bura (2021) <arXiv:2102.13435> method.
CVE and ECVE are sufficient dimension reduction (SDR) methods
in regressions with continuous response and predictors. CVE applies to general
additive error regression models while ECVE generalizes to non-additive error
regression models. They operate under the assumption that the predictors can
be replaced by a lower dimensional projection without loss of information.
It is a semiparametric forward regression model based exhaustive sufficient
dimension reduction estimation method that is shown to be consistent under mild
assumptions.
License: GPL-3 License: GPL-3
Contact: <daniel@kapla.at>
URL: https://git.art-ist.cc/daniel/CVE
Encoding: UTF-8 Encoding: UTF-8
NeedsCompilation: yes
Imports: stats,mda Imports: stats,mda
RoxygenNote: 7.0.2 RoxygenNote: 7.0.2

View File

@ -17,4 +17,4 @@ importFrom(mda,mars)
importFrom(stats,model.frame) importFrom(stats,model.frame)
importFrom(stats,rbinom) importFrom(stats,rbinom)
importFrom(stats,rnorm) importFrom(stats,rnorm)
useDynLib(CVE, .registration = TRUE) useDynLib(CVarE, .registration = TRUE)

View File

@ -46,7 +46,7 @@
#' arXiv:2102.13435 #' arXiv:2102.13435
#' #'
#' @docType package #' @docType package
#' @useDynLib CVE, .registration = TRUE #' @useDynLib CVarE, .registration = TRUE
"_PACKAGE" "_PACKAGE"
#' Conditional Variance Estimator (CVE). #' Conditional Variance Estimator (CVE).
@ -495,7 +495,7 @@ cve.call <- function(X, Y,
h <- estimate.bandwidth(X, k, nObs) h <- estimate.bandwidth(X, k, nObs)
} }
dr.k <- .Call('cve_export', PACKAGE = 'CVE', dr.k <- .Call('cve_export', PACKAGE = 'CVarE',
X, Fy, k, h, X, Fy, k, h,
method_nr, method_nr,
V.init, V.init,

View File

@ -1,55 +0,0 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/CVE.R
\docType{package}
\name{CVE-package}
\alias{CVE}
\alias{CVE-package}
\title{Conditional Variance Estimator (CVE) Package.}
\description{
Conditional Variance Estimation (CVE) is a novel sufficient dimension
reduction (SDR) method for regressions satisfying \eqn{E(Y|X) = E(Y|B'X)},
where \eqn{B'X} is a lower dimensional projection of the predictors and
\eqn{Y} is a univariate response. CVE,
similarly to its main competitor, the mean average variance estimation
(MAVE), is not based on inverse regression, and does not require the
restrictive linearity and constant variance conditions of moment based SDR
methods. CVE is data-driven and applies to additive error regressions with
continuous predictors and link function. Let \eqn{X} be a real
\eqn{p}-dimensional covariate vector. We assume that the dependence of
\eqn{Y} and \eqn{X} is modelled by
}
\details{
\deqn{Y = g(B'X) + \epsilon}
where \eqn{X} is independent of \eqn{\epsilon} with positive definite
variance-covariance matrix \eqn{Var(X) = \Sigma_X}. \eqn{\epsilon} is a mean
zero random variable with finite \eqn{Var(\epsilon) = E(\epsilon^2)}, \eqn{g}
is an unknown, continuous non-constant function,
and \eqn{B = (b_1, ..., b_k)} is
a real \eqn{p \times k}{p x k} matrix of rank \eqn{k \leq p}{k <= p}.
Without loss of generality \eqn{B} is assumed to be orthonormal.
Further, the extended Ensemble Conditional Variance Estimation (ECVE) is
implemented which is a SDR method in regressions with continuous response and
predictors. ECVE applies to general non-additive error regression models.
\deqn{Y = g(B'X, \epsilon)}
It operates under the assumption that the predictors can be replaced by a
lower dimensional projection without loss of information.It is a
semiparametric forward regression model based exhaustive sufficient dimension
reduction estimation method that is shown to be consistent under mild
assumptions.
}
\references{
[1] Fertl, L. and Bura, E. (2021), Conditional Variance
Estimation for Sufficient Dimension Reduction.
arXiv:2102.08782
[2] Fertl, L. and Bura, E. (2021), Ensemble Conditional Variance
Estimation for Sufficient Dimension Reduction.
arXiv:2102.13435
}
\author{
Daniel Kapla, Lukas Fertl, Bura Efstathia
}

View File

@ -20,12 +20,13 @@ the environment from which \code{cve} is called.}
\item{method}{This character string specifies the method of fitting. The \item{method}{This character string specifies the method of fitting. The
options are options are
\itemize{ \itemize{
\item "mean" method to estimate the mean subspace, see [1]. \item \code{"mean"} method to estimate the mean subspace, see [1].
\item "central" ensemble method to estimate the central subspace, see [2]. \item \code{"central"} ensemble method to estimate the central subspace,
\item "weighted.mean" variation of `"mean"` method with adaptive weighting see [2].
of slices, see [1]. \item \code{"weighted.mean"} variation of \code{"mean"} method with
\item "weighted.central" variation of `"central"` method with adaptive adaptive weighting of slices, see [1].
weighting of slices, see [2]. \item \code{"weighted.central"} variation of \code{"central"} method with
adaptive weighting of slices, see [2].
}} }}
\item{max.dim}{upper bounds for \code{k}, (ignored if \code{k} is supplied).} \item{max.dim}{upper bounds for \code{k}, (ignored if \code{k} is supplied).}

View File

@ -34,12 +34,13 @@ cve.call(
\item{method}{This character string specifies the method of fitting. The \item{method}{This character string specifies the method of fitting. The
options are options are
\itemize{ \itemize{
\item "mean" method to estimate the mean subspace, see [1]. \item \code{"mean"} method to estimate the mean subspace, see [1].
\item "central" ensemble method to estimate the central subspace, see [2]. \item \code{"central"} ensemble method to estimate the central subspace,
\item "weighted.mean" variation of `"mean"` method with adaptive weighting see [2].
of slices, see [1]. \item \code{"weighted.mean"} variation of \code{"mean"} method with
\item "weighted.central" variation of `"central"` method with adaptive adaptive weighting of slices, see [1].
weighting of slices, see [2]. \item \code{"weighted.central"} variation of \code{"central"} method with
adaptive weighting of slices, see [2].
}} }}
\item{func_list}{a list of functions applied to \code{Y} used by ECVE \item{func_list}{a list of functions applied to \code{Y} used by ECVE

View File

@ -1,7 +1,7 @@
#include "cve.h" #include "cve.h"
/** /**
* Calles a R function passed to the algoritm and supplied intermidiate * Calls a R function passed to the algorithm and supplied intermediate
* optimization values for logging the optimization progress. * optimization values for logging the optimization progress.
* The supplied parameters to the logger functions are as follows: * The supplied parameters to the logger functions are as follows:
* - attempt: Attempts counter. * - attempt: Attempts counter.
@ -19,7 +19,7 @@
* @param V Pointer memory area of size `nrowV * ncolV` storing `V`. * @param V Pointer memory area of size `nrowV * ncolV` storing `V`.
* @param G Pointer memory area of size `nrowG * ncolG` storing `G`. * @param G Pointer memory area of size `nrowG * ncolG` storing `G`.
* @param loss Current loss L(V). * @param loss Current loss L(V).
* @param err Errof for break condition (0.0 befor first iteration). * @param err Error for break condition (0.0 before first iteration).
* @param tau Current step-size. * @param tau Current step-size.
*/ */
void callLogger(SEXP logger, SEXP env, void callLogger(SEXP logger, SEXP env,
@ -63,6 +63,6 @@ void callLogger(SEXP logger, SEXP env,
/* Evaluate the logger function call expression. */ /* Evaluate the logger function call expression. */
eval(loggerCall, env); eval(loggerCall, env);
/* Unprotect created R objects. */ /* Unlock created R objects. */
UNPROTECT(11); UNPROTECT(11);
} }

View File

@ -11,7 +11,6 @@ void cve(const mat *X, const mat *Fy, const double h,
mat *V, mat *L, mat *V, mat *L,
SEXP logger, SEXP loggerEnv) { SEXP logger, SEXP loggerEnv) {
// TODO: param and dim. validation.
int n = X->nrow, p = X->ncol, q = V->ncol; int n = X->nrow, p = X->ncol, q = V->ncol;
int attempt = 0, iter; int attempt = 0, iter;
double loss, loss_last, loss_best, err, tau; double loss, loss_last, loss_best, err, tau;
@ -20,16 +19,14 @@ void cve(const mat *X, const mat *Fy, const double h,
double sumK; double sumK;
double c = agility / (double)n; double c = agility / (double)n;
// TODO: check parameters! dim, ...
/* Create further intermediate or internal variables. */ /* Create further intermediate or internal variables. */
mat *lvecD_e = (void*)0; mat *lvecD_e = (void*)0;
mat *Fy_sq = (void*)0; mat *Fy_sq = (void*)0;
mat *XV = (void*)0; mat *XV = (void*)0;
mat *lvecD = (void*)0; // TODO: combine. aka in-place lvecToSym mat *lvecD = (void*)0;
mat *D = (void*)0; // TODO: combine. aka in-place lvecToSym mat *D = (void*)0;
mat *lvecK = (void*)0; // TODO: combine. aka in-place lvecToSym mat *lvecK = (void*)0;
mat *K = (void*)0; // TODO: combine. aka in-place lvecToSym mat *K = (void*)0;
mat *colSumsK = (void*)0; mat *colSumsK = (void*)0;
mat *rowSumsL = (void*)0; mat *rowSumsL = (void*)0;
mat *W = (void*)0; mat *W = (void*)0;
@ -44,7 +41,7 @@ void cve(const mat *X, const mat *Fy, const double h,
mat *V_best = (void*)0; mat *V_best = (void*)0;
mat *L_best = (void*)0; mat *L_best = (void*)0;
/* Allocate appropiate amount of working memory. */ /* Allocate appropriate amount of working memory. */
int workLen = 2 * (p + 1) * p; int workLen = 2 * (p + 1) * p;
if (workLen < n) { if (workLen < n) {
workLen = n; workLen = n;
@ -60,7 +57,7 @@ void cve(const mat *X, const mat *Fy, const double h,
/* Check if start value for `V` was supplied. */ /* Check if start value for `V` was supplied. */
if (attempts > 0) { if (attempts > 0) {
/* Sample start value from stiefel manifold. */ /* Sample start value from Stiefel manifold. */
V = rStiefel(p, q, V, workMem); V = rStiefel(p, q, V, workMem);
} }
@ -77,7 +74,7 @@ void cve(const mat *X, const mat *Fy, const double h,
colSumsK = colSums(K, colSumsK); colSumsK = colSums(K, colSumsK);
/* Normalize K columns to obtain weight matrix W */ /* Normalize K columns to obtain weight matrix W */
W = colApply(K, '/', colSumsK, W); W = colApply(K, '/', colSumsK, W);
/* first and second order weighted responces */ /* first and second order weighted responses */
y1 = matrixprod(1.0, W, Fy, 0.0, y1); y1 = matrixprod(1.0, W, Fy, 0.0, y1);
y2 = matrixprod(1.0, W, Fy_sq, 0.0, y2); y2 = matrixprod(1.0, W, Fy_sq, 0.0, y2);
/* Compute losses */ /* Compute losses */
@ -114,8 +111,8 @@ void cve(const mat *X, const mat *Fy, const double h,
A = skew(tau, G, V, 0.0, A); A = skew(tau, G, V, 0.0, A);
for (iter = 0; iter < maxIter; ++iter) { for (iter = 0; iter < maxIter; ++iter) {
/* Before Starting next iteration check if the Uer has requested an /* Before next iteration, check if the User has requested an
* interupt (aka. ^C, or "Stop" button). * interrupt (aka. ^C, or "Stop" button).
* If interrupted the algorithm will be exited here and everything * If interrupted the algorithm will be exited here and everything
* will be discharted! */ * will be discharted! */
R_CheckUserInterrupt(); R_CheckUserInterrupt();
@ -136,7 +133,7 @@ void cve(const mat *X, const mat *Fy, const double h,
colSumsK = colSums(K, colSumsK); colSumsK = colSums(K, colSumsK);
/* Normalize K columns to obtain weight matrix W */ /* Normalize K columns to obtain weight matrix W */
W = colApply(K, '/', colSumsK, W); W = colApply(K, '/', colSumsK, W);
/* first and second order weighted responces */ /* first and second order weighted responses */
y1 = matrixprod(1.0, W, Fy, 0.0, y1); y1 = matrixprod(1.0, W, Fy, 0.0, y1);
y2 = matrixprod(1.0, W, Fy_sq, 0.0, y2); y2 = matrixprod(1.0, W, Fy_sq, 0.0, y2);
/* Compute losses */ /* Compute losses */
@ -184,7 +181,7 @@ void cve(const mat *X, const mat *Fy, const double h,
if (method == weighted) { if (method == weighted) {
/* Calculate the scaling matrix S */ /* Calculate the scaling matrix S */
S = laplace(adjacence(L, Fy, y1, D, K, gauss, S), workMem); S = laplace(adjacence(L, Fy, y1, D, K, gauss, S), workMem);
c = agility / sumK; // n removed previousely c = agility / sumK; // n removed previously
} else { /* simple */ } else { /* simple */
/* Calculate the scaling matrix S */ /* Calculate the scaling matrix S */
S = laplace(adjacence(L, Fy, y1, D, W, gauss, S), workMem); S = laplace(adjacence(L, Fy, y1, D, W, gauss, S), workMem);

View File

@ -65,9 +65,9 @@ mat* applyKernel(const mat* A, const double h, kernel kernel, mat* B) {
* v . . . . . . * v . . . . . .
* s[n-1] s[2n-1] . . . s[n-1] . . . s[nn-1] * s[n-1] s[2n-1] . . . s[n-1] . . . s[nn-1]
* *
* @param L per sample loss vector of (lenght `n`). * @param L per sample loss vector of (length `n`).
* @param Y responces (lenght `n`). * @param Y responses (length `n`).
* @param y1 weighted responces (lenght `n`). * @param y1 weighted responses (length `n`).
* @param D distance matrix (dim. `n x n`). * @param D distance matrix (dim. `n x n`).
* @param W weight matrix (dim. `n x n`). * @param W weight matrix (dim. `n x n`).
* @param kernel the kernel to be used. * @param kernel the kernel to be used.
@ -117,8 +117,6 @@ mat* adjacence(const mat *mat_L, const mat *mat_Fy, const mat *mat_y1,
double *Y, *L, *y1; double *Y, *L, *y1;
double *D, *W, *S; double *D, *W, *S;
// TODO: Check dims.
if (!mat_S) { if (!mat_S) {
mat_S = zero(n, n); mat_S = zero(n, n);
} else { } else {

View File

@ -7,7 +7,7 @@
* *
* @details Reuses the memory area of the SEXP object, therefore manipulation * @details Reuses the memory area of the SEXP object, therefore manipulation
* of the returned matrix works in place of the SEXP object. In addition, * of the returned matrix works in place of the SEXP object. In addition,
* a reference to the original SEXP is stored and will be retriefed from * a reference to the original SEXP is stored and will be retrieved from
* `asSEXP()` if the matrix was created through this function. * `asSEXP()` if the matrix was created through this function.
*/ */
static mat* asMat(SEXP S) { static mat* asMat(SEXP S) {

View File

@ -13,26 +13,26 @@ Open `R` and then the following:
```R ```R
# addapt to download file. # addapt to download file.
install.packages("path/to/cve_0.2.<end>", repos = NULL) install.packages("path/to/cve_0.2.<end>", repos = NULL)
library(CVE) # Test installation. library(CVarE) # Test installation.
``` ```
Please consult the man-pages `?install.package` and `?library` for further information. Please consult the man-pages `?install.package` and `?library` for further information.
## Installing Source ## Installing Source
Cloning the `CVE` repository and using `R`'s build and install routines from a terminal. Cloning the `CVarE` repository and using `R`'s build and install routines from a terminal.
```bash ```bash
git clone https://git.art-ist.cc/daniel/CVE.git # Clone repository git clone https://git.art-ist.cc/daniel/CVE.git # Clone repository
cd CVE # Go into the repository cd CVarE # Go into the repository
R CMD build CVE # Build package tarbal R CMD build CVarE # Build package tarbal
R CMD INSTALL CVE_0.2.tar.gz # Install package R CMD INSTALL CVarE_1.0.tar.gz # Install package
``` ```
### Alternative Installing Source from within R ### Alternative Installing Source from within R
Using the `devtools` the package can also be directly installed from within `R` Using the `devtools` the package can also be directly installed from within `R`
```R ```R
library(devtools) # Load the dectools library(devtools) # Load the dectools
setwd('<path_to_repo>/CVE') # Go into package root setwd('<path_to_repo>/CVarE') # Go into package root
(path <- build(vignettes = FALSE)) # Create source package (path <- build(vignettes = FALSE)) # Create source package
install.packages(path, repos = NULL, type = "source") # Install source package install.packages(path, repos = NULL, type = "source") # Install source package
library(CVE) # Test library(CVarE) # Test
``` ```
### Windows / macOS ### Windows / macOS
@ -40,4 +40,4 @@ Installing from source (for any package which contains compiled code, in our cas
See _R Installation and Administration_ from [r-project manuals](https://cran.r-project.org/manuals.html). See _R Installation and Administration_ from [r-project manuals](https://cran.r-project.org/manuals.html).
# Repository Structure # Repository Structure
The repository is structured in two directories, the `CVE/` directory which is the `R` package root and `simulations/` where all simulation scripts can be found (and `README.md` which is this). The repository is structured in two directories, the `CVarE/` directory which is the `R` package root and `simulations/` where all simulation scripts can be found (and `README.md` which is this).