CRAN submittion

2021-03-08 13:31:13 +01:00 · 2021-03-08 13:31:13 +01:00 · a419a8cdc7
commit a419a8cdc7
parent c554ae6e9c
12 changed files with 70 additions and 111 deletions
--- a/.gitignore
+++ b/.gitignore
@ -1,9 +1,10 @@
 simulations/results/*
 literature/*
+doc/*

-CVE/src/*.o
-CVE/src/*.so
-CVE/src/*.dll
+CVarE/src/*.o
+CVarE/src/*.so
+CVarE/src/*.dll

 *.tgz
 *.tar.xz
--- a/CVarE/DESCRIPTION
+++ b/CVarE/DESCRIPTION
@ -1,12 +1,28 @@
-Package: CVE
+Package: CVarE
 Type: Package
 Title: Conditional Variance Estimator for Sufficient Dimension Reduction
-Version: 0.3
-Date: 2021-03-04
-Author: Daniel Kapla <daniel@kapla.at>, Lukas Fertl <lukas.fertl@chello.at>
+Version: 1.0
+Date: 2021-03-05
 Maintainer: Daniel Kapla <daniel@kapla.at>
-Description: Implementation of the Conditional Variance Estimation (CVE) method.
+Author: Daniel Kapla [aut, cph, cre],
+  Lukas Fertl [aut, cph],
+  Efstathia Bura [ctb]
+Description: Implementation of the Conditional Variance Estimation (CVE)
+  Fertl and Bura (2021) <arXiv:2102.08782> and the Ensemble Conditional Variance
+  Estimation (ECVE) Fertl and Bura (2021) <arXiv:2102.13435> method.
+
+  CVE and ECVE are sufficient dimension reduction (SDR) methods
+  in regressions with continuous response and predictors. CVE applies to general
+  additive error regression models while ECVE generalizes to non-additive error
+  regression models. They operate under the assumption that the predictors can
+  be replaced by a lower dimensional projection without loss of information.
+  It is a semiparametric forward regression model based exhaustive sufficient
+  dimension reduction estimation method that is shown to be consistent under mild
+  assumptions.
 License: GPL-3
+Contact: <daniel@kapla.at>
+URL: https://git.art-ist.cc/daniel/CVE
 Encoding: UTF-8
+NeedsCompilation: yes
 Imports: stats,mda
 RoxygenNote: 7.0.2
--- a/CVarE/NAMESPACE
+++ b/CVarE/NAMESPACE
@ -17,4 +17,4 @@ importFrom(mda,mars)
 importFrom(stats,model.frame)
 importFrom(stats,rbinom)
 importFrom(stats,rnorm)
-useDynLib(CVE, .registration = TRUE)
+useDynLib(CVarE, .registration = TRUE)
--- a/CVarE/R/CVE.R
+++ b/CVarE/R/CVE.R
@ -46,7 +46,7 @@
 #'          arXiv:2102.13435
 #'
 #' @docType package
-#' @useDynLib CVE, .registration = TRUE
+#' @useDynLib CVarE, .registration = TRUE
 "_PACKAGE"

 #' Conditional Variance Estimator (CVE).
@ -495,7 +495,7 @@ cve.call <- function(X, Y,
            h <- estimate.bandwidth(X, k, nObs)
        }

-        dr.k <- .Call('cve_export', PACKAGE = 'CVE',
+        dr.k <- .Call('cve_export', PACKAGE = 'CVarE',
                      X, Fy, k, h,
                      method_nr,
                      V.init,
--- a/CVarE/man/CVE-package.Rd
+++ b/CVarE/man/CVE-package.Rd
@ -1,55 +0,0 @@
-% Generated by roxygen2: do not edit by hand
-% Please edit documentation in R/CVE.R
-\docType{package}
-\name{CVE-package}
-\alias{CVE}
-\alias{CVE-package}
-\title{Conditional Variance Estimator (CVE) Package.}
-\description{
-Conditional Variance Estimation (CVE) is a novel sufficient dimension
-reduction (SDR) method for regressions satisfying \eqn{E(Y|X) = E(Y|B'X)},
-where \eqn{B'X} is a lower dimensional projection of the predictors and
-\eqn{Y} is a univariate response. CVE,
-similarly to its main competitor, the mean average variance estimation
-(MAVE), is not based on inverse regression, and does not require the
-restrictive linearity and constant variance conditions of moment based SDR
-methods. CVE is data-driven and applies to additive error regressions with
-continuous predictors and link function. Let \eqn{X} be a real
-\eqn{p}-dimensional covariate vector. We assume that the dependence of
-\eqn{Y} and \eqn{X} is modelled by
-}
-\details{
-\deqn{Y = g(B'X) + \epsilon}
-
-where \eqn{X} is independent of \eqn{\epsilon} with positive definite
-variance-covariance matrix \eqn{Var(X) = \Sigma_X}. \eqn{\epsilon} is a mean
-zero random variable with finite \eqn{Var(\epsilon) = E(\epsilon^2)}, \eqn{g}
-is an unknown, continuous non-constant function,
-and \eqn{B = (b_1, ..., b_k)} is
-a real \eqn{p \times k}{p x k} matrix of rank \eqn{k \leq p}{k <= p}. 
-Without loss of generality \eqn{B} is assumed to be orthonormal.
-
-Further, the extended Ensemble Conditional Variance Estimation (ECVE) is
-implemented which is a SDR method in regressions with continuous response and
-predictors. ECVE applies to general non-additive error regression models.
-
-\deqn{Y = g(B'X, \epsilon)}
-
-It operates under the assumption that the predictors can be replaced by a
-lower dimensional projection without loss of information.It is a
-semiparametric forward regression model based exhaustive sufficient dimension
-reduction estimation method that is shown to be consistent under mild
-assumptions.
-}
-\references{
-[1] Fertl, L. and Bura, E. (2021), Conditional Variance
-         Estimation for Sufficient Dimension Reduction.
-         arXiv:2102.08782
-
-   [2] Fertl, L. and Bura, E. (2021), Ensemble Conditional Variance
-         Estimation for Sufficient Dimension Reduction.
-         arXiv:2102.13435
-}
-\author{
-Daniel Kapla, Lukas Fertl, Bura Efstathia
-}
--- a/CVarE/man/cve.Rd
+++ b/CVarE/man/cve.Rd
@ -20,12 +20,13 @@ the environment from which \code{cve} is called.}
 \item{method}{This character string specifies the method of fitting. The
 options are
 \itemize{
-   \item "mean" method to estimate the mean subspace, see [1].
-   \item "central" ensemble method to estimate the central subspace, see [2].
-   \item "weighted.mean" variation of `"mean"` method with adaptive weighting
-     of slices, see [1].
-   \item "weighted.central" variation of `"central"` method with adaptive
-     weighting of slices, see [2].
+   \item \code{"mean"} method to estimate the mean subspace, see [1].
+   \item \code{"central"} ensemble method to estimate the central subspace,
+     see [2].
+   \item \code{"weighted.mean"} variation of \code{"mean"} method with
+     adaptive weighting of slices, see [1].
+   \item \code{"weighted.central"} variation of \code{"central"} method with
+     adaptive weighting of slices, see [2].
 }}

 \item{max.dim}{upper bounds for \code{k}, (ignored if \code{k} is supplied).}
@ -98,7 +99,7 @@ estimate the central subspace. This corresponds to the generalization

 \deqn{F(Y|X) = F(Y|B'X)}

-or, equivalently, 
+or, equivalently,

 \deqn{Y = g(B'X, \epsilon)}

--- a/CVarE/man/cve.call.Rd
+++ b/CVarE/man/cve.call.Rd
@ -34,12 +34,13 @@ cve.call(
 \item{method}{This character string specifies the method of fitting. The
 options are
 \itemize{
-   \item "mean" method to estimate the mean subspace, see [1].
-   \item "central" ensemble method to estimate the central subspace, see [2].
-   \item "weighted.mean" variation of `"mean"` method with adaptive weighting
-     of slices, see [1].
-   \item "weighted.central" variation of `"central"` method with adaptive
-     weighting of slices, see [2].
+   \item \code{"mean"} method to estimate the mean subspace, see [1].
+   \item \code{"central"} ensemble method to estimate the central subspace,
+     see [2].
+   \item \code{"weighted.mean"} variation of \code{"mean"} method with
+     adaptive weighting of slices, see [1].
+   \item \code{"weighted.central"} variation of \code{"central"} method with
+     adaptive weighting of slices, see [2].
 }}

 \item{func_list}{a list of functions applied to \code{Y} used by ECVE
@ -160,7 +161,7 @@ estimate the central subspace. This corresponds to the generalization

 \deqn{F(Y|X) = F(Y|B'X)}

-or, equivalently, 
+or, equivalently,

 \deqn{Y = g(B'X, \epsilon)}

--- a/CVarE/src/callLogger.c
+++ b/CVarE/src/callLogger.c
@ -1,7 +1,7 @@
 #include "cve.h"

 /**
- * Calles a R function passed to the algoritm and supplied intermidiate
+ * Calls a R function passed to the algorithm and supplied intermediate
 * optimization values for logging the optimization progress.
 * The supplied parameters to the logger functions are as follows:
 * - attempt: Attempts counter.
@ -19,7 +19,7 @@
 * @param V Pointer memory area of size `nrowV * ncolV` storing `V`.
 * @param G Pointer memory area of size `nrowG * ncolG` storing `G`.
 * @param loss Current loss L(V).
- * @param err Errof for break condition (0.0 befor first iteration).
+ * @param err Error for break condition (0.0 before first iteration).
 * @param tau Current step-size.
 */
 void callLogger(SEXP logger, SEXP env,
@ -63,6 +63,6 @@ void callLogger(SEXP logger, SEXP env,
    /* Evaluate the logger function call expression. */
    eval(loggerCall, env);

-    /* Unprotect created R objects. */
+    /* Unlock created R objects. */
    UNPROTECT(11);
 }
--- a/CVarE/src/cve.c
+++ b/CVarE/src/cve.c
@ -11,7 +11,6 @@ void cve(const mat *X, const mat *Fy, const double h,
         mat *V, mat *L,
         SEXP logger, SEXP loggerEnv) {

-    // TODO: param and dim. validation.
    int n = X->nrow, p = X->ncol, q = V->ncol;
    int attempt = 0, iter;
    double loss, loss_last, loss_best, err, tau;
@ -20,16 +19,14 @@ void cve(const mat *X, const mat *Fy, const double h,
    double sumK;
    double c = agility / (double)n;

-    // TODO: check parameters! dim, ...
-
    /* Create further intermediate or internal variables. */
    mat *lvecD_e  = (void*)0;
    mat *Fy_sq    = (void*)0;
    mat *XV       = (void*)0;
-    mat *lvecD    = (void*)0; // TODO: combine. aka in-place lvecToSym
-    mat *D        = (void*)0; // TODO: combine. aka in-place lvecToSym
-    mat *lvecK    = (void*)0; // TODO: combine. aka in-place lvecToSym
-    mat *K        = (void*)0; // TODO: combine. aka in-place lvecToSym
+    mat *lvecD    = (void*)0;
+    mat *D        = (void*)0;
+    mat *lvecK    = (void*)0;
+    mat *K        = (void*)0;
    mat *colSumsK = (void*)0;
    mat *rowSumsL = (void*)0;
    mat *W        = (void*)0;
@ -44,7 +41,7 @@ void cve(const mat *X, const mat *Fy, const double h,
    mat *V_best   = (void*)0;
    mat *L_best   = (void*)0;

-    /* Allocate appropiate amount of working memory. */
+    /* Allocate appropriate amount of working memory. */
    int workLen = 2 * (p + 1) * p;
    if (workLen < n) {
        workLen = n;
@ -60,7 +57,7 @@ void cve(const mat *X, const mat *Fy, const double h,

        /* Check if start value for `V` was supplied. */
        if (attempts > 0) {
-            /* Sample start value from stiefel manifold. */
+            /* Sample start value from Stiefel manifold. */
            V = rStiefel(p, q, V, workMem);
        }

@ -77,7 +74,7 @@ void cve(const mat *X, const mat *Fy, const double h,
        colSumsK = colSums(K, colSumsK);
        /* Normalize K columns to obtain weight matrix W */
        W = colApply(K, '/', colSumsK, W);
-        /* first and second order weighted responces */
+        /* first and second order weighted responses */
        y1 = matrixprod(1.0, W, Fy,    0.0, y1);
        y2 = matrixprod(1.0, W, Fy_sq, 0.0, y2);
        /* Compute losses */
@ -114,8 +111,8 @@ void cve(const mat *X, const mat *Fy, const double h,
        A = skew(tau, G, V, 0.0, A);

        for (iter = 0; iter < maxIter; ++iter) {
-            /* Before Starting next iteration check if the Uer has requested an
-             * interupt (aka. ^C, or "Stop" button).
+            /* Before next iteration, check if the User has requested an
+             * interrupt (aka. ^C, or "Stop" button).
             * If interrupted the algorithm will be exited here and everything
             * will be discharted! */
            R_CheckUserInterrupt();
@ -136,7 +133,7 @@ void cve(const mat *X, const mat *Fy, const double h,
            colSumsK = colSums(K, colSumsK);
            /* Normalize K columns to obtain weight matrix W */
            W = colApply(K, '/', colSumsK, W);
-            /* first and second order weighted responces */
+            /* first and second order weighted responses */
            y1 = matrixprod(1.0, W, Fy,    0.0, y1);
            y2 = matrixprod(1.0, W, Fy_sq, 0.0, y2);
            /* Compute losses */
@ -184,7 +181,7 @@ void cve(const mat *X, const mat *Fy, const double h,
            if (method == weighted) {
                /* Calculate the scaling matrix S */
                S = laplace(adjacence(L, Fy, y1, D, K, gauss, S), workMem);
-                c = agility / sumK; // n removed previousely
+                c = agility / sumK; // n removed previously
            } else { /* simple */
                /* Calculate the scaling matrix S */
                S = laplace(adjacence(L, Fy, y1, D, W, gauss, S), workMem);
--- a/CVarE/src/cve_subroutines.c
+++ b/CVarE/src/cve_subroutines.c
@ -65,9 +65,9 @@ mat* applyKernel(const mat* A, const double h, kernel kernel, mat* B) {
 *               v    .       .       .    .        .      .
 *                 s[n-1] s[2n-1] . . .  s[n-1] . . .   s[nn-1]
 *
- * @param L per sample loss vector of (lenght `n`).
- * @param Y responces (lenght `n`).
- * @param y1 weighted responces (lenght `n`).
+ * @param L per sample loss vector of (length `n`).
+ * @param Y responses (length `n`).
+ * @param y1 weighted responses (length `n`).
 * @param D distance matrix (dim. `n x n`).
 * @param W weight matrix (dim. `n x n`).
 * @param kernel the kernel to be used.
@ -117,8 +117,6 @@ mat* adjacence(const mat *mat_L, const mat *mat_Fy, const mat *mat_y1,
    double *Y, *L, *y1;
    double *D, *W, *S;

-    // TODO: Check dims.
-
    if (!mat_S) {
        mat_S = zero(n, n);
    } else {
--- a/CVarE/src/export.c
+++ b/CVarE/src/export.c
@ -7,7 +7,7 @@
 *
 * @details Reuses the memory area of the SEXP object, therefore manipulation
 *      of the returned matrix works in place of the SEXP object. In addition,
- *      a reference to the original SEXP is stored and will be retriefed from
+ *      a reference to the original SEXP is stored and will be retrieved from
 *      `asSEXP()` if the matrix was created through this function.
 */
 static mat* asMat(SEXP S) {
--- a/README.md
+++ b/README.md
@ -13,26 +13,26 @@ Open `R` and then the following:
 ```R
 # addapt to download file.
 install.packages("path/to/cve_0.2.<end>", repos = NULL)
-library(CVE) # Test installation.
+library(CVarE) # Test installation.
 ```
 Please consult the man-pages `?install.package` and `?library` for further information.

 ## Installing Source
-Cloning the `CVE` repository and using `R`'s build and install routines from a terminal.
+Cloning the `CVarE` repository and using `R`'s build and install routines from a terminal.
 ```bash
 git clone https://git.art-ist.cc/daniel/CVE.git  # Clone repository
-cd CVE                                           # Go into the repository
-R CMD build CVE                                  # Build package tarbal
-R CMD INSTALL CVE_0.2.tar.gz                     # Install package
+cd CVarE                                         # Go into the repository
+R CMD build CVarE                                # Build package tarbal
+R CMD INSTALL CVarE_1.0.tar.gz                   # Install package
 ```
 ### Alternative Installing Source from within R
 Using the `devtools` the package can also be directly installed from within `R`
 ```R
 library(devtools)                                       # Load the dectools
-setwd('<path_to_repo>/CVE')                             # Go into package root
+setwd('<path_to_repo>/CVarE')                           # Go into package root
 (path <- build(vignettes = FALSE))                      # Create source package
 install.packages(path, repos = NULL, type = "source")   # Install source package
-library(CVE)                                            # Test
+library(CVarE)                                          # Test
 ```

 ### Windows / macOS
@ -40,4 +40,4 @@ Installing from source (for any package which contains compiled code, in our cas
 See _R Installation and Administration_ from [r-project manuals](https://cran.r-project.org/manuals.html).

 # Repository Structure
-The repository is structured in two directories, the `CVE/` directory which is the `R` package root and `simulations/` where all simulation scripts can be found (and `README.md` which is this).
+The repository is structured in two directories, the `CVarE/` directory which is the `R` package root and `simulations/` where all simulation scripts can be found (and `README.md` which is this).