- add: simulations,

- wip: README
2019-12-17 12:07:33 +01:00 · 2019-12-17 12:07:33 +01:00 · c437c60e84
commit c437c60e84
parent d5cd6075a9
9 changed files with 827 additions and 445 deletions
--- a/README.md
+++ b/README.md
@ -1,451 +1,34 @@
-# TODOs
+# Installation Instructions
-Doc:
+The package depends on the `mda` package (see: [mda @ cran.r-project](https://cran.r-project.org/web/packages/mda/index.html). Therefore the first step is to install the `mda` package
 - [x] Stiefel (instead of Stiefl)
 - [x] Return value description (`@returs`)
 - [x] DESCRIPTION
    - [x] Maintainer
    - [x] Author
    - [x] Volume
    - [x] Description (from Paper) and Ref.
 - [x] Ref paper in doc
 - [ ] Data set descriptions and augmentations.
 - [x] Demonstration of the `Logger` function usage (Demo file or so, ...)
 - [ ] Update Paper (to new version / version consistent with current code!)
 - [ ] Reference Paper in DESCRIPTION file (in Description or specific tag)
 - [ ] Split `cve` and `cve.call` docs.
 - [ ] "Copy" form `dr` package (specifically `dr.directions` -> description)
 - [ ] Document `C` code.
 Methods to be implemented:
 - [x] simple
 - [x] weighted
 - [x] momentum
 - [x] weighted with momentum
 Performance:
 - [x] Pure C implementation.
 - [NOT Feasible] Stochastic Version
 - [NOT Feasible] Gradient Approximations (using Algebraic Software for alternative Loss function formulations and gradient optimizations)
 - [NOT Sufficient] Alternative Kernels for reducing samples
 - [x] (To Be further investigated) "Kronecker" optimization
 - [ ] Implement "Kronecker" optimization
 Features (functions):
 - [x] Initial `V.init` parameter (only ONE try, ignore number of `attempts` parameter)
 - [x] `basis.cve` list of estimated `B`s (with `k` supplied, only `B`)
 - [x] `directions.cve` Projected `X` given `k`
 - [x] `predict.cve` using `mars` for predicting responses given new data.
 - [x] `predict.dim.cve` Cross-validation or `aov` (in stats package) or "elbow" estimation
 - [x] `plot.elbow`
 - [x] `summary`
 - [ ] Consider `cor.test` for dimension selection
 - [x] Check for user interrupt (`R_CheckUserInterrupt`)
 Changes:
 - [x] New `estimate.bandwidth` implementation.
    (h = 2 * (tr(\Sigma) / p) * (6/5 * n^(-1 / (4 + k)))^2,
    \Sigma = 1/n * (X-mean)'(X-mean))
 Errors:
 - [x] `CVE_C` compare to `CVE_legacy`.
 - [x] fix: `predict.dim` not found.
 - [ ] fix/check: error computation for break condition (in `cve.c`)
 # Development
 ## Build and install.
 To build the package the `devtools` package is used. This also provides `roxygen2` which is used for documentation and automatic creation of the `NAMESPACE` file.
 ```R
-setwd("./CVE_R")  # Set path to the package root.
+install.packages("mda")
 library(devtools) # Load required `devtools` package.
 document()        # Create `.Rd` files and write `NAMESPACE`.
 ```
-Next the package needs to be build, therefore (if pure `R` package, aka. `C/C++`, `Fortran`, ... code) just do the following.
+A release will be available in a few days.
 - Linux: Not available jet (one or two days, state 17.12.2019)
 - Windows: Not available jet (one or two days, state 17.12.2019)
 - MacOS: Not available jet (unknown)
 Open `R` and then the following:
 ```R
 # addapt to download file.
 install.packages("path/to/cve_0.2.<end>", repos = NULL)
 library(CVE) # Test installation.
 ```
 Please consult the man-pages `?install.package` and `?library` for further information.
 ## Installing Source
 Cloning the `CVE` repository and using `R`'s build and install routines from a terminal.
 ```bash
-R CMD build CVE_C; R CMD INSTALL CVE_0.2.tar.gz
+git clone https://git.art-ist.cc/daniel/CVE.git  # Clone repository
-```
+cd CVE                                           # Go into the repository
-Then we are ready for using the package.
+R CMD build CVE                                  # Build package tarbal
-As well as building the `NAMESPACE` and `*.Rd` files using `devtools` (`roxygen2`) the following resembles an entire build pipeline including checks.
+R CMD INSTALL CVE_0.2.tar.gz                     # Install package
 ```bash
 R -q -e 'library(devtools); setwd("CVE_C"); pkgbuild::compile_dll(); document(); pkgbuild::clean_dll()'
 R CMD build CVE_C; R CMD check CVE_0.2.tar.gz;
 R CMD INSTALL CVE_0.2.tar.gz
 ```
 ## Build and install from within `R`.
 An alternative approach is the following.
 ```R
 ## Installing CVE (C implementation)
 (setwd('~/Projects/CVE/CVE_C'))
 # equiv to Rcpp::compileAttributes().
 library(devtools)
 pkgbuild::compile_dll() # required for packages with C/C++ code
 document()              # See bug: https://github.com/stan-dev/rstantools/issues/52
 pkgbuild::clean_dll()
 (path <- build(vignettes = FALSE))
 install.packages(path, repos = NULL, type = "source")
 library(CVE)
 ```
 **Note: I only recommend this approach during development.**
 # Package Structure
 ## Demos
 A demo is an `.R` file that lives in `demo/`. Demos are like examples but tend to
 be longer. Instead of focusing on a single function, they show how to weave
 together multiple functions to solve a problem.
 You list and access demos with `demo()`:
   * Show all available demos: `demo()`.
   * Show all demos in a package: `demo(package = "CVE")`.
   * Run a specific demo: `demo("runtime_test", package = "CVE")`.
   * Find a demo: `system.file("demo", "runtime_test.R", package = "CVE")`.
 Each demo must be listed in `demo/00Index` in the following form:
 `demo-name Demo description`.
 The demo name is the name of the file without the extension,
 e.g. `demo/runtime_test.R` becomes `runtime_test`.
 By default the demo ask for human input for each plot: "Hit to see next plot".
 This behavior can be overridden by adding `devAskNewPage(ask = FALSE)` to
 the demo file. You can add pauses by adding:
 `readline("press any key to continue")`.
 **Note**: Demos are not automatically tested by `R CMD check`. This means that they
 can easily break without your knowledge.
 # General Notes for Source Code analysis
 ## Search in multiple files.
 Using the Linux `grep` program with the parameters `-rnw` and specifying a include files filter like the following example.
 ```bash
 grep --include=*\.{c,h,R} -rnw '.' -e "sweep"
 ```
 searches in all `C` source and header files as well as `R` source files for the term _sweep_.
 ## Recursive directory compare with colored structure (more or less).
 ```bash
 diff -r CVE_R/ CVE_C/ | grep -E "^([<>]|[^<>].*)"
 ```
-## Parsing `bash` script parameters.
+### Windows / macOS
-```bash
+Installing from source (for any package which contains compiled code, in our case `C`) on Windows and MacOS requires additional tools.
-usage="$0 [-v|--verbose] [-n|--dry-run] [(-s|--stack-size) <size>] [-h|--help] [-- [p1, [p2, ...]]]"
+See _R Installation and Administration_ from [r-project manuals](https://cran.r-project.org/manuals.html).
 verbose=false
 help=false
 dry_run=false
 stack_size=0
-while [ $# -gt 0 ]; do
+# Repository Structure
-    case "$1" in
+The repository is structured in two directories, the `CVE/` directory which is the `R` package root and `simulations/` where all simulation scripts can be found (and `README.md` which is this).
        -v | --verbose )    verbose=true;      shift ;;
        -n | --dry-run )    dry_run=true;      shift ;;
        -s | --stack-size ) stack_size="$2";   shift; shift ;;
        -h | --help )       echo $usage;       exit ;; # On help print usage and exit.
        -- ) shift; break ;;            # Break param "parsing".
         * ) echo $usage >&2; exit 1 ;; # Print usage and exit with failure.
    esac
 done
 echo verbose=$verbose
 echo dry_run=$dry_run
 echo stack_size=$stack_size
 ```
 # Analysis
 ## Logging (a `cve` run).
 To log `loss`, `error` (estimated) the true error (error of current estimated `B` against the true `B`) or even the step size one can use the `logger` parameter. A `logger` is a function that gets the current `environment` of the CVE optimization methods (__do not alter this environment, only read from it__). This can be used to create logs like in the following example.
 ```R
 library(CVE)
 # Setup histories.
 (epochs <- 50)
 (attempts <- 10)
 loss.history       <- matrix(NA, epochs + 1, attempts)
 error.history      <- matrix(NA, epochs + 1, attempts)
 tau.history        <- matrix(NA, epochs + 1, attempts)
 true.error.history <- matrix(NA, epochs + 1, attempts)
 # Create a dataset
 ds <- dataset("M1")
 X <- ds$X
 Y <- ds$Y
 B <- ds$B # the true `B`
 (k <- ncol(ds$B))
 # True projection matrix.
 P <- B %*% solve(t(B) %*% B) %*% t(B)
 # Define the logger for the `cve()` method.
 logger <- function(env) {
    # Note the `<<-` assignement!
    loss.history[env$epoch + 1, env$attempt] <<- env$loss
    error.history[env$epoch + 1, env$attempt] <<- env$error
    tau.history[env$epoch + 1, env$attempt] <<- env$tau
    # Compute true error by comparing to the true `B`
    B.est <- null(env$V) # Function provided by CVE
    P.est <- B.est %*% solve(t(B.est) %*% B.est) %*% t(B.est)
    true.error <- norm(P - P.est, 'F') / sqrt(2 * k)
    true.error.history[env$epoch + 1, env$attempt] <<- true.error
 }
 # Performa SDR
 dr <- cve(Y ~ X, k = k, logger = logger, epochs = epochs, attempts = attempts)
 # Plot history's
 par(mfrow = c(2, 2))
 matplot(loss.history,       type = 'l', log = 'y', xlab = 'iter',
        main = 'loss', ylab = expression(L(V[iter])))
 matplot(error.history,      type = 'l', log = 'y', xlab = 'iter',
        main = 'error', ylab = 'error')
 matplot(tau.history,        type = 'l', log = 'y', xlab = 'iter',
        main = 'tau', ylab = 'tau')
 matplot(true.error.history, type = 'l', log = 'y', xlab = 'iter',
        main = 'true error', ylab = 'true error')
 ```
 ## Reading log files.
 The run-time tests (upcoming further tests) are creating log files saved in `tmp/`. These log files are `CSV` files (actually `TSV`) with a header storing the test results. Depending on the test the files may contain different data. As an example we use the run-time test logs which store in each line the `dataset`, the used `method` as well as the `error` (actual error of estimated `B` against real `B`) and the `time`. For reading and analyzing the data see the following example.
 ```R
 # Load log as `data.frame`
 log <- read.csv('tmp/test0.log', sep = '\t')
 # Create a error boxplot grouped by dataset.
 boxplot(error ~ dataset, log)
 # Overview
 for (ds.name in paste0('M', seq(5))) {
    ds <- subset(log, dataset == ds.name, select = c('method', 'dataset', 'time', 'error'))
    print(summary(ds))
 }
 ```
 ## Environments and variable lookup.
 In the following a view simple examples of how `R` searches for variables.
 In addition we manipulate function closures to alter the search path in variable lookup and outer scope variable manipulation.
 ```R
 droids <- "These aren't the droids you're looking for."
 search <- function() {
    print(droids)
 }
 trooper.seeks <- function() {
    droids <- c("R2-D2", "C-3PO")
    search()
 }
 jedi.seeks <- function() {
    droids <- c("R2-D2", "C-3PO")
    environment(search) <- environment()
    search()
 }
 trooper.seeks()
 # [1] "These aren't the droids you're looking for."
 jedi.seeks()
 # [1] "R2-D2", "C-3PO"
 ```
 The next example illustrates how to write (without local copies) to variables outside the functions local environment.
 ```R
 counting <- function() {
    count <<- count + 1 # Note the `<<-` assignment.
 }
 (function() {
    environment(counting) <- environment()
    count <- 0
    for (i in 1:10) {
        counting()
    }
    return(count)
 })()
 (function () {
    closure <- new.env()
    environment(counting) <- closure
    assign("count", 0, envir = closure)
    for (i in 1:10) {
        counting()
    }
    return(closure$count)
 })()
 ```
 Another example for the usage of `do.call` where the evaluation of parameters is illustrated (example taken (and altered) from `?do.call`).
 ```R
 ## examples of where objects will be found.
 A <- "A.Global"
 f <- function(x) print(paste("f.new", x))
 env <- new.env()
 assign("A", "A.new", envir = env)
 assign("f", f, envir = env)
 f <- function(x) print(paste("f.Global", x))
 f(A)                                          # f.Global A.Global
 do.call("f", list(A))                         # f.Global A.Global
 do.call("f", list(A), envir = env)            # f.new A.Global
 do.call(f,   list(A), envir = env)            # f.Global A.Global
 do.call("f", list(quote(A)), envir = env)     # f.new A.new
 do.call(f,   list(quote(A)), envir = env)     # f.Global A.new
 do.call("f", list(as.name("A")), envir = env) # f.new A.new
 do.call("f", list(as.name("A")), envir = env) # f.new A.new
 ```
 # Performance benchmarks
 In this section alternative implementations of simple algorithms are compared for there performance.
 ### Computing the trace of a matrix multiplication.
 ```R
 library(microbenchmark)
 A <- matrix(runif(120), 12, 10)
 # Check correctnes and benckmark performance.
 stopifnot(
    all.equal(
        sum(diag(t(A) %*% A)), sum(diag(crossprod(A, A)))
    ),
    all.equal(
        sum(diag(t(A) %*% A)), sum(A * A)
    )
 )
 microbenchmark(
    MM = sum(diag(t(A) %*% A)),
    cross = sum(diag(crossprod(A, A))),
    elem = sum(A * A)
 )
 # Unit: nanoseconds
 #   expr  min     lq    mean median     uq   max neval
 #     MM 4232 4570.0 5138.81   4737 4956.0 40308   100
 #  cross 2523 2774.5 2974.93   2946 3114.5  5078   100
 #   elem  582  762.5  973.02    834  964.0 12945   100
 ```
 ```R
 n <- 200
 M <- matrix(runif(n^2), n, n)
 dnorm2 <- function(x) exp(-0.5 * x^2) / sqrt(2 * pi)
 stopifnot(
    all.equal(dnorm(M), dnorm2(M))
 )
 microbenchmark(
    dnorm = dnorm(M),
    dnorm2 = dnorm2(M),
    exp = exp(-0.5 * M^2) # without scaling -> irrelevant for usage
 )
 # Unit: microseconds
 #   expr     min      lq     mean   median       uq      max neval
 #  dnorm 841.503 843.811 920.7828 855.7505 912.4720 2405.587   100
 # dnorm2 543.510 580.319 629.5321 597.8540 607.3795 2603.763   100
 #    exp 502.083 535.943 577.2884 548.3745 561.3280 2113.220   100
 ```
 ### Using `crosspord()`
 ```R
 p <- 12
 q <- 10
 V <- matrix(runif(p * q), p, q)
 stopifnot(
    all.equal(V %*% t(V), tcrossprod(V)),
    all.equal(V %*% t(V), tcrossprod(V, V))
 )
 microbenchmark(
    V %*% t(V),
    tcrossprod(V),
    tcrossprod(V, V)
 )
 # Unit: microseconds
 #              expr   min     lq    mean median     uq    max neval
 #        V %*% t(V) 2.293 2.6335 2.94673 2.7375 2.9060 19.592   100
 #     tcrossprod(V) 1.148 1.2475 1.86173 1.3440 1.4650 30.688   100
 #  tcrossprod(V, V) 1.003 1.1575 1.28451 1.2400 1.3685  2.742   100
 ```
 ### Recycling vs. Sweep
 ```R
 (n <- 200)
 (p <- 12)
 (q <- 10)
 X_diff <- matrix(runif(n * (n - 1) / 2 * p), n * (n - 1) / 2, p)
 V <- matrix(rnorm(p * q), p, q)
 vecS <- runif(n * (n - 1) / 2)
 stopifnot(
    all.equal((X_diff %*% V) * rep(vecS, q),
              sweep(X_diff %*% V, 1, vecS, `*`)),
    all.equal((X_diff %*% V) * rep(vecS, q),
              (X_diff %*% V) * vecS)
 )
 microbenchmark(
    rep = (X_diff %*% V) * rep(vecS, q),
    sweep = sweep(X_diff %*% V, 1, vecS, `*`, check.margin = FALSE),
    recycle = (X_diff %*% V) * vecS
 )
 # Unit: microseconds
 #     expr      min        lq     mean    median       uq      max neval
 #      rep  851.723  988.3655 1575.639 1203.6385 1440.578 18999.23   100
 #    sweep 1313.177 1522.4010 2355.269 1879.2605 2065.399 18783.24   100
 #  recycle  719.001  786.1265 1157.285  881.8825 1163.202 19091.79   100
 ```
 ### Scaled `crossprod` with matrix multiplication order.
 ```R
 (n <- 200)
 (p <- 12)
 (q <- 10)
 X_diff <- matrix(runif(n * (n - 1) / 2 * p), n * (n - 1) / 2, p)
 V <- matrix(rnorm(p * q), p, q)
 vecS <- runif(n * (n - 1) / 2)
 ref <- crossprod(X_diff, X_diff * vecS) %*% V
 stopifnot(
    all.equal(ref, crossprod(X_diff, (X_diff %*% V) * vecS)),
    all.equal(ref, crossprod(X_diff, (X_diff %*% V) * vecS))
 )
 microbenchmark(
    inner = crossprod(X_diff, X_diff * vecS) %*% V,
    outer = crossprod(X_diff, (X_diff %*% V) * vecS)
 )
 # Unit: microseconds
 #   expr      min       lq     mean    median       uq       max neval
 #  inner  789.065  867.939 1683.812  987.9375 1290.055 16800.265   100
 #  outer 1141.479 1216.929 1404.702 1317.7315 1582.800  2531.766   100
 ```
 ### Fast dist matrix computation (aka. row sum of squares).
 ```R
 library(microbenchmark)
 library(CVE)
 (n <- 200)
 (N <- n * (n - 1) / 2)
 (p <- 12)
 M <- matrix(runif(N * p), N, p)
 stopifnot(
    all.equal(rowSums(M^2), rowSums.c(M^2)),
    all.equal(rowSums(M^2), rowSquareSums.c(M))
 )
 microbenchmark(
    sums = rowSums(M^2),
    sums.c = rowSums.c(M^2),
    sqSums.c = rowSquareSums.c(M)
 )
 # Unit: microseconds
 #      expr     min       lq      mean    median       uq      max neval
 #      sums 666.311 1051.036 1612.3100 1139.0065 1547.657 13940.97   100
 #    sums.c 342.647  672.453 1009.9109  740.6255 1224.715 13765.90   100
 #  sqSums.c 115.325  142.128  175.6242  153.4645  169.678   759.87   100
 ```
 ## Using `Rprof()` for performance.
 The standard method for profiling where an algorithm is spending its time is with `Rprof()`.
 ```R
 path <- '../tmp/R.prof' # path to profiling file
 Rprof(path)
 cve.res <- cve.call(X, Y, k = k)
 Rprof(NULL)
 (prof <- summaryRprof(path)) # Summarise results
 ```
 **Note: consider to run `gc()` before measuring**, aka cleaning up by explicitly calling the garbage collector.
--- a/simulations/hitters.R
+++ b/simulations/hitters.R
@ -0,0 +1,80 @@
 library(ISLR) # for Hitters dataset
 library(MAVE)
 library(CVE)
 # Set global parameters.
 seed <- 21
 max.dim <- 5L
 attempts <- 25L
 max.iter <- 100L
 # momentum <- 0.0
 # Prepair data for analysis.
 cols <- c("Salary", "AtBat", "Hits", "HmRun", "Runs", "RBI",
          "Walks", "Years", "CAtBat","CHits", "CHmRun","CRuns",
          "CRBI", "CWalks", "PutOuts", "Assists", "Errors")
 outliers <- c(92,  # Gary Pettis
              120, # John Moses
              173, # Milt Thompson
              189, # Rick Burleson
              220, # Scott Fletcher
              230, # Tom Foley
              241) # Terry Puhl
 # Subselect as a matrix without outliers as well as reordered
 # and filtered columns.
 ds <- na.omit(Hitters[, cols])[-outliers, ]
 ds$Salary <- log(ds$Salary)
 ds <- scale(ds, center = TRUE, scale = TRUE)
 # Split into data and responce.
 X <- as.matrix(ds[, colnames(ds) != "Salary"])
 Y <- as.matrix(ds[, "Salary"])
 pdf("results/hitters.pdf", width = 15, height = 5)
 layout(matrix(c(1, 2), nrow = 1))
 cat("Seed: ", seed, "\n\n")
 ################################################################################
 ###                                 meanMAVE                                 ###
 ################################################################################
 set.seed(seed)
 dr <- mave(Y ~ X, method = "meanMAVE", max.dim = max.dim)
 dim <- which.min(mave.dim(dr)$cv)
 B <- coef(dr, 2L)
 # train linear model on reduced data
 data <- as.data.frame(cbind(Y, X %*% B))
 colnames(data) <- c("Y", "dir1", "dir2")
 model <- lm(Y ~ dir1 + I(dir1^2) + dir2, data = data)
 # Print and log.
 cat("### meanMAVE\n\nest.dim: ", dim, '\n\n')
 summary(model)
 plot(x = data[, "dir1"], y = data[, "Y"],
     main = "meanMave", xlab = "dir1", ylab = "Y")
 plot(x = data[, "dir2"], y = data[, "Y"],
     main = "meanMave", xlab = "dir2", ylab = "Y")
 ################################################################################
 ###                                   CVE                                    ###
 ################################################################################
 set.seed(seed)
 dr <- cve(Y ~ X, max.dim = max.dim, max.iter = max.iter, attempts = attempts)
 dim <- predict_dim(dr, method = "cv")$k
 B <- coef(dr, 2L)
 # Determine 1st direction (cause CVE does not care for column order).
 if (which.max(abs(t(coef(dr, 1L)) %*% B)) == 2) {
     B <- B[, c(2, 1)] # switch directions
 }
 # train linear model on reduced data
 data <- as.data.frame(cbind(Y, X %*% B))
 colnames(data) <- c("Y", "dir1", "dir2")
 model <- lm(Y ~ dir1 + I(dir1^2) + dir2, data = data)
 # Print and log.
 cat("### CVE\n\nest.dim: ", dim, '\n\n')
 summary(model)
 plot(x = data[, "dir1"], y = data[, "Y"],
     main = "CVE", xlab = "dir1", ylab = "Y")
 plot(x = data[, "dir2"], y = data[, "Y"],
     main = "CVE", xlab = "dir2", ylab = "Y")
 dev.off()
--- a/simulations/hitters_logging.R
+++ b/simulations/hitters_logging.R
@ -0,0 +1,108 @@
 library(ISLR) # for Hitters dataset
 library(MAVE)
 library(CVE)
 # Set global parameters.
 seed <- 21
 max.dim <- 5L
 attempts <- 25L
 max.iter <- 100L
 momentum <- 0.0
 name <- "Hitters"
 set.seed(seed)
 # Prepair data for analysis.
 cols <- c("Salary", "AtBat", "Hits", "HmRun", "Runs", "RBI",
          "Walks", "Years", "CAtBat","CHits", "CHmRun","CRuns",
          "CRBI", "CWalks", "PutOuts", "Assists", "Errors")
 outliers <- c(92,  # Gary Pettis
              120, # John Moses
              173, # Milt Thompson
              189, # Rick Burleson
              220, # Scott Fletcher
              230, # Tom Foley
              241) # Terry Puhl
 # Subselect as a matrix without outliers as well as reordered
 # and filtered columns.
 ds <- na.omit(Hitters[, cols])[-outliers, ]
 ds$Salary <- log(ds$Salary)
 ds <- scale(ds, center = TRUE, scale = TRUE)
 # Split into data and responce.
 X <- as.matrix(ds[, colnames(ds) != "Salary"])
 Y <- as.matrix(ds[, "Salary"])
 path <- file.path(getwd(), 'results', 'hitters_logger.pdf')
 pdf(path, width = 8.27, height = 11.7) # width, height unit is inces -> A4
 layout(matrix(c(1, 1,
                2, 3,
                4, 5), nrow = 3, byrow = TRUE))
 # Setup histories.
 loss.history       <- matrix(NA, max.iter + 1, attempts)
 error.history      <- matrix(NA, max.iter + 1, attempts)
 tau.history        <- matrix(NA, max.iter + 1, attempts)
 grad.norm.history  <- matrix(NA, max.iter + 1, attempts)
 # Define logger for `cve()` method.
 logger <- function(attempt, iter, data) {
    # Note the `<<-` assignement!
    loss.history[iter + 1, attempt] <<- data$loss
    error.history[iter + 1, attempt] <<- data$err
    tau.history[iter + 1, attempt] <<- data$tau
    grad.norm.history[iter + 1, attempt] <<- norm(data$G, 'F')
 }
 dr <- cve(Y ~ X, k = 2L, max.iter = max.iter, attempts = attempts,
        logger = logger)
 B <- coef(dr, 2L)
 loss <- dr$res[['2']]$loss
 textplot <- function(...) {
    text <- unlist(list(...))
    if (length(text) > 20) {
        text <- c(text[1:17],
                  '   ...... (skipped, text too long) ......',
                  text[c(-1, 0) + length(text)])
    }
    plot(NA, xlim = c(0, 1), ylim = c(0, 1),
         bty = 'n', xaxt = 'n', yaxt = 'n', xlab = '', ylab = '')
    for (i in seq_along(text)) {
        text(0, 1 - (i / 20),
             text[[i]], pos = 4)
    }
 }
 # Write metadata.
 textplot(
    paste0("Seed value: ",     seed),
            "",
    paste0("Dataset Name: ",   name),
    paste0("dim(X) = (", nrow(X), ", ", ncol(X), ")"),
    paste0("dim(B) = (", nrow(B), ", ", ncol(B), ")"),
            "",
    paste0("CVE method: ",     dr$method),
    paste0("Max Iterations: ", max.iter),
    paste0("Attempts: ",       attempts),
    paste0("Momentum: ",       momentum),
            "CVE call:",
            paste0("  > ", format(dr$call)),
            "",
    paste0("True Error: ", NA),
    paste0("loss: ",       round(loss, 3))
 )
 # Plot history's
 matplot(loss.history,       type = 'l', log = 'y', xlab = 'i (iteration)',
    main = paste('loss', name),
    ylab = expression(L(V[i])))
 matplot(grad.norm.history, type = 'l', log = 'y', xlab = 'i (iteration)',
    main = paste('gradient norm', name),
    ylab = expression(group('|', paste(nabla, L(V)), '|')[F]))
 matplot(error.history,      type = 'l', log = 'y', xlab = 'i (iteration)',
    main = paste('error', name),
    ylab = expression(group('|', V[i-1]*V[i-1]^T - V[i]*V[i]^T, '|')[F]))
 matplot(tau.history,        type = 'l', log = 'y', xlab = 'i (iteration)',
    main = paste('learning rate', name),
    ylab = expression(tau[i]))
--- a/simulations/method_compair.R
+++ b/simulations/method_compair.R
@ -0,0 +1,61 @@
 # Compairson of differend dimension reduction (dr) methods over all datasets
 # with default parametrs.
 #
 # Results are written to a CSV file with a dataset column and its simulation nr.
 #
 # Note:
 #   All methods are called via interface functions defined in
 #   'simulation_interfaces.R'. In addition this file provides a few convenience
 #   functions, namely: 'progress_logger' and 'subspace_dist'.
 #
 # Dependencies (list of packages loaded):
 #   - MASS
 #   - dr
 #   - MAVE
 #   - CVE
 source('simulation_interfaces.R') # depends on MASS, dr, MAVE and CVE
 # Number of simulations
 NR.SIM <- 100L
 DS <- 1L:7L # datasets run from 1 to 7.
 # Methods for comparison, see 'simulation_interfaces.R' for their definitions.
 methods <- c('CVE', 'wCVE', 'rCVE',  # CVE variants
             'meanOPG', 'rOPG',      # OPG variants
             'meanMAVE', 'rmave',    # MAVE variants
             'phdy', 'sir', 'save')  # Methods from dr package
 # Build result data matrix (repeat each dataset simulation times).
 result <- matrix(NA, nrow = length(DS) * NR.SIM,
                     ncol = length(methods) + 1) # +1 for dataset column.
 result[, 1L] <- rep(paste0("M", DS), NR.SIM)
 colnames(result) <- c("dataset", methods)
 # Create a progress logger for length many reports.
 logger <- progress_logger(length(DS) * NR.SIM * length(methods))
 for (sim in seq_len(nrow(result))) {
    # Create new dataset.
    with(dataset(result[sim, 1L]), { X <<- X; Y <<- Y; B <<- B })
    # And each reference method.
    for (method in methods) {
        # Report simulation progress to user.
        logger(method)
        # Try/Catch to avoid simulation stop on error, this should not
        # accure for CVE but may be the case for others like OPG.
        tryCatch({
            # Call dimension reduction method.
            dr <- eval(call(method, X = X, Y = Y, k = ncol(B)))
            # Compute distance of estimated `B` to true `B`.
            result[sim, method] <- subspace_dist(B, coef(dr, ncol(B)))
        }, error = function(e) {}) # No-Operation error handler!
    }
    # explicit call to the garbage collector.
    gc()
 }
 path <- paste0(getwd(),
    format(Sys.time(), '/results/method_compair_%Y-%m-%dT%H%M%S'))
 # Write entire simulation results into a single file.
 write.csv(result, file = paste0(path, ".csv"), row.names = FALSE)
--- a/simulations/method_compair_M2.R
+++ b/simulations/method_compair_M2.R
@ -0,0 +1,73 @@
 # Specialized comparison for dataset M2 with others than the default parameters.
 #
 # Results are written to a CSV file with a dataset column and its simulation nr.
 #
 # Note:
 #   All methods are called via interface functions defined in
 #   'simulation_interfaces.R'. In addition this file provides a few convenience
 #   functions, namely: 'progress_logger' and 'subspace_dist'.
 #
 # Dependencies (list of packages loaded):
 #   - MASS
 #   - dr
 #   - MAVE
 #   - CVE
 source('simulation_interfaces.R') # depends on MASS, dr, MAVE and CVE
 # Number of simulations
 NR.SIM <- 100L
 # Dataset param grid
 grid <- expand.grid(
    pmix = c(0.3, 0.4, 0.5),
    lambda = c(0, 0.5, 1, 1.5)
 )
 # Methods for comparison, see 'simulation_interfaces.R' for their definitions.
 methods <- c('CVE', 'wCVE', 'rCVE', 'meanOPG', 'meanMAVE')
 # Build result data matrix (repeat each dataset simulation times).
 result <- matrix(NA, nrow = nrow(grid) * NR.SIM,
                     ncol = ncol(grid) + length(methods))
 colnames(result) <- c(colnames(grid), methods)
 # Create a progress logger for length many reports.
 logger <- progress_logger(nrow(grid) * NR.SIM * length(methods))
 count <- 0
 for (sim in seq_len(NR.SIM)) {
    for (i in seq_len(nrow(grid))) {
        # Create new dataset.
        pmix <- grid[i, "pmix"]
        lambda <- grid[i, "lambda"]
        with(dataset("M2", pmix = pmix, lambda = lambda), {
            X <<- X; Y <<- Y; B <<- B
        })
        count <- count + 1
        result[count, "pmix"] <- pmix
        result[count, "lambda"] <- lambda
        # And each reference method.
        for (method in methods) {
            # Report simulation progress to user.
            logger(method)
            # Try/Catch to avoid simulation stop on error, this should not
            # accure for CVE but may be the case for others like OPG.
            tryCatch({
                # Call dimension reduction method.
                dr <- eval(call(method, X = X, Y = Y, k = ncol(B)))
                # Compute distance of estimated `B` to true `B`.
                result[count, method] <- subspace_dist(B, coef(dr, ncol(B)))
            }, error = function(e) {}) # No-Operation error handler!
        }
    }
    # explicit call to the garbage collector.
    gc()
 }
 path <- paste0(getwd(),
    format(Sys.time(), '/results/method_compair_M2_%Y-%m-%dT%H%M%S'))
 # Write entire simulation results into a single file.
 write.csv(result, file = paste0(path, ".csv"), row.names = FALSE)
--- a/simulations/plots.R
+++ b/simulations/plots.R
@ -0,0 +1,69 @@
 ### Method Compair plots.
 for (file in list.files('results', pattern = 'method_compair_[0-9T-]+\\.csv')) {
    # Read simulation result data set.
    ds <- read.csv(file.path('results', file), stringsAsFactors = FALSE)
    # Get dataset groups
    datasets <- unlist(unique(ds["dataset"]))
    # All seperate dataset plots into one file.
    pdf(file.path('results', sub('.csv', '.pdf', file, fixed = TRUE)))
    for (ds.name in datasets) {
        boxplot(ds[ds["dataset"] == ds.name, colnames(ds) != "dataset"],
                main = ds.name, ylab = 'Error', las = 2)
    }
    dev.off()
    # Plot each dataset seperately.
    for (ds.name in datasets) {
        pdf(file.path('results', sub('.csv',
                                     paste0('_', ds.name, '.pdf'),
                      file, fixed = TRUE)))
        boxplot(ds[ds["dataset"] == ds.name, colnames(ds) != "dataset"],
                main = ds.name, ylab = 'Error', las = 2)
        dev.off()
    }
 }
 ### Method Compair plots for parameterized M2.
 for (file in list.files('results', 'method_compair_M2_[0-9T-]+\\.csv')) {
    # Read simulation results.
    ds <- read.csv(file.path('results', file), stringsAsFactors = FALSE)
    # Extract M2 dataset parameters.
    pmix <- unique(ds$pmix)
    lambda <- unique(ds$lambda)
    # setup pdf file and plot layout.
    pdf(file.path('results', sub('.csv', '.pdf', file, fixed = TRUE)),
        width = 15, height = 10)
    layout(matrix(seq(length(pmix) * length(lambda)),
                  ncol = length(lambda), byrow = TRUE))
    for (p in pmix) {
        for (l in lambda) {
            boxplot(ds[ds$pmix == p & ds$lambda == l,
                       !(colnames(ds) %in% c("pmix", "lambda"))],
                    main = substitute(expression(paste(
                        lambda, ' = ', l, ', ', p[mix], ' = ', q
                    )), list(l = l, q = p)))
        }
    }
    dev.off()
 }
 ### Analyse predict dimension simulation.
 for (file in list.files('results', 'predict_dim_compair_[0-9T-]+\\.csv')) {
    # Read simulation result data set.
    ds <- read.csv(file.path('results', file), stringsAsFactors = FALSE)
    # Over all datasets (including M2 with default parameters)
    true.dim <- ds$true.dim
    dataset <- ds$dataset
    df <- ds[!(colnames(ds) %in% c("true.dim", "dataset"))]
    accuracy <- data.frame(
        all = colMeans(df == true.dim),
        M1 = colMeans(df[dataset == "M1", ] == true.dim[dataset == "M1"]),
        M2 = colMeans(df[dataset == "M2", ] == true.dim[dataset == "M2"]),
        M3 = colMeans(df[dataset == "M3", ] == true.dim[dataset == "M3"]),
        M4 = colMeans(df[dataset == "M4", ] == true.dim[dataset == "M4"]),
        M5 = colMeans(df[dataset == "M5", ] == true.dim[dataset == "M5"]),
        M6 = colMeans(df[dataset == "M6", ] == true.dim[dataset == "M6"]),
        M7 = colMeans(df[dataset == "M7", ] == true.dim[dataset == "M7"])
    )
    cat("Accuracy of simulation:", file, '\n')
    print(accuracy)    
 }
--- a/simulations/predict_dim_compair.R
+++ b/simulations/predict_dim_compair.R
@ -0,0 +1,54 @@
 # Comparison of dimension prediction methods including maves cross-validation
 # with meanMAVE as basis for reference.
 #
 # Results are written to a CSV file with a dataset column and its simulation nr.
 #
 # Note:
 #   All methods are called via interface functions defined in
 #   'simulation_interfaces.R'. In addition this file provides a few convenience
 #   functions.
 source('simulation_interfaces.R')
 # Number of simulations
 NR.SIM <- 100L
 NR.DS <- 7L
 max.dim <- 5L
 # List of dimension prediction methods.
 methods <- c("cv", "elbow", "wilcoxon")
 # Build result data matrix (repeat each dataset simulation times).
 result <- matrix(NA, nrow = NR.DS * NR.SIM,
                     ncol = length(methods) + 2) # +2 for dataset and meanMAVE.
 result[, 1L] <- rep(paste0("M", seq(NR.DS)), NR.SIM)
 colnames(result) <- c("dataset", "true.dim", "mave.cv", methods)
 # Create a progress logger for length many reports.
 logger <- progress_logger(NR.DS * NR.SIM)
 for (sim in seq_len(nrow(result))) {
    # Create new dataset.
    with(dataset(result[sim, 1L]), { X <<- X; Y <<- Y; B <<- B })
    # Write true dr dimension.
    result[sim, "true.dim"] <- ncol(B)
    # Report simulation progress to user.
    logger('')
    # Call CVE and let each dimension prediction method estimate.
    dr <- cve.call(X = X, Y = Y, max.dim = max.dim)
    for (method in methods) {
        result[sim, method] <- predict_dim(dr, method = method)$k
    }
    # The same using meanMAVE and MAVE's 'mave.dim'.
    dr <- meanMAVE(X = X, Y = Y, max.dim = max.dim)
    result[sim, "mave.cv"] <- which.min(mave.dim(dr)$cv)
    # explicit call to the garbage collector.
    gc()
 }
 path <- paste0(getwd(),
    format(Sys.time(), '/results/predict_dim_compair_%Y-%m-%dT%H%M%S'))
 # Write entire simulation results into a single file.
 write.csv(result, file = paste0(path, ".csv"), row.names = FALSE)
--- a/simulations/runtime_test.R
+++ b/simulations/runtime_test.R
@ -0,0 +1,128 @@
 # Usage:
 # ~$ Rscript runtime_test.R
 textplot <- function(...) {
    text <- unlist(list(...))
    if (length(text) > 20) {
        text <- c(text[1:17],
                  '   ...... (skipped, text too long) ......',
                  text[c(-1, 0) + length(text)])
    }
    plot(NA, xlim = c(0, 1), ylim = c(0, 1),
         bty = 'n', xaxt = 'n', yaxt = 'n', xlab = '', ylab = '')
    for (i in seq_along(text)) {
        text(0, 1 - (i / 20),
             text[[i]], pos = 4)
    }
 }
 # library(CVEpureR) # load CVE's pure R implementation
 library(CVE) # load CVE
 #' Writes log information to console. (to not get bored^^)
 tell.user <- function(name, start, i, length) {
    cat("\rRunning Test (", name, "):",
        i, "/", length,
        " - elapsed:", format(Sys.time() - start), "\033[K")
 }
 #' Computes "distance" of spanned subspaces.
 #' @param B1 Semi-orthonormal basis matrix
 #' @param B2 Semi-orthonormal basis matrix
 #' @return Frobenius norm of subspace projection matrix diff.
 subspace.dist <- function(B1, B2){
    P1 <- tcrossprod(B1, B1)
    P2 <- tcrossprod(B2, B2)
    return(norm(P1 - P2, type = 'F'))
 }
 # Set random seed
 set.seed(437)
 # Number of simulations
 SIM.NR <- 50L
 # maximal number of iterations in curvilinear search algorithm
 MAXIT <- 50L
 # number of arbitrary starting values for curvilinear optimization
 ATTEMPTS <- 10L
 # set names of datasets
 ds.names <- paste0("M", seq(7))
 # Set used CVE method
 methods <- c("simple", "weighted") # c("legacy", "simple", "linesearch", "sgd")
 # Setup error and time tracking variables
 error <- matrix(NA, SIM.NR, length(methods) * length(ds.names))
 time <- matrix(NA, SIM.NR, ncol(error))
 colnames(error) <- kronecker(paste0(ds.names, '-'), methods, paste0)
 colnames(time) <- colnames(error)
 # Create new log file and write CSV (actualy TSV) header.
 # (do not overwrite existing logs)
 log.nr <- length(list.files('tmp/', pattern = '.*\\.log'))
 file <- file.path('tmp', paste0('test', log.nr, '.log'))
 cat('dataset\tmethod\terror\ttime\n', sep = '', file = file)
 # Open a new pdf device for plotting into (do not overwrite existing ones)
 path <- paste0('test', log.nr, '.pdf')
 pdf(file.path('tmp', path))
 cat('Plotting to file:', path, '\n')
 # only for telling user (to stdout)
 count <- 0
 start <- Sys.time()
 # Start simulation loop.
 for (sim in 1:SIM.NR) {
    # Repeat for each dataset.
    for (name in ds.names) {
        tell.user(name, start, (count <- count + 1), SIM.NR * length(ds.names))
        # Create a new dataset
        ds <- dataset(name)
        # Prepare X, Y and combine to data matrix
        Y <- ds$Y
        X <- ds$X
        data <- cbind(Y, X)
        # get dimensions
        k <- ncol(ds$B)
        for (method in methods) {
            dr.time <- system.time(
                dr <- cve.call(X, Y,
                    method = method,
                    k = k,
                    attempts = ATTEMPTS
                )
            )
            dr$B <- coef(dr, k)
            key <- paste0(name, '-', method)
            error[sim, key] <- subspace.dist(dr$B, ds$B) / sqrt(2 * k)
            time[sim, key] <- dr.time["elapsed"]
            # Log results to file (mostly for long running simulations)
            cat(paste0(
                    c(name, method, error[sim, key], time[sim, key]),
                    collapse = '\t'
                ), '\n',
                sep = '', file = file, append = TRUE
            )
        }
    }
 }
 cat("\n\n## Time [sec] Summary:\n")
 print(summary(time))
 cat("\n## Error Summary:\n")
 print(summary(error))
 boxplot(error,
    main = paste0("Error (Nr of simulations ", SIM.NR, ")"),
    ylab = "Error",
    las = 2
 )
 boxplot(time,
    main = paste0("Time (Nr of simulations ", SIM.NR, ")"),
    ylab = "Time [sec]",
    las = 2
 )
--- a/simulations/simulation_interfaces.R
+++ b/simulations/simulation_interfaces.R
@ -0,0 +1,226 @@
 library(MASS) # required by 'dr'
 library(dr)   # for `dr`
 library(MAVE) # for `mave`, `coef.mave`.
 library(CVE)
 ################################################################################
 ###                       simulation utility functions                       ###
 ################################################################################
 #' Calc distance between two subspaces.
 #'
 #'     || P(A) - P(B) ||_F / sqrt(2 * ncol(A))
 #'
 #' such that `P(A) = A (A^T A)^-1 A^T` the projection onto
 #' the subspace spanned by `A`.
 #'
 #' @param A matrix which rows span a subspace.
 #' @param B matrix of the same dim. as `A`-
 #'
 #' @returns numeric metric of spanning subspaces distance.
 subspace_dist <- function(A, B) {
    P <- A %*% solve(t(A) %*% A, t(A))
    Q <- B %*% solve(t(B) %*% B, t(B))
    norm(P - Q, 'F') / sqrt(2 * ncol(A))
 }
 #' Creats a progress logger.
 #'
 #' @param len number of calls to the creted rpogress logger till finished.
 #'
 #' @returns progress logger function:
 #'     \code{logger(info)}
 #' which is to be called \code{len} times with a string \code{info}.
 progress_logger <- function(len) {
    start <- Sys.time()
    count <- 0
    return(function(info) {
        cat("\rRunning (", info, "):",
            (count <<- count + 1), "/", len,
            " - elapsed:", format(Sys.time() - start), "\033[K")
        if (count == len) { cat("\n") }
    })
 }
 ################################################################################
 ###                                cve wrapper                               ###
 ################################################################################
 CVE <- function(...) {
    .cve <- match.call(expand.dots = TRUE)
    .cve[[1L]] <- quote(cve.call)
    .cve$method <- "simple"
    eval(.cve)
 }
 wCVE <- function(...) {
    .cve <- match.call(expand.dots = TRUE)
    .cve[[1L]] <- quote(cve.call)
    .cve$method <- "weighted"
    eval(.cve)
 }
 rCVE <- function(...) {
    .cve <- match.call(expand.dots = TRUE)
    .cve[[1L]] <- quote(cve.call)
    .cve$method <- "simple"
    dr <- eval(.cve)
    for (name in names(dr$res)) {
        .cve$method <- "weighted"
        .cve$k <- dr$res[[name]]$k
        .cve$V.init <- dr$res[[name]]$V
        dr$res[[name]] <- eval(.cve)$res[[name]]
    }
    dr$call <- match.call()
    dr$call[[1L]] <- quote(cve)
    dr$call$method <- "refined"
    return(dr)
 }
 wls <- function(x,y,w){ 
    x <- as.matrix(x); y <- as.vector(y)
  n=dim(x)[1];
  p=dim(x)[2]-1 
  out=c(solve(t(x*w)%*%x/n)%*%apply(x*y*w,2,mean))
  return(list(a=out[1],b=out[2:(p+1)]))
 }
 #' Compute kernel matrix.
 #'
 #' @param X matrix of dim. `n x p` (`n` samples of dimension `p`)
 #' @param h numeric bandwidth.
 #'
 #' @returns `n x n` matrix with kernel of X_i to X_j interaction
 #'      in its `i, j` component.
 kern <- function(X, h) {
    X <- as.matrix(X)
    n <- nrow(X)
    k2 <- X %*% t(X)                # (X_i^T X_j)_ij
    k3 <- matrix(diag(k2), n, n)    # diag(k2) in rows
    k1 <- t(k3)                     # diag(k2) in cols
    K <- k1 - 2 * k2 + k3           # (|X_i|^2 - 2 * X_i^T X_j - |X_j|^2)_ij
    return(exp((-0.5 / h^2) * K))
 }
 coef.opg <- function(object, ...) {
    return(object$B)
 }
 OPG <- function(X, Y, k, c0 = 2.34) {
    # Cast parameters.
    X <- as.matrix(X); Y <- as.vector(Y)
    # Get Dimensions
    n <- nrow(X); p <- ncol(X)
    p0 = max(p, 3)
    rn <- n^(-1 / (2 * (p0 + 6)))
    h <- c0 * n^(-1 / (p0 + 6))
    sig <- diag(var(X))
    X <- scale(X)
    kmat <- kern(X, h)
    bmat <- numeric()
    for (i in 1:n) {
        wi <- kmat[, i]
        xi <- cbind(1, t(t(X) - X[i, ]))
        bmat <- cbind(bmat, wls(xi, Y, wi)$b)
    }
    beta <- eigen(bmat %*% t(bmat))$vectors[, 1:k]
    B <- diag(sig^(-1 / 2)) %*% beta
    return(structure(list(B = B), class = "opg"))
 }
 rOPG <- function(X, Y, k, nr.iter = 25L, c0 = 2.34) {
    # Cast parameters.
    X <- as.matrix(X); Y <- as.vector(Y)
    # Get Dimensions
    n <- nrow(X); p <- ncol(X)
    sig <- diag(var(X))
    X <- scale(X)
    p0 = max(p, 3)
    rn <- n^(-1 / (2 * (p0 + 6)))
    h <- c0 * n^(-1 / (p0 + 6)) 
    beta <- diag(p)
    for(. in seq_len(nr.iter)) {
        kmat <- kern(X %*% beta, h)
        bmat <- numeric()
        for (i in 1:n) {
            wi <- kmat[, i]
            xi <- cbind(1, t(t(X) - X[i, ])) 
            bmat <- cbind(bmat, wls(xi, Y, wi)$b)
        }
        beta <- eigen(bmat %*% t(bmat))$vectors[, 1:k, drop = F]
        h <- max(rn * h, c0 * n^((-1/(k + 4)))) 
    }
    B <- diag(sig^(-1/2)) %*% beta 
    return(structure(list(B = B), class = "opg"))
 }
 rmave <- function(X, Y, k, nr.iter = 25L, c0 = 2.34) {
    X <- as.matrix(X); Y <- as.vector(Y)
    # Get dimensions.
    n <- nrow(X); p <- ncol(X)
    X <- scale(X)
    sig <- attr(X, "scaled:scale")^2 # diag(var(X))
    p0 <- max(p,3)
    h <- c0 * n^(-1 / (p0 + 6))
    rn <- n^(-1 / (2 * (p0 + 6)))
    beta <- OPG(X, Y, k)$B
    for (. in seq_len(nr.iter)) {
        kermat <- kern(X %*% beta, h)
        mkermat <- colMeans(kermat)
        a <- numeric()
        b <- numeric()
        for (i in 1:n) {
            wi <- kermat[ ,i] / mkermat[i]
            ui <- cbind(1, t(t(X) - X[i, ]) %*% beta)
            out <- wls(ui, Y, wi)
            a <- c(a, out$a)
            b <- cbind(b, out$b)
        }
        out <- 0
        out1 <- 0
        for (i in 1:n) {
            xi <- kronecker(t(t(X) - X[i, ]), t(b[, i]))
            yi <- Y - a[i]
            wi <- kermat[, i] / mkermat[i]
            out <- out + colMeans(xi * yi * wi)
            out1 <- out1 + t(xi * wi) %*% xi / n
        }
        beta <- t(matrix(solve(out1, out), k, p))
        h <- max(rn * h, c0 * n^(-1 / (k + 4)))
    }
    B <- diag(sig^-0.5) %*% beta
    return(structure(list(B = B), class = "opg"))
 }
 meanOPG <- function(X, Y, max.dim = 10L, ...) {
    mave(Y ~ X, method = "meanOPG", max.dim = max.dim)
 }
 meanMAVE <- function(X, Y, max.dim = 10L, ...) {
    mave(Y ~ X, method = "meanMAVE", max.dim = max.dim)
 }
 coef.phdy <- coef.sir <- coef.save <- function(object, k, ...) {
    matrix(object$evectors[, 1:k], ncol = k)
 }
 phdy <- function(X, Y, k, ...) {
    dr(Y ~ X, method = 'phdy', numdir = k)
 }
 sir <- function(X, Y, k, ...) {
    dr(Y ~ X, method = 'sir', numdir = k)
 }
 save <- function(X, Y, k, ...) {
    dr(Y ~ X, method = 'save', numdir = k)
 }