2
0
Fork 0

- wip: reviewing and fixing doc.

This commit is contained in:
Daniel Kapla 2019-12-20 09:40:46 +01:00
parent b71898a5bc
commit 5d3a0ca18d
17 changed files with 196 additions and 166 deletions

View File

@ -2,7 +2,7 @@ Package: CVE
Type: Package
Title: Conditional Variance Estimator for Sufficient Dimension Reduction
Version: 0.2
Date: 2019-11-13
Date: 2019-12-20
Author: Daniel Kapla <daniel@kapla.at>, Lukas Fertl <lukas.fertl@chello.at>
Maintainer: Daniel Kapla <daniel@kapla.at>
Description: Implementation of the Conditional Variance Estimation (CVE) method.

View File

@ -2,16 +2,13 @@
#'
#' Conditional Variance Estimation (CVE) is a novel sufficient dimension
#' reduction (SDR) method for regressions satisfying \eqn{E(Y|X) = E(Y|B'X)},
#' where \eqn{B'X} is a lower dimensional projection of the predictors. CVE,
#' where \eqn{B'X} is a lower dimensional projection of the predictors and
#' \eqn{Y} is a univariate responce. CVE,
#' similarly to its main competitor, the mean average variance estimation
#' (MAVE), is not based on inverse regression, and does not require the
#' restrictive linearity and constant variance conditions of moment based SDR
#' methods. CVE is data-driven and applies to additive error regressions with
#' continuous predictors and link function. The effectiveness and accuracy of
#' CVE compared to MAVE and other SDR techniques is demonstrated in simulation
#' studies. CVE is shown to outperform MAVE in some model set-ups, while it
#' remains largely on par under most others.
#' Let \eqn{Y} be real denotes a univariate response and \eqn{X} a real
#' continuous predictors and link function. Let \eqn{X} be a real
#' \eqn{p}-dimensional covariate vector. We assume that the dependence of
#' \eqn{Y} and \eqn{X} is modelled by
#' \deqn{Y = g(B'X) + \epsilon}
@ -20,11 +17,11 @@
#' zero random variable with finite \eqn{Var(\epsilon) = E(\epsilon^2)}, \eqn{g}
#' is an unknown, continuous non-constant function,
#' and \eqn{B = (b_1, ..., b_k)} is
#' a real \eqn{p \times k}{p x k} of rank \eqn{k \leq p}{k <= p}.
#' a real \eqn{p \times k}{p x k} matrix of rank \eqn{k \leq p}{k <= p}.
#' Without loss of generality \eqn{B} is assumed to be orthonormal.
#'
#' @author Daniel Kapla, Lukas Fertl, Bura Efstathia
#' @references Fertl Lukas, Bura Efstathia. (2019), Conditional Variance
#' @references Fertl, L. and Bura, E. (2019), Conditional Variance
#' Estimation for Sufficient Dimension Reduction. Working Paper.
#'
#' @docType package
@ -33,7 +30,10 @@
#' Conditional Variance Estimator (CVE).
#'
#' @inherit CVE-package description
#' This is the main function in the \code{CVE} package. It creates objects of
#' class \code{"cve"} to estimate the mean subspace. Helper functions that
#' require a \code{"cve"} object can then be applied to the output from this
#' function.
#'
#' @param formula an object of class \code{"formula"} which is a symbolic
#' description of the model to be fitted like \eqn{Y\sim X}{Y ~ X} where
@ -46,13 +46,41 @@
#' @param method This character string specifies the method of fitting. The
#' options are
#' \itemize{
#' \item "simple" implementation as described in the paper.
#' \item "simple" implementation,
#' \item "weighted" variation with adaptive weighting of slices.
#' }
#' see paper.
#' see Fertl, L. and Bura, E. (2019).
#' @param max.dim upper bounds for \code{k}, (ignored if \code{k} is supplied).
#' @param ... optional parameters passed on to \code{cve.call}.
#'
#'
#' Conditional Variance Estimation (CVE) is a sufficient dimension reduction
#' (SDR) method for regressions studying \eqn{E(Y|X)}, the conditional
#' expectation of a response \eqn{Y} given a set of predictors \eqn{X}. This
#' function provides methods for estimating the dimension and the subspace
#' spanned by the columns of a \eqn{p\times k}{p x k} matrix \eqn{B} of minimal
#' rank \eqn{k} such that
#' \deqn{%
#' E(Y|X) = E(Y|B'X) %
#' }
#' or, equivalently,
#' \deqn{%
#' Y = g(B'X) + \epsilon %
#' }
#' where \eqn{X} is independent of \eqn{\epsilon} with positive definite
#' variance-covariance matrix \eqn{Var(X) = \Sigma_X}. \eqn{\epsilon} is a mean
#' zero random variable with finite \eqn{Var(\epsilon) = E(\epsilon^2)}, \eqn{g}
#' is an unknown, continuous non-constant function, and \eqn{B = (b_1,..., b_k)}
#' is a real \eqn{p \times k}{p x k} matrix of rank \eqn{k \leq p}{k <= p}.
#'
#' Both the dimension \eqn{k} and the subspace \eqn{span(B)} are unknown. The
#' CVE method makes very few assumptions.
#'
#' A kernel matrix \eqn{\hat{B}}{Bhat} is estimated such that the column space
#' of \eqn{\hat{B}}{Bhat} should be close to the mean subspace \eqn{span(B)}.
#' The primary output from this method is a set of orthonormal vectors,
#' \eqn{\hat{B}}{Bhat}, whose span estimates \eqn{span(B)}.
#'
#' @return an S3 object of class \code{cve} with components:
#' \describe{
#' \item{X}{design matrix of predictor vector used for calculating
@ -130,7 +158,7 @@
#'
#' @seealso For a detailed description of \code{formula} see
#' \code{\link{formula}}.
#' @references Fertl Lukas, Bura Efstathia. (2019), Conditional Variance
#' @references Fertl, L. and Bura, E. (2019), Conditional Variance
#' Estimation for Sufficient Dimension Reduction. Working Paper.
#'
#' @importFrom stats model.frame
@ -159,8 +187,8 @@ cve <- function(formula, data, method = "simple", max.dim = 10L, ...) {
#' @inherit cve title
#' @inherit cve description
#'
#' @param X Design matrix with dimension \eqn{n\times p}{n x p}.
#' @param Y numeric array of length \eqn{n} of Responses.
#' @param X Design predictor matrix.
#' @param Y \eqn{n}-dimensional vector of responces.
#' @param h bandwidth or function to estimate bandwidth, defaults to internaly
#' estimated bandwidth.
#' @param nObs parameter for choosing bandwidth \code{h} using
@ -193,7 +221,7 @@ cve <- function(formula, data, method = "simple", max.dim = 10L, ...) {
#' @param V.init Semi-orthogonal matrix of dimensions `(ncol(X), ncol(X) - k)
#' used as starting value in the optimization. (If supplied,
#' \code{attempts} is set to 0 and \code{k} to match dimension).
#' @param logger a logger function (only for advanced user, slows down the
#' @param logger a logger function (only for advanced users, slows down the
#' computation).
#'
#' @inherit cve return
@ -209,11 +237,11 @@ cve <- function(formula, data, method = "simple", max.dim = 10L, ...) {
#' # Y = f(B'X) + err
#' # with f(x1) = x1 and err ~ N(0, 0.25^2)
#' Y <- X %*% B + 0.25 * rnorm(100)
#'
#'
#' # calculate cve with method 'simple' for k = 1
#' set.seed(21)
#' cve.obj.simple1 <- cve(Y ~ X, k = 1)
#'
#'
#' # same as
#' set.seed(21)
#' cve.obj.simple2 <- cve.call(X, Y, k = 1)

View File

@ -1,14 +1,14 @@
#' Gets estimated SDR basis.
#' Extracts estimated SDR basis.
#'
#' Returns the SDR basis matrix for dimension \code{k}, i.e. returns the
#' cve-estimate with dimension \eqn{p\times k}{p x k}.
#' cve-estimate of \eqn{B} with dimension \eqn{p\times k}{p x k}.
#'
#' @param object instance of \code{cve} as output from \code{\link{cve}} or
#' \code{\link{cve.call}}.
#' @param object an object of class \code{"cve"}, usually, a result of a call to
#' \code{\link{cve}} or \code{\link{cve.call}}.
#' @param k the SDR dimension.
#' @param ... ignored.
#' @param ... ignored (no additional arguments).
#'
#' @return dir the matrix of CS or CMS of given dimension
#' @return The matrix \eqn{B} of dimensions \eqn{p\times k}{p x k}.
#'
#' @examples
#' # set dimensions for simulation model
@ -19,7 +19,7 @@
#' b1 <- rep(1 / sqrt(p), p)
#' b2 <- (-1)^seq(1, p) / sqrt(p)
#' B <- cbind(b1, b2)
#'
#'
#' set.seed(21)
#' # creat predictor data x ~ N(0, I_p)
#' x <- matrix(rnorm(n * p), n, p)
@ -31,7 +31,7 @@
#' cve.obj <- cve(y ~ x, max.dim = 5)
#' # get cve-estimate for B with dimensions (p, k = 2)
#' B2 <- coef(cve.obj, k = 2)
#'
#'
#' # Projection matrix on span(B)
#' # equivalent to `B %*% t(B)` since B is semi-orthonormal
#' PB <- B %*% solve(t(B) %*% B) %*% t(B)

View File

@ -1,15 +1,17 @@
#' @export
directions <- function(dr, k) {
directions <- function(object, k, ...) {
UseMethod("directions")
}
#' Computes projected training data \code{X} for given dimension `k`.
#'
#' Projects the dimensional design matrix \eqn{X} on the columnspace of the
#' cve-estimate for given dimension \eqn{k}.
#' Returns \eqn{B'X}. That is the dimensional design matrix \eqn{X} on the
#' columnspace of the cve-estimate for given dimension \eqn{k}.
#'
#' @param dr Instance of \code{'cve'} as returned by \code{\link{cve}}.
#' @param object an object of class \code{"cve"}, usually, a result of a call to
#' \code{\link{cve}} or \code{\link{cve.call}}.
#' @param k SDR dimension to use for projection.
#' @param ... ignored (no additional arguments).
#'
#' @return the \eqn{n\times k}{n x k} dimensional matrix \eqn{X B} where \eqn{B}
#' is the cve-estimate for dimension \eqn{k}.
@ -32,12 +34,14 @@ directions <- function(dr, k) {
#' # plot y against projected data
#' plot(x.proj, y)
#'
#' @seealso \code{\link{cve}}
#'
#' @method directions cve
#' @aliases directions directions.cve
#' @export
directions.cve <- function(dr, k) {
if (!(k %in% names(dr$res))) {
directions.cve <- function(object, k, ...) {
if (!(k %in% names(object$res))) {
stop("SDR directions for requested dimension `k` not computed.")
}
return(dr$X %*% dr$res[[as.character(k)]]$B)
return(object$X %*% object$res[[as.character(k)]]$B)
}

View File

@ -7,14 +7,14 @@
#' h = (2 * tr(\Sigma) / p) * (1.2 * n^(-1 / (4 + k)))^2}
#' Alternative version 2 is used for dimension prediction which is given by
#' \deqn{%
#' h = (2 * tr(\Sigma) / p) * \chi_k^{-1}(\frac{nObs - 1}{n - 1})}{%
#' h = \frac{2 tr(\Sigma)}{p} \chi_k^{-1}(\frac{nObs - 1}{n - 1})}{%
#' h = (2 * tr(\Sigma) / p) * \chi_k^-1((nObs - 1) / (n - 1))}
#' with \eqn{n} the sample size, \eqn{p} its dimension and the
#' covariance-matrix \eqn{\Sigma}, which is \code{(n-1)/n} times the sample
#' covariance estimate.
#' with \eqn{n} the sample size, \eqn{p} the dimension of \eqn{X} and
#' \eqn{\Sigma} is \eqn{(n - 1) / n} times the sample covariance matrix of
#' \eqn{X}.
#'
#' @param X a \eqn{n\times p}{n x p} matrix with samples in its rows.
#' @param k Dimension of lower dimensional projection.
#' @param X the \eqn{n\times p}{n x p} matrix of predictor values.
#' @param k the SDR dimension.
#' @param nObs number of points in a slice, only for version 2.
#' @param version either \code{1} or \code{2}.
#'

View File

@ -1,13 +1,16 @@
#' Loss distribution elbow plot.
#' Elbow plot of the loss function.
#'
#' Boxplots of the output \code{L} from \code{\link{cve}} over \code{k} from
#' \code{min.dim} to \code{max.dim}. For given \code{k}, \code{L} corresponds
#' to \eqn{L_n(V, X_i)} where \eqn{V \in S(p, p - k)}{V} is the minimizer of
#' \eqn{L_n(V)}, for further details see the paper.
#' to \eqn{L_n(V, X_i)} where \eqn{V} is a stiefel manifold element as
#' minimizer of
#' \eqn{L_n(V)}, for further details see Fertl, L. and Bura, E. (2019).
#'
#' @param x Object of class \code{"cve"} (result of [\code{\link{cve}}]).
#' @param x an object of class \code{"cve"}, usually, a result of a call to
#' \code{\link{cve}} or \code{\link{cve.call}}.
#' @param ... Pass through parameters to [\code{\link{plot}}] and
#' [\code{\link{lines}}]
#'
#' @examples
#' # create B for simulation
#' B <- cbind(rep(1, 6), (-1)^seq(6)) / sqrt(6)
@ -34,7 +37,7 @@
#' # elbow plot
#' plot(cve.obj.simple)
#'
#' @references Fertl Lukas, Bura Efstathia. (2019), Conditional Variance
#' @references Fertl, L. and Bura, E. (2019), Conditional Variance
#' Estimation for Sufficient Dimension Reduction. Working Paper.
#'
#' @seealso see \code{\link{par}} for graphical parameters to pass through

View File

@ -1,15 +1,15 @@
#' Predict method for CVE Fits.
#'
#' Predict response using projected data where the forward model \eqn{g(B'X)}
#' is estimated using \code{\link{mars}}.
#' Predict response using projected data \eqn{B'C} by fitting
#' \eqn{g(B'C) + \epsilon} using \code{\link{mars}}.
#'
#' @param object instance of class \code{cve} (result of \code{cve},
#' \code{cve.call}).
#' @param newdata Matrix of the new data to be predicted.
#' @param dim dimension of SDR space to be used for data projecition.
#' @param object an object of class \code{"cve"}, usually, a result of a call to
#' \code{\link{cve}} or \code{\link{cve.call}}.
#' @param newdata Matrix of new predictor values, \eqn{C}.
#' @param k dimension of SDR space to be used for data projection.
#' @param ... further arguments passed to \code{\link{mars}}.
#'
#' @return prediced response of data \code{newdata}.
#' @return prediced response at \code{newdata}.
#'
#' @examples
#' # create B for simulation
@ -44,11 +44,11 @@
#' @importFrom mda mars
#' @method predict cve
#' @export
predict.cve <- function(object, newdata, dim, ...) {
predict.cve <- function(object, newdata, k, ...) {
if (missing(newdata)) {
stop("No data supplied.")
}
if (missing(dim)) {
if (missing(k)) {
stop("No dimension supplied.")
}
@ -56,7 +56,7 @@ predict.cve <- function(object, newdata, dim, ...) {
newdata <- matrix(newdata, nrow = 1L)
}
B <- object$res[[as.character(dim)]]$B
B <- object$res[[as.character(k)]]$B
model <- mda::mars(object$X %*% B, object$Y)
predict(model, newdata %*% B)

View File

@ -126,19 +126,18 @@ predict_dim_wilcoxon <- function(object, p.value = 0.05) {
#'
#' This function estimates the dimension of the mean dimension reduction space,
#' i.e. number of columns of \eqn{B} matrix. The default method \code{'CV'}
#' performs cross-validation using \code{mars}. Given
#' performs l.o.o cross-validation using \code{mars}. Given
#' \code{k = min.dim, ..., max.dim} a cross-validation via \code{mars} is
#' performed on the dataset \eqn{(Y i, B_k' X_i)_{i = 1, ..., n}} where
#' \eqn{B_k} is the \eqn{p \times k}{p x k} dimensional CVE estimate given
#' \eqn{k}. The estimated SDR dimension is the \eqn{k} where the
#' cross-validation mean squared error is the lowest. The method \code{'elbow'}
#' estimates the dimension via \eqn{k = argmin_k L_n(V_{p k})} where
#' \eqn{V_{p k}} is the CVE estimate of the orthogonal columnspace of
#' \eqn{B_k}. Method \code{'wilcoxon'} is similar to \code{'elbow'} but finds
#' the minimum using the wilcoxon-test.
#' performed on the dataset \eqn{(Y_i, B_k' X_i)_{i = 1, ..., n}} where
#' \eqn{B_k} is the \eqn{p \times k}{p x k} dimensional CVE estimate. The
#' estimated SDR dimension is the \eqn{k} where the
#' cross-validation mean squared error is minimal. The method \code{'elbow'}
#' estimates the dimension via \eqn{k = argmin_k L_n(V_{p - k})} where
#' \eqn{V_{p - k}} is space that is orthogonal to the columns-space of the CVE estimate of \eqn{B_k}. Method \code{'wilcoxon'} is similar to \code{'elbow'}
#' but finds the minimum using the wilcoxon-test.
#'
#' @param object instance of class \code{cve} (result of \code{\link{cve}},
#' \code{\link{cve.call}}).
#' @param object an object of class \code{"cve"}, usually, a result of a call to
#' \code{\link{cve}} or \code{\link{cve.call}}.
#' @param method This parameter specify which method will be used in dimension
#' estimation. It provides three methods \code{'CV'} (default), \code{'elbow'},
#' and \code{'wilcoxon'} to estimate the dimension of the SDR.

View File

@ -8,16 +8,13 @@
\description{
Conditional Variance Estimation (CVE) is a novel sufficient dimension
reduction (SDR) method for regressions satisfying \eqn{E(Y|X) = E(Y|B'X)},
where \eqn{B'X} is a lower dimensional projection of the predictors. CVE,
where \eqn{B'X} is a lower dimensional projection of the predictors and
\eqn{Y} is a univariate responce. CVE,
similarly to its main competitor, the mean average variance estimation
(MAVE), is not based on inverse regression, and does not require the
restrictive linearity and constant variance conditions of moment based SDR
methods. CVE is data-driven and applies to additive error regressions with
continuous predictors and link function. The effectiveness and accuracy of
CVE compared to MAVE and other SDR techniques is demonstrated in simulation
studies. CVE is shown to outperform MAVE in some model set-ups, while it
remains largely on par under most others.
Let \eqn{Y} be real denotes a univariate response and \eqn{X} a real
continuous predictors and link function. Let \eqn{X} be a real
\eqn{p}-dimensional covariate vector. We assume that the dependence of
\eqn{Y} and \eqn{X} is modelled by
\deqn{Y = g(B'X) + \epsilon}
@ -26,11 +23,11 @@ variance-covariance matrix \eqn{Var(X) = \Sigma_X}. \eqn{\epsilon} is a mean
zero random variable with finite \eqn{Var(\epsilon) = E(\epsilon^2)}, \eqn{g}
is an unknown, continuous non-constant function,
and \eqn{B = (b_1, ..., b_k)} is
a real \eqn{p \times k}{p x k} of rank \eqn{k \leq p}{k <= p}.
a real \eqn{p \times k}{p x k} matrix of rank \eqn{k \leq p}{k <= p}.
Without loss of generality \eqn{B} is assumed to be orthonormal.
}
\references{
Fertl Lukas, Bura Efstathia. (2019), Conditional Variance
Fertl, L. and Bura, E. (2019), Conditional Variance
Estimation for Sufficient Dimension Reduction. Working Paper.
}
\author{

View File

@ -2,24 +2,24 @@
% Please edit documentation in R/coef.R
\name{coef.cve}
\alias{coef.cve}
\title{Gets estimated SDR basis.}
\title{Extracts estimated SDR basis.}
\usage{
\method{coef}{cve}(object, k, ...)
}
\arguments{
\item{object}{instance of \code{cve} as output from \code{\link{cve}} or
\code{\link{cve.call}}.}
\item{object}{an object of class \code{"cve"}, usually, a result of a call to
\code{\link{cve}} or \code{\link{cve.call}}.}
\item{k}{the SDR dimension.}
\item{...}{ignored.}
\item{...}{ignored (no additional arguments).}
}
\value{
dir the matrix of CS or CMS of given dimension
The matrix \eqn{B} of dimensions \eqn{p\times k}{p x k}.
}
\description{
Returns the SDR basis matrix for dimension \code{k}, i.e. returns the
cve-estimate with dimension \eqn{p\times k}{p x k}.
cve-estimate of \eqn{B} with dimension \eqn{p\times k}{p x k}.
}
\examples{
# set dimensions for simulation model
@ -30,7 +30,7 @@ n <- 200 # samplesize
b1 <- rep(1 / sqrt(p), p)
b2 <- (-1)^seq(1, p) / sqrt(p)
B <- cbind(b1, b2)
set.seed(21)
# creat predictor data x ~ N(0, I_p)
x <- matrix(rnorm(n * p), n, p)
@ -42,7 +42,7 @@ y <- (x \%*\% b1)^2 + 2 * (x \%*\% b2) + 0.25 * rnorm(100)
cve.obj <- cve(y ~ x, max.dim = 5)
# get cve-estimate for B with dimensions (p, k = 2)
B2 <- coef(cve.obj, k = 2)
# Projection matrix on span(B)
# equivalent to `B \%*\% t(B)` since B is semi-orthonormal
PB <- B \%*\% solve(t(B) \%*\% B) \%*\% t(B)

View File

@ -20,14 +20,42 @@ the environment from which \code{cve} is called.}
\item{method}{This character string specifies the method of fitting. The
options are
\itemize{
\item "simple" implementation as described in the paper.
\item "simple" implementation,
\item "weighted" variation with adaptive weighting of slices.
}
see paper.}
see Fertl, L. and Bura, E. (2019).}
\item{max.dim}{upper bounds for \code{k}, (ignored if \code{k} is supplied).}
\item{...}{optional parameters passed on to \code{cve.call}.}
\item{...}{optional parameters passed on to \code{cve.call}.
Conditional Variance Estimation (CVE) is a sufficient dimension reduction
(SDR) method for regressions studying \eqn{E(Y|X)}, the conditional
expectation of a response \eqn{Y} given a set of predictors \eqn{X}. This
function provides methods for estimating the dimension and the subspace
spanned by the columns of a \eqn{p\times k}{p x k} matrix \eqn{B} of minimal
rank \eqn{k} such that
\deqn{%
E(Y|X) = E(Y|B'X) %
}
or, equivalently,
\deqn{%
Y = g(B'X) + \epsilon %
}
where \eqn{X} is independent of \eqn{\epsilon} with positive definite
variance-covariance matrix \eqn{Var(X) = \Sigma_X}. \eqn{\epsilon} is a mean
zero random variable with finite \eqn{Var(\epsilon) = E(\epsilon^2)}, \eqn{g}
is an unknown, continuous non-constant function, and \eqn{B = (b_1,..., b_k)}
is a real \eqn{p \times k}{p x k} matrix of rank \eqn{k \leq p}{k <= p}.
Both the dimension \eqn{k} and the subspace \eqn{span(B)} are unknown. The
CVE method makes very few assumptions.
A kernel matrix \eqn{\hat{B}}{Bhat} is estimated such that the column space
of \eqn{\hat{B}}{Bhat} should be close to the mean subspace \eqn{span(B)}.
The primary output from this method is a set of orthonormal vectors,
\eqn{\hat{B}}{Bhat}, whose span estimates \eqn{span(B)}.}
}
\value{
an S3 object of class \code{cve} with components:
@ -56,28 +84,10 @@ an S3 object of class \code{cve} with components:
}
}
\description{
Conditional Variance Estimation (CVE) is a novel sufficient dimension
reduction (SDR) method for regressions satisfying \eqn{E(Y|X) = E(Y|B'X)},
where \eqn{B'X} is a lower dimensional projection of the predictors. CVE,
similarly to its main competitor, the mean average variance estimation
(MAVE), is not based on inverse regression, and does not require the
restrictive linearity and constant variance conditions of moment based SDR
methods. CVE is data-driven and applies to additive error regressions with
continuous predictors and link function. The effectiveness and accuracy of
CVE compared to MAVE and other SDR techniques is demonstrated in simulation
studies. CVE is shown to outperform MAVE in some model set-ups, while it
remains largely on par under most others.
Let \eqn{Y} be real denotes a univariate response and \eqn{X} a real
\eqn{p}-dimensional covariate vector. We assume that the dependence of
\eqn{Y} and \eqn{X} is modelled by
\deqn{Y = g(B'X) + \epsilon}
where \eqn{X} is independent of \eqn{\epsilon} with positive definite
variance-covariance matrix \eqn{Var(X) = \Sigma_X}. \eqn{\epsilon} is a mean
zero random variable with finite \eqn{Var(\epsilon) = E(\epsilon^2)}, \eqn{g}
is an unknown, continuous non-constant function,
and \eqn{B = (b_1, ..., b_k)} is
a real \eqn{p \times k}{p x k} of rank \eqn{k \leq p}{k <= p}.
Without loss of generality \eqn{B} is assumed to be orthonormal.
This is the main function in the \code{CVE} package. It creates objects of
class \code{"cve"} to estimate the mean subspace. Helper functions that
require a \code{"cve"} object can then be applied to the output from this
function.
}
\examples{
# set dimensions for simulation model
@ -131,7 +141,7 @@ norm(PB - PB.w, type = 'F')
}
\references{
Fertl Lukas, Bura Efstathia. (2019), Conditional Variance
Fertl, L. and Bura, E. (2019), Conditional Variance
Estimation for Sufficient Dimension Reduction. Working Paper.
}
\seealso{

View File

@ -10,9 +10,9 @@ cve.call(X, Y, method = "simple", nObs = sqrt(nrow(X)), h = NULL,
max.iter = 50L, attempts = 10L, logger = NULL)
}
\arguments{
\item{X}{Design matrix with dimension \eqn{n\times p}{n x p}.}
\item{X}{Design predictor matrix.}
\item{Y}{numeric array of length \eqn{n} of Responses.}
\item{Y}{\eqn{n}-dimensional vector of responces.}
\item{method}{specifies the CVE method variation as one of
\itemize{
@ -60,7 +60,7 @@ used as starting value in the optimization. (If supplied,
out \code{attempts} times with starting values drawn from the invariant
measure on the Stiefel manifold (see \code{\link{rStiefel}}).}
\item{logger}{a logger function (only for advanced user, slows down the
\item{logger}{a logger function (only for advanced users, slows down the
computation).}
}
\value{
@ -90,28 +90,10 @@ an S3 object of class \code{cve} with components:
}
}
\description{
Conditional Variance Estimation (CVE) is a novel sufficient dimension
reduction (SDR) method for regressions satisfying \eqn{E(Y|X) = E(Y|B'X)},
where \eqn{B'X} is a lower dimensional projection of the predictors. CVE,
similarly to its main competitor, the mean average variance estimation
(MAVE), is not based on inverse regression, and does not require the
restrictive linearity and constant variance conditions of moment based SDR
methods. CVE is data-driven and applies to additive error regressions with
continuous predictors and link function. The effectiveness and accuracy of
CVE compared to MAVE and other SDR techniques is demonstrated in simulation
studies. CVE is shown to outperform MAVE in some model set-ups, while it
remains largely on par under most others.
Let \eqn{Y} be real denotes a univariate response and \eqn{X} a real
\eqn{p}-dimensional covariate vector. We assume that the dependence of
\eqn{Y} and \eqn{X} is modelled by
\deqn{Y = g(B'X) + \epsilon}
where \eqn{X} is independent of \eqn{\epsilon} with positive definite
variance-covariance matrix \eqn{Var(X) = \Sigma_X}. \eqn{\epsilon} is a mean
zero random variable with finite \eqn{Var(\epsilon) = E(\epsilon^2)}, \eqn{g}
is an unknown, continuous non-constant function,
and \eqn{B = (b_1, ..., b_k)} is
a real \eqn{p \times k}{p x k} of rank \eqn{k \leq p}{k <= p}.
Without loss of generality \eqn{B} is assumed to be orthonormal.
This is the main function in the \code{CVE} package. It creates objects of
class \code{"cve"} to estimate the mean subspace. Helper functions that
require a \code{"cve"} object can then be applied to the output from this
function.
}
\examples{
# create B for simulation (k = 1)

View File

@ -5,20 +5,23 @@
\alias{directions}
\title{Computes projected training data \code{X} for given dimension `k`.}
\usage{
\method{directions}{cve}(dr, k)
\method{directions}{cve}(object, k, ...)
}
\arguments{
\item{dr}{Instance of \code{'cve'} as returned by \code{\link{cve}}.}
\item{object}{an object of class \code{"cve"}, usually, a result of a call to
\code{\link{cve}} or \code{\link{cve.call}}.}
\item{k}{SDR dimension to use for projection.}
\item{...}{ignored (no additional arguments).}
}
\value{
the \eqn{n\times k}{n x k} dimensional matrix \eqn{X B} where \eqn{B}
is the cve-estimate for dimension \eqn{k}.
}
\description{
Projects the dimensional design matrix \eqn{X} on the columnspace of the
cve-estimate for given dimension \eqn{k}.
Returns \eqn{B'X}. That is the dimensional design matrix \eqn{X} on the
columnspace of the cve-estimate for given dimension \eqn{k}.
}
\examples{
# create B for simulation (k = 1)
@ -39,3 +42,6 @@ x.proj <- directions(cve.obj.simple, k = 1)
plot(x.proj, y)
}
\seealso{
\code{\link{cve}}
}

View File

@ -7,9 +7,9 @@
estimate.bandwidth(X, k, nObs, version = 1L)
}
\arguments{
\item{X}{a \eqn{n\times p}{n x p} matrix with samples in its rows.}
\item{X}{the \eqn{n\times p}{n x p} matrix of predictor values.}
\item{k}{Dimension of lower dimensional projection.}
\item{k}{the SDR dimension.}
\item{nObs}{number of points in a slice, only for version 2.}
@ -26,11 +26,11 @@ defaults to using the following formula (version 1)
h = (2 * tr(\Sigma) / p) * (1.2 * n^(-1 / (4 + k)))^2}
Alternative version 2 is used for dimension prediction which is given by
\deqn{%
h = (2 * tr(\Sigma) / p) * \chi_k^{-1}(\frac{nObs - 1}{n - 1})}{%
h = \frac{2 tr(\Sigma)}{p} \chi_k^{-1}(\frac{nObs - 1}{n - 1})}{%
h = (2 * tr(\Sigma) / p) * \chi_k^-1((nObs - 1) / (n - 1))}
with \eqn{n} the sample size, \eqn{p} its dimension and the
covariance-matrix \eqn{\Sigma}, which is \code{(n-1)/n} times the sample
covariance estimate.
with \eqn{n} the sample size, \eqn{p} the dimension of \eqn{X} and
\eqn{\Sigma} is \eqn{(n - 1) / n} times the sample covariance matrix of
\eqn{X}.
}
\examples{
# set dimensions for simulation model

View File

@ -2,12 +2,13 @@
% Please edit documentation in R/plot.R
\name{plot.cve}
\alias{plot.cve}
\title{Loss distribution elbow plot.}
\title{Elbow plot of the loss function.}
\usage{
\method{plot}{cve}(x, ...)
}
\arguments{
\item{x}{Object of class \code{"cve"} (result of [\code{\link{cve}}]).}
\item{x}{an object of class \code{"cve"}, usually, a result of a call to
\code{\link{cve}} or \code{\link{cve.call}}.}
\item{...}{Pass through parameters to [\code{\link{plot}}] and
[\code{\link{lines}}]}
@ -15,8 +16,9 @@
\description{
Boxplots of the output \code{L} from \code{\link{cve}} over \code{k} from
\code{min.dim} to \code{max.dim}. For given \code{k}, \code{L} corresponds
to \eqn{L_n(V, X_i)} where \eqn{V \in S(p, p - k)}{V} is the minimizer of
\eqn{L_n(V)}, for further details see the paper.
to \eqn{L_n(V, X_i)} where \eqn{V} is a stiefel manifold element as
minimizer of
\eqn{L_n(V)}, for further details see Fertl, L. and Bura, E. (2019).
}
\examples{
# create B for simulation
@ -46,7 +48,7 @@ plot(cve.obj.simple)
}
\references{
Fertl Lukas, Bura Efstathia. (2019), Conditional Variance
Fertl, L. and Bura, E. (2019), Conditional Variance
Estimation for Sufficient Dimension Reduction. Working Paper.
}
\seealso{

View File

@ -4,24 +4,24 @@
\alias{predict.cve}
\title{Predict method for CVE Fits.}
\usage{
\method{predict}{cve}(object, newdata, dim, ...)
\method{predict}{cve}(object, newdata, k, ...)
}
\arguments{
\item{object}{instance of class \code{cve} (result of \code{cve},
\code{cve.call}).}
\item{object}{an object of class \code{"cve"}, usually, a result of a call to
\code{\link{cve}} or \code{\link{cve.call}}.}
\item{newdata}{Matrix of the new data to be predicted.}
\item{newdata}{Matrix of new predictor values, \eqn{C}.}
\item{dim}{dimension of SDR space to be used for data projecition.}
\item{k}{dimension of SDR space to be used for data projection.}
\item{...}{further arguments passed to \code{\link{mars}}.}
}
\value{
prediced response of data \code{newdata}.
prediced response at \code{newdata}.
}
\description{
Predict response using projected data where the forward model \eqn{g(B'X)}
is estimated using \code{\link{mars}}.
Predict response using projected data \eqn{B'C} by fitting
\eqn{g(B'C) + \epsilon} using \code{\link{mars}}.
}
\examples{
# create B for simulation

View File

@ -7,8 +7,8 @@
predict_dim(object, ..., method = "CV")
}
\arguments{
\item{object}{instance of class \code{cve} (result of \code{\link{cve}},
\code{\link{cve.call}}).}
\item{object}{an object of class \code{"cve"}, usually, a result of a call to
\code{\link{cve}} or \code{\link{cve.call}}.}
\item{...}{ignored.}
@ -26,16 +26,15 @@ list with
\description{
This function estimates the dimension of the mean dimension reduction space,
i.e. number of columns of \eqn{B} matrix. The default method \code{'CV'}
performs cross-validation using \code{mars}. Given
performs l.o.o cross-validation using \code{mars}. Given
\code{k = min.dim, ..., max.dim} a cross-validation via \code{mars} is
performed on the dataset \eqn{(Y i, B_k' X_i)_{i = 1, ..., n}} where
\eqn{B_k} is the \eqn{p \times k}{p x k} dimensional CVE estimate given
\eqn{k}. The estimated SDR dimension is the \eqn{k} where the
cross-validation mean squared error is the lowest. The method \code{'elbow'}
estimates the dimension via \eqn{k = argmin_k L_n(V_{p k})} where
\eqn{V_{p k}} is the CVE estimate of the orthogonal columnspace of
\eqn{B_k}. Method \code{'wilcoxon'} is similar to \code{'elbow'} but finds
the minimum using the wilcoxon-test.
performed on the dataset \eqn{(Y_i, B_k' X_i)_{i = 1, ..., n}} where
\eqn{B_k} is the \eqn{p \times k}{p x k} dimensional CVE estimate. The
estimated SDR dimension is the \eqn{k} where the
cross-validation mean squared error is minimal. The method \code{'elbow'}
estimates the dimension via \eqn{k = argmin_k L_n(V_{p - k})} where
\eqn{V_{p - k}} is space that is orthogonal to the columns-space of the CVE estimate of \eqn{B_k}. Method \code{'wilcoxon'} is similar to \code{'elbow'}
but finds the minimum using the wilcoxon-test.
}
\examples{
# create B for simulation