Compare commits
3 Commits
7a68948d26
...
b1f25b89da
Author | SHA1 | Date |
---|---|---|
|
b1f25b89da | |
|
5b3912f21b | |
|
8fb2fd057d |
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,505 @@
|
|||
\documentclass[aos]{imsart}
|
||||
|
||||
%% Packages
|
||||
\RequirePackage{amsthm,amsmath,amsfonts,amssymb}
|
||||
\RequirePackage[numbers,sort&compress]{natbib}
|
||||
%\RequirePackage[authoryear]{natbib}%% uncomment this for author-year citations
|
||||
\RequirePackage[colorlinks,citecolor=blue,urlcolor=blue]{hyperref}
|
||||
\RequirePackage{graphicx}
|
||||
|
||||
\startlocaldefs
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% %%
|
||||
%% Uncomment next line to change %%
|
||||
%% the type of equation numbering %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\numberwithin{equation}{section}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% %%
|
||||
%% For Axiom, Claim, Corollary, Hypothesis, %%
|
||||
%% Lemma, Theorem, Proposition %%
|
||||
%% use \theoremstyle{plain} %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\theoremstyle{plain}
|
||||
\newtheorem{axiom}{Axiom}
|
||||
\newtheorem{claim}[axiom]{Claim}
|
||||
\newtheorem{theorem}{Theorem}[section]
|
||||
\newtheorem{lemma}[theorem]{Lemma}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% %%
|
||||
%% For Assumption, Definition, Example, %%
|
||||
%% Notation, Property, Remark, Fact %%
|
||||
%% use \theoremstyle{remark} %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\theoremstyle{remark}
|
||||
\newtheorem{definition}[theorem]{Definition}
|
||||
\newtheorem*{example}{Example}
|
||||
\newtheorem*{fact}{Fact}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Please put your definitions here: %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
\endlocaldefs
|
||||
|
||||
\begin{document}
|
||||
|
||||
\begin{frontmatter}
|
||||
\title{A sample article title}
|
||||
%\title{A sample article title with some additional note\thanksref{t1}}
|
||||
\runtitle{A sample running head title}
|
||||
%\thankstext{T1}{A sample additional note to the title.}
|
||||
|
||||
\begin{aug}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Only one address is permitted per author. %%
|
||||
%% Only division, organization and e-mail is %%
|
||||
%% included in the address. %%
|
||||
%% Additional information can be included in %%
|
||||
%% the Acknowledgments section if necessary. %%
|
||||
%% ORCID can be inserted by command: %%
|
||||
%% \orcid{0000-0000-0000-0000} %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\author[A]{\fnms{First}~\snm{Author}\ead[label=e1]{first@somewhere.com}},
|
||||
\author[B]{\fnms{Second}~\snm{Author}\ead[label=e2]{second@somewhere.com}\orcid{0000-0000-0000-0000}}
|
||||
\and
|
||||
\author[B]{\fnms{Third}~\snm{Author}\ead[label=e3]{third@somewhere.com}}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Addresses %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\address[A]{Department,
|
||||
University or Company Name\printead[presep={,\ }]{e1}}
|
||||
|
||||
\address[B]{Department,
|
||||
University or Company Name\printead[presep={,\ }]{e2,e3}}
|
||||
\end{aug}
|
||||
|
||||
\begin{abstract}
|
||||
The abstract should summarize the contents of the paper.
|
||||
It should be clear, descriptive, self-explanatory and not longer
|
||||
than 200 words. It should also be suitable for publication in
|
||||
abstracting services. Formulas should be used as sparingly as
|
||||
possible within the abstract. The abstract should not make
|
||||
reference to results, bibliography or formulas in the body
|
||||
of the paper---it should be self-contained.
|
||||
|
||||
This is a sample input file. Comparing it with the output it
|
||||
generates can show you how to produce a simple document of
|
||||
your own.
|
||||
\end{abstract}
|
||||
|
||||
\begin{keyword}[class=MSC]
|
||||
\kwd[Primary ]{00X00}
|
||||
\kwd{00X00}
|
||||
\kwd[; secondary ]{00X00}
|
||||
\end{keyword}
|
||||
|
||||
\begin{keyword}
|
||||
\kwd{First keyword}
|
||||
\kwd{second keyword}
|
||||
\end{keyword}
|
||||
|
||||
\end{frontmatter}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Please use \tableofcontents for articles %%
|
||||
%% with 50 pages and more %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\tableofcontents
|
||||
|
||||
\section{Introduction}
|
||||
|
||||
This template helps you to create a properly formatted \LaTeXe\ manuscript.
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% `\ ' is used here because TeX ignores %%
|
||||
%% spaces after text commands. %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
Prepare your paper in the same style as used in this sample .pdf file.
|
||||
Try to avoid excessive use of italics and bold face.
|
||||
Please do not use any \LaTeXe\ or \TeX\ commands that affect the layout
|
||||
or formatting of your document (i.e., commands like \verb|\textheight|,
|
||||
\verb|\textwidth|, etc.).
|
||||
|
||||
\section{Section headings}
|
||||
Here are some sub-sections:
|
||||
\subsection{A sub-section}
|
||||
Regular text.
|
||||
\subsubsection{A sub-sub-section}
|
||||
Regular text.
|
||||
|
||||
\section{Text}
|
||||
\subsection{Lists}
|
||||
|
||||
The following is an example of an \emph{itemized} list,
|
||||
two levels deep.
|
||||
\begin{itemize}
|
||||
\item
|
||||
This is the first item of an itemized list. Each item
|
||||
in the list is marked with a ``tick.'' The document
|
||||
style determines what kind of tick mark is used.
|
||||
\item
|
||||
This is the second item of the list. It contains another
|
||||
list nested inside it.
|
||||
\begin{itemize}
|
||||
\item This is the first item of an itemized list that
|
||||
is nested within the itemized list.
|
||||
\item This is the second item of the inner list. \LaTeX\
|
||||
allows you to nest lists deeper than you really should.
|
||||
\end{itemize}
|
||||
This is the rest of the second item of the outer list.
|
||||
\item
|
||||
This is the third item of the list.
|
||||
\end{itemize}
|
||||
|
||||
The following is an example of an \emph{enumerated} list of one level.
|
||||
|
||||
\begin{longlist}
|
||||
\item This is the first item of an enumerated list.
|
||||
\item This is the second item of an enumerated list.
|
||||
\end{longlist}
|
||||
|
||||
The following is an example of an \emph{enumerated} list, two levels deep.
|
||||
\begin{longlist}
|
||||
\item[1.]
|
||||
This is the first item of an enumerated list. Each item
|
||||
in the list is marked with a ``tick.'' The document
|
||||
style determines what kind of tick mark is used.
|
||||
\item[2.]
|
||||
This is the second item of the list. It contains another
|
||||
list nested inside of it.
|
||||
\begin{longlist}
|
||||
\item
|
||||
This is the first item of an enumerated list that
|
||||
is nested within.
|
||||
\item
|
||||
This is the second item of the inner list. \LaTeX\
|
||||
allows you to nest lists deeper than you really should.
|
||||
\end{longlist}
|
||||
This is the rest of the second item of the outer list.
|
||||
\item[3.]
|
||||
This is the third item of the list.
|
||||
\end{longlist}
|
||||
|
||||
\subsection{Punctuation}
|
||||
Dashes come in three sizes: a hyphen, an intra-word dash like ``$U$-statistics'' or ``the time-homogeneous model'';
|
||||
a medium dash (also called an ``en-dash'') for number ranges or between two equal entities like ``1--2'' or ``Cauchy--Schwarz inequality'';
|
||||
and a punctuation dash (also called an ``em-dash'') in place of a comma, semicolon,
|
||||
colon or parentheses---like this.
|
||||
|
||||
Generating an ellipsis \ldots\ with the right spacing
|
||||
around the periods requires a special command.
|
||||
|
||||
\subsection{Citation}
|
||||
Only include in the reference list entries for which there are text citations,
|
||||
and make sure all citations are included in the reference list.
|
||||
Simple cite: \cite{r1}.
|
||||
Multiple bibliography items cite: \cite{r1,r2,r3,r4}.
|
||||
Citing bibliography with object: \cite{r1}, Theorem 1.
|
||||
|
||||
\section{Fonts}
|
||||
Please use text fonts in text mode, e.g.:
|
||||
\begin{itemize}
|
||||
\item[]\textrm{Roman}
|
||||
\item[]\textit{Italic}
|
||||
\item[]\textbf{Bold}
|
||||
\item[]\textsc{Small Caps}
|
||||
\item[]\textsf{Sans serif}
|
||||
\item[]\texttt{Typewriter}
|
||||
\end{itemize}
|
||||
Please use mathematical fonts in mathematical mode, e.g.:
|
||||
\begin{itemize}
|
||||
\item[] $\mathrm{ABCabc123}$
|
||||
\item[] $\mathit{ABCabc123}$
|
||||
\item[] $\mathbf{ABCabc123}$
|
||||
\item[] $\boldsymbol{ABCabc123\alpha\beta\gamma}$
|
||||
\item[] $\mathcal{ABC}$
|
||||
\item[] $\mathbb{ABC}$
|
||||
\item[] $\mathsf{ABCabc123}$
|
||||
\item[] $\mathtt{ABCabc123}$
|
||||
\item[] $\mathfrak{ABCabc123}$
|
||||
\end{itemize}
|
||||
Note that \verb|\mathcal, \mathbb| belongs to capital letters-only font typefaces.
|
||||
|
||||
\section{Notes}
|
||||
Footnotes\footnote{This is an example of a footnote.}
|
||||
pose no problem.\footnote{Note that footnote number is after punctuation.}
|
||||
|
||||
\section{Quotations}
|
||||
|
||||
Text is displayed by indenting it from the left margin. There are short quotations
|
||||
\begin{quote}
|
||||
This is a short quotation. It consists of a
|
||||
single paragraph of text. There is no paragraph
|
||||
indentation.
|
||||
\end{quote}
|
||||
and longer ones.
|
||||
\begin{quotation}
|
||||
This is a longer quotation. It consists of two paragraphs
|
||||
of text. The beginning of each paragraph is indicated
|
||||
by an extra indentation.
|
||||
|
||||
This is the second paragraph of the quotation. It is just
|
||||
as dull as the first paragraph.
|
||||
\end{quotation}
|
||||
|
||||
\section{Environments}
|
||||
|
||||
\subsection{Examples for \emph{\texttt{plain}}-style environments}
|
||||
\begin{axiom}\label{ax1}
|
||||
This is the body of Axiom \ref{ax1}.
|
||||
\end{axiom}
|
||||
|
||||
\begin{proof}
|
||||
This is the body of the proof of the axiom above.
|
||||
\end{proof}
|
||||
|
||||
\begin{claim}\label{cl1}
|
||||
This is the body of Claim \ref{cl1}. Claim \ref{cl1} is numbered after
|
||||
Axiom \ref{ax1} because we used \verb|[axiom]| in \verb|\newtheorem|.
|
||||
\end{claim}
|
||||
|
||||
\begin{theorem}\label{th1}
|
||||
This is the body of Theorem \ref{th1}. Theorem \ref{th1} numbering is
|
||||
dependent on section because we used \verb|[section]| after \verb|\newtheorem|.
|
||||
\end{theorem}
|
||||
|
||||
\begin{theorem}[Title of the theorem]\label{th2}
|
||||
This is the body of Theorem \ref{th2}. Theorem \ref{th2} has additional title.
|
||||
\end{theorem}
|
||||
|
||||
\begin{lemma}\label{le1}
|
||||
This is the body of Lemma \ref{le1}. Lemma \ref{le1} is numbered after
|
||||
Theorem \ref{th2} because we used \verb|[theorem]| in \verb|\newtheorem|.
|
||||
\end{lemma}
|
||||
|
||||
|
||||
\begin{proof}[Proof of Lemma \ref{le1}]
|
||||
This is the body of the proof of Lemma \ref{le1}.
|
||||
\end{proof}
|
||||
|
||||
\subsection{Examples for \emph{\texttt{remark}}-style environments}
|
||||
\begin{definition}\label{de1}
|
||||
This is the body of Definition \ref{de1}. Definition \ref{de1} is numbered after
|
||||
Lemma \ref{le1} because we used \verb|[theorem]| in \verb|\newtheorem|.
|
||||
\end{definition}
|
||||
|
||||
\begin{example}
|
||||
This is the body of the example. Example is unnumbered because we used \verb|\newtheorem*|
|
||||
instead of \verb|\newtheorem|.
|
||||
\end{example}
|
||||
|
||||
\begin{fact}
|
||||
This is the body of the fact. Fact is unnumbered because we used \verb|\newtheorem*|
|
||||
instead of \verb|\newtheorem|.
|
||||
\end{fact}
|
||||
|
||||
\section{Tables and figures}
|
||||
Cross-references to labeled tables: As you can see in Table~\ref{sphericcase}
|
||||
and also in Table~\ref{parset}.
|
||||
|
||||
\begin{table*}
|
||||
\caption{The spherical case ($I_1=0$, $I_2=0$)}
|
||||
\label{sphericcase}
|
||||
\begin{tabular}{@{}lrrrrc@{}}
|
||||
\hline
|
||||
Equil. \\
|
||||
points & \multicolumn{1}{c}{$x$}
|
||||
& \multicolumn{1}{c}{$y$} & \multicolumn{1}{c}{$z$}
|
||||
& \multicolumn{1}{c}{$C$} & S \\
|
||||
\hline
|
||||
$L_1$ & $-$2.485252241 & 0.000000000 & 0.017100631 & 8.230711648 & U \\
|
||||
$L_2$ & 0.000000000 & 0.000000000 & 3.068883732 & 0.000000000 & S \\
|
||||
$L_3$ & 0.009869059 & 0.000000000 & 4.756386544 & $-$0.000057922 & U \\
|
||||
$L_4$ & 0.210589855 & 0.000000000 & $-$0.007021459 & 9.440510897 & U \\
|
||||
$L_5$ & 0.455926604 & 0.000000000 & $-$0.212446624 & 7.586126667 & U \\
|
||||
$L_6$ & 0.667031314 & 0.000000000 & 0.529879957 & 3.497660052 & U \\
|
||||
$L_7$ & 2.164386674 & 0.000000000 & $-$0.169308438 & 6.866562449 & U \\
|
||||
$L_8$ & 0.560414471 & 0.421735658 & $-$0.093667445 & 9.241525367 & U \\
|
||||
$L_9$ & 0.560414471 & $-$0.421735658 & $-$0.093667445 & 9.241525367 & U \\
|
||||
$L_{10}$ & 1.472523232 & 1.393484549 & $-$0.083801333 & 6.733436505 & U \\
|
||||
$L_{11}$ & 1.472523232 & $-$1.393484549 & $-$0.083801333 & 6.733436505 & U \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{table*}
|
||||
|
||||
\begin{table}
|
||||
\caption{Sample posterior estimates for each model}
|
||||
\label{parset}
|
||||
%
|
||||
\begin{tabular}{@{}lcrcrrr@{}}
|
||||
\hline
|
||||
&& & &\multicolumn{3}{c}{Quantile} \\
|
||||
\cline{5-7}
|
||||
Model &Parameter &
|
||||
\multicolumn{1}{c}{Mean} &
|
||||
Std. dev.&
|
||||
\multicolumn{1}{c}{2.5\%} &
|
||||
\multicolumn{1}{c}{50\%}&
|
||||
\multicolumn{1}{c@{}}{97.5\%} \\
|
||||
\hline
|
||||
{Model 0} & $\beta_0$ & $-$12.29 & 2.29 & $-$18.04 & $-$11.99 & $-$8.56 \\
|
||||
& $\beta_1$ & 0.10 & 0.07 & $-$0.05 & 0.10 & 0.26 \\
|
||||
& $\beta_2$ & 0.01 & 0.09 & $-$0.22 & 0.02 & 0.16 \\[6pt]
|
||||
{Model 1} & $\beta_0$ & $-$4.58 & 3.04 & $-$11.00 & $-$4.44 & 1.06 \\
|
||||
& $\beta_1$ & 0.79 & 0.21 & 0.38 & 0.78 & 1.20 \\
|
||||
& $\beta_2$ & $-$0.28 & 0.10 & $-$0.48 & $-$0.28 & $-$0.07 \\[6pt]
|
||||
{Model 2} & $\beta_0$ & $-$11.85 & 2.24 & $-$17.34 & $-$11.60 & $-$7.85 \\
|
||||
& $\beta_1$ & 0.73 & 0.21 & 0.32 & 0.73 & 1.16 \\
|
||||
& $\beta_2$ & $-$0.60 & 0.14 & $-$0.88 & $-$0.60 & $-$0.34 \\
|
||||
& $\beta_3$ & 0.22 & 0.17 & $-$0.10 & 0.22 & 0.55 \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
%
|
||||
\end{table}
|
||||
|
||||
\begin{figure}
|
||||
\includegraphics{figure1}
|
||||
\caption{Pathway of the penicillin G biosynthesis.}
|
||||
\label{penG}
|
||||
\end{figure}
|
||||
|
||||
Sample of cross-reference to figure.
|
||||
Figure~\ref{penG} shows that it is not easy to get something on paper.
|
||||
|
||||
\section{Equations and the like}
|
||||
|
||||
Two equations:
|
||||
\begin{equation}
|
||||
C_{s} = K_{M} \frac{\mu/\mu_{x}}{1-\mu/\mu_{x}} \label{ccs}
|
||||
\end{equation}
|
||||
and
|
||||
\begin{equation}
|
||||
G = \frac{P_{\mathrm{opt}} - P_{\mathrm{ref}}}{P_{\mathrm{ref}}} 100(\%).
|
||||
\end{equation}
|
||||
|
||||
Equation arrays:
|
||||
\begin{eqnarray}
|
||||
\frac{dS}{dt} & = & - \sigma X + s_{F} F,\\
|
||||
\frac{dX}{dt} & = & \mu X,\\
|
||||
\frac{dP}{dt} & = & \pi X - k_{h} P,\\
|
||||
\frac{dV}{dt} & = & F.
|
||||
\end{eqnarray}
|
||||
One long equation:
|
||||
\begin{eqnarray}
|
||||
\mu_{\text{normal}} & = & \mu_{x} \frac{C_{s}}{K_{x}C_{x}+C_{s}} \nonumber\\
|
||||
& = & \mu_{\text{normal}} - Y_{x/s}\bigl(1-H(C_{s})\bigr)(m_{s}+\pi /Y_{p/s})\\
|
||||
& = & \mu_{\text{normal}}/Y_{x/s}+ H(C_{s}) (m_{s}+ \pi /Y_{p/s}).\nonumber
|
||||
\end{eqnarray}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Example with single Appendix: %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{appendix}
|
||||
\section*{Title}\label{appn} %% if no title is needed, leave empty \section*{}.
|
||||
Appendices should be provided in \verb|{appendix}| environment,
|
||||
before Acknowledgements.
|
||||
|
||||
If there is only one appendix,
|
||||
then please refer to it in text as \ldots\ in the \hyperref[appn]{Appendix}.
|
||||
\end{appendix}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Example with multiple Appendixes: %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{appendix}
|
||||
\section{Title of the first appendix}\label{appA}
|
||||
If there are more than one appendix, then please refer to it
|
||||
as \ldots\ in Appendix \ref{appA}, Appendix \ref{appB}, etc.
|
||||
|
||||
\section{Title of the second appendix}\label{appB}
|
||||
\subsection{First subsection of Appendix \protect\ref{appB}}
|
||||
|
||||
Use the standard \LaTeX\ commands for headings in \verb|{appendix}|.
|
||||
Headings and other objects will be numbered automatically.
|
||||
\begin{equation}
|
||||
\mathcal{P}=(j_{k,1},j_{k,2},\dots,j_{k,m(k)}). \label{path}
|
||||
\end{equation}
|
||||
|
||||
Sample of cross-reference to the formula (\ref{path}) in Appendix \ref{appB}.
|
||||
\end{appendix}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Support information, if any, %%
|
||||
%% should be provided in the %%
|
||||
%% Acknowledgements section. %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{acks}[Acknowledgments]
|
||||
The authors would like to thank the anonymous referees, an Associate
|
||||
Editor and the Editor for their constructive comments that improved the
|
||||
quality of this paper.
|
||||
\end{acks}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Funding information, if any, %%
|
||||
%% should be provided in the %%
|
||||
%% funding section. %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{funding}
|
||||
The first author was supported by NSF Grant DMS-??-??????.
|
||||
|
||||
The second author was supported in part by NIH Grant ???????????.
|
||||
\end{funding}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Supplementary Material, including data %%
|
||||
%% sets and code, should be provided in %%
|
||||
%% {supplement} environment with title %%
|
||||
%% and short description. It cannot be %%
|
||||
%% available exclusively as external link. %%
|
||||
%% All Supplementary Material must be %%
|
||||
%% available to the reader on Project %%
|
||||
%% Euclid with the published article. %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{supplement}
|
||||
\stitle{Title of Supplement A}
|
||||
\sdescription{Short description of Supplement A.}
|
||||
\end{supplement}
|
||||
\begin{supplement}
|
||||
\stitle{Title of Supplement B}
|
||||
\sdescription{Short description of Supplement B.}
|
||||
\end{supplement}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% The Bibliography %%
|
||||
%% %%
|
||||
%% imsart-???.bst will be used to %%
|
||||
%% create a .BBL file for submission. %%
|
||||
%% %%
|
||||
%% Note that the displayed Bibliography will not %%
|
||||
%% necessarily be rendered by Latex exactly as specified %%
|
||||
%% in the online Instructions for Authors. %%
|
||||
%% %%
|
||||
%% MR numbers will be added by VTeX. %%
|
||||
%% %%
|
||||
%% Use \cite{...} to cite references in text. %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
%% if your bibliography is in bibtex format, uncomment commands:
|
||||
%\bibliographystyle{imsart-number} % Style BST file (imsart-number.bst or imsart-nameyear.bst)
|
||||
%\bibliography{bibliography} % Bibliography file (usually '*.bib')
|
||||
|
||||
%% or include bibliography directly:
|
||||
\begin{thebibliography}{4}
|
||||
%%
|
||||
\bibitem{r1}
|
||||
\textsc{Billingsley, P.} (1999). \textit{Convergence of
|
||||
Probability Measures}, 2nd ed.
|
||||
Wiley, New York.
|
||||
|
||||
\bibitem{r2}
|
||||
\textsc{Bourbaki, N.} (1966). \textit{General Topology} \textbf{1}.
|
||||
Addison--Wesley, Reading, MA.
|
||||
|
||||
\bibitem{r3}
|
||||
\textsc{Ethier, S. N.} and \textsc{Kurtz, T. G.} (1985).
|
||||
\textit{Markov Processes: Characterization and Convergence}.
|
||||
Wiley, New York.
|
||||
|
||||
\bibitem{r4}
|
||||
\textsc{Prokhorov, Yu.} (1956).
|
||||
Convergence of random processes and limit theorems in probability
|
||||
theory. \textit{Theory Probab. Appl.}
|
||||
\textbf{1} 157--214.
|
||||
\end{thebibliography}
|
||||
|
||||
\end{document}
|
|
@ -0,0 +1,197 @@
|
|||
%% Template for the submission to:
|
||||
%% The Annals of Statistics [AOS]
|
||||
%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% In this template, the places where you %%
|
||||
%% need to fill in your information are %%
|
||||
%% indicated by '???'. %%
|
||||
%% %%
|
||||
%% Please do not use \input{...} to include %%
|
||||
%% other tex files. Submit your LaTeX %%
|
||||
%% manuscript as one .tex document. %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
\documentclass[aos]{imsart}
|
||||
|
||||
%% Packages
|
||||
\RequirePackage{amsthm,amsmath,amsfonts,amssymb}
|
||||
\RequirePackage[numbers]{natbib}
|
||||
%\RequirePackage[authoryear]{natbib}%% uncomment this for author-year citations
|
||||
%\RequirePackage[colorlinks,citecolor=blue,urlcolor=blue]{hyperref}%% uncomment this for coloring bibliography citations and linked URLs
|
||||
%\RequirePackage{graphicx}%% uncomment this for including figures
|
||||
|
||||
\startlocaldefs
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% %%
|
||||
%% Uncomment next line to change %%
|
||||
%% the type of equation numbering %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\numberwithin{equation}{section}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% %%
|
||||
%% For Axiom, Claim, Corollary, Hypothesis, %%
|
||||
%% Lemma, Theorem, Proposition %%
|
||||
%% use \theoremstyle{plain} %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\theoremstyle{plain}
|
||||
%\newtheorem{???}{???}
|
||||
%\newtheorem*{???}{???}
|
||||
%\newtheorem{???}{???}[???]
|
||||
%\newtheorem{???}[???]{???}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% %%
|
||||
%% For Assumption, Definition, Example, %%
|
||||
%% Notation, Property, Remark, Fact %%
|
||||
%% use \theoremstyle{remark} %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\theoremstyle{remark}
|
||||
%\newtheorem{???}{???}
|
||||
%\newtheorem*{???}{???}
|
||||
%\newtheorem{???}{???}[???]
|
||||
%\newtheorem{???}[???]{???}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Please put your definitions here: %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
\endlocaldefs
|
||||
|
||||
\begin{document}
|
||||
|
||||
\begin{frontmatter}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% %%
|
||||
%% Enter the title of your article here %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\title{???}
|
||||
%\title{A sample article title with some additional note\thanksref{T1}}
|
||||
\runtitle{???}
|
||||
%\thankstext{T1}{A sample of additional note to the title.}
|
||||
|
||||
\begin{aug}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Only one address is permitted per author. %%
|
||||
%% Only division, organization and e-mail is %%
|
||||
%% included in the address. %%
|
||||
%% Additional information can be included in %%
|
||||
%% the Acknowledgments section if necessary. %%
|
||||
%% ORCID can be inserted by command: %%
|
||||
%% \orcid{0000-0000-0000-0000} %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\author[A]{\fnms{???}~\snm{???}\ead[label=e1]{???@???}},
|
||||
\author[B]{\fnms{???}~\snm{???}\ead[label=e2]{???@???}}
|
||||
\and
|
||||
\author[B]{\fnms{???}~\snm{???}\ead[label=e3]{???@???}}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Addresses %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\address[A]{???\printead[presep={,\ }]{e1}}
|
||||
|
||||
\address[B]{???\printead[presep={,\ }]{e2,e3}}
|
||||
\end{aug}
|
||||
|
||||
\begin{abstract}
|
||||
???.
|
||||
\end{abstract}
|
||||
|
||||
\begin{keyword}[class=MSC]
|
||||
\kwd[Primary ]{???}
|
||||
\kwd{???}
|
||||
\kwd[; secondary ]{???}
|
||||
\end{keyword}
|
||||
|
||||
\begin{keyword}
|
||||
\kwd{???}
|
||||
\kwd{???}
|
||||
\end{keyword}
|
||||
|
||||
\end{frontmatter}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Please use \tableofcontents for articles %%
|
||||
%% with 50 pages and more %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\tableofcontents
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%% Main text entry area:
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Single Appendix: %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\begin{appendix}
|
||||
%\section*{???}%% if no title is needed, leave empty \section*{}.
|
||||
%\end{appendix}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Multiple Appendixes: %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\begin{appendix}
|
||||
%\section{???}
|
||||
%
|
||||
%\section{???}
|
||||
%
|
||||
%\end{appendix}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Support information, if any, %%
|
||||
%% should be provided in the %%
|
||||
%% Acknowledgements section. %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\begin{acks}[Acknowledgments]
|
||||
% The authors would like to thank ...
|
||||
%\end{acks}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Funding information, if any, %%
|
||||
%% should be provided in the %%
|
||||
%% funding section. %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\begin{funding}
|
||||
% The first author was supported by ...
|
||||
%
|
||||
% The second author was supported in part by ...
|
||||
%\end{funding}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% Supplementary Material, including data %%
|
||||
%% sets and code, should be provided in %%
|
||||
%% {supplement} environment with title %%
|
||||
%% and short description. It cannot be %%
|
||||
%% available exclusively as external link. %%
|
||||
%% All Supplementary Material must be %%
|
||||
%% available to the reader on Project %%
|
||||
%% Euclid with the published article. %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\begin{supplement}
|
||||
%\stitle{???}
|
||||
%\sdescription{???.}
|
||||
%\end{supplement}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%% The Bibliography %%
|
||||
%% %%
|
||||
%% imsart-???.bst will be used to %%
|
||||
%% create a .BBL file for submission. %%
|
||||
%% %%
|
||||
%% Note that the displayed Bibliography will not %%
|
||||
%% necessarily be rendered by Latex exactly as specified %%
|
||||
%% in the online Instructions for Authors. %%
|
||||
%% %%
|
||||
%% MR numbers will be added by VTeX. %%
|
||||
%% %%
|
||||
%% Use \cite{...} to cite references in text. %%
|
||||
%% %%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
%% if your bibliography is in bibtex format, uncomment commands:
|
||||
%\bibliographystyle{imsart-number} % Style BST file (imsart-number.bst or imsart-nameyear.bst)
|
||||
%\bibliography{bibliography} % Bibliography file (usually '*.bib')
|
||||
|
||||
%% or include bibliography directly:
|
||||
% \begin{thebibliography}{}
|
||||
% \bibitem{b1}
|
||||
% \end{thebibliography}
|
||||
|
||||
\end{document}
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,153 @@
|
|||
%% This is file `imsart.cls'
|
||||
%%
|
||||
%% LaTeX 2e class file for the processing of LaTeX2e files
|
||||
%% of the following IMS/BS journals:
|
||||
%%
|
||||
%% The Annals of Probability
|
||||
%% The Annals of Applied Probability
|
||||
%% The Annals of Statistics
|
||||
%% The Annals of Applied Statistics
|
||||
%% Statistical Science
|
||||
%% Probability Surveys
|
||||
%% Statistics Surveys
|
||||
%% Electronic Journal of Statistics
|
||||
%% Bernoulli
|
||||
%% Annales de l'Institut Henri Poincar\'e - Probabilit\'es et Statistiques
|
||||
%% Brazilian Journal of Probability and Statistics
|
||||
%% Bayesian Analysis
|
||||
%%
|
||||
%% Institute of Mathematical Statistics, U.S.A.
|
||||
%% Bernoulli Society
|
||||
%% Institut Henry Poincare
|
||||
%% Brazilian Statistical Association
|
||||
%% International Society for Bayesian Analysis
|
||||
%%
|
||||
%% Macros written by Vytas Statulevicius, VTeX, Lithuania
|
||||
%% Maintained by Deimantas Galcius, VTeX, Lithuania
|
||||
%% for Institute of Mathematical Statistics, U.S.A.
|
||||
%% Please submit bugs or your comments to latex-support@vtex.lt
|
||||
%%
|
||||
%% The original distribution is located at:
|
||||
%% http://www.e-publications.org/ims/support
|
||||
%%
|
||||
%% This class file loads standard "article.cls" with appropriate
|
||||
%% settings and then style file "imsart.sty" with additional macros
|
||||
%%
|
||||
%% You are free to use this style file as you see fit, provided
|
||||
%% that you do not make changes to the file.
|
||||
%% If you DO make changes, you are required to rename this file.
|
||||
%%
|
||||
%% It may be distributed under the terms of the LaTeX Project Public
|
||||
%% License, as described in lppl.txt in the base LaTeX distribution.
|
||||
%% Either version 1.0 or, at your option, any later version.
|
||||
%%
|
||||
%% \CharacterTable
|
||||
%% {Upper-case \A\B\C\D\E\F\G\H\I\J\K\L\M\N\O\P\Q\R\S\T\U\V\W\X\Y\Z
|
||||
%% Lower-case \a\b\c\d\e\f\g\h\i\j\k\l\m\n\o\p\q\r\s\t\u\v\w\x\y\z
|
||||
%% Digits \0\1\2\3\4\5\6\7\8\9
|
||||
%% Exclamation \! Double quote \" Hash (number) \#
|
||||
%% Dollar \$ Percent \% Ampersand \&
|
||||
%% Acute accent \' Left paren \( Right paren \)
|
||||
%% Asterisk \* Plus \+ Comma \,
|
||||
%% Minus \- Point \. Solidus \/
|
||||
%% Colon \: Semicolon \; Less than \<
|
||||
%% Equals \= Greater than \> Question mark \?
|
||||
%% Commercial at \@ Left bracket \[ Backslash \\
|
||||
%% Right bracket \] Circumflex \^ Underscore \_
|
||||
%% Grave accent \` Left brace \{ Vertical bar \|
|
||||
%% Right brace \} Tilde \~}
|
||||
%%
|
||||
%%
|
||||
%% Bug fixes and changes: at end of file
|
||||
%
|
||||
% TeX programming: Vytas Statulevicius, VTeX, Lithuania.
|
||||
% TeX programming: Deimantas Galcius, VTeX, Lithuania.
|
||||
% TeX programming: Edgaras Sakuras, VTeX, Lithuania.
|
||||
% Requires Latex2e, ver.2000.06
|
||||
%
|
||||
\NeedsTeXFormat{LaTeX2e}
|
||||
\ProvidesClass{imsart}[2022/04/21 driver class for package imsart.sty]
|
||||
%
|
||||
% set various options for different journals:
|
||||
\DeclareOption{ps}{%
|
||||
\PassOptionsToClass{10pt,oneside}{article}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{ss}{%
|
||||
\PassOptionsToClass{10pt,oneside}{article}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{ejs}{%
|
||||
\PassOptionsToClass{10pt,oneside}{article}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{aap}{%
|
||||
\PassOptionsToClass{11pt,oneside,leqno}{article}%
|
||||
\PassOptionsToPackage{leqno}{amsmath}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{aop}{%
|
||||
\PassOptionsToClass{11pt,oneside,leqno}{article}%
|
||||
\PassOptionsToPackage{leqno}{amsmath}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{aos}{%
|
||||
\PassOptionsToClass{11pt,oneside,leqno}{article}%
|
||||
\PassOptionsToPackage{leqno}{amsmath}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{aoas}{%
|
||||
\PassOptionsToClass{11pt,oneside,leqno}{article}%
|
||||
\PassOptionsToPackage{leqno}{amsmath}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{sts}{%
|
||||
\PassOptionsToClass{11pt,oneside,leqno}{article}%
|
||||
\PassOptionsToPackage{leqno}{amsmath}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{bj}{%
|
||||
\PassOptionsToClass{10pt,oneside}{article}%
|
||||
\g@addto@macro\@classoptionslist{,nosetpagesize}%
|
||||
\PassOptionsToPackage{sort&compress}{natbib}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{aihp}{%
|
||||
\PassOptionsToClass{10pt,oneside,leqno}{article}%
|
||||
\PassOptionsToPackage{leqno}{amsmath}%
|
||||
\PassOptionsToPackage{numbers,sort&compress}{natbib}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{bjps}{%
|
||||
\PassOptionsToClass{11pt,twoside}{article}%
|
||||
}
|
||||
%
|
||||
\DeclareOption{ba}{%
|
||||
\PassOptionsToClass{10pt,oneside}{article}%
|
||||
}
|
||||
%
|
||||
\PassOptionsToPackage{cmex10}{amsmath}
|
||||
%
|
||||
\DeclareOption*{\PassOptionsToClass{\CurrentOption}{article}}
|
||||
\ProcessOptions*
|
||||
%
|
||||
\LoadClass{article}
|
||||
%
|
||||
\IfFileExists{imsart.sty}%
|
||||
{}%
|
||||
{%
|
||||
\ClassError{imsart}%
|
||||
{The complimentary style file "imsart.sty" is required}%
|
||||
{%
|
||||
The complimentary style file "imsart.sty" is required\MessageBreak
|
||||
You need to install file "imsart.sty" in your system\MessageBreak
|
||||
File could be downloaded from http://www.e-publications.org/ims/support
|
||||
}%
|
||||
}
|
||||
%
|
||||
% Load additional macros and modifications for "article.cls"
|
||||
\RequirePackage{imsart}
|
||||
%
|
||||
\endinput
|
||||
%%
|
||||
%% End of file `imsart.cls'.
|
File diff suppressed because it is too large
Load Diff
2106
LaTeX/main.bib
2106
LaTeX/main.bib
File diff suppressed because it is too large
Load Diff
275
LaTeX/paper.tex
275
LaTeX/paper.tex
|
@ -13,7 +13,7 @@
|
|||
\usepackage{environ} % for dynamic TikZ picture scaling
|
||||
\usepackage{algorithm, algpseudocode} % Pseudo Codes / Algorithms
|
||||
\usepackage[
|
||||
style = apa, % citation style
|
||||
style = authoryear, % citation style
|
||||
isbn = false, % show isbn?
|
||||
maxbibnames = 50, % maximal number of names in bibilography
|
||||
maxcitenames = 2, % maximal number of names in text before et al.
|
||||
|
@ -28,8 +28,8 @@
|
|||
sortcites = false,
|
||||
natbib = false,
|
||||
dashed = true,
|
||||
url = false,
|
||||
doi = false,
|
||||
url = true,
|
||||
doi = true,
|
||||
bibencoding = utf8
|
||||
]{biblatex}
|
||||
|
||||
|
@ -87,7 +87,7 @@
|
|||
\newtheorem{remark}{Remark}
|
||||
|
||||
|
||||
\crefalias{section}{appendix}
|
||||
% \crefalias{section}{appendix}
|
||||
|
||||
\crefname{condition}{Condition}{Conditions}
|
||||
\Crefname{condition}{Condition}{Conditions}
|
||||
|
@ -272,9 +272,9 @@
|
|||
\maketitle
|
||||
|
||||
\begin{abstract}
|
||||
We consider regression or classification problems where the independent variable is matrix- or tensor-valued. We derive a multi-linear sufficient reduction for the regression or classification problem modeling the conditional distribution of the predictors given the response as a member of the quadratic exponential family. Using manifold theory, we prove the consistency and asymptotic normality of the sufficient reduction. We develop estimation procedures of
|
||||
sufficient reductions for both continuous and binary tensor-valued predictors. For continuous predictors, the algorithm is highly computationally efficient and is also applicable to situations where the dimension of
|
||||
the reduction exceeds the sample size. We demonstrate the superior performance of our approach in simulations and real-world data examples for both continuous and binary tensor-valued predictors. The \textit{Chess data} analysis results agree with a human player's understanding of the game and confirm the relevance of our approach.
|
||||
We consider supervised learning (regression/classification) problems where the independent variable is tensor valued. We derive a multi-linear sufficient reduction for the regression or classification problem modeling the conditional distribution of the predictors given the response as a member of the quadratic exponential family. Using manifold theory, we prove the consistency and asymptotic normality of the sufficient reduction. We develop estimation procedures of
|
||||
sufficient reductions for both continuous and binary tensor valued predictors. For continuous predictors, the algorithm is highly computationally efficient and is also applicable to situations where the dimension of
|
||||
the reduction exceeds the sample size. We demonstrate the superior performance of our approach in simulations and real-world data examples for both continuous and binary tensor valued predictors. The \textit{Chess data} analysis results agree with a human player's understanding of the game and confirm the relevance of our approach.
|
||||
\end{abstract}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
@ -283,50 +283,35 @@
|
|||
|
||||
Tensors are a mathematical tool to represent data of complex structure in statistics. \textit{Tensors} are considered as a generalization of matrices to higher dimensions: A tensor is a multi-dimensional array of numbers. For example, a second-order tensor can be represented as a matrix, while a third-order tensor can be represented as a cube of matrices.
|
||||
|
||||
Complex data are collected at different times and/or under several conditions often involving a large number of multi-indexed variables represented as tensor-valued data \parencite{KoldaBader2009}. They occur in large-scale longitudinal studies \parencite[e.g.][]{Hoff2015}, in agricultural experiments and chemometrics and spectroscopy \parencite[e.g.][]{LeurgansRoss1992,Burdick1995}, in signal and video processing where sensors produce multi-indexed data, e.g. over spatial, frequency, and temporal dimensions \parencite[e.g.][]{DeLathauwerCastaing2007,KofidisRegalia2005}, and in telecommunications \parencite[e.g.][]{DeAlmeidaEtAl2007}. Other examples of multiway data include 3D images of the brain, where the modes are the 3 spatial dimensions, and spatio-temporal weather imaging data, a set of image sequences represented as 2 spatial modes and 1 temporal mode.
|
||||
Complex data are collected at different times and/or under several conditions often involving a large number of multi-indexed variables represented as tensor valued data \parencite{KoldaBader2009}. They occur in large-scale longitudinal studies \parencite[e.g.][]{Hoff2015}, in agricultural experiments and chemometrics and spectroscopy \parencite[e.g.][]{LeurgansRoss1992,Burdick1995}, in signal and video processing where sensors produce multi-indexed data, e.g. over spatial, frequency, and temporal dimensions \parencite[e.g.][]{DeLathauwerCastaing2007,KofidisRegalia2005}, and in telecommunications \parencite[e.g.][]{DeAlmeidaEtAl2007}. Other examples of multiway data include 3D images of the brain, where the modes are the 3 spatial dimensions, and spatio-temporal weather imaging data, a set of image sequences represented as 2 spatial modes and 1 temporal mode.
|
||||
|
||||
% \begin{itemize}
|
||||
% \item Review \cite{ZhouLiZhu2013} and see how you compare with them. They focus on the forward regression model with a scalar response but they claim that "Exploiting the array structure in imaging data, the new method substantially reduces the dimensionality of imaging data, which leads to efficient estimation and prediction."
|
||||
% \item Read \cite{ZhouEtAl2023} to figure out the distribution they use for the tensor-valued predictors and briefly describe what they do.
|
||||
% \item Read \cite{RabusseauKadri2016} to figure out what they do. They seem to draw both the response and the predictors from tensor-normal with iid N(0,1) entries: "In order to leverage the tensor structure of the output data, we formulate the problem as the minimization of a least squares criterion subject to a multilinear rank constraint on the regression tensor. The rank constraint enforces the model to capture low-rank structure in the outputs and to explain dependencies between inputs and outputs in a low-dimensional multilinear subspace."
|
||||
% \end{itemize}
|
||||
Tensor regression models have been proposed to leverage the structure inherent in tensor-valued data. For instance, \textcite{HaoEtAl2021,ZhouLiZhu2013} focus on tensor covariates, while \textcite{RabusseauKadri2016,LiZhang2017,ZhouLiZhu2013} focus on tensor responses, and \textcite{Hoff2015,Lock2018} consider tensor on tensor regression. \textcite{HaoEtAl2021} modeled a scalar response as a flexible nonparametric function of tensor covariates. \textcite{ZhouLiZhu2013} assume the scalar response has a distribution in the exponential family given the tensor-valued predictors with the link modeled as a multilinear function of the predictors. \textcite{RabusseauKadri2016} model the tensor-valued response as a linear model with tensor-valued regression coefficients subject to a multilinear rank constraint. \textcite{LiZhang2017} approach the problem with a similar linear model but instead of a low-rank constraint the error term is assumed to have a separable Kronecker product structure while using a generalization of the envelope model \parencite{CookLiChiaromonte2010}. \textcite{ZhouEtAl2023} focus on partially observed tensor response given vector-valued predictors with mode-wise sparsity constraints in the regression coefficients. \textcite{Hoff2015} extends an existing bilinear regression model to a tensor on tensor of conformable modes and dimensions regression model based on a Tucker product. \textcite{Lock2018} uses a tensor contraction to build a penalized least squares model for a tensor with arbitrary number of modes and dimensions.
|
||||
|
||||
% - RabusseauKadri2016 Y | x for tensor Y with vector x (HOLRR; Higher Order Low-Rank Regression)
|
||||
% - LiZhang2017 Y | x for tensoe Y with vector x (envelope model)
|
||||
% - ZhouEtAl2023 Y | x for tensor Y with vector x (sparse and partially observed)
|
||||
Our approach considers the general regression problem of fitting a response of general form (univariate, multivariate, tensor-valued) on a tensor-value predictor. We operate in the context of sufficient dimension reduction \parencite[e.g.]{Cook1998,Li2018} based on inverse regression, which leads us to regressing the tensor valued predictor on the response. In our setting, this necessitates transforming the response to tensor-valued functions, regardless of whether it is itself tensor valued. Because of the setting, our method shares commonalities with the tensor regression models referred to above, yet the modeling and methodology are novel. Specifically, our tensor-to-tensor regression model is a generalized multi-linear model similar to the generalized linear model of \parencite{ZhouLiZhu2013}. To bypass the explosion of the number of parameters to estimate, we assume the inverse regression error covariance has Kronecker product structure as do \textcite{LiZhang2017}. Our maximum likelihood-based estimation does not require any penalty terms in contrast to the least squares and/or sparse approaches \parencite{ZhouLiZhu2013}. In the case of a tensor (multilinear) normal, given the tensor valued function of the response, our model exhibits similarities to the multilinear modeling of \textcite{Hoff2015}, but we use a generalized multilinear model and estimate the parameters with maximum likelihood instead of least squares. Moreover, a common issue in multilinear tensor regression models is the unidentifiability of the parameters, which we address in a completely different manner. For example, \textcite{LiZhang2017} developed theory that is based on orthogonal projection matrices to uniquely identify a subspace, while our approach is more general as it uses manifold theory.
|
||||
|
||||
% - ZhouLiZhu2013 y\in R (GLM) for y | Z, X for tensor X
|
||||
% - HaoEtAl2021 y\in R for y | X for tensor X (sparse, element wise B-splines)
|
||||
|
||||
% Tensor regression models have been proposed to exploit the special structure of tensor covariates, e.g. \cite{HaoEtAl2021,ZhouLiZhu2013}, or tensor responses \cite{RabusseauKadri2016,LiZhang2017,ZhouEtAl2023} \cite{HaoEtAl2021} modeled a scalar response as a flexible nonparametric function of tensor covariates. \cite{ZhouLiZhu2013} assume the scalar response has a distribution in the exponential family given the tensor-valued predictors and model the link function as a multilinear function of the predictors. \cite{LiZhang2017} model the tensor-valued response as tensor normal. Rather than using $L_1$ type penalty functions to induce sparsity, they employ the envelope method (Cook, Li, and Chiaromonte Citation2010) to estimate the unknown regression coefficient. Moreover, the envelope method essentially identifies and uses the material information jointly. They develop an estimation algorithm and study the asymptotic properties of the estimator. the scalar response as These models try to utilize the sparse and low-rank structures in the tensors -- either in the regression coefficient tensor or the response tensor -- to boost performance on the regression task by reducing the number of free parameters.
|
||||
|
||||
Tensor regression models have been proposed to leverage the structure inherent in tensor valued data. For instance, \textcite{HaoEtAl2021,ZhouLiZhu2013} focus on tensor covariates, while \textcite{RabusseauKadri2016,LiZhang2017,ZhouLiZhu2013} focus on tensor responses, and \textcite{Hoff2015,Lock2018} consider tensor on tensor regression. \textcite{HaoEtAl2021} modeled a scalar response as a flexible nonparametric function of tensor covariates. \textcite{ZhouLiZhu2013} assume the scalar response has a distribution in the exponential family given the tensor-valued predictors with the link modeled as a multilinear function of the predictors. \textcite{RabusseauKadri2016} model the tensor-valued response as a linear model with tensor valued regression coefficients subject to a multilinear rank constraint. \textcite{LiZhang2017} approach the problem with a similar linear model but instead of a low rank constraint the error term is assumed to have a separable Kronecker product structure while using a generalization of the envelope model \parencite{CookLiChiaromonte2010}. \textcite{ZhouEtAl2023} focus on partially observed tensor response given vector-valued predictors with mode-wise sparsity constraints in the regression coefficients. \textcite{Hoff2015} extends an existing bilinear regression model to a tensor on tensor of conformable modes and dimensions regression model based on a Tucker product. \textcite{Lock2018} uses a tensor contraction to build a penalized least squares model for a tensor with arbitrary number of modes and dimensions.
|
||||
|
||||
Our approach considers the general regression problem of fitting a response of general form (univariate, multivariate, tensor-valued) on a tensor-value predictor. We operate in the context of sufficient dimension reduction \parencite[e.g.]{Cook1998,Li2018} based on inverse regression, which leads us to regressing the tensor-valued predictor on the response. In our setting, this necessitates transforming the response to tensor-valued functions, regardless of whether it is itself tensor-valued. Because of the setting, our method shares commonalities with the tensor regression models referred to above, yet the modeling and methodology are novel.
|
||||
Specifically, our tensor-to-tensor regression model is a generalized multi-linear model similar to the generalized linear model of \cite{ZhouLiZhu2013}. % but with tensor valued response by applying (a known) tensor valued function to the response in an inverse regression setting, reversing the role of response and predictors.
|
||||
To bypass the explosion of number of parameters to estimate, we assume the inverse regression error covariance has Kronecker product structure as do \textcite{LiZhang2017}. Our maximum likelihood-based estimation does not require any penalty terms in contrast to the least squares and/or sparse approaches \cite{????} . In the case of a tensor (multilinear) normal, given the tensor-valued function of the response, our model exhibits similarities to the multilinear modeling of \textcite{Hoff2015}, but we use a generalized multilinear model and estimate the parameters with maximum likelihood instead of least squares. Moreover, a common issue in multilinear tensor regression models is the unidentifiability of the parameters, which we address in a completely different manner. For example, \cite{LiZhang2017} develop theory that is based on orthogonal projection matrices to uniquely identify a subspace, while our approach is more general as it uses manifold theory.
|
||||
In this paper, we present a model-based \emph{Sufficient Dimension Reduction} (SDR) method for tensor-valued data with distribution in the quadratic exponential family assuming a separable Kronecker product structure of the first and second moment. The quadratic exponential family contains the multi-linear normal and the multi-linear Ising distributions, for continuous and binary tensor-valued random variables, respectively. By generalizing the parameter space to embedded manifolds we %obtain consistency and asymptotic normality results while
|
||||
allow great modeling flexibility in the sufficient dimension reduction.
|
||||
|
||||
|
||||
In this paper we present a model-based \emph{Sufficient Dimension Reduction} (SDR) method for tensor-valued data with distribution in the quadratic exponential family assuming a separable Kronecker product structure of the first and second moment. By generalizing the parameter space to embedded manifolds we obtain consistency and asymptotic normality results while allowing great modeling flexibility in the linear sufficient dimension reduction.
|
||||
Multi-linear normal models have been used in various applications, including medical imaging \parencite{BasserPajevic2007,DrydenEtAl2009}, spatio-temporal data analysis \parencite{GreenewaldHero2014}, regression analysis for longitudinal relational data \parencite{Hoff2015}. One of the most important uses of the multi-linear normal (MLN) distribution, and hence tensor analysis, is perhaps in magnetic resonance imaging (MRI) \parencite{OhlsonEtAl2013}. A recent survey \parencite{WangEtAl2022} and references therein contain more information and potential applications of multilinear tensor normal models.
|
||||
|
||||
The quadratic exponential family contains the tensor normal and the tensor Ising distributions, for continuous and binary tensor-valued random variables, respectively.
|
||||
The Ising\footnote{Also known as the \emph{Lenz-Ising} model as the physical assumptions of the model where developed by both Lenz and Ising \parencite{Niss2005} where Ising gave a closed form solution for the 1D lattice \parencite{Ising1925}.} model \parencite{Lenz1920,Ising1925,Niss2005} is a mathematical model originating in statistical physics to study ferromagnetism in a thermodynamic setting. It describes magnetic dipoles (atomic ``spins'' with values $\pm 1$) under an external magnetic field (first moments) while allowing two-way interactions (second moments) between direct neighbors on a lattice, a discrete grid. The Ising problem, as known in statistical physics, is to compute observables such as the magnetizations and correlations under the Boltzmann distribution\footnote{The Boltzmann distribution is a probability distribution over the states of a physical system in thermal equilibrium (constant temperature) that assigns higher probabilities to states with lower energy.} while the interaction structure and the magnetic fields are given. The ``reverse'' problem, where the couplings and fields are unknown and to be determined from observations of the spins, as in statistical inference, is known as the \emph{inverse Ising problem} \parencite{NguyenEtAl2017}. From this point of view, the Ising model is a member of a discrete quadratic exponential family \parencite{CoxWermuth1994,JohnsonEtAl1997} for multivariate binary outcomes where the interaction structure (non-zero correlations) is determined by the lattice. Generally, neither the values of couplings nor the interaction structure are known.
|
||||
|
||||
Multilinear tensor normal models have been used in various applications, including medical imaging \parencite{BasserPajevic2007,DrydenEtAl2009}, spatio-temporal data analysis \parencite{GreenewaldHero2014}, regression analysis for longitudinal relational data \parencite{Hoff2015}. One of the most important uses of the multilinear normal (MLN) distribution, and hence tensor analysis, is perhaps in magnetic resonance imaging (MRI) \parencite{OhlsonEtAl2013}. A recent survey \parencite{WangEtAl2022} and references therein contain more information and potential applications of multilinear tensor normal models.
|
||||
In consequence, the Ising model is mostly used to model multivariate binary data in statistics. The states are ${0, 1}$ instead of $\pm 1$, and full interaction structure. It is related to a multitude of other models, among which the most prominent are: \emph{Graphical Models} and \emph{Markov Random Fields} to describe conditional dependence \parencite{Lauritzen1996,WainwrightJordan2008,LauritzenRichardson2002}, \emph{Potts models} \parencite{Besag1974,ChakrabortyEtAl2022} which generalize the Ising model to multiple states, the \emph{multivariate Bernoulli distribution} \parencite{Whittaker1990,JohnsonEtAl1997,DaiDingWahba2013} that also accommodates higher-order interactions (three-way and higher), \emph{(restricted) Botlzmann machines} \parencite{Smolensky1986,Hinton2002,FischerIgel2012} that introduce additional hidden variables for learning binary distributions. Most of these models can be used both in supervised and unsupervised settings. Applications of the Ising model (and variations thereof) range from modeling neural firing patterns \parencite{SchneidmanEtAl2006}, gene expression data analysis \parencite{LezonEtAl2006}, and modeling financial markets \parencite{Bury2013}. See also \textcite{NguyenEtAl2017}.
|
||||
|
||||
The Ising model for multivariate binary outcomes belongs to the class of discrete exponential families. Its defining feature is that the sufficient statistic involves a quadratic term to capture correlations arising from pairwise interactions.
|
||||
The tensor Ising model is a higher-order Ising model for tensor-valued binary outcomes.
|
||||
%From \cite{MukherjeeEtAl2020}
|
||||
Higher-order Ising models arise naturally in the study of multi-atom interactions in lattice gas models, such as the square-lattice eight-vertex model, the Ashkin-Teller model, and Suzuki's pseudo-3D anisotropic model (cf. [6, 33, 36, 37, 49, 55, 56, 61, 62] and the references therein). More recently, higher-order spin systems have also been proposed for modeling peer-group effects in social networks [22]. \textcite{MukherjeeEtAl2020} proposed a maximum pseudo-likelihood estimation algorithm for a one-parameter tensor-Ising model. \efi{Daniel: comment on what these guys do and contrast with your setting} In our approach, the parameter is not constrained to be scalar
|
||||
We derive maximum likelihood estimates for all first and second order interactions and propose a gradient-based optimization algorithm.
|
||||
The $r$-tensor Ising model in statistical physics is a generalization of the Ising model to $r$-order interactions. \textcite{MukherjeeEtAl2022} study the one-parameter discrete exponential family for modeling dependent binary data where the interaction structure is given. In \textcite{LiuEtAl2023} the tensor structure itself is to be inferred. These models are fundamentally different from our approach where we rely on properties of the quadratic exponential family which models up to second-order interactions. Another important difference is that we adopt the multi-linear formulation as it is inherently linked to the observable structure of multi-way data as opposed to describing the model coefficients with an $r$-order tensor structure.
|
||||
|
||||
Our results in the framework of the quadratic exponential family for tensor-valued variables; i.e., consistency and asymptotic normality, apply to both tensor normal and tensor Ising models.
|
||||
Our main contributions are (a) formulating the dimension reduction problem via the quadratic exponential family, which allows us to derive the sufficient dimension reduction in closed form, (b) defining the parameter space as an embedded manifold, which provides great flexibility in modeling, (c) deriving the maximum likelihood estimator of the sufficient reduction subject to multi-linear constraints and overcoming parameter non-identifiability, (d) developing estimation algorithms which in the case of multi-linear normal predictors is fast and efficient, and (e) establishing the consistency and asymptotic normality of the estimators.
|
||||
|
||||
Even though our motivation is rooted in the SDR perspective, our proposal concerns inference on any regression model with a tensor valued response and predictors of any type. Thus, our approach can be used as a stand-alone model for such data regardless of whether one is interested in deriving sufficient reductions and/or reducing the dimension of the data. Our results in the framework of the quadratic exponential family for tensor valued variables; i.e., consistency and asymptotic normality, apply to both multi-linear normal \parencite{KolloVonRosen2005,Hoff2011,OhlsonEtAl2013} and multi-linear Ising models, as defined in \cref{sec:ising_estimation}.
|
||||
|
||||
The structure of this paper is as follows. We introduce our notation in \cref{sec:notation}. \Cref{sec:problem-formulation} details the problem we consider and in \cref{sec:gmlm-model} we introduce our model. Continuing in \cref{sec:ml-estimation} we provide the basis for a general maximum likelihood estimation procedure and derive specialized methods for multi-linear normal as well as the multi-linear Ising distributions. \Cref{sec:manifolds} gives a short introduction into manifolds and provides the basis for applying the consistency and asymtotic normality results from \cref{sec:asymtotics}. Simulations for continuous and binary predictors are the subject of \cref{sec:simulations}. Finally, in \cref{sec:data-analysis} we apply our model to EEG data and perform a proof of concept data analysis where a chess board is interpreted as a collection of binary $8\times 8$ matrices.
|
||||
|
||||
The structure of this paper is as follows. In \cref{sec:notation} we introduce our notation. \Cref{sec:problem-formulation} decribes the exact problem and in \cref{sec:gmlm-model} we introduce our model. Continuing in \cref{sec:ml-estimation} we provide the basis for a general maximum likelihood estimation procedure and derive specialized methods for tensor normal as well as the tensor Ising distributions. \Cref{sec:manifolds} gives a short introduction into manifolds and provides the basis for applying the consistency and asymtotic normality results from \cref{sec:asymtotics}. Simulations for continuous and binary predicotrs are subject of \cref{sec:simulations}. Finally, in \cref{sec:data-analysis} we apply our model to EEG data and perform a prove of concept data analysis example where a chess board is interprated as a collection of binary $8\times 8$ matrices.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Notation}\label{sec:notation}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
Let $\ten{A}\in\mathbb{R}^{q_1\times \ldots\times q_r}$ denotes an order\footnote{Also referred to as rank, therefore the variable name $r$, but this term is \emph{not} used as it leads to confusion with the concept of rank of a matrix.} $r$ tensor, where $r\in\mathbb{N}$ is the number of modes or axes (dimensions) of $\ten{A}$ and $\ten{A}_{i_1,...,i_r} \in \mathbb{R}$ is its $(i_1, \ldots, i_r)$th entry. For example, a $p \times q$ matrix $\mat{B}$ has two modes, the rows and columns. For matrices $\mat{B}_k\in\mathbb{R}^{p_k\times q_k}$, $k\in[r] = \{1, 2, \ldots, r\}$, the \emph{multi-linear multiplication}, or \emph{Tucker operator} \parencite{Kolda2006}, is defined element wise as
|
||||
Let $\ten{A}\in\mathbb{R}^{q_1\times \ldots\times q_r}$ denotes an order\footnote{Also referred to as rank, therefore the variable name $r$, but this term is \emph{not} used as it leads to confusion with the concept of rank of a matrix.} $r$ tensor, where $r\in\mathbb{N}$ is the number of modes or axes (dimensions) of $\ten{A}$ and $\ten{A}_{i_1,...,i_r} \in \mathbb{R}$ is its $(i_1, \ldots, i_r)$th entry. For example, a $p \times q$ matrix $\mat{B}$ has two modes, rows and columns. For matrices $\mat{B}_k\in\mathbb{R}^{p_k\times q_k}$, $k\in[r] = \{1, 2, \ldots, r\}$, the \emph{multi-linear multiplication}, or \emph{Tucker operator} \parencite{Kolda2006}, is defined element-wise as
|
||||
\begin{displaymath}
|
||||
(\ten{A}\times\{\mat{B}_1, \ldots, \mat{B}_r\})_{j_1, \ldots, j_r} = \sum_{i_1, \ldots, i_r = 1}^{q_1, \ldots, q_r} \ten{A}_{i_1, \ldots, i_r}(\mat{B}_{1})_{j_1, i_1} \cdots (\mat{B}_{r})_{j_r, i_r}
|
||||
\end{displaymath}
|
||||
|
@ -334,25 +319,24 @@ which results in an order $r$ tensor of dimension $p_1\times ...\times p_k$. The
|
|||
\begin{displaymath}
|
||||
\ten{A}\times_k\mat{B}_k = \ten{A}\times\{\mat{I}_{q_1}, \ldots, \mat{I}_{q_{k-1}}, \mat{B}_{k}, \mat{I}_{q_{k+1}}, \ldots, \mat{I}_{q_r}\}.
|
||||
\end{displaymath}
|
||||
The notation $\ten{A}\mlm_{k\in S}\mat{B}_k$ is short hand for the iterative application of the mode product for all indices in $S\subseteq[r]$. For example $\ten{A}\mlm_{k\in\{2, 5\}}\mat{B}_k = \ten{A}\times_2\mat{B}_2\times_5\mat{B}_5$. By only allowing $S$ to be a set, this notation is unambiguous because the mode product commutes for different modes; i.e., $\ten{A}\times_j\mat{B}_j\times_k\mat{B}_k = \ten{A}\times_k\mat{B}_k\times_j\mat{B}_j$ for $j\neq k$. For example, let $\mat{A}, \mat{B}_1, \mat{B}_2$ be matrices (of matching dimensions). The bilinear mode-product and multi-linear multiplication relate to the well known matrix-matrix multiplications as
|
||||
The notation $\ten{A}\mlm_{k\in S}\mat{B}_k$ is shorthand for the iterative application of the mode product for all indices in $S\subseteq[r]$. For example $\ten{A}\mlm_{k\in\{2, 5\}}\mat{B}_k = \ten{A}\times_2\mat{B}_2\times_5\mat{B}_5$. By only allowing $S$ to be a set, this notation is unambiguous because the mode product commutes for different modes; i.e., $\ten{A}\times_j\mat{B}_j\times_k\mat{B}_k = \ten{A}\times_k\mat{B}_k\times_j\mat{B}_j$ for $j\neq k$. For example, let $\mat{A}, \mat{B}_1, \mat{B}_2$ be matrices (of matching dimensions). The bilinear mode-product and multi-linear multiplication relate to the well-known matrix-matrix multiplications as
|
||||
\begin{displaymath}
|
||||
\mat{A}\times_1\mat{B}_1 = \mat{B}_1\mat{A}, \qquad
|
||||
\mat{A}\times_2\mat{B}_2 = \mat{A}\t{\mat{B}_2}, \qquad
|
||||
\mat{A}\mlm_{k = 1}^2\mat{B}_k = \mat{A}\mlm_{k \in \{1, 2\}}\mat{B}_k = \mat{B}_1\mat{A}\t{\mat{B}_2}.
|
||||
\end{displaymath}
|
||||
|
||||
%Matrices and tensors can be \emph{vectorized} by the \emph{vectorization} operator $\vec$.
|
||||
The operator $\vec$ maps an array to a vector. Specifically, $\vec(\mat{B})$ stands for the $pq \times 1$ vector of the $p \times q$ matrix $\mat{B}$ resulting from stacking the columns of $\mat{B}$ one after the other. For a tensor $\ten{A}$ of order $r$ and dimensions $q_1, \ldots, q_r$, $\vec(\ten{A})$ is the $q_1 q_2 \ldots q_r \times 1$ vector with the elements of $\ten{A}$ stacked one after the other in the order $r$ then $r-1$, and so on. For example, if $\ten{A}$ is a 3-dimensional array, $\vec(\ten{A})=\t{(\t{\vec(\ten{A}_{:,:,1})},\t{\vec(\ten{A}_{:,:,2})},\ldots,\t{\vec(\ten{A}_{:,:,q_3})})}$. We use the notation $\ten{A}\equiv \ten{B}$ for objects $\ten{A}, \ten{B}$ of any shape if and only if $\vec(\ten{A}) = \vec(\ten{B})$.
|
||||
|
||||
The \emph{inner product} between two tensors of the same order and dimensions is
|
||||
\begin{displaymath}
|
||||
\langle\ten{A}, \ten{B}\rangle = \sum_{i_1, \ldots, i_r} \ten{A}_{i_1, \ldots, i_r}\ten{B}_{i_1, \ldots, i_r}
|
||||
\end{displaymath}
|
||||
This leads to the definition of the \emph{Frobenius norm} for tensors, $\|\ten{A}\|_F = \sqrt{\langle\ten{A}, \ten{A}\rangle}$ and is the straightforward extension of the Frobenius norm for matrices and vectors. The \emph{outer product} between two tensors $\ten{A}$ of dimensions $q_1, \ldots, q_r$ and $\ten{B}$ of dimensions $p_1, \ldots, p_l$ is a tensor $\ten{A}\circ\ten{B}$ of order $r + l$ and dimensions $q_1, \ldots, q_r, p_1, \ldots, p_l$, such that
|
||||
This leads to the definition of the \emph{Frobenius norm} for tensors, $\|\ten{A}\|_F = \sqrt{\langle\ten{A}, \ten{A}\rangle}$ and is the straightforward extension of the Frobenius norm for matrices and vectors. The \emph{outer product} between two tensors $\ten{A}$ of dimensions $q_1, \ldots, q_r$ and $\ten{B}$ of dimensions $p_1, \ldots, p_l$ is a tensor $\ten{A}\circ\ten{B}$ of order $r + l$ and dimensions $q_1, \ldots, q_r, p_1, \ldots, p_l$, such that
|
||||
\begin{displaymath}
|
||||
\ten{A}\circ\ten{B} \equiv (\vec\ten{A})\t{(\vec{\ten{B}})}.
|
||||
\end{displaymath}
|
||||
Let $\ten{K} : \mathbb{R}^{q_1, ..., q_{2 r}}\to\mathbb{R}^{q_1 q_{r + 1}, ..., q_r q_{2 r}}$ be defined element wise with indices $1\leq i_j + 1\leq q_j q_{r + j}$ for $j = 1, ..., r$ as
|
||||
Let $\ten{K} : \mathbb{R}^{q_1, ..., q_{2 r}}\to\mathbb{R}^{q_1 q_{r + 1}, ..., q_r q_{2 r}}$ be defined element-wise with indices $1\leq i_j + 1\leq q_j q_{r + j}$ for $j = 1, ..., r$ as
|
||||
\begin{align*}
|
||||
\ten{K}(\ten{A})_{i_1 + 1, ..., i_r + 1} &= \ten{A}_{\lfloor i_1 / q_{r + 1}\rfloor + 1, ..., \lfloor i_r / q_{2 r} \rfloor + 1, (i_1\operatorname{mod}q_{r + 1}) + 1, ..., (i_r\operatorname{mod}q_{2 r}) + 1}
|
||||
\end{align*}
|
||||
|
@ -381,7 +365,7 @@ where the vectorized quantities $\vec{\ten{X}}\in\mathbb{R}^p$ and $\vec\ten{F}(
|
|||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Problem Formulation}\label{sec:problem-formulation}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
Our goal is to infer the cumulative distribution function (cdf) $F$ of $Y\mid \ten{X}$, where $\ten{X}$ is assumed to admit $r$-tensor structure of dimension $p_1\times ... \times p_r$ with continuous or discrete entries, and the response $Y$ is unconstrained. The predictor $\ten{X}$ is a complex object; to simplify the problem we assume there exists a tensor-valued function of lower dimension $\ten{R}:\ten{X}\mapsto \ten{R}(\ten{X})$ such that
|
||||
Our goal is to infer the cumulative distribution function (cdf) $F$ of $Y\mid \ten{X}$, where $\ten{X}$ is assumed to admit $r$-tensor structure of dimension $p_1\times ... \times p_r$ with continuous or discrete entries, and the response $Y$ is unconstrained. The predictor $\ten{X}$ is a complex object; to simplify the problem we assume there exists a tensor valued function of lower dimension $\ten{R}:\ten{X}\mapsto \ten{R}(\ten{X})$ such that
|
||||
\begin{displaymath}
|
||||
F(Y\mid \ten{X}) = F(Y\mid \ten{R}(\ten{X})).
|
||||
\end{displaymath}
|
||||
|
@ -395,19 +379,19 @@ To find such a reduction $\ten{R}$, we leverage the equivalence relation pointed
|
|||
\end{equation}
|
||||
According to \eqref{eq:inverse-regression-sdr}, a \textit{sufficient statistic} $\ten{R}(\ten{X})$ for $Y$ in the inverse regression $\ten{X}\mid Y$, where $Y$ is considered as a parameter indexing the model, is also a \textit{sufficient reduction} for $\ten{X}$ in the forward regression $Y\mid\ten{X}$. %The equivalent inverse regression in \eqref{eq:inverse-regression-sdr} provides exhaustive characterization of $\ten{R}(\ten{X})$.
|
||||
|
||||
The factorization theorem is the usual tool to identify sufficient statistics and requires a distributional model. In this paper, we assume the distribution of $\ten{X}\mid Y$ belongs to the \emph{quadratic exponential family} in order to (a) simplify modeling and (b) keep estimation feasible. We assume that $\ten{X}\mid Y$ is a full rank quadratic exponential family with density
|
||||
The factorization theorem is the usual tool to identify sufficient statistics and requires a distributional model. In this paper, we assume the distribution of $\ten{X}\mid Y$ belongs to the \emph{quadratic exponential family} in order to (a) simplify modeling and (b) keep estimation feasible. We assume that $\ten{X}\mid Y$ is a full rank quadratic exponential family with density
|
||||
\begin{align}
|
||||
f_{\mat{\eta}_y}(\ten{X}\mid Y = y)
|
||||
&= h(\ten{X})\exp(\t{\mat{\eta}_y}\mat{t}(\ten{X}) - b(\mat{\eta}_y)) \nonumber \\
|
||||
&= h(\ten{X})\exp(\langle \mat{t}_1(\ten{X}), \mat{\eta}_{1y} \rangle + \langle \mat{t}_2(\ten{X}), \mat{\eta}_{2y} \rangle - b(\mat{\eta}_{y})) \label{eq:quad-density}
|
||||
\end{align}
|
||||
where $\mat{t}_1(\ten{X})=\vec \ten{X}$ and $\mat{t}_2(\ten{X})$ is linear in $\ten{X}\circ\ten{X}$. The dependence of $\ten{X}$ on $Y$ is fully captured in the natural parameter $\mat{\eta}_y$. The function $h$ is non-negative real-valued and $b$ is assumed to be at least twice continuously differentiable and strictly convex. An important feature of the \emph{quadratic exponential family} is that the distribution of its members is fully characterized by their first two moments. Distributions within the quadratic exponential family include the \emph{tensor normal} (\cref{sec:tensor-normal-estimation}) and \emph{tensor Ising model} (\cref{sec:ising_estimation}, a generalization of the (inverse)\footnote{\todo{}} Ising model which is multi-variate Bernoulli with up to second order interactions) and mixtures of these two.
|
||||
where $\mat{t}_1(\ten{X})=\vec \ten{X}$ and $\mat{t}_2(\ten{X})$ is linear in $\ten{X}\circ\ten{X}$. The dependence of $\ten{X}$ on $Y$ is fully captured in the natural parameter $\mat{\eta}_y$. The function $h$ is non-negative real-valued and $b$ is assumed to be at least twice continuously differentiable and strictly convex. An important feature of the \emph{quadratic exponential family} is that the distribution of its members is fully characterized by their first two moments. Distributions within the quadratic exponential family include the \emph{multi-linear normal} (\cref{sec:tensor-normal-estimation}) and \emph{multi-linear Ising model} (\cref{sec:ising_estimation}, a generalization of the (inverse) Ising model which is a multi-variate Bernoulli with up to second order interactions) and mixtures of these two.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{The Generalized Multi-Linear Model}\label{sec:gmlm-model}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
In model \eqref{eq:quad-density}, the dependence of $\ten{X}$ and $Y$ is absorbed in $\mat{\eta}_y$, and $\mat{t}(\ten{X})$ is the minimal sufficient statistic for the \textit{pseudo}-parameter\footnote{$\mat{\eta}_y$ is a function of the response $Y$, thus it is not a parameter in the formal statistical sense. It is considered as a parameter when using the equivalence in \eqref{eq:inverse-regression-sdr} and view $Y$ as a parameter as a device to derive the sufficient reduction from the inverse regression.} $\mat{\eta}_y = (\mat{\eta}_{1y}, \mat{\eta}_{2y})$ with
|
||||
In model \eqref{eq:quad-density}, the dependence of $\ten{X}$ and $Y$ is absorbed in $\mat{\eta}_y$, and $\mat{t}(\ten{X})$ is the minimal sufficient statistic for the \textit{pseudo}-parameter\footnote{$\mat{\eta}_y$ is a function of the response $Y$, so it is not a parameter in the formal statistical sense. We view it as a parameter in order to leverage \eqref{eq:inverse-regression-sdr} and derive the sufficient reduction from the inverse regression.} $\mat{\eta}_y = (\mat{\eta}_{1y}, \mat{\eta}_{2y})$ with
|
||||
\begin{align}\label{eq:t-stat}
|
||||
\mat{t}(\ten{X}) &= (\mat{t}_1(\ten{X}),\mat{t}_2(\ten{X}))=(\vec{\ten{X}}, \mat{T}_2\vech((\vec\ten{X})\t{(\vec\ten{X})})),
|
||||
\end{align}
|
||||
|
@ -421,7 +405,7 @@ where $\mat{D}_p$ is the \emph{duplication matrix} from \textcite[Ch.~11]{Abadir
|
|||
\begin{equation}\label{eq:quadratic-exp-fam}
|
||||
f_{\eta_y}(\ten{X}\mid Y = y) = h(\ten{X})\exp\left(\langle \vec \ten{X}, \mat{\eta}_{1y} \rangle + \langle \vec(\ten{X}\circ\ten{X}), \t{(\mat{T}_2\pinv{\mat{D}_p})}\mat{\eta}_{2y} \rangle - b(\mat{\eta}_y)\right)
|
||||
\end{equation}
|
||||
The exponential family in \eqref{eq:quadratic-exp-fam} is easily generalizable to any order. This, though, would result in the number of parameters becoming prohibitive to estimate, which is also the reason why we opted for the second order exponential family in our formulation.
|
||||
The exponential family in \eqref{eq:quadratic-exp-fam} is easily generalizable to any order. This, though, would result in the number of parameters becoming prohibitive to estimate, which is also the reason why we opted for the second-order exponential family in our formulation.
|
||||
|
||||
By the equivalence in \eqref{eq:inverse-regression-sdr}, in order to find the sufficient reduction $\ten{R}(\ten{X})$ we need to infer $\mat{\eta}_{1y}$, and $\mat{\eta}_{2y}$. This is reminiscent of generalized linear modeling, which we extend to a multi-linear formulation next.
|
||||
Suppose $\ten{F}_y$ is a known mapping of $y$ with zero expectation $\E_Y\ten{F}_Y = 0$. We assume the dependence of $\ten{X}$ and $Y$ is reflected only in the first parameter and let
|
||||
|
@ -429,7 +413,7 @@ Suppose $\ten{F}_y$ is a known mapping of $y$ with zero expectation $\E_Y\ten{F}
|
|||
\mat{\eta}_{1y} &= \vec{\overline{\ten{\eta}}} + \mat{B}\vec\ten{F}_y, \label{eq:eta1-manifold} \\
|
||||
\mat{\eta}_{2} &= \t{(\pinv{(\mat{T}_2\pinv{\mat{D}_p})})}\vec(c\,\mat{\Omega}), \label{eq:eta2-manifold}
|
||||
\end{align}
|
||||
where $\overline{\ten{\eta}}\in\mathbb{R}^{p_1\times\ldots\times p_r}$, $\mat{\Omega} \in \mathbb{R}^{p \times p}$ is positive definite with $p = \prod_{j = 1}^{r} p_j$, and $c\in\mathbb{R}$ is a known constant determined by the distribution to ease modeling. That is, we assume that only $\mat{\eta}_{1y}$ depends on $Y$ through $\mat{B}$. The second parameter $\mat{\eta}_2$ captures the second order interaction structure of $\ten{X}$, which we assume not to depend on the response $Y$. In order to relate individual modes of $\ten{X}$ to the response, allowing flexibility in modeling, we assume $\ten{F}_y$ takes values in $\mathbb{R}^{q_1\times ...\times q_r}$; that is, $\ten{F}_y$ is a tensor valued independent variable. This, in turn, leads to imposing corresponding tensor structure to the regression parameter $\mat{B}$. Thus, \eqref{eq:eta1-manifold} becomes
|
||||
where $\overline{\ten{\eta}}\in\mathbb{R}^{p_1\times\ldots\times p_r}$, $\mat{\Omega} \in \mathbb{R}^{p \times p}$ is positive definite with $p = \prod_{j = 1}^{r} p_j$, and $c\in\mathbb{R}$ is a known constant determined by the distribution to ease modeling. That is, we assume that only $\mat{\eta}_{1y}$ depends on $Y$ through $\mat{B}$. The second parameter $\mat{\eta}_2$ captures the second order interaction structure of $\ten{X}$, which we assume not to depend on the response $Y$. To relate individual modes of $\ten{X}$ to the response, allowing flexibility in modeling, we assume $\ten{F}_y$ takes values in $\mathbb{R}^{q_1\times ...\times q_r}$; that is, $\ten{F}_y$ is a tensor valued independent variable. This, in turn, leads to imposing a corresponding tensor structure to the regression parameter $\mat{B}$. Thus, \eqref{eq:eta1-manifold} becomes
|
||||
\begin{align}
|
||||
\mat{\eta}_{1y} &=
|
||||
\vec\biggl(\overline{\ten{\eta}} + \ten{F}_y\mlm_{j = 1}^{r}\mat{\beta}_j\biggr), \label{eq:eta1}
|
||||
|
@ -478,11 +462,9 @@ The reduction in vectorized form is $\vec\ten{R}(\ten{X})=\t{\mat{B}}\vec(\ten{X
|
|||
\begin{align*}
|
||||
f_{\theta}(\mat{x}\mid Y = y)
|
||||
&= h(\mat{x})\exp(\langle\mat{x}, \mat{\eta}_{1y}(\mat{\theta})\rangle + \langle\vec(\mat{x}\circ\mat{x}), \mat{\eta}_2(\mat{\theta})\rangle - b(\mat{\eta}_y(\mat{\theta}))) \\
|
||||
% &= h(\mat{x})\exp(\t{\mat{\eta}_{1y}(\theta)}\mat{x} + \t{\vec(\mat{x}\circ\mat{x})}\mat{\eta}_2(\mat{\theta}) - b(\mat{\eta}_y(\mat{\theta}))) \\
|
||||
&= h(\mat{x})\exp(\t{(\overline{\mat{\eta}} + \mat{\beta}\mat{f}_y)}\mat{x} + c\,\t{\mat{x}}\mat{\Omega}\,\mat{x} - b(\mat{\eta}_y(\mat{\theta}))).
|
||||
\end{align*}
|
||||
using the relation of $\mat{\theta}$ to the natural parameters given by $\mat{\eta}_{1y}(\mat{\theta}) = \overline{\mat{\eta}} + \mat{\beta}\mat{f}_y$ and $\mat{\eta}_2(\theta) = c\,\mat{\Omega}$.
|
||||
% where the number of unknown parameters is $p + \dim(\StiefelNonCompact{p}{q}) + \dim(\SymPosDefMat{p}) = p\frac{p + 2 q + 3}{2}$.
|
||||
\end{example}
|
||||
|
||||
\begin{example}[Matrix valued $\mat{X}$ ($r = 2$)]
|
||||
|
@ -507,7 +489,7 @@ The maximum likelihood estimate of $\mat{\theta}_0$ is the solution to the optim
|
|||
\end{equation}
|
||||
with $\hat{\mat{\theta}}_n = (\vec\widehat{\overline{\ten{\eta}}}, \vec\widehat{\mat{B}}, \vech\widetilde{\mat{\Omega}})$ where $\widehat{\mat{B}} = \bigkron_{k = r}^{1}\widehat{\mat{\beta}}_k$ and $\widehat{\mat{\Omega}} = \bigkron_{k = r}^{1}\widehat{\mat{\Omega}}_k$.
|
||||
|
||||
A straightforward and general method for parameter estimation is \emph{gradient descent}. To apply gradient based optimization, we compute the gradients of $l_n$ in \cref{thm:grad}.
|
||||
A straightforward and general method for parameter estimation is \emph{gradient descent}. To apply gradient-based optimization, we compute the gradients of $l_n$ in \cref{thm:grad}.
|
||||
|
||||
\begin{theorem}\label{thm:grad}
|
||||
For $n$ i.i.d. observations $(\ten{X}_i, y_i), i = 1, ..., n$ the log-likelihood is of the form in \eqref{eq:log-likelihood} with $\mat{\theta}$ being the collection of all GMLM parameters $\overline{\ten{\eta}}$, ${\mat{B}} = \bigkron_{k = r}^{1}{\mat{\beta}}_k$ and ${\mat{\Omega}} = \bigkron_{k = r}^{1}{\mat{\Omega}}_k$ for $k = 1, ..., r$. Let $\ten{G}_2(\mat{\eta}_y)$ be a tensor of dimensions $p_1, \ldots, p_r$ such that
|
||||
|
@ -524,42 +506,31 @@ A straightforward and general method for parameter estimation is \emph{gradient
|
|||
If $\mat{T}_2$ is the identity matrix $\mat{I}_{p(p + 1) / 2}$, then $\ten{G}_2(\mat{\eta}_y) = \ten{g}_2(\mat{\eta}_y)$.
|
||||
\end{theorem}
|
||||
|
||||
Although the general case of any GMLM model can be fitted via gradient descent using \cref{thm:grad}, this may be very inefficient. In \cref{thm:grad}, $\mat{T}_2$ can be used to introduce flexible second moment structures. For example, it allows modeling effects differently for predictor components, as described in \cref{sec:ising_estimation} after Eqn. \eqref{eq:ising-cond-prob}. In the remainder, we focus on $\mat{T}_2$'s that are identity matrices. This approach simplifies the estimation algorithm and the speed of the numerical calculation in the case of tensor normal predictors. % In the case of the tensor normal distribution,
|
||||
An iterative cyclic updating scheme is derived in \cref{sec:tensor-normal-estimation}, which has much faster convergence, is stable and does not require hyperparameters, as will be discussed later. On the other hand, the Ising model does not allow such a scheme. There we need to use a gradient-based method, which is the subject of \cref{sec:ising_estimation}.
|
||||
Although the general case of any GMLM model can be fitted via gradient descent using \cref{thm:grad}, this may be very inefficient. In \cref{thm:grad}, $\mat{T}_2$ can be used to introduce flexible second moment structures. For example, it allows modeling effects differently for predictor components, as described in \cref{sec:ising_estimation} after Eqn. \eqref{eq:ising-cond-prob}. In the remainder, we focus on $\mat{T}_2$'s that are identity matrices. This approach simplifies the estimation algorithm and the speed of the numerical calculation in the case of multi-linear normal predictors. An iterative cyclic updating scheme is derived in \cref{sec:tensor-normal-estimation}, which has much faster convergence, is stable, and does not require hyperparameters, as will be discussed later. On the other hand, the Ising model does not allow such a scheme. There we need to use a gradient-based method, which is the subject of \cref{sec:ising_estimation}.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Tensor Normal}\label{sec:tensor-normal-estimation}
|
||||
\subsection{Multi-Linear Normal}\label{sec:tensor-normal-estimation}
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
The tensor normal, also known as the \emph{multilinear normal}, is the extension of the matrix normal to tensor-valued random variables and a member of the quadratic exponential family \eqref{eq:quadratic-exp-fam} under \eqref{eq:eta2}. \textcite{Dawid1981,Arnold1981} introduced the term matrix normal and, in particular, \textcite{Arnold1981} provided several theoretical results, such as its density, moments and conditional distributions of its components. The matrix normal distribution is a bilinear normal distribution; a distribution of a two-way array, each component
|
||||
representing a vector of observations \parencite{OhlsonEtAl2013}. \textcite{KolloVonRosen2005,Hoff2011,OhlsonEtAl2013} presented the extension of the bilinear to the multilinear normal distribution, what we call tensor normal, using a parallel extension of bilinear matrices to multilinear tensors \parencite{Comon2009}.
|
||||
The \emph{multi-linear normal} is the extension of the matrix normal to tensor valued random variables and a member of the quadratic exponential family \eqref{eq:quadratic-exp-fam} under \eqref{eq:eta2}. \textcite{Dawid1981,Arnold1981} introduced the term matrix normal and, in particular, \textcite{Arnold1981} provided several theoretical results, such as its density, moments and conditional distributions of its components. The matrix normal distribution is a bilinear normal distribution; a distribution of a two-way array, each component
|
||||
representing a vector of observations \parencite{OhlsonEtAl2013}. \textcite{KolloVonRosen2005,Hoff2011,OhlsonEtAl2013} presented the extension of the bilinear to the multilinear normal distribution, using a parallel extension of bilinear matrices to multi-linear tensors \parencite{Comon2009}.
|
||||
|
||||
The defining feature of the matrix normal distribution, and its tensor extension, is the Kronecker product structure of its covariance. This formulation, where the covariates are multivariate normal with multiway covariance structure modeled as a Kronecker product of matrices of much lower dimension, aims to overcome the significant modeling and computational challenges arising from the high computational complexity of manipulating tensor representations \parencite[see, e.g.,][]{HillarLim2013,WangEtAl2022}.
|
||||
The defining feature of the matrix normal distribution, and its multi-linear extension, is the Kronecker product structure of its covariance. This formulation, where the covariates are multivariate normal with multiway covariance structure modeled as a Kronecker product of matrices of much lower dimension, aims to overcome the significant modeling and computational challenges arising from the high computational complexity of manipulating tensor representations \parencite[see, e.g.,][]{HillarLim2013,WangEtAl2022}.
|
||||
|
||||
Multilinear tensor normal %Kronecker-separable covariance
|
||||
models have been used in various applications, including
|
||||
medical imaging \parencite{BasserPajevic2007,DrydenEtAl2009 |