Sampling¶

sampling ¶

Sampling routines for optimal weighted least squares.

This module provides routines for sampling from the optimal sampling distribution. Given a polynomial space \(V:= \text{span} \{P_\lambda: \lambda \in \Lambda\}\) defined by a multi-index set \(\Lambda \subset \mathbb{N}_0^d\), there exists an optimal sampling distribution \(\nu = \nu(V)\) that minimizes the number of samples required for the weighted least squares estimator onto \(V\). This module also contains routines for arcsine sampling and determining the optimal sample size for quasi-optimality of the weighted least squares estimator.

Functions:

Name	Description
`arcsine`	Arcsine density function.
`sample_arcsine`	Samples from the arcsine distribution.
`sample_squared_legendre`	Samples from the distribution with squared Legendre density.
`optimal_sample_size`	Optimal sample size for quasi-optimality.
`sample_optimal_distribution`	Samples from the optimal distribution.

multichaos.sampling.arcsine ¶

arcsine(x: np.ndarray) -> np.ndarray

Arcsine density function.

This evaluates the arcsine density function

\[ \begin{equation} p_d(x) = \prod_{i=1}^d \frac{1}{\pi \sqrt{1- x_i^2}}, \quad x \in (-1, 1)^d. \end{equation} \]

Parameters:

Name	Type	Description	Default
`x`	`np.ndarray`	Input array.	required

Returns:

Type	Description
`np.ndarray`	Arcsine density function evaluated at `x`.

multichaos.sampling.sample_arcsine ¶

sample_arcsine(size: tuple[int, ...]) -> np.ndarray

Samples from the arcsine distribution.

This is based on the fact that if \(U \sim \mathcal{U}(-\pi, \pi)\), then \(X = \sin(U) \sim p_1\) follows the arcsine distribution.

Parameters:

Name	Type	Description	Default
`size`	`tuple`	Size of the sample.	required

Returns:

Type	Description
`np.ndarray`	Sample from the arcsine distribution with shape `size`.

multichaos.sampling.sample_squared_legendre ¶

sample_squared_legendre(n: int, size: int) -> np.ndarray

Samples from the distribution with squared Legendre density.

This function samples from the distribution with squared Legendre density \(P_n^2\), where \(P_n\) is the Legendre polynomial of degree \(n\). This is done using rejection sampling with arcsine proposals, since

\[ \begin{equation} |P_n(x)| \leq 2 p_1(x)^{1/2}, \quad x \in (-1, 1), \,\, \forall n \in \mathbb{N}_0. \end{equation} \]

Parameters:

Name	Type	Description	Default
`n`	`int`	Degree of Legendre polynomial.	required
`size`	`int`	Number of samples.	required

Returns:

Type	Description
`np.ndarray`	Sample from the square Legendre density with shape `size`.

multichaos.sampling.optimal_sample_size ¶

optimal_sample_size(dim: float, risk: float = 0) -> int

Optimal sample size for quasi-optimality.

Determines the optimal number of samples \(N\) required to ensure quasi-optimality of a weighted least squares estimator onto a polynomial space \(V\). For \(p \in (0, 1)\) and \(\theta = \frac{1}{2}(1 - \log 2)\), the condition

\[ \begin{equation} N \geq \frac{m}{\theta} \log \left(\frac{2m}{1 - p}\right), \end{equation} \]

where \(m = \text{dim}(V)\), ensures that

\[ \begin{equation} \mathbb{P}(\Vert \mathbf{G} - \mathbf{I}\Vert \leq 1/2) \geq p, \end{equation} \]

where \(\mathbf{G} \in \mathbb{R}^{m \times m}\) denotes the Gramian least squares and \(\mathbf{I} \in \mathbb{R}^{m \times m}\) the identity matrix.

Parameters:

Name	Type	Description	Default
`dim`	`float`	Dimension of the polynomial space used for projection.	required
`risk`	`float`	Probability \(p\) for quasi-optimality to hold.	`0`

Returns:

Name	Type	Description
`int`	`int`	Optimal sample size.

multichaos.sampling.sample_optimal_distribution ¶

sample_optimal_distribution(index_set: np.ndarray, size: int) -> np.ndarray

Samples from the optimal distribution.

Given a polynomial subspace \(\text{span} \{P_\lambda: \lambda \in \Lambda\} \subset L^2([-1, 1]^d)\) defined by an index set \(\Lambda \subset \mathbb{N}_0^d\), the density of the optimal distribution is defined by

\[ \begin{equation} \frac{1}{|\Lambda|} \sum_{\lambda \in \Lambda} P_\lambda^2. \end{equation} \]

Parameters:

Name	Type	Description	Default
`index_set`	`np.ndarray`	Index set defining the polynomial subspace of shape `(n_basis, dim)`.	required
`size`	`int`	Number of samples.	required

Returns:

Type	Description
`np.ndarray`	Samples with shape `(size, dim)`.