scores.utils.TensorPCA

Tensor-based PCA.

Usage

scores.utils.TensorPCA()

Supports standard (linear) PCA and an optional Random Fourier Feature (RFF) mapping prior to PCA.

Operation Modes

linear: standard linear PCA.
rff: apply a Random Fourier Feature mapping before PCA.

Mode selection follows the constructor arguments: the RFF branch is inferred only when both gamma and M are provided, unless mode is set explicitly. Supplying only one of these values does not enable the RFF mapping by itself.

Saving / Loading

Persist the module with the standard PyTorch state-dict API. The module registers persistent buffers for PCA and RFF state:

torch.save(instance.state_dict(), "tpca.pt")

tpca2 = TensorPCA(n_components=..., gamma=..., M=..., mode=...)
sd = torch.load("tpca.pt")
tpca2.load_state_dict(sd)

The custom _load_from_state_dict accepts placeholder or differently- shaped tensors and will set or register buffers to avoid size-mismatch errors on fresh instances.

Notes

PCA internals are stored in float64 for numerical fidelity. During preprocessing, inputs are cast to float64 to match the stored mean.

See https://arxiv.org/pdf/2505.15284 for motivation behind RFF-PCA.

Examples

import torch
from seapig.scores.utils import TensorPCA
pca = TensorPCA(n_components=0.90)
X = torch.randn(100, 32)
pca.fit(X)
Z = pca.transform(X)        # projected to lower dimension
X_rec, err = pca.reconstruct(X)  # reconstruction and per-sample L2 error
print(err)

tensor([2.0058, 1.9291, 1.8213, 1.7371, 2.4389, 1.7318, 1.8956, 1.6139, 2.7999,
        1.3375, 1.8060, 1.5045, 1.0151, 1.5380, 1.7393, 2.3805, 1.9327, 1.2392,
        1.8813, 1.4295, 1.4910, 1.8490, 1.7813, 2.3088, 2.0619, 1.5390, 1.9259,
        1.9640, 1.4325, 1.7071, 1.9314, 2.0367, 1.1526, 2.0713, 0.5289, 1.2012,
        2.3030, 1.5801, 1.4331, 0.9826, 1.5877, 2.2835, 1.0568, 1.1616, 2.0296,
        2.1218, 2.2270, 1.6374, 1.6830, 2.2425, 1.7872, 2.3373, 1.6275, 1.1028,
        1.5290, 1.5754, 1.7815, 1.5694, 2.1446, 1.7456, 1.5061, 1.5564, 2.0463,
        1.7947, 1.5016, 1.9780, 1.7616, 1.9326, 2.1084, 0.9584, 1.8802, 2.0631,
        1.8875, 1.5057, 1.6701, 2.1497, 1.4630, 0.8322, 1.4700, 1.5116, 1.8499,
        1.5569, 1.9381, 1.4179, 1.2711, 1.7404, 1.1604, 1.6745, 2.1154, 2.3305,
        2.0704, 0.8755, 1.9547, 1.9037, 1.9707, 1.8827, 0.9674, 1.8882, 1.4635,
        1.7248], dtype=torch.float64)

Methods

Name	Description
__init__()	Initialise TensorPCA.
finalize()	Finalize partial fit: compute covariance SVD and set PCA params.
fit()	Fit PCA on the input data X.
fit_transform()	Fit PCA on X and return the projected components.
inverse_transform()	Reconstruct samples from principal component scores.
partial_fit()	Process a single batch for incremental PCA.
reconstruct()	Reconstruct an input and return the L2 reconstruction error.
reset_partial()	Reset internal accumulators used for partial fitting.
transform()	Project input samples onto the retained principal components.

init()

Initialise TensorPCA.

Usage

Source

__init__(n_components=0.9, gamma=None, M=None, mode=None)

Parameters

n_components: int or float = 0.90: If an int, the exact number of principal components to retain (must be > 0). If a float in (0, 1], the minimum cumulative explained variance to retain. Defaults to 0.90 (90% variance).
gamma: float or None = None: Bandwidth parameter for the RFF kernel. If provided together with M, RFF mode is enabled automatically.
M: int or None = None: Number of RFF random features (must be > input dimensionality D). If provided together with gamma, RFF mode is enabled automatically.
mode: (linear, rff) = "linear": Explicit mode override. When None, the mode is inferred from gamma and M.

finalize()

Finalize partial fit: compute covariance SVD and set PCA params.

Usage

Source

finalize()

This method computes the overall mean and centred covariance from accumulated sums and performs SVD to extract principal components.

fit()

Fit PCA on the input data X.

Usage

Source

fit(X, Y=None)

Convenience method that runs a single-batch partial_fit followed by finalize. For large datasets or streaming data, use the incremental partial_fit / finalize interface instead.

Parameters

X: torch.Tensor: Input data of shape (N, D).
Y: None = None: Ignored. Present for API compatibility.

fit_transform()

Fit PCA on X and return the projected components.

Usage

Source

fit_transform(X, Y=None)

Parameters

X: torch.Tensor: Input data of shape (N, D).
Y: None = None: Ignored. Present for API compatibility.

Returns

torch.Tensor: Projected data of shape (N, q) where q is the number of retained components.

inverse_transform()

Reconstruct samples from principal component scores.

Usage

Source

inverse_transform(Z)

Parameters

Z: torch.Tensor: Component scores of shape (N, q).

Returns

torch.Tensor: Reconstructed samples in the preprocessed space, shape (N, D) or (N, M) if RFF mode is used.

partial_fit()

Process a single batch for incremental PCA.

Usage

Source

partial_fit(X)

This accumulates sufficient statistics (sum of samples and sum of outer products) which are later finalised in finalize() to produce the PCA decomposition.

reconstruct()

Reconstruct an input and return the L2 reconstruction error.

Usage

Source

reconstruct(X)

reset_partial()

Reset internal accumulators used for partial fitting.

Usage

Source

reset_partial()

transform()

Project input samples onto the retained principal components.

Usage

Source

transform(X)

Parameters

X: torch.Tensor: Input data of shape (N, D).

Returns

torch.Tensor: Projected data of shape (N, q) where q is the number of retained components.

Operation Modes

Saving / Loading

Notes

Examples

Methods

__init__()

Parameters

finalize()

fit()

Parameters

fit_transform()

Parameters

Returns

inverse_transform()

Parameters

Returns

partial_fit()

reconstruct()

reset_partial()

transform()

Parameters

Returns

See Also

init()