scores.KNNScore
Abstract base class for KNN distance-based confidence scores.
Usage
scores.KNNScore()Computes distance-based confidence scores where low scores indicate samples similar to the training distribution (likely inliers) and high scores indicate samples deviating from the training distribution (likely outliers).
Parameters
k: int = 1-
Number of nearest neighbors used to compute the distance score.
stat: (max, mean, median, min) = "max"-
Statistic applied to aggregate distances across the k neighbors.
pca: TensorPCA or None = None-
Optional PCA for dimensionality reduction prior to scoring.
save_index: bool or Path = False-
If
True, the HNSW index is saved to a default file. If aPathis provided (must end in.bin), the index is saved there.
Attributes
| Name | Description |
|---|---|
| k | int([x]) -> integer |
k
int([x]) -> integer
k: int = k
int(x, base=10) -> integer
Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.
If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-’ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4
Methods
| Name | Description |
|---|---|
| fit() | Train a confidence score based on sample embeddings. |
fit()
Train a confidence score based on sample embeddings.
Usage
fit(
X=None,
Y=None,
model=None,
loaders=None,
outdir=None,
prefix=None,
q=False
)This method supports two usage modes:
- Precomputed embeddings: Supply training embeddings via
Xand optional calibration embeddings viaY. - On-the-fly extraction: Supply a
modelwith an.embed()method and a dictionary ofDataLoadersto extract embeddings automatically.
You must use either embeddings (X/Y) OR model+loaders, but not both.
# Mode 1: Precomputed embeddings
from seapig.scores import EuclideanScore
my_score = EuclideanScore(k=2)
my_score.fit(X=train_embs, Y=val_embs)
# Mode 2: On-the-fly extraction
my_score = EuclideanScore(k=2)
my_score.fit(model=model, loaders={"train": train_loader, "val": val_loader})Parameters
X: torch.Tensor | None = None-
A
torch.Tensorwith training sample embeddings. Required when not usingmodelandloaders. Y: torch.Tensor | None = None-
A
torch.Tensorwith calibration sample embeddings. Optional. model: torch.nn.Module | None = None-
A
torch.nn.Modulewith an.embed()method. Required when not usingX. loaders: dict[str, DataLoader[torch.Tensor | dict[str, torch.Tensor]]] | None = None-
A
dictwithDataLoaderobjects. Required keys:["train"]. Optional key:["val"]. Required when usingmodel. outdir: Path | None = None-
A
pathlib.Pathpointing to a directory for saving/loading embeddings. Only used withmodelandloaders. prefix: str | None = None-
A
strused as filename prefix for saved embeddings. Only used withmodelandloaders. q: bool | float = False-
A
floatorboolindicating if outliers from the training distribution should be filtered before fitting. Defaults toFalse.
See Also
- seapig.scores.knn.EuclideanScore: Concrete score using Euclidean distance.
- seapig.scores.knn.CosineScore: Concrete score using cosine distance.
- seapig.scores.knn.MahalanobisScore: Concrete score using Mahalanobis distance.