Starting point
Fit ICA on new models.
The most direct extension is to run ICA Lens on newer, larger, multilingual,
or domain-specific LLMs. This tests whether the same non-Gaussian structure
appears across model families and gives researchers an inspectable basis
before model-specific SAE dictionaries exist.
Start with the Qwen3.6-27B one-layer tutorial
Research direction
Go beyond the d-component limit.
Standard compact ICA returns at most d components for a d-dimensional
activation space. One way around this hard limit is to fit ICA on different
datasets or distributions, then compare the resulting bases in the same
activation space. Different corpora may expose different non-Gaussian
directions, giving a route toward a richer component inventory without
relying on a single overcomplete fit. Another route is to test higher-capacity
ICA variants such as overcomplete ICA, adaptive or deflationary FastICA,
Infomax, JADE, and heavy-tail-aware objectives.
Research direction
Automatic annotation for ICA components.
Once an ICA model is fit, the explorer already exposes top examples,
opposite-side examples, signed scores, ERF, trace plots, and prompt tests.
A natural next step is to use this evidence to propose component labels
automatically, then hand the candidate labels to humans for verification.
Research direction
Theorize ERF beyond a useful heuristic.
The current Effective Receptive Field diagnostic asks how much left context
is sufficient to recover a component response. Future work can make this
more principled: test robustness, compare recovery criteria, connect ERF
to annotation difficulty, and relate component scope to model computation.
Research direction
Scale FastICA to huge open models.
The current pipeline loads activation matrices into memory, which becomes
limiting for 27B-scale models and beyond. A scalable ICA Lens would need
better memory management: distributed fitting, streaming or blockwise
whitening, activation offloading, and algorithms that can move between
CPU, disk, and GPU without treating the full dataset as one in-memory
matrix.
Research direction
Fit ICA across activation sites, not only layers.
ICA Lens currently focuses on embeddings and residual-stream states. A
broader analysis could include attention outputs, MLP outputs, residual
updates, shared bases across layers, or multiple sites at once. This would
help study how directions emerge, transform, persist, or disappear through
the forward pass.
Research direction
Test whether ICA components can become practical steering handles.
SAEBench TPP suggests that zeroing a small number of ICA coordinates can
selectively move probe-relevant behavior. A practical steering workflow
would need to edit signed scores, reconstruct through the writing map,
the pseudoinverse of the reading map R, patch back into the residual stream, and check whether
pseudoinverse quality and conditioning are good enough for stable edits
outside benchmark probes and against strong task-specific baselines.
Research direction
Develop ICA-SAE hybrid methods.
ICA and SAEs expose related but non-redundant directions in activation
space. A hybrid method could combine ICA's compact non-Gaussian basis with
SAE-style overcomplete sparse reconstruction, using the strengths of both
to improve component discovery, labeling, and intervention.
Research direction
Apply ICA Lens to vision-language models.
ICA can also be fit on image-text datasets rather than text-only corpora.
For VLMs, including image-token activations in the decomposition could help
reveal what visual token positions represent in deeper layers and how visual evidence mixes
with text through the model.
Research direction
Study activation geometry after normalization.
Row normalization makes ICA fitting more stable and supports the idea that
directions carry important structure even when raw activation norms are
set aside. This opens a deeper question: how much of LLM representation
geometry becomes simpler on the normalized sphere, and what does that
reveal about features, contexts, and interventions?
More ideas
This list is meant to grow.
Other directions include auditing SAE labels through ICA-SAE overlap,
overcomplete ICA variants, task-specific decompositions, and better
interfaces for comparing components across models. The release is intended
as infrastructure for these follow-up projects, not only as a static paper
artifact.