Install all required dependencies
%pip install git+https://github.com/BonnerLab/ccn-tutorial.git
Revealing the latent structure of fMRI responses to natural scenes
August 26, 2023
March 10, 2024
Here’s a link to this notebook on Google Colab.
The natural scenes dataset (NSD) is the largest fMRI dataset on human vision, with 7T fMRI responses (1.8 mm isotropic voxels) obtained from 8 adult participants (Allen et al. 2021). The experiment involved a continuous recognition task while participants observed natural scene images from the Microsoft Common Objects in Context (COCO) database (Lin et al. 2014).
Let’s load the dataset. This data contains neural responses to 700 images from ~15,000 voxels reliably modulated by the visual stimuli during the NSD experiment.
from collections import Counter
import warnings
import numpy as np
import pandas as pd
import xarray as xr
from torchtext.datasets import IMDB
from torchtext.data.utils import get_tokenizer
from sklearn.preprocessing import MinMaxScaler
from sklearn.decomposition import PCA, NMF
from IPython.display import display, HTML
from matplotlib.figure import Figure
from mpl_toolkits.axes_grid1 import ImageGrid
from matplotlib import pyplot as plt
from matplotlib_inline.backend_inline import set_matplotlib_formats
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
import seaborn as sns
from utilities.brain import (
load_dataset,
average_data_across_repetitions,
load_stimuli,
plot_brain_map,
)
%matplotlib inline
sns.set_theme(
context="notebook",
style="white",
palette="deep",
rc={"legend.edgecolor": "None"},
)
set_matplotlib_formats("svg")
pd.set_option("display.max_rows", 5)
pd.set_option("display.max_columns", 10)
pd.set_option("display.precision", 3)
pd.set_option("display.show_dimensions", False)
xr.set_options(display_max_rows=3, display_expand_data=False)
warnings.filterwarnings("ignore")
<xarray.DataArray 'fMRI betas' (presentation: 700, neuroid: 15724)> Size: 44MB 0.4915 0.2473 0.08592 0.05828 -0.1315 ... -0.2126 -0.6315 -0.5751 -0.5354 Coordinates: (3/4) x (neuroid) uint8 16kB 12 12 12 12 12 12 12 ... 72 72 72 72 72 72 y (neuroid) uint8 16kB 21 22 22 22 22 22 23 ... 29 29 30 30 30 31 ... ... stimulus_id (presentation) object 6kB 'image02950' ... 'image72948' Dimensions without coordinates: presentation, neuroid Attributes: (3/8) resolution: 1pt8mm preprocessing: fithrf_GLMdenoise_RR ... ... postprocessing: averaged across first two repetitions
array([[ 0.4915219 , 0.24733381, 0.08592446, ..., -0.366651 , 0.30723202, 0.43520752], [ 0.1664538 , -0.10728736, 0.35630295, ..., 0.8608913 , 0.03464809, 0.11020081], [ 1.0357349 , 0.77598304, 0.35813144, ..., 0.2419075 , 0.81557286, 0.38667244], ..., [-0.05812129, -0.4539395 , 0.41060364, ..., 0.5738151 , -0.718189 , -0.638827 ], [-0.00340644, -1.0050421 , 0.7278904 , ..., 0.580743 , -0.50856245, -0.2727615 ], [-1.2668517 , -1.4769105 , -0.3562023 , ..., -0.63146234, -0.575121 , -0.5354325 ]], dtype=float32)
array([12, 12, 12, ..., 72, 72, 72], dtype=uint8)
array([21, 22, 22, ..., 30, 30, 31], dtype=uint8)
array([47, 45, 46, ..., 46, 49, 49], dtype=uint8)
array(['image02950', 'image02990', 'image03049', 'image03077', 'image03146', 'image03157', 'image03164', 'image03171', 'image03181', 'image03386', 'image03434', 'image03449', 'image03489', 'image03626', 'image03682', 'image03687', 'image03729', 'image03809', 'image03842', 'image03847', 'image03856', 'image03913', 'image03951', 'image04051', 'image04058', 'image04129', 'image04156', 'image04249', 'image04423', 'image04436', 'image04667', 'image04690', 'image04768', 'image04786', 'image04835', 'image04892', 'image04930', 'image05034', 'image05106', 'image05204', 'image05301', 'image05338', 'image05459', 'image05542', 'image05583', 'image05602', 'image05714', 'image06199', 'image06222', 'image06431', 'image06444', 'image06489', 'image06514', 'image06521', 'image06558', 'image06801', 'image07007', 'image07039', 'image07120', 'image07207', 'image07366', 'image07418', 'image07480', 'image07654', 'image07840', 'image07859', 'image07944', 'image07948', 'image08006', 'image08109', 'image08204', 'image08225', 'image08394', 'image08415', 'image08435', 'image08465', 'image08509', 'image08646', 'image08807', 'image08843', ... 'image64615', 'image64621', 'image64867', 'image64880', 'image65010', 'image65148', 'image65253', 'image65267', 'image65376', 'image65445', 'image65769', 'image65799', 'image65821', 'image65872', 'image65920', 'image65943', 'image66004', 'image66216', 'image66278', 'image66330', 'image66464', 'image66489', 'image66580', 'image66773', 'image66836', 'image66946', 'image66976', 'image67045', 'image67113', 'image67204', 'image67237', 'image67295', 'image67742', 'image67802', 'image67829', 'image68168', 'image68278', 'image68339', 'image68418', 'image68471', 'image68741', 'image68814', 'image68842', 'image68858', 'image68897', 'image69007', 'image69130', 'image69214', 'image69240', 'image69502', 'image69614', 'image69839', 'image69854', 'image70075', 'image70095', 'image70193', 'image70232', 'image70335', 'image70360', 'image70427', 'image70505', 'image71229', 'image71232', 'image71241', 'image71410', 'image71450', 'image71753', 'image71894', 'image72015', 'image72080', 'image72209', 'image72312', 'image72510', 'image72605', 'image72719', 'image72948'], dtype=object)
We utilized the NSD single-trial betas, preprocessed in 1.8 mm volumetric space and denoised using the GLMdenoise technique (version 3; “betas_fithrf_GLMdenoise_RR”). The betas were converted to Z-scores within each scanning session and averaged across repetitions for each stimulus.
Here are some examples of stimuli seen by the participants.
def view_stimuli(stimuli: xr.DataArray, *, n: int = 10) -> None:
fig = plt.figure(figsize=(12, 4))
image_grid = ImageGrid(
fig=fig,
rect=(1, 1, 1),
nrows_ncols=(1, n),
share_all=True,
)
for i_image in range(n):
image_grid[i_image].imshow(stimuli[i_image])
image_grid[i_image].axis("off")
fig.show()
stimuli = load_stimuli()
view_stimuli(stimuli)
Now we can apply PCA to the neural responses and plot the eigenspectrum of their covariance!
def view_eigenspectrum(pca: PCA, *, log: bool = False) -> None:
eigenvalues = pd.DataFrame(pca.explained_variance_, columns=["eigenvalue"]).assign(
rank=1 + np.arange(pca.n_components_)
)
fig, ax = plt.subplots(figsize=(6, 6))
sns.lineplot(
ax=ax,
data=eigenvalues.loc[eigenvalues["rank"] < pca.n_components_],
x="rank",
y="eigenvalue",
)
sns.despine(ax=ax)
if log:
ax.set_xscale("log")
ax.set_yscale("log")
ax.set_ylim(bottom=1)
fig.show()
pca = PCA()
pca.fit(data)
view_eigenspectrum(pca)
There are some simple ways to visualize and interpret the principal components.
def view_images_as_scatterplot(
x: np.ndarray, y: np.ndarray, *, stimuli: xr.DataArray
) -> None:
fig, ax = plt.subplots(figsize=(10, 10))
for i_stimulus in range(len(stimuli)):
image_box = OffsetImage(stimuli[i_stimulus].values, zoom=0.3)
image_box.image.axes = ax
ab = AnnotationBbox(
image_box,
xy=(x[i_stimulus], y[i_stimulus]),
xycoords="data",
frameon=False,
pad=0,
)
ax.add_artist(ab)
ax.set_xlim([x.min(), x.max()])
ax.set_ylim([y.min(), y.max()])
ax.axis("off")
fig.show()
def view_images_at_poles(
x: np.ndarray,
*,
stimuli: xr.DataArray,
n_images_per_pole: int = 5,
label: str | None = None,
) -> None:
indices = np.argsort(x, axis=0)
fig = plt.figure(figsize=(12, 4))
image_grid = ImageGrid(
fig=fig,
rect=(1, 1, 1),
nrows_ncols=(1, 2 * n_images_per_pole + 1),
share_all=True,
)
for i_image in range(n_images_per_pole):
image_grid[i_image].imshow(stimuli[indices[i_image]])
image_grid[i_image].axis("off")
image_grid[-i_image - 1].imshow(stimuli[indices[-i_image - 1]])
image_grid[-i_image - 1].axis("off")
for ax in image_grid:
ax.axis("off")
if label is not None:
ax = image_grid[n_images_per_pole]
ax.text(
0.5,
0.5,
label,
horizontalalignment="center",
verticalalignment="center",
transform=ax.transAxes,
)
fig.show()
The first method is to plot the stimuli on a scatter plot, designating their X and Y coordinates to be their scores along two principal components of interest. This allows us to observe potential clustering of the stimuli.
Alternatively, we can focus on the stimuli with the highest or lowest scores along a given principal component. This provides simple clues of what this PC might be sensitive to, which could be visual features ranging from low to high complexity.
Interpreting PCs can be challenging especially when we rely solely on visual inspection. This difficulty arises in part because many natural features are non-negative. As a result, methods like nonnegative matrix factorization (NMF) often offer more interpretable dimensions than PCA.
Similarly, we can inspect the stimuli with highest or closest-to-zero values along each dimension.
Nonetheless, PCA has unique benefits that shouldn’t be overlooked. For instance, PCA offers closed-form solutions and non-stochastic outcomes. They’re also well characterized mathematically. Moreover, because PCA is essentially a simple rotation of the data, it preserves all the original information in the dataset.
On this plot, it looks like that the first few PCs have substantial variance while the rest are negligible, which suggests a low-dimensional structure.
However, when dealing with high-dimensional data that span several orders of magnitude, it’s more insightful to visualize it on a logarithmic scale, which makes many key statistical trends more apparent. Let’s try visualizing the spectrum on a logarithmic scale for both axes:
On a log-log scale, the spectrum shows a trend that looks remarkably linear, suggesting that the eigenspectrum might obey a power-law distribution:
\begin{align*} \log{\lambda_\text{rank}} &\approx \alpha \log{\left( \text{rank} \right)} + c\\ \lambda_\text{rank} &\propto \left( \text{rank} \right)^\alpha \end{align*}
There appears to be no obvious cut-off point in this power law, suggesting that there might be information across all ranks. The number of effective dimensions here is likely much higher than what we would have expected from simply viewing the eigenspectrum on a linear scale.
A power law is a relationship of the form f(x) \propto x^{\alpha}, where \alpha is termed the index of the power law, or the power law exponent. It suggests a scale-free structure because f(kx) \propto f(x).
Power laws are ubiquitious in nature, arising in all sorts of systems:
Nevertheless, a power law relationship will not be observed when the data
Zipf’s law suggests a power-law distribution of use frequency of English words. Let’s compute the distribution of frequencies in a small corpus – a collection of IMDb movie reviews.
def view_word_frequency_distribution(counter: Counter, *, log: bool = False) -> Figure:
fig, ax = plt.subplots()
sns.lineplot(
ax=ax,
data=pd.DataFrame(
{
"rank": 1 + np.arange(len(counter)),
"word frequency": sorted(counter.values())[::-1],
}
),
x="rank",
y="word frequency",
)
if log:
ax.set_xscale("log")
ax.set_yscale("log")
sns.despine(ax=ax, offset=20)
plt.close(fig)
return fig
view_word_frequency_distribution(counter, log=True)
If only the high-frequency words are meaningful to human language, then we should be able to reconstruct a movie review with these words only:
for _, line in dataset:
tokens = tokenizer(line)
break
def postprocess(text: str):
new_text = ""
for i_char, char in enumerate(text):
if i_char != 0:
prev_char = text[i_char - 1]
else:
prev_char = None
if i_char > 1:
prev2_char = text[i_char - 2]
else:
prev2_char = None
if i_char != len(text) - 1:
next_char = text[i_char + 1]
else:
next_char = None
if char == "i" and prev_char == " " and next_char == " ":
new_text += "I"
elif char == " " and (
next_char == "."
or next_char == ","
or next_char == "'"
or prev_char == "'"
or prev_char == "("
or next_char == ")"
or next_char == "!"
):
continue
elif prev_char == " " and (prev2_char == "." or prev2_char == "!"):
new_text += char.upper()
else:
new_text += char
return new_text
print("An IMDb review")
display(HTML(f"<blockquote>{postprocess(' '.join(tokens))}</blockquote>"))
An IMDb review
i love sci-fi and am willing to put up with a lot. Sci-fi movies/tv are usually underfunded, under-appreciated and misunderstood. I tried to like this, I really did, but it is to good tv sci-fi as babylon 5 is to star trek (the original). Silly prosthetics, cheap cardboard sets, stilted dialogues, cg that doesn't match the background, and painfully one-dimensional characters cannot be overcome with a'sci-fi'setting. (I'm sure there are those of you out there who think babylon 5 is good sci-fi tv. It's not. It's clichéd and uninspiring.) while us viewers might like emotion and character development, sci-fi is a genre that does not take itself seriously (cf. Star trek). It may treat important issues, yet not as a serious philosophy. It's really difficult to care about the characters here as they are not simply foolish, just missing a spark of life. Their actions and reactions are wooden and predictable, often painful to watch. The makers of earth know it's rubbish as they have to always say gene roddenberry's earth... Otherwise people would not continue watching. Roddenberry's ashes must be turning in their orbit as this dull, cheap, poorly edited (watching it without advert breaks really brings this home) trudging trabant of a show lumbers into space. Spoiler. So, kill off a main character. And then bring him back as another actor. Jeeez! Dallas all over again.
Review regenerated using only the top 200 words
i love and to up with a lot. Are, and. I to like this, I really did, but it is to good as is to (the).,,, that doesn't the, and characters be with a''. (I'm there are those of you out there who think is good. It's not. It's and.) while us like and character, is a that does not take (.). It, not as a. It's really to about the characters here as they are not, just a of life. Their and are and, to watch. The of know it's as they have to say's... People would not watching.'s be in their as this,, (watching it really this) of a show into.. So, off a character. And then him back as another.! All over again.
This poor reconstruction demonstrates that the high-rank, low-frequency words also carry meaningful information – in fact, most of it. Analogously, if we try to reconstruct neural data with just a few high-variance principal components – which is exactly what typical dimensionality reduction methods do – we will likely lose valuable information about the presented stimulus.