Doctoral Dissertation Neural Decoding of Visual Dream contents

Transcription

Doctoral Dissertation Neural Decoding of Visual Dream contents
NAIST-IS-DD1061022
Doctoral Dissertation
Neural Decoding of Visual Dream contents
Tomoyasu Horikawa
December 13, 2013
Department of Bioinformatics and Genomics
Graduate School of Information Science
Nara Institute of Science and Technology
A Doctoral Dissertation
submitted to Graduate School of Information Science,
Nara Institute of Science and Technology
in partial fulfillment of the requirements for the degree of
Doctor of SCIENCE
Tomoyasu Horikawa
Thesis Committee:
Professor Kazushi Ikeda
Professor Yuji Matsumoto
Associate Professor Tomohiro Shibata
Professor Mitsuo Kawato
Professor Yukiyasu Kamitani
(Supervisor)
(Co-supervisor)
(Co-supervisor)
(Co-supervisor)
(Co-supervisor)
Neural Decoding of Visual Dream contents∗
Tomoyasu Horikawa
Abstract
Dreaming is a subjective experience during sleep often accompanied by vivid visual
contents. Previous research has attempted to link physiological states with dreaming
but has not demonstrated how specific visual dream contents are represented in brain
activity. The recent advent of machine learning-based analysis has allowed for the
decoding of stimulus- and task-induced brain activity patterns to reveal visual contents. Here, we extend this approach to decode spontaneous brain activity associated
with dreaming with the assistance by lexical and image databases. We measured the
brain activity of sleeping human subjects using fMRI while monitoring sleep stages by
EEG. Subjects were awakened when a specific EEG pattern was observed during the
sleep-onset (hypnagogic) period. They gave a verbal report on the visual experiences
just before awakening and then returned to sleep. The words describing visual contents
were extracted and grouped into 16-26 categories defined in the English lexical database
WordNet, for systematically labeling dream contents. Decoders were trained on fMRI
responses to natural images describing each category, and then tested on sleep data.
Pairwise and multilabel decoding revealed that accurate classification, detection, and
identification regarding dream contents could be achieved with the higher visual cortex,
with semantic preferences of individual areas mirroring known stimulus representation.
Our results demonstrate that specific dream contents are represented in activity patterns of visual cortical areas, which are shared by stimulus perception. Our method
uncovers contents represented by brain activity not induced by stimulus or task, which
could provide insights into the functions of dreaming and spontaneous neural events.
Keywords:
Neural decoding, fMRI, dream, multivariate pattern analysis
∗
Doctoral Dissertation, Department of Bioinformatics and Genomics, Graduate School of Information Science, Nara Institute of Science and Technology, NAIST-IS-DD1061022, December
13, 2013.
i
Contents
1. Introduction
1
2. Methods
2.1 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Prior instructions to subjects . . . . . . . . . . . . . .
2.3 Sleep adaptation . . . . . . . . . . . . . . . . . . . . .
2.4 Sleep experiment . . . . . . . . . . . . . . . . . . . . .
2.5 MRI acquisition . . . . . . . . . . . . . . . . . . . . . .
2.6 PSG recordings . . . . . . . . . . . . . . . . . . . . . .
2.7 Offline EEG artifact removal and sleep-stage scoring .
2.8 Visual dream content labeling . . . . . . . . . . . . . .
2.9 Visual stimulus experiment . . . . . . . . . . . . . . .
2.10 Localizer experiments . . . . . . . . . . . . . . . . . .
2.11 MRI data preprocessing . . . . . . . . . . . . . . . . .
2.12 Region of interest (ROI) selection . . . . . . . . . . . .
2.13 Decoding analysis . . . . . . . . . . . . . . . . . . . . .
2.14 Synset pair selection by within-dataset cross-validation
3. Results
3.1 Behavioral results of sleep experiments
3.2 Dream contents decoding . . . . . . .
3.2.1 Pairwise classification analysis
3.2.2 Multilabel decoding analysis .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
6
6
7
9
9
10
10
14
14
16
17
22
24
.
.
.
.
25
25
34
34
51
4. Discussion
59
References
63
Acknowledgements
70
Appendix
71
A. Supplementary results
71
ii
List of Figures
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
REM and sleep-onset sleeping . . . . . . . . . . . . . . . . . . . . . . . .
Experimental overview . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Base synsets selection . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Visual content vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Visual stimulus experiment design . . . . . . . . . . . . . . . . . . . . .
Functionally defined regions of interest on the flattened cortex . . . . .
Inflated view of the anatomically defined regions of interest . . . . . . .
Schematic overview of the pairwise classification analysis . . . . . . . . .
Schematic overview of the multilabel decoding analysis . . . . . . . . . .
Time course of theta power . . . . . . . . . . . . . . . . . . . . . . . . .
Awakening statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time course of sleep state proportion . . . . . . . . . . . . . . . . . . . .
Distributions of pairwise decoding accuracy for stimulus-to-dream decoding
Within dataset cross-validation decoding . . . . . . . . . . . . . . . . . .
Representational similarity analysis . . . . . . . . . . . . . . . . . . . . .
Decoding with averaged vs. multivoxel activity . . . . . . . . . . . . . .
Mean accuracies for the pairs within and across meta-categories . . . . .
Mean accuracies for the samples from each sleep state . . . . . . . . . .
Pairwise decoding accuracies across visual cortical areas . . . . . . . . .
Time course of pairwise decoding accuracy . . . . . . . . . . . . . . . . .
Stimulus-to-stimulus decoding accuracy on whole cortical areas for the
three subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time course of stimulus-to-dream decoding accuracy on whole cortical
areas for Subject 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time course of stimulus-to-dream decoding accuracy on whole cortical
areas for Subject 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Time course of stimulus-to-dream decoding accuracy on whole cortical
areas for Subject 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ROC analysis for the three subjects . . . . . . . . . . . . . . . . . . . .
AUC averaged within meta-categories for different visual areas . . . . .
Synset score time course . . . . . . . . . . . . . . . . . . . . . . . . . . .
Identification analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Distribution of pairwise decoding accuracies . . . . . . . . . . . . . . . .
Stimulus-to-stimulus pairwise decoding . . . . . . . . . . . . . . . . . . .
iii
3
5
12
13
15
19
21
23
23
26
28
29
35
37
39
40
42
43
44
45
47
48
49
50
52
54
56
58
72
73
31
32
33
34
35
36
37
38
Dream-to-dream pairwise decoding . . . . . . . . . . . . . . . . . . . . .
Decoding with averaged vs. multivoxel activity for individual subjects .
Mean accuracies for the samples from each sleep state for individual
subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Pairwise decoding accuracies across visual cortical areas . . . . . . . . .
Time course of pairwise decoding accuracy . . . . . . . . . . . . . . . . .
Examples for the time courses of synset scores . . . . . . . . . . . . . . .
Time courses of averaged synset scores for each subject . . . . . . . . .
Identification performance for individual subjects . . . . . . . . . . . . .
74
75
76
77
78
79
80
81
List of Tables
1
2
3
4
Examples of Verbal Reports . . .
List of Base Synsets for Subect 1
List of Base Synsets for Subect 2
List of Base Synsets for Subect 3
.
.
.
.
iv
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
27
31
32
33
1. Introduction
Dreaming is a subjective experience during sleep often accompanied by vivid visual contents. Due to its fundamentally subjective nature, the objective study
of dreaming has been challenging. However, since the discovery of the rapid eye
movement (REM) during sleep, scientific knowledge on the relationship between
dreaming and physiological measures including brain activity has accumulated.
Although dreaming has often been associated with the REM sleep stage, recent
studies have shown that dreaming can be experienced during non-REM periods
(Hobson and Stickgold 1994; Solms, 1997; Takeuchi et al., 2001; Baylor and Cavallero, 2001; Hori et al., 1994; Cavallero et al., 1992; Foulkes and Vogel, 1965;
Foulkes, 1962; Nir and Tononi, 2010), and much research has been conducted to
link various aspects of dreaming with physiological and behavioral measures during sleeping. Those studies have reported relations between dreaming and specific
patterns of polysomnography (PSG; electroencephalography (EEG), electrooculogram (EOG), and electromyography (EMG)) (Aserinsky and Kleitman, 1953;
Hori et al., 1998; German and Nielsen, 2001; Palagini et al., 2004), a specific type
of behavior observed during sleeping (Gugger and Wagner, 2007), and changes of
brain activity in several regions, including activations in multisensory areas (Hong
et al., 2009), visual cortical areas (Maquet et al., 1996; Maquet, 2000; Miyauchi
et al., 2009) and hippocampus (Wilson and McNaughton, 1994). However, none
has demonstrated how specific visual dream contents are represented in the brain
activity.
The advent of machine learning-based analysis allows for the decoding of
stimulus- and task-induced brain activity patterns to reveal visual contents (Haxby
et al., 2001; Cox and Savoy, 2003; Kamitani and Tong 2005, 2006; Polyn et al.,
2005; Miyawaki et al., 2008; Stokes et al., 2009; Reddy et al., 2010; Harrison and
Tong, 2010; Albers et al., 2013). Those studies have demonstrated that not only
the visual contents explicitly presented to subjects (Haxby et al., 2001; Cox and
Savoy, 2003; Kamitani and Tong 2005, 2006; Miyawaki et al., 2008), but also the
subjective visual contents, such as attended visual features (Kamitani and Tong,
2005, 2006), visually imagined shapes (Stokes et al., 2009) and imagined object
categories (Reddy et al., 2010), and orientations maintained in working memory
(Harrison and Tong, 2010; Albers et al., 2013), can be read out from brain activity
1
patterns using decoders trained on stimulus induced brain activity patterns. The
results of these studies suggest that similar experiences may be represented by
similar brain activity patterns. If this is true, because dreaming often contained
vivid visual contents, we can expect that decoders trained on stimulus-induced
brain activity can predict dream contents given brain activity patterns during
sleep. However, dreaming is a phenomenon which is spontaneously generated by
the brain during sleep, and it is not yet clear whether the neural representational
similarity observed between perception and some kind of imagery during awakening (Kamitani and Tong 2005, 2006; Polyn et al., 2005; Stokes et al., 2009;
Reddy et al., 2010; Harrison and Tong, 2010; Albers et al., 2013) can generalize
to the similarity between perception and visual phenomenon during sleep. Here,
we extend the decoding approach to the decoding of spontaneous brain activity
during sleep, and examine whether we can read out visual dream contents from
brain activity patterns in human visual cortex measured by functional resonance
imaging (fMRI) during sleep. This approach is a most direct demonstration to
establish a link between dreaming and observed brain activity during sleep.
There were two major challenges to this approach. First, although several
previous studies have shown that dream contents can be affected by waking
experience or conditioning (Stickgold et al., 2000; Wamsley et al., 2010), it is
generally difficult to experimentally control dream contents. We instead let the
subject sleep and dream without pre-conditioning and freely describe the contents
after awakening. Reports were analyzed with the assistance of a lexical database,
WordNet (Fellbaum, 1998), which were used to create systematic descriptions
of dream contents. The second challenge was the difficulty in collecting a large
amount of dream data. Although dreaming has often been associated with the
REM sleep stage (Aserinsky and Kleitman, 1953; Dement and Kleitman, 1957;
Dement and Wolpert 1958; Hobson, 2009), since it takes at least 1 hour to enter
the first REM stage, REM dreams are not suitable for collecting sufficient data
for quantitative evaluation with decoding analysis. Recent studies have demonstrated that dreaming is dissociable from REM sleep and can be experienced during non-REM periods (Hobson and Stickgold 1994; Solms, 1997; Takeuchi et al.,
2001; Baylor and Cavallero, 2001; Hori et al., 1994; Cavallero et al., 1992; Foulkes
and Vogel, 1965; Foulkes, 1962; Nir and Tononi, 2010), and reports at awakenings
2
Sleep stage
in sleep-onset and REM periods share general features such as frequency, length,
and contents while differing in several aspects including the affective component
(Foulkes and Vogel, 1965; Vogel et al., 1972; Oudiette, 2012). Here, we focused
on visual dreaming experienced during the sleep-onset (hypnagogic) period (sleep
stage 1 or 2) (Hori et al., 1994; Stickgold et al., 2000; Foulkes and Vogel, 1965),
making it easy to collect many observations by repeating awakenings and recording subjects’ verbal reports of visual experience (Fig. 1).
Wake
1
2
3
4
REM/NREM
0
1
2
3
4
5
6
7
8
Figure 1. REM and sleep-onset sleeping. Sleep state measurements reveal about
90 minute cycles of REM and non-REM (NREM) sleep (the red/blue lines indicate periods of REM/NREM sleep). Since the discovery of the REM sleep
(Aserinsky and Kleitman, 1953) the dreaming has often been associated with the
REM sleep stage. However, reports of dreaming are also common from NREM
sleep stage, including sleep stage 1 and 2. While there are differences in several
aspects of reports obtained at awakenings in sleep-onset and REM periods, they
share general features such as frequency, length, and contents.
3
In this thesis, we present a neural decoding approach in which machine learning models predict the contents of visual dreaming during the sleep onset period
given measured brain activity, by discovering links between human fMRI patterns
and verbal reports with the assistance of lexical and image databases (Fig. 2).
We hypothesized that contents of visual dreaming are represented at least partly
by visual cortical activity patterns shared by stimulus representation. Thus we
trained decoders on brain activity patterns induced by viewing natural images
collected from web image databases, and tested on brain activity during sleeping.
The results showed that the decoding models trained on stimulus-induced brain
activity in higher visual cortical areas showed accurate classification, detection,
and identification of contents. Our findings demonstrate that specific visual experience during sleep is represented by brain activity patterns shared by stimulus
perception, providing a means to uncover subjective contents of dreaming using
objective neural measurement.
4
Yes, well, I saw a person. Yes. What it was... It was something like a scene that
I hid a key in a place between a chair and a bed and someone took it.
Awakening
Sleep 1
stages
2
Report
period
Wake
fMRI
volumes
t
Prediction
fMRI activity pattern
before awakening
Machine learning decoder
assisted by
lexical and image databases
Figure 2. Experimental overview. fMRI data were acquired from sleeping subjects
simultaneous with PSG. Subjects were awakened during sleep stage 1 or 2 (red
dashed line) and verbally reported their visual experience during sleep. fMRI
data immediately before awakening (average of three volumes [= 9 s]) were used
as the input for main decoding analyses (sliding time windows were used for time
course analyses). Words describing visual objects or scenes (red letters) were
extracted. The visual contents were predicted using machine learning decoders
trained on fMRI responses to natural images.
5
2. Methods
The study protocol was approved by the Ethics Committee of ATR.
2.1 Subjects
Potential subjects answered questionnaires regarding their sleep-wake habits.
Usual sleep and wake times, regularity of the sleep habits, habits of taking a
nap, sleep complaints, and regularity of lifestyle (e.g., mealtime), their physical
and psychiatric health, and sleeping conditions were checked. Anyone who had
physical or psychiatric diseases, was currently receiving medical treatment or was
suspected of having a sleep disorder was excluded. People who had a habit of
taking alcoholic beverages before sleep or smoking were also excluded. Finally,
three healthy subjects (Japanese-speaking male adults, aged 27-39 years) with
normal visual acuity participated in the experiments. All subjects gave written
informed consent for their participation in the experiment.
2.2 Prior instructions to subjects
From three days prior to each experiment, subjects were instructed to maintain
their sleep-wake habits, i.e., daily wake/sleep time and sleep duration. They were
also instructed to refrain from excessive alcohol consumption, unusual physical
exercise, and taking of naps, from the day before each experiment. Their sleepwake habits were monitored by a sleep log. No subject was chronically sleep
deprived, and all slept over 6 hours on average the night before experiments.
2.3 Sleep adaptation
Subjects underwent two adaptation sleep experiments before the main fMRI sleep
experiments to get used to sleeping in the experimental setting (Agnew et al.,
1966; Tamaki et al., 2005). The adaptation experiments were conducted using
the same procedures as the fMRI sleep experiments except that real fMRI scans
were not performed. The experimental environment was simulated using a mock
scanner consisting of the shell of a real scanner without the magnet. Echo-planar
6
imaging (EPI) acoustic noise was also simulated and given to the subject via
speakers.
2.4 Sleep experiment
Sleep (nap) experiments were carried out from 1:00 pm until 5:30 pm, and were
scheduled to include the mid-afternoon dip (Monk et al., 1996). Subjects were
instructed to sleep if they could, but not to try to sleep if they felt they could
not. This instruction was given to reduce psychological pressure toward sleeping
because efforts to sleep may themselves cause difficulty in falling asleep. fMRI
scans were conducted simultaneous with PSG recordings (electroencephalogram
[EEG], electrooculogram [EOG], electromyogram [EMG], and electrocardiogram
[ECG]). We performed multiple awakenings (see below for details) to collect verbal
reports on visual experience during sleep. The multiple-awakening procedure was
repeated while fMRI was performed continuously (interrupted upon the subject
1-3 request for a break; mean ± SD of duration across all 55 runs, 88.99 ± 26.09
min [mean ± SD]). The experiment was repeated over 10, 7, and 7 days in Subject
1-3, respectively, until at least 200 awakenings with a visual report were obtained
from each subject. Offline sleep stage scoring confirmed that in >90% of the
awakenings followed by visual reports, the last 15-s epoch before awakening was
classified as sleep stage 1 or 2. If the last 15-s epoch was classified as the waking
stage, we excluded the data from decoding analysis. As a result, 235, 198, and
186 awakenings were selected for Subject 1-3, respectively, constituting sleep data
samples for further analysis.
Multiple-awakening procedure
Once the fMRI scan began, the subject was allowed to fall asleep. The experimenter monitored noise-reduced EEG recordings in real time while performing
EEG staging on every 6-s epoch. The experimenter awakened subjects by calling
them by name via a microphone/headphone when a single epoch with alpha-wave
suppression and theta-wave (ripple) occurrence, which are known to be EEG signatures of NREM sleep stage 1 for obtaining frequent visual reports upon awakening (Hori et al., 1994), was detected. The subject was asked to verbally describe
7
what they saw before awakening along with other mental content and then to
go to sleep again. If the EEG signatures were detected before the elapsed time
from the previous awakening reached 2 min, then the experimenter waited until
it reached 2-3 min. If the subject was already in NREM sleep stage 2 when 2
min had passed, and it was unlikely they would go back to NREM sleep stage
1, the subject was awakened. When the subject repeatedly entered NREM sleep
stage 2 within 2 min, the subject was awakened with short intervals (less than 2
min) or was asked to remain awake with eyes closed for one or two minutes after
the awakening to increase their arousal level. This multiple-awakening procedure
was repeated during the fMRI session. The subject was also asked to respond
by button press when they detected a beep sound (250 Hz pure tone, 500 ms
duration, 12-18-s inter-stimulus intervals). This auditory task was conducted for
potential use in monitoring the sleep stage. However, in the present study, we did
not use the data because we failed to record responses in some of the experiments
owing to computer trouble. Our preliminary analysis using successfully recorded
data (Subject 1) showed that the detection rates in each of the wake/sleep stages
were similar to those of previous work (Ogilvie and Wilkinson, 1989) even when
sleep samples were limited to those in which a sound was played during the last
15-s epoch before awakening but not detected by the subject, the decoding results were similar. Subjects were informed that they could quit the experiment
anytime they wished to, and that they could refuse to report mental contents in
cases where there were privacy concerns.
Acquisition of verbal reports
On each awakening, the subject was asked if the subject had seen anything just
before awakening, and then to freely describe it along with other mental contents.
If a description of the contents was unclear, the subject was asked to report the
contents in more detail. Most reports started immediately upon awakening and
lasted for 34 ± 19 s (mean ± SD; three subjects pooled). After the free verbal
report, the subject was asked to answer specific questions such as rating the
vividness of the image and the subjective timing of the experience (from when
and until when relative to awakening), but the reports obtained by these explicit
questions were not used in the analyses in the current study. Free reports that
8
contained at least one visual element were classified as visual reports. If no visual
content was present, reports were classified as others including thought (active
thinking), forgot, non-visual report, and no report. The classification was first
conducted in real time by the experimenter, and was later confirmed by other
investigators. Examples of verbal reports are shown in table 1. The subject’
voice during the procedure was recorded by an optical microphone.
2.5 MRI acquisition
fMRI data were collected using 3.0-Tesla scanner located at the ATR Brain Activity Imaging Center. An interleaved T2*-weighted gradient-EPI scan was performed to acquire functional images to cover whole brain (sleep experiments,
visual stimulus experiments, and higher visual area localizer experiments: TR,
3,000 ms; TE, 30 ms; flip angle, 80 deg; FOV, 192 × 192 mm; voxel size, 3 ×
3 × 3 mm; slice gap, 0 mm; number of slices, 50) or the entire occipital lobe
(retinotopy experiments; TR, 2,000 ms; TE, 30 ms; flip angle, 80 deg; FOV,
192 × 192 mm; voxel size, 3 × 3 × 3 mm; slice gap, 0 mm; number of slices,
30). T2-weighted turbo spin echo images were scanned to acquire high-resolution
anatomical images of the same slices used for the EPI (sleep experiments, visual
stimulus experiments, and higher visual area localizer experiments: TR, 7,020 ms;
TE, 69 ms; flip angle, 160 deg; FOV, 192 × 192 mm; voxel size, 0.75 × 0.75 × 3.0
mm; retinotopy experiments: TR, 6,000 ms; TE, 57 ms; flip angle, 160 deg; FOV,
192 × 192 mm; voxel size, 0.75 × 0.75 × 3.0 mm). T1-weighted magnetizationprepared rapid acquisition gradient-echo (MP-RAGE) fine-structural images of
the whole head were also acquired (TR, 2,250 ms; TE, 3.06 ms; TI, 900 ms; flip
angle, 9 deg, FOV, 256 × 256 mm; voxel size, 1.0 × 1.0 × 1.0 mm).
2.6 PSG recordings
PSG was performed simultaneously with fMRI. PSG consisted of EEG, EOG,
EMG, and ECG recordings. EEGs were recorded at the 31 scalp sites in all
experiments except for one (Fp1, Fp2, F7, F3, Fz, F4, F8, FC5, FC1, FC2,
FC6, T7, C3, Cz, C4, T8, TP9, CP5, CP1, CP2, CP6, TP10, P7, P3, Pz, P4,
P8, POz, O1, Oz, O2) according to 10% electrode positions (Sharbrough et al.,
9
1991). For one experiment, EEG was recorded at 25 scalp sites (Fp1, Fp2, F7,
F3, Fz, F4, F8, T7, C3, Cz, C4, T8, P7, P3, Pz, P4, P8, PO7, PO3, POz, PO4,
PO8, O1, Oz, O2). EOGs were recorded bipolarly from four electrodes placed at
the outer canthi of both eyes (horizontal EOG) and above and below the right
eye (vertical EOG). EMGs were recorded bipolarly from the mentum. ECGs
were recorded from the lower shoulder blade. EEG and ECG recordings were
referenced to FCz. All EEG electrodes in the cap were with 5 kΩ-resistors, while
other electrodes were with 15 kΩ resistors. In consideration of EEG data quality,
impedance of EEG electrodes was kept below 15 kΩ and that of other electrodes
was kept below 25 kΩ. All data were recorded by an MRI-compatible amplifier
at a sampling rate of 5,000 Hz using BrainVision Recorder. The artifacts derived
from T2*-weighted gradient-EPI scan and ballistocardiogram (Bonmassar et al.,
2002) were reduced in real time using RecView so that the experimenter was able
to monitor the EEG patterns online. Because EEG recordings were referenced
to FCz, EEGs recorded in the occipital area were better than those recorded in
the central area at detecting subtle changes in EEG waves. Thus, O1 was used
for online monitoring of EEG data. When the O1 channel was contaminated by
artifacts, O2 was used instead.
2.7 Offline EEG artifact removal and sleep-stage scoring
Artifacts were removed offline from EEG recordings after each experiment (the
FMRIB plug-in for EEGLAB, The University of Oxford) for further analyses.
EEG data were then down-sampled to 500 Hz, re-referenced to TP9 or TP10,
depending on the derivation, and low-pass filtered with a 216-Hz cut-off frequency
using a two-way least-squares FIR filter. EEG recordings were scored into sleep
stages according to the standard criteria (Rechtshaffen and Kales, 1968) for every
15 s.
2.8 Visual dream content labeling
The subjects’ report at each awakening, given verbally in Japanese, was transcribed into text. The reports that contained at least one visual object or scene
were classified as visual report, and those without visual content were classified as
10
others. Three labelers extracted words (nouns) that described visual objects or
scenes from each visual report text and translated them into English (verified by
a bilingual speaker). These words were mapped to WordNet, an English lexical
database in which words with a similar meaning are grouped as synsets (an abbreviation of ”synonym-set”) in a hierarchical structure (Fellbaum, 1998). Synset
assignment was cross-checked by all three labelers. For each extracted word, hypernym synsets (superordinate categories defined on the WordNet tree) of the
assigned synset were also identified in the WordNet hierarchy. To determine representative synsets that describe visual contents, we selected the synsets (synsets
assigned to each word and their hypernyms) that were found in 10 or more visual
reports without co-occurrence with at least one other synset. We then removed
the synsets that were the hypernyms of others. These procedures produced base
synsets, which were frequently reported while being semantically exclusive and
specific (Fig. 3). The visual contents at each awakening were coded by a visual
content vector, in which the presence/absence of each synset was denoted by 1/0
(Fig. 4).
11
Um, what I saw now was like,
a place with a street and some
houses around it...
Artefact
Structure
Way
Street n=21
Base synset
WordNet
Tree
Building n=18
House
Hotel
n=7
Figure 3. Base synsets selection. Words describing visual objects or scenes (red)
were mapped onto synsets of the WordNet tree. Synsets were grouped into base
synsets (blue frames) located higher in the tree.
12
Base synsets
building
chair
character
clothing
code
cognition
external body part
geographical area
girl
group
illustration
implement
line
male
material
natural object
performer
picture
room
shape
table
vertebrate
way
window
workplace
writing
book
building
car
character
commodity
computer screen
covering
dwelling
electronic equipment
female
food
furniture
male
mercantile establishment
point
region
representation
street
Subject 1
50
100
150
50
100
150
50
100
Awakening index
150
200
Subject 2
Subject 3
car
communication
display
external body part
female
geological formation
house
male
material
room
surface
table
tract
vascular plant
vertebrate
way
Figure 4. Visual content vectors. Visual reports are represented by visual content
vectors, in which the presence/absence of the base synsets in the report at each
awakening is indicated by white/black. The visual content vectors are shown for
all subjects and awakenings with visual reports (excluding those with contamination of the wake stage detected by offline sleep staging). Each column denotes
the presence/absence of the base synsets in the sleep sample. Note that there are
several samples in which no synset is present. This is because the reported words
in these samples were rare and not included in the base synsets.
13
2.9 Visual stimulus experiment
We selected stimulus images for decoder construction using ImageNet (http://www.imagenet.org/; 2011 fall release) (Deng et al., 2009), an image database in which web images are grouped according to WordNet. Two-hundred and forty images were collected for each base synset (each image corresponding to exclusively one synset).
If images for a synset were not available in ImageNet, we collected images using
Google Images (http://www.google.com/imghp). In the visual stimulus experiment, the selected images were resized so that the width and height of images
were within 16 deg (the original aspect ratio was preserved) and were presented
at the center of the screen on gray background (Fig. 5). Subjects were allowed
to freely view the images without fixation. We measured stimulus-induced fMRI
activity for each base synset by presenting these images (using the same fMRI
setting as the sleep experiment). In a 9-s stimulus block, six images were randomly sampled from the 192 (Subject 1) or 240-image (Subject 2 and Subject
3) set for one base synset without replacement, and each image was presented
for 0.75 s with intervening blanks of 0.75 s ( 4 hours for a whole visual stimulus
experiment). We presented multiple images for each base synset to attenuate influences from irrelevant features (e.g., age or view of a depicted person for ”male”
synset). Thus each of the images for each base synset was presented only once
during the experiment. The stimulus block (9 s) was followed by a 6-s rest period,
and was repeated for all base synsets in each run. Extra 33-s and 6-s rest periods
were added at the beginning and the end of each run, respectively. We repeated
the run with different images to obtain 32, 40, and 40 blocks per base synset for
Subject 1-3, respectively.
2.10 Localizer experiments
Retinotopy
The retinotopy mapping session followed the conventional procedure (Engel et
al., 1994; Sereno et al., 1995) using a rotating wedge and an expanding ring of
14
“male”
“building”
9s
6s
9s
Time
Figure 5. Visual stimulus experiment design. Subjects freely viewed six different
exemplars of each base synset on each 9-s stimulus block.
a flickering checkerboard. The data were used to delineate the borders between
each visual cortical area, and to identify the retinotopic map on the flattened
cortical surfaces of individual subjects.
Localizers for higher visual areas
We performed functional localizer experiments to identify the lateral occipital
complex (LOC), fusiform face area (FFA), and parahippocampal place area (PPA)
for each individual subject (Epstein and Kanwisher, 1998; Kanwisher et al., 1997;
Kourtzi and Kanwisher, 2000). The localizer experiment consisted of 4-8 runs and
each run contained 16 stimulus blocks. In this experiment, intact or scrambled
images (12 × 12 deg) of face, object, house and scene categories were presented
at the center of the screen. Each of eight stimulus types (four categories × two
conditions) was presented twice per run. Each stimulus block consisted of a
15-s intact or scrambled stimulus presentation. The intact and scrambled stimulus blocks were presented successively (the order of the intact and scrambled
stimulus block was random), followed by a 15-s rest period of uniform gray background. Extra 33-s and 6-s rest periods were added at the beginning and end
of each run, respectively. In each stimulus block, twenty different images of
the same type were presented with a duration of 0.3 s followed by intervening
15
blanks of 0.4-s duration. Images for each category were collected from the following resources: face images from THE CENTER FOR VITAL LONGEVITY
(http://vitallongevity.utdallas.edu) (Minear and Park, 2004); object images from
The Object Databank (http://stims.cnbc.cmu.edu/ImageDatabases/TarrLab/Objects/;
Stimulus images courtesy of Michael J. Tarr, Center for the Neural Basis of Cognition and Department of Psychology Carnegie Mellon University, http://www.tarrlab.org/);
house and scene images from SUN database (http://groups.csail.mit.edu/vision/SUN/)
(Xiao et al., 2010).
2.11 MRI data preprocessing
The first 9-s scans for experiments with TR = 3 s (sleep, visual stimulus, and
higher visual area localizer experiments) and 8-s scans for experiments with TR
= 2 s (retinotopy experiments) of each run were discarded to avoid instability
of the MRI scanner. The acquired fMRI data underwent three-dimensional motion correction by SPM5 (http://www.fil.ion.ucl.ac.uk/spm). The data were then
coregistered to the within-session high-resolution anatomical image of the same
slices used for EPI and subsequently to the whole-head high-resolution anatomical image. The coregistered data were then reinterpolated by 3 × 3 × 3 mm
voxels. For the sleep data, after a linear trend was removed within each run,
voxel amplitude around awakening was normalized relative to the mean amplitude during the period 60-90 s prior to each awakening. The average proportions
of wake, stage 1, and stage 2 during this period were 32.7%, 44.0%, and 23.3%,
respectively (three subjects pooled). This period was used as the baseline because it tended to show relatively stable BOLD signals over time. We assumed
that the occurrence of sleep-onset dreaming would be rare with the relatively low
theta amplitudes (Hori et al., 1994). However, it cannot be ruled out that visual
dreaming may have been experienced during this period, and that the baseline
using this period as the baseline would make it difficult to detect visual contents
relevant to the dreaming in this period. The voxel values averaged across the
three volumes (9 s) immediately before awakening served as a data sample for
decoding analysis (the time window was shifted for time course analysis). For
the visual stimulus data, after within-run linear trend removal, voxel amplitudes
were normalized relative to the mean amplitude of the pre-rest period of each
16
run and then averaged within each 9-s stimulus block (three volumes) shifted by
3 s (one volume) to compensate for hemodynamic delays. Voxels used for the
decoding of a synset pair in each ROI were selected by t statistics comparing
the mean responses to the images of paired synsets (highest absolute t values;
400 voxels for individual areas, and 1,000 voxels for LVC and HVC). The voxel
values in each data sample were z-transformed for removing potential mean-level
differences between the sleep and visual stimulus experiments.
2.12 Region of interest (ROI) selection
Functionally localized areas
V1, V2, and V3 were delineated in the standard retinotopy experiment (Engel et al., 1994; Sereno et al., 1995), and the lateral occipital complex (LOC),
the fusiform face area (FFA), and the parahippocampal place area (PPA) were
identified using conventional functional localizers (Epstein and Kanwisher, 1998;
Kanwisher et al., 1997; Kourtzi and Kanwisher, 2000). The retinotopy experiment data were transformed to the Talairach coordinates and the visual cortical
borders were delineated on the flattened cortical surfaces using BrainVoyager QX
(http://www.brainvoyager.com). The voxel coordinates around the gray-white
matter boundary in V1-V3 were identified and transformed back into the original
coordinates of the EPI images. The voxels from V1, V2, and V3 were combined
as the lower visual cortex (LVC; 2,000 voxels in each subject). The localizer experiment data for higher visual areas were analyzed using SPM5. The voxels that
showed significantly higher activation in response to objects, faces, or scenes than
scrambled images for each (t test, uncorrected P < 0.05 or 0.01) were identified,
and were used as ROIs for LOC, FFA, and PPA, respectively. We set relatively
low thresholds for identifying ROIs so that a larger number of voxels would be
included to preserve broadly distributed patterns. A continuous region covering
LOC, FFA, and PPA was manually delineated on the flattened cortical surfaces,
and the region was defined as the higher visual cortex (HVC; 2,000 voxels in each
subject. Voxels overlapping with LVC were excluded from HVC. After voxels of
extremely low signal amplitudes were removed, approximately 2,000 voxels remained in LVC (2054, 2172, and 1935 voxels for Subject 1-3, respectively) and in
17
HVC (1956, 1788, and 2235 voxels for Subject 1-3, respectively). For the analysis
of individual subareas, the following numbers of voxels were identified for V1, V2,
V3, LOC, FFA, and PPA, respectively: 885, 901, 728, 523, 537, and 353 voxels
for Subject 1; 779, 949, 897, 329, 382, and 334 voxels for Subject 2; 710, 859,
765, 800, 432, and 316 voxels for Subject 3. LOC, FFA, and PPA identified by
the localizer experiments may overlap: the voxels in the overlapping region were
included in both ROIs (Fig. 6).
18
Subject 1
V2 V3
V2
V1
V3
V1
HVC
HVC
LOC
LOC
Dorsal
FFA
FFA
Left
Right
PPA
PPA
Ventral
Subject 2
V3 V2
HVC
V1
V1
V2 V3
LOC
HVC
FFA
LOC
FFA
PPA
PPA
Subject 3
V3 V2
V1
V1
V2 V3
HVC
HVC
LOC
LOC
FFA
FFA
PPA
PPA
Figure 6. Functionally defined regions of interest on the flattened cortex. The
individual areas of each subject are shown on the flattened cortex. A contiguous
region covering LOC, FFA, and PPA was manually delineated on the flattened
cortical surface, and the region was defined as the ”higher visual cortex”(red
line). The voxels overlapping with the lower visual cortical areas (V1-V3) were
excluded from ROI for the higher visual cortex. For individual ROIs voxels near
the area border were included in both areas.
19
Anatomically delineated areas
The T1-image of each subject was analyzed using FreeSurfer (http://surfer.nmr.mgh.harvard.edu),
and regions covering a whole cortical surface were anatomically identified on each
subject’s cortical surface (Fig. 7). Two types of parcellations provided from
FreeSurfer were used to define a total of 108 cortical regions on one hemisphere
(Desikan et al., 2006, Destrieux et al., 2010).
20
A
Anterior
Lateral
Superior
Posterior
Medial
Inferior
Lateral
Superior
Medial
Inferior
B Anterior
Posterior
Figure 7. Inflated view of the anatomically defined regions of interest. The
whole cortical surface was automatically delineated according to two types of
parcellations provided from FreeSurfer. (A) The 34 anatomically delineated ROIs
defined by a parcellation from Desikan et al. (2006). (B) The 74 anatomically
delineated ROIs defined by a parcellation from Destrieux et al. (2010). Here, the
ROIs were mapped on the left hemisphere of Subject 1.
21
2.13 Decoding analysis
For all pairs of the base synsets, a binary decoder consisting of linear Support
Vector Machine (Vapnik, 1998) (SVM; implemented by LIBSVM (Chang and Lin,
2011)) was trained on the visual stimulus data of each ROI. The fMRI signals
of the selected voxels and the synset labels were given to the decoder as training
data. SVM provided a linear discriminant function for classification between
synset k and l given input voxel values x = [x1 ,. . ., xD ] (D, number of voxels),
fkl (x) =
D
!
wd xd + w0
d=1
where wd is the weight parameter for voxel d, and w0 is the bias. The performance
was evaluated by the correct classification rate for all sleep samples selected for
each pair.
In the pairwise decoding analysis, the stimulus-trained decoder was tested
on the sleep data that contained exclusively one of the paired synsets (Fig. 8).
Prediction was made on the basis of whether fkl (x) was positive (k) or negative
(l), given a sleep fMRI data sample. In the multilabel decoding, the discriminant
functions comparing a base synset k and each of the other synsets (l #= k) were
averaged after the normalization by the norm of the weight vector wkl to yield the
linear detector function (Kamitani and Tong, 2005), which indicates how likely
synset k is to be present,
fk (x) =
1 ! fkl (x)
N − 1 l!=k ||wkl ||
where N is the number of base synsets. Given a sleep fMRI data sample, multilabel decoding produced a vector consisting of the output scores of the detector
functions for all base synsets [f1 (x), f2 (x), . . ., fN (x)] (Fig. 9).
22
Synset
Awakening
Male
1
0
0
1
Car
0
1
1
0
Z
...
z
Male
z
or
Pairwise
decoder
Car
Figure 8. Schematic overview of the pairwise classification analysis. A binary
classifier for pairs of base synsets was constructed. A decoder was trained with
fMRI responses to stimulus images of two base synsets, and sleep samples labeled
with either of the two synsets exclusively were tested.
Z
z
male
food
z
...
...
Multi-label
decoder
car
street
Figure 9. Schematic overview of the multilabel decoding analysis. The synset
detectors for each base synset were constructed from a combination of the pairwise
classifiers. Given an arbitrary sleep data, each detector outputs a continuous score
indicating how likely the synset is to be present in each report.
23
2.14 Synset pair selection by within-dataset cross-validation
To select synset pairs with content-specific patterns in both the stimulus-induced
and sleep datasets, we performed cross-validation decoding analysis for each pair
in each dataset. For the stimulus-induced dataset, samples from one run were
left for testing, and the rest were used for decoder training (repeated until all
runs were tested; leave-one-run-out cross validation) (Kamitani and Tong, 2005).
For the sleep dataset, one sample was left for testing, and the rest were used
for decoder training (leave-one-sample-out cross-validation). Note that since the
frequency in the sleep reports is generally different between paired synsets, their
available samples are usually unbalanced. To avoid possible biases in decoder
training caused by the imbalance, we trained multiple decoders by randomly
resampling a subset of the training data for the synset with more samples to
match to the synset with fewer samples (repeated 11 times). The discriminant
functions calculated for all the resampled training datasets were averaged after
normalization by the norm of the weight vector to yield the discriminant function
(decoder) to be used for testing in each cross-validation step. We selected the
synset pairs that showed high cross-validation performance in both of datasets
(one-tailed binomial test, uncorrected P < 0.05).
24
3. Results
3.1 Behavioral results of sleep experiments
Verbal report collections by multiple awakening procedure
Three subjects participated in the fMRI sleep experiments (Fig. 2), in which
they were woken when an EEG signature was detected (Hori et al., 1994) (Fig.
10). The subjects were asked to give a verbal report freely describing their visual
experience before awakening (Table 1; duration, 34 ± 19 s [mean ± SD]). We
repeated this procedure to attain at least 200 awakenings with a visual report
for each subject. On average, we awakened subjects every 342.0 s, and visual
contents were reported in over 75% of the awakenings (Fig. 11), indicating the
validity of our methods to collect verbal reports. Offline sleep stage scoring (Fig.
12) further selected awakenings to exclude contamination from the wake stage
in the period immediately before awakening (235, 198, and 186 awakenings for
Subject 1-3 used for decoding analyses).
25
Baseline
Subject 1
Subject 2
Subject 3
Normalized theta power (a.u)
1
0
120
90
60
Time to awakening (s)
30
0
Figure 10. Time course of theta power. The time course of theta power (4-7
Hz) during 2 min before awakening is shown for each subject (error bar, 95%
CI; averaged across awakenings). For each awakening, we shifted the 9-s window
(equivalent to three fMRI volumes) by 3 s (equivalent to one fMRI volume) to
calculate the theta power from preprocessed EEG signals. The plotted points
indicate the center of the 9-s window (slightly displaced to avoid overlaps between
subjects). The power was normalized relative to the mean power during the time
window of 60-90 s prior to each awakening (gray area). The result of this offline
analysis is consistent with the awakening procedure in which theta ripples were
detected in online monitoring to determine the awakening timing.
26
Table 1. Examples of Verbal Reports
Subject
Subject
1
Index
Report
Word
13
Well, what was that? Two male persons, well, what was that? I cannot
remember very well, but there were e-mail texts. There were also
characters. E-mail address? Yes, there were a lot of e-mail
addresses. And two male persons existed.
male person
text
character
male (male person)
writing (written material,
piece of writing)
character (grapheme,
graphic symbol)
133
Well, somewhere, in a place like a studio to make a TV program or
something, well, a male person ran with short steps, or run, from the
left side to the right side. Then, he tumbled. He stumbled over
something, and stood up while laughing, and said something. He said
something to persons on the left side. So, well, a person ran, and
there were a lot of unknown people. I did not saw female persons.
There were a group of a lot of people, either male or female. The
place was like a studio. Though there are a lot of variety for studios,
the studio was a huge room. Since it was a huge room, it was indoor
maybe. I saw such a scene in the huge room.
studio
room
male
person
people
workplace (work)
room
male (male person)
group (grouping)
54
Yes, I had a dream. Something long. First at some shop. Ah, a bakery
shop. I was choosing some merchandise. I took a roll in which a leaf
of perilla was put. Then, I went out, and on the street, I saw a person
who were doing something like taking a photograph.
shop
bakery
merchandise
roll
leaf
perilla
street
person
point
mercantile establishment
(retail store, sales outlet,
outlet)
commodity (trade good,
good)
food (solid food)
street
130
Yes, ah, yes, a female person, well, existed. The person served some
foods, umm, like a flight attendant. Then, well, before that scene, I
saw a scene in which I ate or saw yogurt, or I saw yogurt or a scence
in which yogurt was served. What appeared was the female person
and an unknown thing like a refrigerator. Maybe indoor, with colors.
food
flight attendant
yogurt
refrigerator
person
female person
food (solid food)
female (female person)
commodity (trade good,
good)
114
Well, from the sky, from the sky, well, what was it? I saw something
like a bronze statue, a big bronze statue. The bronze statue existed
on a small hill. Below the hill, there were houses, streets, and trees in
an ordinary way.
sky
statue
hill
house
street
tree
geological formation
(formation)
house
way
vascular plant
(tracheophyte)
186
Well, in the night, somewhere, well, in a restaurant in office building
covered with windowpanes, on the big table, someone important
looked a menu and chose dishes. There were both male and female
persons. Then, there was a night scenery from the window.
restaurant
office building
windowpane
table
someone
menu
male
female
scenery
window
table
communication
male (male person)
female (female person)
Subject
2
Subject
3
Base synset
Note. Originally, the reports and words were verbally reported in Japanese. They were transcribed into text and were translated into English (verified by a
bilingual speaker). Note that not all reported visual words were assigned with a base synset (e.g., person in report #133 of Subject 1, and perilla in report #54
of Subject 2). This is because 1) the word and its hypernyms did not appear in ten or more reports, or 2) the synset assigned to the word is a hypernym of
other synsets (see Methods “2.8. Visual dream content labeling”). On average, 47.7% of reported visual words were assigned with base synsets (49.5%,
50.0%, and 42.0% for Subject 1–3, respectively).
27
With visual content
Subject 1
249
Total
awakenings
No visual
content (total exps)
58
307
(10)
220
61
281
(7)
203
63
266
(7)
Subject 2
Subject 3
0
50
100 (%)
Figure 11. Awakening statistics. The numbers of awakenings with/without visual
contents are shown for each subject (numbers of experiments in parentheses).
28
100
With visual content
Subject 1
No visual content
50
Proportion (%)
0
100
Subject 2
50
0
100
Subject 3
50
0
120
60
0 120
60
Time to awakening (s)
Wake
Stage1
0
Stage2
Figure 12. Time course of sleep state proportion. The proportion of wake/sleep
states (Wake, Stage 1, and Stage 2) is shown for all the awakenings with and
without visual content in each subject. Offline sleep stage scoring was conducted
for every 15-s epoch using simultaneously recorded EEG data. The last 15-s
epoch before awakening was classified as Sleep Stage 1 or 2 in over 90% of the
awakenings with visual contents, but in fewer awakenings with no visual content.
This result suggests that most visual reports are indeed associated with dreaming
during sleep, not with imagery during wakefulness. The samples in which the
last 15-s epoch before awakening was classified as Wake were not used for further
analyses.
29
Reported visual dream contents
From the collected reports, words describing visual objects or scenes were manually extracted and mapped to WordNet, a lexical database in which semantically
similar words are grouped as synsets in a hierarchical structure (Fig. 3; Fellbaum,
1998; Huth et al., 2012). Using a semantic hierarchy, we grouped extracted visual
words into base synsets that appeared in at least 10 reports from each subject (26,
18, and 16 synsets for Subject 1-3; Tables 2-4) The fMRI data obtained before
each awakening were labeled with a visual content vector, each element of which
indicated the presence/absence of a base synset in the subsequent report (Fig.
4). We also collected images depicting each base synset from ImageNet (Deng
et al., 2009), an image database in which web images are grouped according to
WordNet, or Google image, for decoder training.
30
Table 2. List of Base Synsets for Subject 1
Base synset
ID
Definition
Reported word
Count
Meta-category
male (male person)
09624168
a person who belongs to the sex that
cannot have babies
gentleman, boy, middle-aged man,
old man, young man, male, dandy
127
Human
character (grapheme, graphic
symbol)
06818970
a written symbol that is used to
represent speech
character, letter
35
Others
room
04105893
an area within a building enclosed by
walls and floor and ceiling
booth, conference room, room,
toilet
20
Scene
workplace (work)
04602044
a place where work is done
laboratory, recording studio,
studio, workplace
17
Scene
external body part
05225090
any body part visible externally
lip, hand, face
17
Others
natural object
00019128
an object occurring naturally; not made
by man
leaf, branch, figure, beard,
mustache, orange, coconut, moon,
sun
13
Others
building (edifice)
02913152
a structure that has a roof and walls
and stands more or less permanently
in one place
bathhouse, building, house,
restaurant, schoolhouse, school
12
Scene
clothing (article of clothing,
vesture, wear, wearable,
habiliment)
03051540
a covering designed to be worn on a
person's body
clothes, baseball cap, clothing, coat,
costume, tuxedo, silk hat, hat, T-shirt,
kimono, muffler, polo shirt, suit,
uniform
13
Object
chair
03001627
a seat for one person, with a support
for the back
chair, folding chair, wheelchair
12
Object
picture (image, icon, ikon)
03931044
a visual representation (of an object or
scene or person or abstraction)
produced on a surface
graphic, picture, image, portrait
12
Others
shape (form)
00027807
the spatial arrangement of something
as distinct from its substance
circle, square, quadrangle, box, node,
point, dot, tree, tree diagram, hole
11
Others
vertebrate (craniate)
01471682
animals having a bony or cartilaginous
skeleton with a segmented spinal
column and a large brain enclosed in a
skull or cranium
bird, ostrich, raptor, hawk, falcon,
eagle, frog, snake, dog, leopard,
horse, sheep, monkey, fish, skipjack
tuna
11
Others
implement
03563967
instrumentation (a piece of equipment
or tool) used to effect an end
trigger, hammer, ice pick, pan, pen,
pencil, plunger, pole, pot, return key,
stick, wok
10
Object
way
04564698
any artifact consisting of a road or path
affording passage from one place to
another
hallway, hall, passageway, pedestrian
crossing, stairway, street
11
Scene
window
04588739
(computer science) a rectangular part
of a computer screen that contains a
display different from the rest of the screen
window
11
Object
girl (miss, missy, young lady,
young woman, fille)
10129825
a young woman
girl, young woman
11
Human
material (stuff)
14580897
the tangible substance that goes into the
makeup of a physical object
water, paper, sand, wood, sheet,
leaf, page
11
Others
cognition (knowledge, noesis)
00023271
the psychological result of perception and
learning and reasoning
symbol, profile, monster, character
10
Others
group (grouping)
00031264
any number of entities (members)
considered as a unit
string, people, pair, pop group, band,
calendar, line, forest
10
Others
table
04379243
a piece of furniture having a smooth flat
top that is usually supported by one or
more vertical legs
desk, stand, table
10
Object
code (computer code)
06355894
(computer science) the symbolic
arrangement of data or instructions in a
computer program or the set of such
instructions
computer code, code
10
Others
writing (written material, piece
of writing)
06362953
the work of a writer; anything expressed in
letters of the alphabet (especially when
considered from the point of view of style
and effect)
draft, text, document, written
document, clipping, line
10
Others
line
06799897
a mark that is long relative to its width
line
10
Others
illustration
06999233
artwork that helps make something clear
or attractive
illustration, figure, Figure
10
Others
geographical area (geographic
area, geographical region,
geographic region)
08574314
a demarcated area of the Earth
tennis court, campus, playing field,
ground, field, lawn, park, parking area,
square, public square, town
10
Scene
performer (performing artist)
10415638
an entertainer who performs a dramatic
or musical work for an audience
idol, singer, actor, actress, clown,
comedian
10
Human
Note. In this study, a representative instance provided by WordNet for each synset was used as the name of the base synset. Here, other instances were described
in parentheses.
31
Table 3. List of Base Synsets for Subject 2
Base synset
ID
Definition
Reported word
Count
Meta-category
character (grapheme,
graphic symbol)
06818970
a written symbol that is used to
represent speech
character
34
Others
male (male person)
09624168
a person who belongs to the sex
that cannot have babies
boy, male, male person
27
Human
street
04334599
a thoroughfare (usually including
sidewalks) that is lined with buildings
street
21
Scene
car (auto, automobile,
machine, motorcar)
02958343
a motor vehicle with four wheels;
usually propelled by an internal
combustion engine
car, patrol car, police cruiser,
used-car
17
Object
food (solid food)
07555863
any solid substance (as opposed to
liquid) that is used as a source of
nourishment
food, chocolate bar, apple
pie, cake, cookie, bread, roll,
noodle, tomato, cherry tomato,
yogurt, yoghurt
19
Object
building (edifice)
02913152
a structure that has a roof and walls
and stands more or less permanently
in one place
apartment house, apartment
building, building, coffee shop,
house, library, school
18
Scene
representation
04076846
a creation that is a visual or tangible
rendering of someone or something
map, model, photograph,
photo, picture, snowman
14
Others
furniture (piece of furniture,
article of furniture)
03405725
furnishings that make a room or
other area ready for occupancy
bed, chair, counter, desk,
furniture, hospital bed, sofa,
couch, table
13
Object
female (female person)
09619168
a person who belongs to the sex that
can have babies
girl, wife, female, female person
13
Human
book (volume)
02870092
physical objects consisting of a
number of pages bound together
book, notebook
12
Object
point
08620061
the precise location of something; a
spatially limited location
bakery, corner, crossing,
intersection, crossroad,
laboratory, level crossing,
studio, bus stop, port, Kobe
12
Scene
commodity (trade good,
good)
03076708
articles of commerce
hat, iron, jacket, T-shirt,
Kimono, kimono, merchandise,
refrigerator, shirt, stove
11
Object
computer screen
(computer display)
03085602
a screen used to display the output
of a computer to the user
computer screen, computer
display
10
Object
electronic equipment
03278248
equipment that involves the
controlled conduction of electrons
(especially in a gas or vacuum or
semiconductor)
amplifier,mobile phone, cell
phone, cellular phone, printer,
television, TV
11
Object
mercantile establishment
(retail store, sales outlet,
outlet)
03748162
a place of business for retailing goods
bakery, bookstore, booth,
convenience store,
department store, shopping
center, shopping mall, shop,
stall, supermarket
11
Scene
region
08630985
a large indefinite location on the
surface of the Earth
garden, downtown, park,
parking
area, scenery, town, Kobe
10
Scene
covering
03122748
an artifact that covers something else
(usually to protect or shelter or conceal
it)
accessory, accessories,
clothes, covering, flying carpet,
hat, jacket, T-shirt, Kimono,
kimono, shirt, slipper
10
Object
dwelling (home, domicile,
abode, habitation, dwelling
house)
03259505
housing that someone is living in
home,house
10
Scene
32
Table 4. List of Base Synsets for Subject 3
Base synset
ID
Definition
Reported word
Count
Meta-category
male (male person)
09624168 a person who belongs to the sex that
cannot have babies
old man, male
28
Human
way
04564698
any artifact consisting of a road or path
affording passage from one place to
another
entrance, hallway, footpath, penny
arcade, pipe, sewer, staircase,
stairway, street, tunnel
19
Scene
room
04105893
an area within a building enclosed by
walls and floor and ceiling
lobby, hospital room, kitchen, trunk,
operating room, room
18
Scene
tract (piece of land, piece of
ground, parcel of land, parcel)
08673395
an extended area of land
garden, field, ground, athletic field,
green, lawn, grassland, rice paddy,
paddy field, park, parking area,
savannah, savanna
18
Scene
female (female person)
09619168
a person who belongs to the sex that
can have babies
female
14
Human
communication
00033020
something that is communicated by or
to or between people or groups
subtitle, text, menu, theatre ticket,
poster, character, traffic light, traffic
signal, graph, mark
13
Others
vertebrate (craniate)
01471682
animals having a bony or cartilaginous
skeleton with a segmented spinal
column and a large brain enclosed in a
skull or cranium
bird, hawk, eagle, frog, whale, dog,
cheetah, horse, water buffalo,
sheep, giraffe, fish
12
Others
display (video display)
03211117
an electronic device that represents
information in visual form
computer screen, computer display,
display, screen, monitor, window
11
Object
surface
04362025
the outer boundary of an artifact or a
material layer constituting or
resembling such a boundary
ceiling, floor, platform, screen, stage
11
Others
material (stuff)
14580897
the tangible substance that goes into
the makeup of a physical object
gravel, cardboard, clay, earth, water,
log, paper, playing card
12
Others
car (auto, automobile,
machine, motorcar)
02958343
a motor vehicle with four wheels;
usually propelled by an internal
combustion engine
car, jeep, sport car, sports car, sport
car
11
Object
house
03544360
a dwelling that serves as living
quarters for one or more families
house
12
Scene
external body part
05225090
any body part visible externally
head, neck, chest, breast, leg, foot,
hand, finger, face, human face, face
11
Others
geological formation
(formation)
09287968
(geology) the geological features of
the earth
bank, beach, gorge, hill, mountain,
slope
11
Scene
table
04379243
a piece of furniture having a smooth
flat top that is usually supported by
one or more vertical legs
counter, desk, operating table, table
10
Object
vascular plant
(tracheophyte)
13083586
green plant having a vascular system:
ferns, gymnosperms, angiosperms
flower, rice, tree
10
Scene
33
3.2 Dream contents decoding
We constructed decoders by training linear support vector machines (SVM) (Vapnik, 1998) on fMRI data measured while each subject viewed web images for each
base synset. Multivoxel patterns in the higher visual cortex (HVC; the ventral
region covering the lateral occipital complex [LOC], fusiform face area [FFA],
and parahippocampal place area [PPA]; 1,000 voxels), the lower visual cortex
(LVC; V1-V3 combined; 1,000 voxels), or the subareas (400 voxels for each area)
were used as the input for the decoders (Fig. 6). To demonstrate dream contents decoding, we performed three types of analysis, classification, detection,
and identification analysis in the following.
3.2.1 Pairwise classification analysis
In classification analysis, first, a binary classifier was trained on the fMRI responses to stimulus images of two base synsets (three-volume averaged data corresponding to the 9-s stimulus block), and tested on the sleep samples (three-volume
[9-s] averaged data immediately before awakening) that contained exclusively one
of the two synsets while ignoring other concurrent synsets (Fig. 8; stimulus-todream decoding analysis). We only used synset pairs in which one of the synsets
appeared in at least 10 reports without co-occurrence with the other (201, 118,
and 86 pairs for Subject 1-3).
Pairwise stimulus-to-dream decoding analysiss
The distribution of the pairwise stimulus-to-dream decoding accuracies for HVC
is shown together with that from the decoders trained on the same stimulusinduced fMRI data with randomly shuffled synset labels (Fig. 13; fig. 29, individual subjects). The mean decoding accuracy was 60.0% (95% confidence
interval, CI, [59.0, 61.0]; three subjects pooled), significantly higher than that of
the label-shuffled decoders with both Wilcoxon rank-sum and permutation tests
(P < 0.001). The synsets of a pair can have unbalanced numbers of samples,
which could potentially lead to some bias. However, when the correct rates were
calculated in each of paired synsets and then averaged, the averaged correct rates
were highly correlated with the correct rates for all pooled samples (correlation
34
Pooled
HVC
Chance
coefficients for Subject 1-3, 0.96, 0.97, and 0.97, respectively). Therefore the bias,
if any, is likely to be small.
Decoding
accuracy (%)
80 Shuffled
50
Chance
20
Unshuffled
All (405 pairs)
Selected (97 pairs)
Figure 13. Distributions of pairwise decoding accuracy for stimulus-to-dream
decoding. Distributions of decoding accuracies with original and label-shuffled
data for all pairs (light blue and gray) and selected pairs (dark blue and black)
(three subjects pooled) were shown (three subjects pooled; chance level = 50%).
35
Commonality of neural representation between perception and dreaming
To look into the commonality of brain activity between perception and sleeponset dreaming, we focused on the synset pairs that produced content-specific
patterns in each of the stimulus and sleep experiments (pairs with high crossvalidation classification accuracy within each of the stimulus and sleep datasets;
Fig. 14; figs. 30, 31 individual subjects). With the selected pairs, even higher
accuracies were obtained (mean = 70.3%, CI [68.5, 72.1]; Fig. 13, dark blue; fig.
29, individual subjects; Tables S5-S7, lists of the selected pairs), indicating that
content-specific patterns are highly consistent between perception and sleep-onset
dreaming. The selection of synset pairs, which used knowledge of the test (sleep)
data, does not bias the null distribution by the label-shuffled decoders (Fig. 13,
black), because the content specificity in the sleep dataset alone does not imply
commonality between the two datasets. Accurate stimulus-to-dream decoding
requires that stimulus and dream data share similar content-specific patterns:
in the case of binary classification, the class boundary in dream data should be
similar to that in stimulus data.
36
B
Pooled
HVC
Chance
SS decoding
Unshuffled
Decoding
accuracy (%)
20
DD decoding
Pooled
HVC
Chance
A
Decoding
accuracy (%)
50
80 Shuffled
20
Unshuffled
50
80 Shuffled
Figure 14. Within dataset cross-validation decoding. The cross-validation analysis of the stimulus-induced and sleep datasets yielded the distribution of accuracies for (A) stimulus-to-stimulus (SS) pairwise decoding and (B) dream-to-dream
(DD) pairwise decoding (three subjects pooled, HVC; shown together with the
distribution from label-shuffled decoders). The mean decoding accuracies for SS
and DD decoding were 83.4% (95% CI [82.3, 84.6]) and 54.8% (95% CI [53.6,
56.0]), respectively (three subject pooled). The fraction of pairs that showed
significantly better decoding performance than chance level (one-tailed binomial
test, uncorrected P < 0.05) for SS decoding was 96.0% (389/405) and for DD
decoding was 24.7% (100/405) (three subjects pooled). The performance was
significantly better than that from label-shuffled decoders for both SS and DD
decoding (Wilcoxon rank-sum test, P < 0.001). Note that although the dream-todream decoding shows marginally significant accuracies (depending on subjects,
see Fig. 31), it is not as accurate as the stimulus-to-dream decoding. This is presumably because training samples from the sleep dataset were fewer and noisier
than those from the stimulus-induced dataset, and thus decoders were not well
trained with the sleep dataset.
37
Representational similarity between stimulus perception and dreaming
To further examine the commonality of representations for perception and sleeponset dreaming, we performed representational similarity analysis (RSA) (Kriegeskorte et al., 2008) using the accuracies from multiple pairs of both stimulus-tostimulus (SS) decoding and dream-to-dream (DD) decoding (Figs. 14). The
RSA posits that the dissimilarities among categories calculated from different
modalities (in our case, perception and dreaming) would be similar if they have
common representation. Thus, we calculated correlations between pairwise accuracies obtained from SS and DD decoding, that is a measure of representational
dissimilarity, and observed positive correlations from two of three subjects when
the activity of HVC were used to train decoders (Fig. 15; r = 0.14, P < 0.05
for Subject 1; r = 0.18, P < 0.05 for Subject 2; r = -0.01, P > 0.05 for Subject
3). Note that although the RSA results of Subject 3 did not show a positive
correlation between the accuracy of SS decoding and DD decoding, it could be
attributed to the low DD decoding accuracy that may be derived from small
number of training samples.
38
A
SS
B
DD
100
C
0.2
Subject 1
Subject 2
Subject 3
50
0
100
0.1
0
50
0
100
50
DD decoding accuracy (%)
100
Su
b
0
je
Su ct 1
bj
e
Su ct 2
bj
ec
t3
50
Decoding accuracy (%)
0
(NaN)
0
100
Correlation coefficient
SS decoding accuracy (%)
50
Figure 15. Representational similarity analysis. (A) Accuracy matrices of SS
and DD decoding as a measure of neural dissimilarity (for DD decoding, threevolume [9-s] averaged data immediately before awakening in HVC were used).
Each rows and columns indicate the base synsets of each subject. Only the
pairs that have ample number of samples were used for this analysis (cyan cells
indicate uncalculated pairs). (B) Plot of accuracy for the DD decoding analysis
against accuracy for SS decoding analysis. Each dot indicates accuracy of a pair.
The blue line indicates the regression line. (C) Correlation coefficient between
accuracies of multiple pairs for SS and DD decoding (asterisk, P < 0.05).
39
Contributions of multivoxel pattern for dream contents decoding
To quantitatively evaluate the efficiency of our use of multivoxel pattern over
analyzing fluctuations of global signal level within each region, we performed
the same pairwise decoding analysis but with averaging voxel values in each data
sample, and the decoding accuracy with averaged activity was compared with that
from the multivoxel decoders (Fig. 13, the top distribution). The performance
with averaged activity was close to the chance level and significantly worse than
that with multivoxel activity (Fig. 16; Wilcoxon signed-rank test, P < 0.001,
all subjects; fig. 32, individual subjects), revealing that the multivoxel pattern,
rather than the average activity level, was critical for decoding.
A
B
Pooled
HVC
100
10
Multivoxel
Averaged
20
20
40
60
80
100
0
M
0
50
ox
el
er
ag
ed
0
ul
tiv
Decoding accuracy (%)
Proportion (%)
10
Av
20
Decoding accuracy (%)
Figure 16. Decoding with averaged vs. multivoxel activity. (A) Histograms of
pairwise decoding accuracy with averaged and multivoxel activity. (B) Mean
pairwise decoding accuracies with averaged and multivoxel activity. The voxel
values in each data sample from HVC were averaged (before the z-transformation
in each sample), and the averaged activity was used as the input to the pairwise
decoders (error bar, 95% CI; averaged across all pairs).
40
Variations of decoding performance and semantic differences between
paired synsets
While the decoding performance was significantly high on average, a large variation in decoding performance could be observed. To explain the cause of this
variation, we hypothesized that the semantic difference between the paired synsets
affects the performance. We then grouped the pairwise decoding accuracies into
the pairs within and across meta-categories and compared the performance for
synsets paired across meta-categories (human, object, scene, and others; Tables
2-4) and that for synsets within meta-categories. As a result, the decoding accuracy for synsets paired across meta-categories was significantly higher than that
for synsets within meta-categories, though the significance level varied across subjects (Fig. 17; Wilcoxon rank-sum test, P = 0.261 for Subject 1; P = 0.003 for
Subject 2; P = 0.020 for Subject 3; P < 0.001 for the pooled; error bar, 95% CI).
However, even within a meta-category, the mean decoding accuracy significantly
exceeded chance level, indicating specificity to fine object categories.
41
Subject 1
Subject 2
Subject 3
Pooled
09
)
0)
s
(1
(5
ro
s
Ac
W
ith
in
W
ith
in
(1
0)
Ac
ro
ss
(2
0)
ss
(2
(1
2
in
ith
ro
Ac
W
W
ith
in
(2
8)
Ac
ro
ss
(6
2)
7)
50
)
Decoding accuracy (%)
80
Figure 17. Mean accuracies for the pairs within and across meta-categories. Pairwise decoding accuracies grouped into the pairs within and across meta-categories
are shown (individual subjects and three subjects pooled). The numbers of available pairs are denoted in parentheses. Because the others category contains a
large variety of synsets in terms of semantic similarity, the pairs of synsets in the
others category were excluded from this analysis.
42
Pairwise decoding accuracy for each sleep state
To find out the dependency of pairwise decoding accuracy on sleep state, the
samples from each sleep state (Fig. 12) were separately evaluated (three-volume
[9-s] averaged data immediately before awakening). While the accuracy for the
samples from awake state tend to be slightly low in comparison with the other
states (sleep stage 1 and 2), any consistent tendency in difference of the decoding
accuracy from sleep stage 1 and 2 was observed across three subjects (Fig. 18;
fig. 33 individual subjects). The slightly low performance for awake state might
be able to be explained by that the contents reported from the awakenings where
the preceding sleep periods were judged as awake might reflect dream contents
viewed long before awakening and the information could not be decoded from
the brain activity just before awakening. These results might indicate that the
representations of objects or scenes are moderately steady and are not affected
by sleep states.
Pooled
HVC
60
50
40
1,
2
Aw (6
a 19
St ke ( )
ag
53
e
)
1
St
(
ag 433
e
)
2
(1
86
)
30
St
ag
e
Decoding accuracy (%)
70
Sleep state
Figure 18. Mean accuracies for the samples from each sleep state. The decoding
accuracy was separately evaluated for the samples from each sleep state judged
from the last 15-s epoch before awakening (Fig. 12). The number of samples is
denoted in parentheses.
43
Pairwise decoding accuracies across visual areas
To investigate the performance difference across visual areas, the mean decoding
accuracy from each subarea was evaluated (Fig. 19; fig. 34, individual subjects).
The results showed that the LVC scored 54.3% (CI [53.4, 55.2]) for all pairs, and
57.2% (CI [54.2, 60.2] for selected pairs (three subjects pooled). The performance
was significantly above chance level but worse than that for HVC. Individual areas
(V1-V3, LOC, FFA, and PPA) showed a gradual increase in accuracy along the
visual processing pathway, mirroring the progressively complex response properties from low-level image features to object-level features (Kobatake, et al.,
1994).
Decoding
accuracy (%)
80
All
Selected
50
LVC HVC
V1
V2
V3
Area
LOC
FFA
PPA
Figure 19. Pairwise decoding accuracies across visual cortical areas. The numbers
of selected pairs for V1, V2, V3, LOC, FFA, PPA, LVC, and HVC were 45, 50,
55, 70, 48, 78, 55, and 97. Error bars indicate 95% CI, and black dashed lines
denote chance level.
44
Time course of decoding accuracy
In order to specify the timing of dreaming, we checked the time course of the
pairwise decoding accuracy calculated from the samples around the awakening.
When the time window was shifted around the awakening timing, the decoding
accuracy peaked around 0-10 s before awakening (Fig. 20 and fig. 35; no correction for hemodynamic delay). The high accuracies after awakening may be due
to hemodynamic delay and the large time window. Thus, verbal reports are likely
to reflect brain activity immediately before awakening. Note that fMRI signals
after awakening may be contaminated with movement artifacts and brain activity
associated with mental states during verbal reports. Mental states during verbal
reports are unlikely to explain the high accuracy immediately after awakening,
because the accuracy profile does not match the mean duration of verbal reports
(34 ± 19 s, mean ± SD; three subjects pooled).
80 HVC
Decoding
accuracy (%)
LVC
50
48
36
24 12
0
48 36 24 12
Time to awakening (s)
All
Selected
0
Figure 20. Time course of pairwise decoding accuracy. The time course of pairwise decoding accuracy is shown (three subjects pooled; shades, 95% CI; averaged
across all or selected pairs and subjects). Averages of three fMRI volumes (9 s;
HVC or LVC) around each time point were used as inputs to the decoders. The
performance is plotted at the center of the window. The gray region indicates
the time window used for the main analyses (the arrow denotes the performance
obtained from the time window). No corrections for hemodynamic delay were
conducted.
45
Pairwise decoding accuracy across whole cortical areas
To further investigate the potential for representing dream contents across whole
cortical areas, the time course of mean pairwise decoding accuracies from local
areas that collectively cover a whole cortical surface were evaluated (Fig. 7),
and the accuracies were mapped on the cortical surface (Fig. 21 for stimulus-tostimulus decoding analysis; Figs. 22-24 for stimulus-to-dream decoding analysis).
The stimulus-to-dream decoding accuracy maps from the three subjects consistently showed high decoding accuracy from visual areas around the timing of
awakening, which was already shown in the previous analysis (Figs. 13, 19).
Additionally, a similar tendency among the three subjects can be seen that the
parietal areas showed relatively high accuracy during the periods not only before
but also after the awakening that roughly corresponds to the duration of report
(34 ± 19 s, mean ± SD; three subjects pooled).
46
Subject 1
Dorsal
Left
Right
Ventral
Decoding accuracy (%)
50
Subject 2
85
Decoding accuracy (%)
50
80
Subject 3
Decoding accuracy (%)
50
75
Figure 21. Stimulus-to-stimulus decoding accuracy on whole cortical areas for the
three subjects. The mean accuracies of stimulus-to-stimulus decoding analysis
from the anatomically defined ROIs were mapped on a flattened cortical surface.
Note that the colorbar ranges for three subjects were different.
47
Time to Awakening= 45.0s
Time to Awakening= 3.0s
Time to Awakening= 0.0s
Time to Awakening= 39.0s
Time to Awakening= 36.0s
Time to Awakening= 33.0s
Time to Awakening= 30.0s
Time to Awakening= 9.0s
Time to Awakening= 6.0s
Decoding accuracy (%)
50
Figure 22. Time course of stimulus-to-dream decoding accuracy on whole cortical
areas for Subject 1. The mean accuracies of stimulus-to-dream decoding analysis
from the anatomically defined ROIs were mapped on a flattened cortical surface.
Averages of three fMRI volumes (9 s) around each time point were used as inputs
to the decoders. No corrections for hemodynamic delay were conducted.
48
56
Time to Awakening= 24.0s
Time to Awakening= 42.0s
Time to Awakening= 27.0s
Time to Awakening= 0.0s
Time to Awakening= 6.0s
Decoding accuracy (%)
60
Figure 23. Time course of stimulus-to-dream decoding accuracy on whole cortical
areas for Subject 2. The mean accuracies of stimulus-to-dream decoding analysis
from the anatomically defined ROIs were mapped on a flattened cortical surface.
Averages of three fMRI volumes (9 s) around each time point were used as inputs
to the decoders. No corrections for hemodynamic delay were conducted.
49
Time to Awakening= 45.0s
Time to Awakening= 24.0s
Time to Awakening= 3.0s
Time to Awakening= 42.0s
Time to Awakening= 0.0s
Time to Awakening= 39.0s
Time to Awakening= 36.0s
Time to Awakening= 33.0s
Time to Awakening= 30.0s
Time to Awakening= 9.0s
Time to Awakening= 6.0s
Decoding accuracy (%)
50
Figure 24. Time course of stimulus-to-dream decoding accuracy on whole cortical
areas for Subject 3. The mean accuracies of stimulus-to-dream decoding analysis
from the anatomically defined ROIs were mapped on a flattened cortical surface.
Averages of three fMRI volumes (9 s) around each time point were used as inputs
to the decoders. No corrections for hemodynamic delay were conducted.
50
56
3.2.2 Multilabel decoding analysis
While the pairwise classification analysis provided a benchmark performance by
preselecting synsets and dream samples for binary classification, to read out richer
contents given arbitrary sleep data, we next performed a multilabel decoding
analysis in which the presence/absence of each base synset was predicted by a
synset detector constructed from a combination of pairwise decoders (Fig. 9).
The synset detector provided a continuous score indicating how likely the synset
is to be present in each report.
Receiver operation characteristic (ROC) curves and the area under
the curve (AUC) for each synset
We calculated receiver operating characteristic (ROC) curves for each base synset
by shifting the detection threshold for the output score (Fig. 25; HVC; threevolume [9-s] averaged data immediately before awakening), and the detection
performance was quantified by the area under the curve (AUC). Although the
performance varied across synsets, 18 out of the total 60 synsets were accurately
detected (Wilcoxon rank-sum test, uncorrected P < 0.05; 7/26 synsets for Subject
1, 8/18 for Subject 2, and 3/16 for Subject 3), greatly exceeding the number of
synsets expected by chance (0.05 × 60 = 3). Here, while we used the samples with
at least one visual report for this analysis, when we compared the scores of each
synset for the samples in which the synset was reported and those for the samples
with no visual report, the former is significantly higher than the latter in 15/60
synsets (Wilcoxon rank-sum test, P < 0.05; three subjects pooled; 4/26 synsets
for Subject 1, 7/18 for Subject 2, and 4/16 for Subject 3), a similar result to that
with samples with visual reports, providing a further support for our conclusions.
Here, while we used multi-label decoders constructed from combinations of pairwise decoders (1 vs 1 classifiers), the same analysis with another way to construct
multi-label decoders (1 vs others classifiers) also showed similar performances
(Wilcoxon rank-sum test, P < 0.05, 12/60 for three subjects pooled; 3/26 synsets
for Subject 1, 8/18 for Subject 2, and 1/16 for Subject 3).
51
Subject 1
Human
girl:0.724
performer:0.707
male:0.477
0.8
Object
0.6
implement:0.552
chair:0.548
window:0.537
clothing:0.528
table:0.520
0.4
Scene
room:0.693
building:0.628
workplace:0.620
way:0.490
Others
shape:0.805
picture:0.658
character:0.644
group:0.486
illustration:0.470
material:0.455
code:0.444
0.2
0
Subject 2
Human
True positive
0.8
female:0.647
Object
0.6
book:0.776
electronicequipment:0.700
0.4
Scene
region:0.794
street:0.774
mercantileestablishment:0.760
building:0.664
point:0.647
dwelling:0.605
Others
commodity:0.596
furniture:0.562
character:0.767
representation:0.448
0.2
0
Subject 3
0.8
Human
male:0.688
0.6
Object
car:0.624
display:0.506
table:0.475
0.4
0.2
Scene
room:0.600
house:0.545
way:0.539
Others
communication:0.673
vertebrate:0.632
material:0.486
0
0
0.2
0.4
0.6
False positive
0.8
Figure 25. ROC analysis for the three subjects. The AUC for each synsets were
denoted on the side of the name of each base synsets. The asterisk denoted the
synsets with above-chance level (Wilcoxon rank-sum test; uncorrected P < 0.05).
52
AUC averaged within meta-categories for different visual areas
Using the AUC, we compared the decoding performance for individual synsets
grouped into meta-categories in different visual areas. Overall, the performance
was better in HVC than in LVC, consistent with the pairwise decoding performance (Fig. 26; three subjects pooled; ANOVA, P = 0.003). While V1-V3 did
not show different performances across meta-categories, the higher visual areas
showed a marked dependence on meta-categories (Fig. 26; three subjects pooled).
In particular, FFA showed better performance with human synsets, while PPA
showed better performance with scene synsets (ANOVA [interaction], P = 0.001;
three subjects pooled), consistent with the known response characteristics of these
areas (Epstein Kanwisher, 1998; Kanwisher et al., 1997). LOC and FFA showed
similar results, presumably because our functional localizers selected partially
overlapping voxels. Because the sample sizes are small in individual subjects,
evaluation with statistical tests for each subject is difficult. However, tendencies
similar to the pooled results are found in individual subjects: HVC tends to outperform LVC (ANOVA, P = 0.029 for Subject 1, P = 0.159 for Subject 2, and P
= 0.139 for Subject 3), and FFA and PPA tend to show better performances with
human and scene, respectively (ANOVA [interaction], P = 0.099 for Subject 1;
P = 0.028 for Subject 2; P = 0.044 for Subject 3).
53
Subject 1
Subject 2
Subject 3
Pooled
0.7
HVC
LVC
0.5
0.7
AUC
V1
V2
V3
0.5
0.7
LOC
FFA
PPA
(2
1)
s
(1
6)
er
O
th
en
e
Sc
t(
ec
an
O
bj
um
H
16
)
(7
)
)
(6
s
(5
)
er
th
O
Sc
en
e
(3
)
(2
)
ct
an
O
um
H
bj
e
(2
)
s
(6
)
er
th
(8
en
e
Sc
ct
bj
e
an
O
O
)
)
(2
3)
um
H
O
th
er
s
(1
(5
)
(5
en
e
Sc
ct
bj
e
O
H
um
an
(3
)
)
0.5
Figure 26. AUC averaged within meta-categories for different visual areas (three
subjects pooled; numbers of synsets in parentheses). The mean AUC for the
synsets within each meta-category is plotted for different visual areas (individual
subjects and pooled results; error bars, 95% CI). The numbers of available synsets
are shown in parentheses.
54
Time course of synset scores
The output scores for individual synsets showed diverse and dynamic profiles in
each sleep sample (Fig. 27A and fig. 36 for other examples). These profiles may
reflect a dynamic variation of visual contents including those experienced even before the period near awakening. On average, there was a general tendency for the
scores for reported synsets to increase toward the time of awakening (Fig. 27B and
fig. 37 individual subjects). Interestingly, synsets that did not appear in reports
showed greater scores if they had a high co-occurrence relationship with reported
synsets (Fig. 27B; synsets with top 15% conditional probabilities given a reported
synset, calculated from the whole content vectors in each subject). The effect of
co-occurrence is rather independent of that of semantic similarity (Fig. 17) because both factors (high/low co-occurrence and within/across meta-categories)
had highly significant effects on the scores of unreported synsets (three-volume
[9-s] averaged data immediately before awakening; two-way ANOVA,P < 0.001,
three subjects pooled) with moderate interaction (P = 0.016). The scores for
reported synsets were significantly higher than those for unreported synsets even
within the same meta-category (Wilcoxon rank-sum test, P < 0.001). Verbal
reports are unlikely to describe full details of visual experience during sleep, and
it is possible that contents with high general co-occurrence (e.g., street and car)
tend to be experienced together even when all are not reported. Therefore, high
scores for the unreported synsets may indicate unreported but actual visual contents during sleep, and we may be able to detect implicit contents by scrutinizing
the score time course.
55
What I was just looking at was
some kind of characters...
Score
Subject 2
118th
10 awakening
B
Normalized Score
A
character
0
48 36 24 12 0
Time to awakening (s)
Reported
Unreported (High/Low)
0.4
0
0.2
48 36 24 12 0 12 24
Time to awakening (s)
Figure 27. Synset score time course. (A) Example time course of synset scores
for a single dream sample (Subject 2, 118th; color legend as in Fig. 25; reported
synset, character, in bold). (B) Time course of averaged synset scores for reported
synsets (red) and unreported synsets with high/low (blue/gray) co-occurrence
with reported synsets (averaged across awakenings and subjects). Scores are
normalized by the mean magnitude in each subject.
56
Identification analysis
Finally, to explore the potential of multilabel decoding to distinguish numerous
contents simultaneously, we performed identification analysis (Kay, et al., 2008;
Miyawaki, et al., 2008). The output scores (score vector) were used to identify
the true visual content vector among a variable number of candidates (true vector + random vectors with matched probabilities for each synset) by selecting
the candidate most correlated with the score vector (Fig. 28A; repeated 100
times for each sleep sample to obtain the correct identification rate). The performance exceeded chance level across all set sizes (Fig. 28B; HVC; three subjects
pooled; fig. 38, individual subjects), although the accuracies were not as high
as those achieved using stimulus-induced brain activity in previous studies (Kay,
et al., 2008; Miyawaki, et al., 2008). The same analysis was performed with
extended visual content vectors in which unreported synsets having a high cooccurrence with reported synsets (top 15% conditional probability) were assumed
to be present. The results showed that extended visual content vectors were better identified (Fig. 28B and fig. 39), suggesting that multilabel decoding outputs
may represent both reported and unreported contents.
57
N candidates
Decoder
output
True
B
80
Identification
accuracy (%)
A
60
40
20
0
Most similar vector?
Original
Extended
24 8
16
Candidate set size
32
Figure 28. Identification analysis. (A) Schematic view of the identification analysis. The correlation coefficients between the score vector from multilabel decoding
and each of the candidate vectors consisting of the true visual content vector and
a variable number of binary vectors generated randomly with a matched probability were calculated. The vector with the highest correlation coefficient was chosen
as the one representing the visual contents. (B) Identification performance as a
function of the candidate set size. Accuracies are plotted against candidate set
size for original and extended visual content vectors (averaged across awakenings
and subjects). Because Pearson’s correlation coefficient could not be calculated
for vectors with identical elements, such samples were excluded. The shades
indicate 95% CI, and dashed lines denote chance level.
58
4. Discussion
We have shown that visual dream contents viewed during sleep-onset periods
can be read out from fMRI signals of the human visual cortex. Our decoding
analyses revealed that the accurate classification, detection, and identification
regarding dream contents could be achieved with the higher visual cortex. This
is the first demonstration of neural basis of dream contents, and our results can
be interpreted as an evidence against the theory that our dreams are made up
when we wake up (Windt, 2013.).
Commonality of neural representation between perception and dreaming
Because our decoders were trained using the fMRI responses induced by natural
image viewing, the accurate performances of the pairwise and multilabel decoding
analysis suggest that the specific dream contents are represented in activity patterns, which are shared by stimulus perception. Such a representational commonality was also confirmed from the representational similarity analysis, though the
results showed only a marginal significance depending on subjects (Fig. 15). The
common representational property between perception and dreaming was also
observed from additional analyses, including the higher performance for synset
pairs across meta-category than those within meta-category (Fig. 17), the gradual increase in decoding accuracy along the visual hierarchy (Fig. 19), and the
marked dependence of the synset detection performance (quantified by AUCs)
on meta-categories in the higher visual areas (Fig. 26). The results suggest that
the principle of perceptual equivalence (Finke, 1989), which postulates a common
neural substrate for perception and imagery, generalizes to spontaneously generated visual experience during sleep. Although we have demonstrated semantic
decoding with the higher visual cortex, this does not rule out the possibility of
decoding low-level features with the lower visual cortex. Furthermore, while we
have focused on semantic aspects of the dreaming (nouns describing objects or
scenes), dreaming consisted of multiple aspects like actions (described by verbs),
emotions (described by adjectives), and so on. Then, further analyses focused on
those aspects will help to reveal the whole picture of dreaming in comparison to
59
our actual experiences.
Decoding from multivoxel patterns
Our approach extends previous research on the (re)activation of the brain during
sleep (Braun et al., 1998; Maquet, 2000; Yotsumoto et al., 2009; Wilson and
Mcnaughton, 1994) and the relationship between dreaming and brain activity
(Dresler et al., 2011; Marzano et al., 2011; Miyauchi et al., 2009), by discovering
links between complex brain activity patterns and unstructured verbal reports
using database-assisted machine learning decoders. A major difference of our
approach from the previous studies is the use of multivoxel patterns to analyze
the brain activity during sleep, as it was demonstrated by our analysis that revealed the contributions of the multivoxel patterns over the average activity level
to read out dream contents (Fig. 16). The multivoxel pattern analysis required
a large number of samples to obtain stable results, and that was a major difficulty to study dreaming with multivoxel pattern analysis. However, the multiple
awakening procedure and the active use of lexical database to extract representative dream contents successfully worked to reconcile this difficulty. We expect
the present study would be a good demonstration of a new approach to study
dreaming with objective manner.
Sleep-onset dreaming and REM dreaming
While we focused on the dreaming during sleep-onset periods, analysis of dreaming during REM periods would be necessary to understand the nature of dreaming
irrespective of the sleep states. The similarity between REM and sleep-onset reports (Foulkes and Vogel, 1965; Vogel et al., 1972; Oudiette et al., 2012) and
the visual cortical activation during the REM sleep (Braun et al., 1998; Maquet,
2000; Miyauchi et al., 2009) suggest that the same approach could also be used to
decode REM dreaming. Moreover, as our analysis could not detect any difference
of accuracy from different sleep states (Fig. 18), dream contents might possibly
be represented in the same manner across different sleep states. Our method
may further work beyond the bounds of sleep stages to uncover the dynamics
of spontaneous brain activity in association with stimulus representation. The
decoding presented here is retrospective in nature: decoders were constructed af60
ter sleep experiments based on the collected reports. However, because reported
synsets largely overlap between the first and the last halves of the experiments
(59/60 base synsets appeared in both), the same decoders may apply to future
sleep data. An interesting direction to train dream decoder is to use a large
dream-report database (DreamBank; http://www.dreambank.net) that collects
reports of dreaming from various people. We may be able to utilize the database
to effectively select representative dream contents to train decoder.
Is dreaming more similar to perception or visual imagery?
Our analyses demonstrate the representational commonality between perception
and dreaming in the higher visual areas just at the timing of their visual experience. However, the commonalities in the other areas and at the other timing,
such as a timing of dream generation, are not yet thoroughly explored. Our
analysis showed the high stimulus-to-dream decoding accuracy from the parietal
areas as well as the higher visual areas during the periods around the awakening
(Figs. 22-24), suggesting the existence of the representational commonality in
non-visual areas. Since our technique can be a tool to detect the commonality
between any types of visual experience and dreaming, decoders trained using
fMRI responses induced by visual imagery during awake can be used to reveal
the commonality between visual imagery and dreaming. Thus, comparing the
accuracy time course of dream decoding on whole cortical areas using perceptor imagery-trained decoders could provide valuable insights into the mechanism
of how dream contents are generated and represented in various brain areas.
Furthermore, because the dreaming and visual imagery both occurred without
visual stimuli, the imagery-trained decoders might provide higher accuracy than
the percept-trained decoders in higher areas.
Decoding spontaneous brain activity
In contrast to the previous studies demonstrating the decoding of stimulus- or
task-induced brain activity (Cox and Savoy, 2003; Kamitani and Tong 2005,
2006; Miyawaki et al., 2008; Stokes et al., 2009; Reddy et al., 2010; Harrison
and Tong, 2010; Albers et al., 2013), the present study demonstrated decoding
of spontaneous brain activity during sleeping. As the results of the multilabel
61
decoding analysis suggested (Figs. 27B, 28), one interesting possibility is that
our approach can read out not only what subjects remember in their dreams
but also what they forget or fail to report. Our method may work to detect
the contents implicitly represented in spontaneous brain activity. Several studies
reported representational overlaps between stimulus- or task-induced patterns
and spontaneously emerged patterns during both awake and sleeping (Tsodyks
et al., 1999; Kenet et al., 2003; Han et al., 2008; Luczak et al., 2009; Yotsumoto
et al., 2009), but the functions of spontaneous brain activity are not yet clear.
Our approach will be able to discover the contents represented in spontaneous
brain activity, and will be a powerful tool to link spontaneous activity patterns
with behavioral, physiological, and cognitive states of us. We expect that this
study will lead to a better understanding of the functions of not only dreaming
but also other spontaneous neural events.
62
References
[1] H. W. Agnew, Jr, W. B. Webb, R. L. Williams, The first night effect: an
EEG study of sleep. Psychophysiology 2, 263-266 (1966)
[2] E. Aserinsky, N. Kleitman, Regularly occurring periods of eye motility, and
concomitant phenomena during sleep. Science 118, 273-274 (1953)
[3] G. W. Baylor, C. Cavallero, Memory Sources Associated with REM and
NREM Dream Reports Throughout the Night: A New Look at the Data.
Sleep 24, 165-170 (2001)
[4] G. Bonmassar et al., Motion and ballistocardiogram artifact removal for
interleaved recording of EEG and EPs during MRI. Neuroimage 16, 11271141 (2002)
[5] A. R. Braun, T. J. Balkin, N. J. Wesenten, R. E. Carson, M. Varga, P.
Baldwin, S. Selbie, G. Belenky, P. Herscovitch, Regional cerebral blood flow
throughout the sleep-wake cycle. An H2(15)O PET study. Brain 120, 11731179 (1997)
[6] A. R. Braun, T. J. Balkin, N. J. Wesensten, F. Gwadry, R. E. Carson, M.
Varga, P. Baldwin, G. Belenky, P. Herscovitch, Dissociated pattern of activity in visual cortices and their projections during human rapid eye movement
sleep. Science 279, 91-95 (1998)
[7] C. Cavallero, P. Cicogna, V. Natale, M. Occhionero, A. Zito, Slow wave sleep
dreaming. Sleep 15, 562-566 (1992)
[8] C. C. Chang, C. J. Lin, LIBSVM a library for support vector machines.
ACM Transactions on Intelligent Systems and Technology 2, (2011) Software
available at http://www.csie.ntu.edu.tw/ cjlin/libsvm
[9] W. Dement, N. Kleitman, The relation of eye movements during sleep to
dream activity: An objective method for the study of dreaming. J. Exp.
Psychol. 53, 339-346 (1957)
63
[10] W. Dement, E. A. Wolpert. The relation of eye movements, body motility,
and external stimuli to dream content. J. Exp. Psychol. 55, 543-553 (1958)
[11] J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei, Imagenet: A largescale hierarchical image database. IEEE CVPR (2009)
[12] M. Dresler et al., Dreamed movement elicits activation in the sensorimotor
cortex. Curr. Biol. 21, 1833-1837 (2011)
[13] S. A. Engel et al., fMRI of human visual cortex. Nature 369, 525 (1994)
[14] R. Epstein, N. Kanwisher, A cortical representation of the local visual environment. Nature 392, 598-601 (1998)
[15] C. Fellbaum, Ed., WordNet: An Electronic Lexical Database. (MIT Press,
Cambridge, MA, 1998)
[16] R. A. Finke, Principles of Mental Imagery. (MIT Press, Cambridge, MA,
1989)
[17] D. Foulkes, G. Vogel, Mental activity at sleep onset. J. Abnorm. Psychol.
70, 231-243 (1965)
[18] W. D. Foulkes, Dream reports from different stages of sleep. J. Abnorm. Soc.
Psychol. 65, 14-25 (1962)
[19] A.Germain, T. A. Nielsen, EEG Power Associated with Early Sleep Onset
Images Differing in Sensory Content. Sleep Research Online 4, 83-90 (2001)
[20] J. J. Gugger, M. L. Wagner, Rapid eye movement sleep behaviour disorder.
Ann. Pharmacother. 41, 18331841 (2007)
[21] F. Han, N. Caporale, Y. Dan, Reverberation of recent visual experience in
spontaneous cortical waves. Neuron 60, 321-327 (2008)
[22] S. A. Harrison, F. Tong, Decoding reveals the contents of visual working
memory in early visual areas. Nature 458, 632-635 (2009)
[23] J. V. Haxby et al., Distributed and overlapping representations of faces and
objects in ventral temporal cortex. Science 293, 2425-2430 (2001)
64
[24] J. A. Hobson, REM sleep and dreaming: towards a theory of protoconsciousness. Nature Rev. Neurosci. 10, 803-813 (2009)
[25] J. A. Hobson, R. Stickgold, Dreaming: A Neurocognitive Approach. Conscious Cogn. 3, 1-15 (1994)
[26] T. Hori, M. Hayashi, T. Morikawa, in Sleep Onset: Normal and Abnormal
Processes. R. D. Ogilvie, J. R. Harsh, Eds. (American Psychological Association, Washington, 1994), pp. 237-253.
[27] C. C. Hong, J. C. Harris, G. D. Pearlson, J. S. Kim, V. D. Calhoun, J. H.
Fallon, X. Golay, J. S. Gillen, D. J. Simmonds, P. C. van Zijl, D. S. Xee, J.
J. Pekar, fMRI evidence for multisensory recruitment associated with rapid
eye movements during sleep. Hum. Brain Mapp. 30 1705-1722 (2009)
[28] A. G. Huth, S. Nishimoto, A. T. Vu, J. L. Gallant, A continuous semantic
space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 1210-1240 (2012)
[29] N. Kajimura, M. Uchiyama, Y. Takayama, S. Uchida, T. Uema, M. Kato,
M. Sekimoto, T. Watanabe, T. Nakajima, S. Horikoshi, K. Ogawa, M.
Nishikawa, M. Hiroki, Y. Kudo, H. Matsuda, M. Okawa, K. Takahashi, Activity of midbrain reticular formation and neocortex during the progression
of human non-rapid eye movement sleep. J. Neurosci. 19, 10065-10073 (1999)
[30] Y. Kamitani, F. Tong, Decoding the visual and subjective contents of the
human brain. Nature Neurosci. 8, 679-685 (2005)
[31] Y. Kamitani, F. Tong, Decoding seen and attended motion directions from
activity in the human visual cortex. Curr. Biol. 16, 1096-1102 (2006)
[32] N. Kanwisher, J. McDermott, M. M. Chun, The fusiform face area: a module
in human extrastriate cortex specialized for face perception. J. Neurosci. 17,
4302-4311 (1997)
[33] K. N. Kay, T. Naselaris, R. J. Prenger, J. L. Gallant, Identifying natural
images from human brain activity. Nature 452, 352-355 (2008)
65
[34] T. Kenet, D. Bibitchkov, M. Tsodyks, A. Grinvald, A. Arieli,Spontaneously
emerging cortical representations of visual attributes. Nature 425, 954-956
(2003)
[35] E. Kobatake, K. Tanaka, Neuronal selectivities to complex object features
in the ventral visual pathway of macaque cerebral cortex. J. Neurophysiol.
71, 856-867 (1994)
[36] Z. Kourtzi, N. Kanwisher, Cortical regions involved in perceiving object
shape. J. Neurosci. 20, 3310-3318 (2000)
[37] N. Kriegeskorte, M. Mur, D. A. Ruff, R. Kiani, J. Bodurka, H. Esteky, K.
Tanaka, P. A. Bandettini, Matching categorical object representations in
inferior temporal cortex of man and monkey. Neuron 60, 1126-1141 (2008)
[38] A. Luczak, P. Bartho, K. D. Harris, Spontaneous events outline the realm
of possible sensory responses in neocortical populations. Neuron 62, 413-425
(2009)
[39] C. Marzano et al., Recalling and forgetting dreams: theta and alpha oscillations during sleep predict subsequent dream recall. J. Neurosci. 31, 6674-6683
(2011)
[40] P. Maquet, J. Peters, J. Aerts, G. Delfiore, C. Degueldre, A. Luxen, G.
Franck, Functional neuroanatomy of human rapid-eye-movement sleep and
dreaming. Nature 383, 163-166 (1996)
[41] P. Maquet, Functional neuroimaging of normal human sleep by positron
emission tomography. J. Sleep Res. 9, 207-231 (2000)
[42] M. Minear, D. C. Park, A lifespan database of adult facial stimuli. Behavior
Research Methods, Instruments, and Computers. 36, 630-633 (2004)
[43] S. Miyauchi, M. Misaki, S. Kan, T. Fukunaga, T. Koike, Human brain activity time-locked to rapid eye movements during REM sleep. Exp. Brain Res.
192, 657-667 (2009)
66
[44] Y. Miyawaki et al., Visual image reconstruction from human brain activity
using a combination of multiscale local image decoders. Neuron 60, 915-929
(2008)
[45] T. H. Monk, D. J. Buysse, C. F. Reynolds, 3rd, D. J. Kupfer, Circadian
determinants of the postlunch dip in performance. Chronobiol. Int. 13, 123133 (1996).
[46] Y. Nir, G. Tononi, Dreaming and the brain: from phenomenology to neurophysiology. Trends Cogn. Sci. 14, 88-100 (2010)
[47] R. D. Ogilvie, R. T. Wilkinson, The detection of sleep onset: behavioral and
physiological convergence. Psychophysiology 21, 510-520 (1989)
[48] D. Oudiette et al., Dreaming without REM sleep. Counscious Cogn. 21,
1129-1140 (2012)
[49] L. Palagini, A. Gemignani, I. Feinberg, M. Guazzelli, I. G. Campbell, Mental activity after early afternoon nap awakenings in healthy subjects. Brain
Research Bulletin 63, 361-368 (2004)
[50] S. M. Polyn, V. S. Natu, J. D. Cohen, K. A. Norman, Category-specific
cortical activity precedes retrieval during memory search. Science 310, 19631966 (2005)
[51] A. Rechtschaffen, A. Kales, A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects. (U.S. Dept.
of Health, Education, and Welfare, Public Health Services-National Institutes of Health, National Institute of Neurological Diseases and Blindness,
Neurological Information Network, Bethesda, Md., 1968)
[52] M. I. Sereno et al., Borders of multiple visual areas in humans revealed by
functional magnetic resonance imaging. Science 268, 889-893 (1995).
[53] F. Sharbrough et al., American electroencephalographic society guidelines
for standard electrode position nomenclature. J. Clin. Neurophysiol. 8, 200202 (1991)
67
[54] M. Solms, Neuropsychology of dreams. Mahwah, NJ:Erbaum (1997)
[55] R. Stickgold, A. Malia, D. Maguire, D. Roddenberry, M. O’Connor, Replaying the game: hypnagogic images in normals and amnesiacs. Science 290,
350-353 (2000)
[56] M. Stokes, R. Thompson, R. Cusack, J. Duncan, Top-down activation of
shape-specific population codes in visual cortex during mental imagery. J.
Neurosci. 29, 1565-1572 (2009)
[57] T. Takeuchi, A. Miyashita, M. Inugami, Y. Yamamoto, Intrinsic dreams are
not produced without REM sleep mechanisms: evidence through elicitation
of sleep onset REM periods. J. Sleep. Res. 10, 43-52 (2001)
[58] M. Tamaki, H. Nittono, M. Hayashi, T. Hori, Examination of the first-night
effect during the sleep-onset period. Sleep 28, 195-202 (2005)
[59] M. Tsodyks, T. Kenet, A. Grinvald, A. Arieli, Linking spontaneous activity
of single cortical neurons and the underlying functional architecture. Science
286 1943-1946 (1999)
[60] V. N. Vapnik, Statistical Learning Theory. (Wiley, New York, 1998)
[61] G. W. Vogel, B. Barrowclough, D. D. Giesler, Limited discriminability of
REM and sleep onset reports and its psychiatric implications. Arch. Gen.
Psychiatry 26, 449-455 (1972)
[62] E. J. Wamsley, K. Perry, I. Djonlagic, L. B. Reaven, R. Stickgold, Cognitive replay of visuomotor learning at sleep onset: temporal dynamics and
relationship to task performance. Sleep 33, 59-68 (2010)
[63] M. A. Wilson, B. L. McNaughton, Reactivation of hippocampal ensemble
memories during sleep. Science 265, 676-679 (1994)
[64] J.M. Windt, Reporting dream experience: Why (not) to be skeptical about
dream reports. Front. Hum. Neurosci. 7, 708 (2013)
[65] J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba, SUN Database Large
scale Scene Recognition from Abbey to Zoo. IEEE CVPR (2010)
68
[66] Y. Yotsumoto et al., Location-specific cortical activation changes during
sleep after training for perceptual learning. Curr. Biol. 19, 1278-1282 (2009)
69
Acknowledgements
はじめに ATR 脳情報研究所 神経情報学研究室 室長 神谷之康教授に深く感謝を
いたします.修士の学生として指導を受け始めてから現在に至るまで,プロの研
究者としての姿勢,実験・解析に関するアドバイスなど,数々のご指導をいただ
きましたことを心より感謝いたします.神谷先生のもとで博士課程を修了するこ
とは,今後の研究者人生における最大の財産です.ありがとうございます.川人
光男教授には,研究に対する指導のみならず,素晴らしい環境のもとで研究をす
る機会をいただきましたこと深く感謝いたします.数理情報学講座の池田和司教
授,柴田智広准教授には,セミナーでの発表に対して,有益なアドバイスやコメ
ントをいただきましたこと感謝いたします.連携講座に所属する自分にとって,
先生方からのサポートは大きな助けとなりました.松本裕治教授には,神経科学
分野とは離れた立場から客観的なコメントをいただいたこと,本論文の審査を引
き受けていただいたことを感謝いたします.また,現在はブラウン大学に所属さ
れている玉置應子研究員,電気通信大学に所属されている宮脇陽一准教授には,
本プロジェクトの進行にあたり多大な協力をいただけましたこと,心より感謝い
たします.本論文で使用した睡眠データの取得は,玉置應子研究員の力なくして
はなし得ないものでした.本研究に対するその甚大なる貢献に深く感謝いたしま
す.解析方法に関する宮脇陽一准教授との数多くの議論は,今回のプロジェクト
だけでなく,今後の研究生活においても有用な技術的土台を身につけるための非
常に有意義な経験でした.深く感謝いたします.また,本研究プロジェクトに関
わり,さまざまな点で研究の進展に尽力してくれた,大貫良幸くん,藤原祐介さ
ん,Taylor Beck さん,久保孝富さんに感謝いたします.本論文の成果は,皆さ
んの試行錯誤あってこそのものです.中野奈月子さん,大島由香さんには,実験
のスケジューリングに関してお世話になりましたことを感謝いたします.脳活動
イメージングセンタの皆様には,実験に協力していただきましたことを感謝いた
します.ATR 脳情報研究所神経情報学研究室の皆様には,研究の内容をはじめ,
論文構成に関しても,たくさんの有益なアドバイスをいただきましたことを心か
ら感謝いたします.
70
Appendix
A. Supplementary results
71
Chance
Subject 1
HVC
Decoding
accuracy (%)
20
Unshuffled
50
80 Shuffled
All (201 pairs)
Selected (39 pairs)
Subject 2
HVC
20
50
80
All (118 pairs)
Selected (47 pairs)
Subject 3
HVC
20
50
80
All (86 pairs)
Selected (11 pairs)
Figure 29. Distribution of pairwise decoding accuracies. The distribution of
the stimulus-to-dream pairwise decoding accuracies for the higher visual cortex
(HVC) is shown together with that from label-shuffled decoders. The format is
the same as in Fig. 13. The mean decoding accuracies for all pairs were 56.0%
(95% CI [54.7, 57.3]) for Subject 1, 66.9% (CI [65.1, 66.8]) for Subject 2, and
59.8% (CI [57.9, 61.7]) for Subject 3, and those for selected pairs were 65.3% (CI
[62.4, 68.2]) for Subject 1, 75.4% (CI [73.3, 77.4]) for Subject 2, and 66.1% (CI
[61.2, 71.0]) for Subject 3. The fraction of pairs that showed significantly better
decoding performance than chance level (one-tailed binomial test, uncorrected
P < 0.05) was 22.4% (45/201) for Subject 1, 57.6% (68/118) for Subject 2, and
27.9% (24/86) for Subject 3. The performance was significantly better than that
from label-shuffled decoders for all subjects for both of all and selected pairs
(Wilcoxon rank-sum test, P < 0.001).
72
Unshuffled
Chance
Subject 1
HVC
Decoding
accuracy (%)
20
50
80 Shuffled
50
80
50
80
Subject 2
HVC
20
Subject 3
HVC
20
Figure 30. Stimulus-to-stimulus pairwise decoding. The cross-validation analysis of the stimulus-induced dataset (see Methods ”2.14. Synset pair selection
by within-dataset cross-validation”) yielded the distribution of accuracies for
stimulus-to-stimulus pairwise decoding (Subject 1-3, HVC; shown together with
the distribution from label-shuffled decoders). The mean decoding accuracies
were 85.6% (95% CI [84.1, 87.0]) for Subject 1, 87.0% (CI [85.4, 88.7]) for Subject 2, and 81.4% (CI [79.4, 83.4]) for Subject 3. The fraction of pairs that showed
significantly better decoding performance than chance level (one-tailed binomial
test, uncorrected P < 0.05) was 94.5% (190/201) for Subject 1, 99.8% (117/118)
for Subject 2, and 95.3% (82/86) for Subjet 3. The performance was significantly
better than that from label-shuffled decoders for all subjects (Wilcoxon rank-sum
test, P < 0.001 for all three subjects).
73
Chance
Subject 1
HVC
Decoding
accuracy (%)
20
Unshuffled
50
80 Shuffled
50
80
50
80
Subject 2
HVC
20
Subject 3
HVC
20
Figure 31.
Dream-to-dream pairwise decoding. The cross-validation analysis of
the sleep dataset (see Methods ”2.14. Synset pair selection by within-dataset crossvalidation”) yielded the distribution of accuracies for dream-to-dream pairwise decoding (Subject 1-3, HVC; shown with the distribution from label-shuffled decoders). The
mean decoding accuracies were 52.9% (95% CI [51.2, 54.7]) for Subject 1, 60.1% (CI
[58.1, 62.1]) for Subject 2, and 51.3% (CI [49.1, 53.5]) for Subject 3. The fraction of
pairs that showed significantly better decoding performance than chance level (onetailed binomial test, uncorrected P < 0.05) was 20.4% (41/201) for Subject 1, 40.7%
(48/118) for Subject 2, and 12.8% (11/86) for Subject 3. The results showed that the
performance was significantly better than that from label-shuffled decoders for two of
three subjects (Wilcoxon rank-sum test, P = 0.002 for Subject 1; P < 0.001 for Subject
2; P = 0.302 for Subject 3). Note that although the dream-to-dream decoding shows
marginally significant accuracies (depending on subjects), it is not as accurate as the
stimulus-to-dream decoding. This is presumably because training samples from the
sleep dataset were fewer and noisier than those from the stimulus-induced dataset, and
thus decoders were not well trained with the sleep dataset.
74
20
Subject 1
HVC
Decoding accuracy (%)
100
10
0
10
50
Multivoxel
Averaged
0
M
ul
t
Av ivo
er xel
ag
ed
20
Decoding accuracy (%)
Proportion (%)
100
Subject 2
HVC
20
10
0
10
50
0
M
ul
t
Av ivo
er xel
ag
ed
20
Subject 3
HVC
100
Decoding accuracy (%)
20
10
0
10
20
50
20
40
60
80
Decoding accuracy (%)
100
M
0
ul
t
Av ivo
er xel
ag
ed
0
Figure 32. Decoding with averaged vs. multivoxel activity for individual subject. (A) Histograms of pairwise decoding accuracy with averaged and multivoxel
activity. (B) Mean pairwise decoding accuracies with averaged and multivoxel
activity. The voxel values in each data sample from HVC were averaged (before
the z-transformation in each sample), and the averaged activity was used as the
input to the pairwise decoders (error bar, 95% CI; averaged across all pairs).
75
Subject 1
HVC
70
60
50
40
Subject 2
HVC
70
60
50
40
Aw (1
ak 98)
e
St
(2
ag
2)
e
1
(
St
ag 178
)
e
2
(2
0)
30
St
ag
e
1,
2
Decoding accuracy (%)
St
ag
e
1,
2
Aw (2
ak 35)
St e (
ag 14
)
e
1
St
(9
ag
1)
e
2
(1
44
)
30
Subject 3
HVC
70
60
50
St
ag
e
1,
2
30
Aw (1
ak 86)
e
St
(1
ag
7)
e
1
(1
St
ag 64
)
e
2
(2
2)
40
Sleep state
Figure 33. Mean accuracies for the samples from each sleep state for individual
subjects. The decoding accuracy was separately evaluated for the samples from
each sleep state judged from the last 15-s epoch before awakening (Fig. 12). The
number of samples is denoted in parentheses.
76
80 Subject 1
All
Selected
Decoding accuracy (%)
50
80 Subject 2
50
80 Subject 3
50
LVC HVC
V1
V2
V3
Area
LOC
FFA
PPA
Figure 34. Pairwise decoding accuracies across visual cortical areas. The decoding
accuracies with different visual areas are shown for individual subjects (error
bars, 95% CI). Selected pairs were determined on the basis of the cross-validation
analysis in each area (the numbers of selected pairs for V1, V2, V3, LOC, FFA,
PPA, LVC, and HVC: 24, 22, 29, 38, 36, 22, 27, and 39 pairs for Subject 1; 15,
23, 21, 24, 7, 47, 20, and 47 pairs for Subject 2; 6, 5, 5, 8, 5, 9, 8, and 11 pairs
for Subject 3, respectively).
77
HVC
LVC
Subject 1
All
Selected
Decoding accuracy (%)
50
Subject 2
50
Subject 3
50
36
0
36
0
Time to awakening (s)
Figure 35. Time course of pairwise decoding accuracy. The time course of pairwise decoding accuracy is shown for individual subjects (shades, 95% CI; averaged
across all or selected pairs). Averages of three fMRI volumes (9 s; HVC or LVC)
around each time point were used as inputs to the decoders. The performance is
plotted at the center of the window. The gray region indicates the time window
used for the main analyses (the arrow denotes the performance obtained from the
time window). No corrections for hemodynamic delay were conducted. Note that
fMRI signals after awakening may be contaminated with movement artifacts and
brain activity associated with mental states during verbal reports. Mental states
during verbal reports are unlikely to explain the high accuracy immediately after
awakening, because the accuracy profile does not match the mean duration of
verbal reports (34 ± 19 s, mean ± SD; three subjects pooled).
78
Subject 1
202th awakening
10
male
0
Subject 2
144th awakening
female
male
Score
10
0
Subject 3
34th awakening
display
10
0
48
36
24
12
0
Time to awakening (s)
Figure 36. Examples for the time courses of synset scores. The time courses
of synset scores from multilabel decoding analyses are shown for four individual
dream examples. The plots represent the scores for the reported synset(s) (bold
line with synset name) and the unreported synsets using the colors in the legend
for Fig. 26.
79
Subject 1
Reported
Unreported (High/Low)
0
Score
Subject 2
0
Subject 3
0
48
36
24
12
0
Time to awakening (s)
Figure 37. Time courses of averaged synset scores for each subject. Synset scores
were averaged across awakenings for reported (red) and unreported synsets with
high (blue) and low (gray) co-occurrence in individual subjects (shades, 95% CI).
80
80
Subject 1
Original
Extended
60
40
20
Identification accuracy (%)
0
80
Subject 2
60
40
20
0
80
Subject 3
60
40
20
0
2 4
8
16
Candidate set size
32
Figure 38. Identification performance for individual subjects. The identification performance with original and extended visual content vectors is shown for
individual subjects (shades, 95% CI; dashed line, chance level).
81