Sun. Jan 29th, 2023


We use a joint EEG-fMRI dataset initially recorded with the intent to discover the results of music on the feelings of the listener. The complete particulars of the experiments are described elsewhere in Ref.59 and the info is publicly available60,61. We briefly summarise the important thing particulars of the dataset beneath.


Individuals had been requested to hear to 2 totally different units of music. The primary set comprised a group of generated items of piano music, which had been generated to focus on particular affective states and was pre-calibrated to make sure they might induce the focused impacts of their listeners. The second set of music was a set of pre-existing classical piano music items, which had been chosen for his or her means to induce a variety of various impacts.

On this research we solely make use of the neural information recorded in the course of the generated music listening job. As detailed beneath, contributors listened to a collection of items of music over various kinds of trials. In some trials contributors had been requested to repeatedly report their feelings, whereas in others they had been requested to simply hearken to the music.


A complete of 21 wholesome adults participated within the research. All contributors had been aged between 20 and 30 years previous and had been right-handed with regular or corrected to regular imaginative and prescient and regular listening to. All contributors had been screened to make sure they might safely take part in a joint EEG-fMRI research. Ten of the contributors had been feminine. All contributors obtained £20.00 (GBP) for his or her participation.


Moral permission was granted for the research by the College of Studying analysis ethics committee, the place the research was carried out. All experimental protocols and strategies had been carried out in accordance with related moral pointers. Knowledgeable consent was obtained from all contributors.


Useful magnetic resonance imaging (fMRI) was recorded utilizing a 3 Tesla Siemans Magnetom Trio scanner with Syngo software program (model MR B17) and a 37-channel head coil. The scanning sequence used comprised a gradient echo planar localizer sequence adopted by an anatomical scan (subject of view: 256 (instances) 256 (instances) 176 voxels, TR = 2020 ms, TE = 2.9 ms, dimensions of voxels = 0.9766 (instances) 0.9766 (instances) 1 mm, flip angle = 9°). This was then adopted by a set of gradient echo planar purposeful sequences (TR = 2000 ms, echo time = 30 ms, subject of view = 64 (instances) 64 (instances) 37 voxels, voxel dimensions = 3 (instances) 3 (instances) 3.75 mm, flip angle = 90°). The ultimate sequence, utilized after the music listening a part of the experiment was accomplished, was one other gradient echo planar sequence.


EEG was recorded through an MRI-compatible BrainAmp MR and BrainCap MR EEG system (BrainProducts Inc., Germany). EEG was recorded from 32 channels (31 channels for EEG and 1 channel for electrocardiogram) at a pattern price of 5000 Hz with out filtering and with an amplitude decision of 0.5 (upmu)V. A reference channel was positioned at place FCz on the worldwide 10/20 system for electrode placement and all impedance’s on all channels had been stored beneath 15 okay(Omega) all through the experiment.

Co-registration of the timing of the EEG and fMRI recordings was achieved by a mix of the BrainVision recording software program (BrainProducts, Germany), which recorded set off indicators from the MRI scanner, and customized written stimuli presentation software program written in Matlab (Mathworks, USA) with Psychtoolbox62.


The music performed to the contributors was generated with the intention of inducing a variety of various affective states. In whole 36 totally different musical items had been generated to focus on 9 totally different affective states (combos of excessive, impartial, and low valence and arousal).

Each bit of music was 40 s lengthy and was generated by an affectively pushed algorithm composition system that was primarily based on a man-made neural network63 that had been beforehand validated on an impartial pool of participants64. The ensuing music was a bit of mono-phonic piano music as performed by a single participant.


The experiment was divided right into a collection of particular person duties for the contributors to finish. These duties fell into three differing types:


Music solely trials In these trials contributors had been requested to simply hearken to a bit of music.


Music reporting trials In these trials contributors had been requested to hearken to a bit of music and, as they listened, to repeatedly report their present felt feelings on the valence-arousal circumplex65 through the FEELTRACE interface66.


Reporting solely trials These trials had been used to regulate for results of motor management of the FEELTRACE interface. Individuals had been proven, on display screen, a recording of a earlier report they’d made with FEELTRACE and had been requested to breed their recorded actions as precisely as they might. No music was performed throughout these trials.

Inside every trial contributors had been first offered with a fixation cross, which was proven on display screen for 1–3 s with a random, uniformly drawn, length. The duty then took 40 s to finish and was then adopted by a brief 0.5 s break.

All sound was offered to contributors through MRI-compatible headphones (NordicNeurolab, Norway). Individuals additionally wore ear-plugs to guard their listening to and the quantity ranges of the music had been adjusted to a cushty stage for every participant earlier than the beginning of the experiment.

The trials had been offered in a pseudo-random order and had been break up over 3 runs, every of which was roughly 10 min lengthy. A 1-min break was given between every pair of runs and every run contained 12 trials in whole.


Each the EEG and the fMRI indicators had been pre-processed to take away artefacts and permit for additional evaluation.


The fMRI information was pre-processed utilizing SPM12 software67 working in Matlab 2018a.

Slice time correction was utilized first, utilizing the primary slice of every run because the reference picture. This was adopted by removing of motion associated artefacts from the photographs through a technique of realignment and un-warping utilizing the strategy initially proposed by Friston et al.67. The sphere maps recorded in the course of the scan sequences had been used to right for picture warping results and take away motion artefacts. A 4 mm separation was used with a Gaussian smoothing kernel of 5 mm. A 2nd diploma spline interpolation was then used for realignment and a 4th diploma spline interpolation for un-warping the photographs.

We then co-registered the purposeful scans towards the high-resolution anatomical scan for every participant earlier than normalising the purposeful scans to the high-resolution anatomical scan.

Lastly, the purposeful scans had been smoothed with a 7 mm Gaussian smoothing kernel and a 4th diploma spline interpolation perform.


The fMRI scanning course of induces appreciable artefacts within the EEG. To take away these mechanistic artefacts the Common Artefact Subtraction (AAS) algorithm was used68. The model of AAS carried out within the Imaginative and prescient Analyser software program (BrainProducts, Germany) was used. The cleaned EEG was then visually checked to substantiate that every one the scanner artefacts had been eliminated.

Physiological artefacts had been then manually faraway from the indicators. The EEG was first decomposed into statistically impartial parts (ICs) by software of second order blind identification (SOBI)69 a variant of impartial part evaluation that identifies a de-mixing matrix that maximises the statistical independence of the second order derivatives of the indicators.

Every ensuing IC was then manually inspected within the time, frequency, and spatial domains by a researcher with 10+ years expertise in EEG artifact removing (creator ID). Elements that had been judged to include artefacts (physiological or in any other case) had been manually eliminated earlier than reconstruction of the cleaned EEG. A ultimate visible inspection of the cleaned EEG was carried out to substantiate that the ensuing indicators are free from all forms of artifact.

fMRI evaluation

The fMRI dataset was used to determine voxels with exercise that considerably differs between music listening (trials through which contributors hearken to music solely and trials through which contributors each hearken to music and report their present feelings) and non-music listening trials (trials through which contributors solely use the FEELTRACE interface with out listening to music).

Particularly, a normal linear mannequin was constructed for every participant and used to determine voxels that considerably differ (T-contrast) between these two situations. Household-wise error price was used to right for a number of comparisons (right p < 0.05). The ensuing clusters of voxels had been used to determine mind areas which exhibit exercise that considerably co-varies with whether or not the contributors had been listening to music or not.

Supply localisation

An fMRI-informed EEG supply localisation strategy was used to extract EEG options which can be most probably to be informative for reconstruction of the music contributors listened to from their neural information. To this finish we first constructed a excessive decision correct conductivity mannequin of the top. We then used a beam-former supply reconstruction technique, carried out in Fieldtrip70, to estimate the exercise at a set of particular person supply places within the mind. These supply places had been chosen primarily based on the fMRI evaluation outcomes on a per participant foundation.

The whole course of is illustrated in Fig. 6.

Determine 6

Evaluation pipeline illustration. Anatomical MRI is used to assemble head fashions, whereas fMRI is used to determine voxels that differ between music and no music situations. EEG is decomposed through ICA and fMRI-informed supply evaluation is used to characterise exercise at fMRI-identified places. The ensuing function set is used to coach a biLSTM to recuperate the music a participant listened to. A cross-fold prepare and validation scheme is used for every participant.

Mannequin building

An in depth head mannequin was constructed for every participant to mannequin conductivity inside the head from every participant’s particular person anatomical MRI scan. Fieldtrip was used to assemble this model70.

The anatomical scan from every participant was first manually labelled to determine the positions of the nasion and the left and proper pre-auricular factors. The scan was then segmented into grey matter, white matter, cerebral spinal fluid, cranium, and scalp tissue utilizing the Fieldtrip toolbox70. Every segmentation was then used to assemble a third-dimensional mesh mannequin out of units of vertices (3000 vertices for the grey matter and the cerebral spinal fluid, 2000 vertices for every of the opposite segments). These mesh fashions had been then used to create a conductivity mannequin of the top through the finite factor method71,72. We specified the conductivity of every layer utilizing the next standardised values: grey matter (=0.33) S/m, white matter (=0.14) S/m, cerebral spinal fluid (=1.79) S/m, cranium (=0.01) S/m, and scalp (=0.43) S/m. These values had been chosen primarily based on suggestions in71,73,74.

The EEG channel places had been then manually fitted to the mannequin by a technique of successive rotations, translations, and visible inspection. Lastly, a lead-field mannequin of the dipole places contained in the conductivity mannequin was computed from a grid of 1.5 (instances) 1.5 (instances) 1.5 cm voxels.

Supply estimation

Supply estimation was achieved through the use of the conductivity head mannequin and the eLoreta supply reconstruction method75,76 to estimate the electro-physiological exercise at particular voxel places inside the head mannequin. Particularly, voxel places within the mannequin had been chosen primarily based on the outcomes of research of the fMRI datasets (see the “fMRI evaluation” part).

From the set of voxels that had been recognized, through the GLM, as containing exercise that considerably differs between the music and no music situations a sub-set of voxel cluster centres had been recognized as follows.


Start with an empty set of voxel cluster places V and a set of candidate voxels C, which include all of the voxels recognized through our GLM-based fMRI evaluation as considerably differing between the music and no music trials.


Establish the voxel with the most important T-value (i.e. the voxel that has the most important distinction in variance between the music and no music situations).


Measure the Euclidean distance between the spatial location of this voxel within the head and all voxels at the moment within the set V. If the smallest distance is larger than our minimal distance m add it to the set V.


Take away the candidate voxel from the set C.


Repeat steps 2–4 till the set V accommodates (n_l) voxels.

This course of ensures that we choose a sub-set of voxel places that differentiate the music and no music trials, whereas making certain this set of voxels are spatially distinct from each other. This ends in a set of (n_l) voxel places that characterise the distributed community of mind areas concerned in music listening. In our implementation we set the minimal distance m = 3 cm and (n_l) = 4 voxel places.

Function set building

To extract a set of options from the EEG to make use of for reconstructing the music performed to contributors we first use impartial part evaluation (ICA) to separate the EEG into statistically impartial parts. Every impartial part is then projected again to the EEG electrodes by multiplying the part by the inverse of the de-mixing matrix recognized by the ICA algorithm. This offers an estimate of the EEG indicators on every channel if solely that impartial part had been current.

This IC projection is then used, together with the pre-calculated head mannequin for the participant, to estimate the supply exercise at every of the (n_l) = 4 places recognized by our supply estimation algorithm (see “Supply estimation” part). This ends in a matrix of 4 (instances N_s) sources for every IC projection, the place (N_s) denotes the entire variety of samples within the recorded EEG sign set. These matrices are generated for every IC projection and concatenated collectively to kind a function matrix of dimensions ((4 instances M) instances N_s), the place M denotes the variety of EEG channels (31 in our experiment). Thus, our ultimate function vector is a matrix of EEG supply projections of dimensions (124 instances N_s).

Music prediction

Reconstruction of the music contributors heard from fMRI-informed EEG sources is tried through a deep neural community. Particularly, a stacked 4-layer bi-directional lengthy short-term reminiscence (biLSTM) community is constructed. The primary layer is a sequence enter layer with the identical variety of inputs as options (124). 4 biLSTM layers are then stacked, every with 250 hidden items. A 1-layer absolutely related layer is then added to the stack, adopted by a regression layer. The structure of the biLSTM community is illustrated in Fig. 7.

Determine 7

Structure of the biLSTM used to aim to recuperate heard music from our fMRI-informed EEG supply evaluation.

The music performed to every participant is down-sampled to the identical pattern price because the EEG (1000 Hz). Each the music and the function vector (see the “Function set building” part) are then additional down-sampled by an element of 10 from 1000 to 100 Hz.

The community is skilled and examined to foretell this music from the EEG sources inside a 3 (instances) 3 cross-fold prepare and check scheme. Particularly, every run of the three runs from the experiment is used as soon as because the check set in every fold. The coaching and testing information comprise the time collection of all EEG pattern factors and music samples from all time factors when the contributors listened to music (trial sorts 1 and a pair of, see the experiment description above) inside every run.

Statistical evaluation

We consider the efficiency of our decoding mannequin in a number of methods.

First, we evaluate the time collection of the reconstructed music with the unique music performed to the contributors through visible inspection and through a correlation evaluation in each the time and frequency domains. Particularly, the Pearson’s correlation coefficient between the unique and reconstructed music (downsampled to 100 Hz), within the time area is measured. We then evaluate the ability spectra of the unique and reconstructed music through Pearson’s correlation coefficient. We additionally measure the structural similarity38 of the time–frequency spectrograms between the unique and reconstructed music.

For every of those indices of similarity between the unique and reconstructed music we measure the statistical significance through a bootstrapping strategy. We first generate units of reconstructed music beneath the null speculation that the reconstructed music will not be associated to the unique music stimuli by shuffling the order of the reconstructed music trials. We repeat this 4000 instances for every similarity measure (correlation coefficients and structural similarity) and measure the similarity between the unique music and the shuffled reconstructed music in every case with a view to generate null distributions. The chance that the measured similarity between the unique music and the un-shuffled reconstructed music is drawn from this null distribution is then measured with a view to estimate the statistical significance of the similarity measures.

Second, we use the reconstructed music to aim to determine which piece of music a participant was listening to inside every trial. If the decoding mannequin is ready to reconstruct an inexpensive approximation of the unique music then it must be doable to make use of this reconstructed music to determine which particular piece of music a participant was listening to in every trial.

Particularly, we first z-score the decoded and unique music time collection with a view to take away any variations in amplitude scaling. We then band-pass filter each indicators within the vary 0.035 Hz to 4.75 Hz. These parameters had been chosen to protect the visually obvious similarities within the amplitude envelopes of the unique and decoded music, which had been noticed upon visually inspecting a subset of the info (contributors 1 and a pair of).

We then segmented the indicators into particular person trials as outlined by the unique experiment. Particularly, every trial is 40 s lengthy and includes a single piece of music. For a given trial the structural similarity is measured between the spectra of the unique music performed to the participant in that trial and the spectra of the reconstructed music. The structural similarity is then additionally measured between the time-frequency spectra of the reconstructed music for that very same trial and the time–frequency spectra of the unique music performed to the participant in all the opposite trials through which the participant heard a distinct piece of music. Particularly, we measure

$$beginaligned C_k,okay = textual content ssim ( R_k, M_k ), endaligned$$



$$beginaligned C_k,i = textual content ssim ( R_k, M_i )~~~~~~~~forall i in A, endaligned$$


the place (R_k) denotes the time–frequency spectrogram of the reconstructed music for trial okay, (M_i) denotes the time–frequency spectrogram of the unique music performed to the participant in trial i, and (textual content ssim ) signifies using the structural similarity measure. For a given trial okay the worth of (C_k,i) is measured for all trials within the set A ((i in A)), the place A is outlined as

$$beginaligned A = 1,…N_t setminus okay, endaligned$$


and denotes the complement of set of all trials (1, …, N_t) (the place (N_t) denotes the variety of trials) and the trial, okay, for which we reconstructed the music performed to the participant through our decoding mannequin.

We then order the set of structural similarity measures (C = C_k,i~forall i in 1,…,N_t) and determine the place of (C_k,okay) on this ordered record with a view to measure the rank accuracy of trial okay. Rank accuracy measures the normalised place of (C_k,okay) within the record and is the same as 0.5 beneath the null speculation that the music can’t be recognized. In different phrases rank accuracy measures the power of our decoder to appropriately decode our music by measuring how related the decoded and unique music are to 1 one other in comparison with the similarity between the decoded music and all different doable items of music. Lastly, we measure the statistical significance of our rank accuracy through the tactic described by Ref.77.

Impact of tempo

A lot of research have reported vital results of music tempo on the EEG39,40,41,42. Subsequently, we examine whether or not the tempo of the music performed to contributors considerably results the efficiency of our decoding mannequin.

Particularly, we estimate the vary of tempos inside every 40 s lengthy piece of music stimuli and the corresponding imply tempo. We then check whether or not the imply tempo of the music considerably results the efficiency of our decoding mannequin by measuring the Pearson’s correlation coefficient between the imply tempo of the music performed to the participant inside every trial and the corresponding rank accuracy measure of the decoders efficiency for that very same trial. Moreover, we additionally measure the probability that the imply tempo for the music inside a single trial was drawn from the distribution of imply tempos over all trials. This enables us to estimate whether or not the tempo of the music inside a trial is ‘typical’ or much less ‘typical’. We measure the correlation between this measure of the typicality of the tempo of the music and the efficiency of the decoder on this trial to determine whether or not trials with uncommon tempos (sooner or slower than typical) are categorized extra (or much less) precisely.

In each instances we hypothesise that if our decoder is predominately making use of the tempo of the music there shall be vital correlations between the decoder’s efficiency and both the tempo of the music or the probability (typicality) of the tempo of the music.

Confound consideration

Using headphones to play music to contributors presents one potential confounding think about our evaluation. Though the headphones we used had been electromagnetically shielded in a means that’s appropriate to be used inside an fMRI scanning setting there’s a chance that their proximity to the EEG electrodes result in some induced noise within the recorded EEG indicators. This noise may both be electromagnetic noise from {the electrical} operation of the headphones of vibrotactile from the vibration of the headphones.

We anticipate that if so the noise removing utilized to the EEG ought to take away this noise. Certainly, our visible inspection of our cleaned EEG indicators didn’t reveal any obvious induced noise. Nonetheless, we can not low cost the likelihood that some residual noise from the headphones (both electromagnetic or vibrotactile in nature) stays within the EEG sign and that that is used as a part of the decoding course of.

The one method to confirm that this was not the case is to aim to repeat the experiment with out using headphones. Subsequently, we make use of one other dataset recorded by our team78 utilizing typical audio system positioned over 1 m away from contributors to play related items of music. This dataset accommodates simply EEG recorded from contributors whereas they listened to related units of artificial music stimuli in a separate experiment. As this dataset solely accommodates EEG information participant particular fMRI-informed supply evaluation will not be doable. As an alternative, we use the averaged fMRI outcomes from all our contributors in our EEG-fMRI experiments to supply an averaged head mannequin and averaged supply dipole places for the fMRI-informed supply evaluation step in our decoding pipeline.

We first element this dataset after which go on to explain how we tailored our evaluation pipeline to aim to decode music performed to contributors on this experiment.


Our EEG solely dataset was initially recorded as a part of a set of experiments to develop a web-based brain-computer music interface (BCMI). These experiments, their outcomes, and the way in which the dataset is recorded are described intimately in Ref.78. We additionally describe the important thing particulars right here.

A cohort of 20 wholesome adults participated in our experiments. EEG was recorded from every participant through 32 EEG electrodes positioned based on the worldwide 10/20 system for electrode placement at a pattern price of 1000 Hz.

Individuals had been invited to take part in a number of classes to first calibrate, then prepare, and eventually to check the BCMI. For our functions on this current research we solely use the EEG information recorded from contributors in the course of the calibration session.

Within the calibration session a collection of artificial music clips had been performed to contributors. Every clip was 20 s lengthy and contained pre-generated piano music. The music was generated by the identical course of used for our EEG-fMRI experiments (see the “Stimuli” part). A complete of 90 distinctive artificial music clips had been performed to the contributors in random order. Every clip was generated for the aim of the experiment (making certain the contributors had by no means heard the clip earlier than) and focused a particular affective state. Individuals had been instructed to report their present felt have an effect on as they listened to the music utilizing the FEELTRACE interface in the same method to the joint EEG-fMRI experiments described above.

Particulars of the dataset and accompanying stimuli are described in Refs.61,78. The information can also be revealed in Ref.79.


Moral permission for recording this second dataset was additionally granted by the College of Studying analysis ethics committee, the place the research was initially carried out. All experimental protocols and strategies had been carried out in accordance with related moral pointers. Knowledgeable consent was obtained from all contributors.


Our decoding mannequin is modified barely to aim to reconstruct the music performed to contributors within the EEG-only experiments. Particularly, we use the imply common of the fMRI outcomes from our cohort of contributors in our joint EEG-fMRI dataset to determine the set of voxels to be used in our fMRI knowledgeable EEG evaluation. Moreover, our head mannequin used within the fMRI-informed EEG supply evaluation step in our decoding mannequin is constructed from an averaged MRI anatomical scan supplied inside SPM1267.

All different phases of our decoding mannequin and evaluation pipeline—together with EEG supply localisation, biLSTM community construction, and statistical evaluation—are the identical.

By Admin

Leave a Reply