Data Scientists Uncovered: Episode 2

Before departing the DSI for the data-driven community of Silicon Valley in California, Dr. Lei Nie sat down to discuss his research interests and his paper concerning brain networks. He will be missed here at DSI and much luck in the future! Click here for more on Lei Nie.


– What was your research focus before writing this paper?

Before this area of research, I was mostly focused on biological network inference. Inferring protein networks then directed my focus to brain networks. A network, in a simple way, is defined as several nodes and connections between the nodes. For most networks, nodes are well defined. For example, in a protein network, a node is defined as a specific protein; the connections are the interactions between the proteins. When my research began to focus on the brain networks, I found that the brain networks are different.  The definitions of nodes are unclear in brain networks. For example, a node could be a neuro or a piece of tissues or a brain region. They vary in size and other properties. I wanted to find a proper way to define the nodes of brain networks.

– What are your main interests in research/ What areas are you primarily interested in

In most previous work, the nodes of a brain network are defined using brain parcellations. A brain parcellation is to divide a brain into several regions focusing on specific functions. A node abstracts a brain region and a connection models the relationship between two brain regions.

In most previous studies for brain networks, brain parcellations are fixed across all subjects. Every subject is assumed to have identical parcellations.  Recent studies show this is not true, parcellations vary across subjects. If you assume all are exactly the same, it will introduce extra noises when analyzing the brain networks. I wanted to design a method to infer the individual variations of the brain parcellations.

– What influenced you to write this paper?

When initiating my research, I found some promising results which influenced me to continue further. I also think the results I have concluded will be helpful to others and brain research upon publishing.

– Broadly could you summarize your method and findings in the paper.

First, I designed an algorithm called joint k-means which is based on the widely used clustering algorithm, k-means. This method can parcellate a group of brains of subjects in a simultaneous way. During the execution of this method, the parcellations share information to improve robustness to the noises. Finally, we used this method to parcellate a large group of subjects from the Human Connectome Project. Then I found something interesting.

  1. First, we found out that most variable area in the cortex was previously found to be highly correlated with intelligence (IQ).
  2. Individual parcellations, not group average one, are highly unique for each subject. We can use the individual parcellations to identify the subjects with 99.9% correctness. This shows that for each different subject, people have different parcellation; making the previous assumption inappropriate.

– What challenges arose during your research?

There were several challenges while completing my research. We were dealing with BIG DATA. The total data size was 10 terra byte. We had to simplify the data set to fit them into the memory. Given the size of the dataset, only simply algorithms work not complex ones. Second, we used a functional MRI to do parcellation. Functional magnetic resonance imaging, or fMRI, is a technique for measuring brain activity.  fMRI is used to produce activation maps showing which parts of the brain are activated during a particular mental process. A fMRI is very noisy, so the majority work was to fight against the noise to find out useful information.

– What is the significance behind using the JK means method in your research?

Before creating the joint k-means method, I tried the most commonly used k-means method. I chose the k-means method because it is efficient to handle large data sets. But, I found some drawbacks of the k-means because it can only parcellate each cortex independently. K-means is sensitive to noises. So, even if two fMRI dataset acquired from the same subject under the same conditions, the K-means method of parcellation may result in very different parcellations. We didn’t want this.

Our assumption is that if the conditions are the same for the same subject, the parcellations should be very similar. The joint K-means based on the K-means, while we added some prior knowledge to guide generation of parcellations.  Moreover, the prior knowledge can be learned from the data. The idea is to parcellate a  group of subjects simultaneously to get a distribution of parcellations; then this distribution is used to refine the parcellations. These two steps are executed in an iterative way.

-What further research could this paper lead to?

Our results show that variations in the parcellations across subjects are significant. What I want to happen next is for others to use these variations and determine whether they are correlated to the behavior of the people, e.g. IQ, or their satisfactions/ achievements in life.

For brain networks, I encourage future studies to not use fixed parcellation because they are inappropriate. Use individual parcellations to form the nodes of brain networks may get accurate results.

– What is next for you in research?

The study of the brain is very interesting to me. For most of organs, we understand the mechanisms of them to implement their functions, e.g. the heart and lungs.  However, we don’t know that much about brains. I also want to bring about the start of healthcare check-ups for brains. We can get health checks on almost every organ, except the brain. Moreover, there’s organ transplantations or replacements for almost every organ, but not for brains. The brain is very specific organ and we have to take more care of it.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s