Co-head-researcher for the ENCODE project, head of the CRG Bioinformatics and Genomics Group and professor at Pompeu Fabra University
He is one of the most world-renowned Catalan scientists. He has participated in various international projects to analyze genome sequencing: identifying genes, automatically extracting information from databases, analyzing protein sequences and molecular evolution. He moved to the Los Alamos National Laboratory, where he did a postdoctoral fellowship and began to delve into issues related to genome analysis. He has worked as a research at the Hospital del Mar Research Institute and has coordinated the Bioinformatics and Genomics program at the Center for Genomic Regulation.
The Center for Genomic Regulation (CRG) is celebrating its tenth anniversary, which coincides with the publication of the results of one of the most important studies of the 21st century in the field of the life sciences, the decoding of the human genome map, which the Bioinformatics and Genomics research group led by Roderic Guigó took part in. The CRG is the only Catalan research center to participate in the first and second stages of this project (ENCODE) funded by the US National Health Institutes, and is the only Spanish center to continue on in the third stage, which begins this December.
Catalonia is the most active Spanish community in genomics. The CRG, whose facilities are located at the emblematic Barcelona Biomedical Research Park (PRBB), is a benchmark center in this field. The level of excellence at this basic biomedical research institute can be measured according to many parameters, one of which is the 11 grants they have received from the European Research Council (3 Advanced and 8 Starting) in recent years.
How do you feel now, just a few weeks after presenting the results of the ENCODE project in London?
(Laughs) Happy to have been a part of it, but it’s a project that is finished and now we are working on others, including the continuation of ENCODE. We’re as busy now as we were a few weeks ago!
How can these studies improve quality of life for the general public?
It’s hard to say, or to give specific examples of the aspects of people’s lives that will improve, when we’re talking about such basic research. I’ll answer the question in more general terms: when we know how things work, we’re more capable of fixing them. In the case of ENCODE, if we know more about how living beings work, in theory this knowledge will help us correct body functions when they are not what we want. The translation from basic scientific knowledge to application, in this case the possibility of developing new therapies or approaches to disease, is a process that can take decades.
Will a day come when we will know that we are predisposed to certain diseases and, thus, able to prevent them?
Right now we know the genetic base of many diseases and we know that in some specific cases (Mendelian disorders) the disease is caused by a single mutation. Doctors are already doing prenatal interventions to prevent embryos with these mutations from developing. However, there are diseases that aren’t triggered by just one gene but by more general changes in the genetic environment. In these cases, we can’t predict that an individual will suffer from a specific disease with any certainty, but we can calculate whether they have a higher than average probability of developing it. There are many companies that offer genetic testing to determine a patient’s predisposition to a specific disease.
Will the ENCODE results help us move forward in this area?
There are other projects that are working towards obtaining the genome sequence of many individuals and correlating changes in the sequence of this genome and changes in the biological characteristics of the organisms. These projects will allow us to draw links between mutation and disease, but how this link occurs, the series of biological events that link mutation to DNA, this path is what the ENCODE project can help us move forward.
The data from the ENCODE project findings are available online to the scientific community as a whole. Why has this been done?
It couldn’t be any other way, as our research is publicly funded, meaning it is paid for by society as a whole. The research findings should be part of the public domain almost immediately because they are the result of all of our efforts. And, in fact, the most important breakthroughs in genomics and biology have been made possible thanks to the fact that researchers from around the world have had access to data. This has sped up progress exponentially.
How did the CRG come to participate in this project funded by the United States?
Our research group has been working in the field of computational genomics for years now, since I did my postdoctoral work in the United States, in fact. When I was there, in the early nineties, I began working with a program to develop methods to analyze and predict genome sequences. At that time there were very few genome sequences; it was the beginning of this controversial revolution. And, since then, our work in this area has become more widely recognized. We also participated in the human genome project and, after that, we participated more heavily in the mouse genome. When the US National Health Institutes were considering the path to take after the human genome sequencing, one of them was the ENCODE project, which began in 2003. We submitted a proposal, they accepted it and we participated in the first and second stages of the project –the results of which we have just recently presented— and now we have also been asked to participate in the third stage, which is set to begin in early December.
Is ENCODE a success story, an example of how research should be done nowadays, in teams, globally?
Well, there are many ways to do research. It’s true that this project has involved scientists from around the world —more than 400— and was more collaborative than competitive in nature. Scientists put a high value on being recognized for their discoveries, being the first, and this leads to a certain secrecy in some cases. With the ENCODE project, we have worked in a way that all ideas belong to the whole consortium, and it has yielded positive results. Nevertheless, there are other ways to do research in biology, in which researchers work in their own laboratories and hold on to their findings until they are published. I think we could create a model in which data is immediately accessible without obliging researchers to be the first to publish an article .
What are the next steps for ENCODE?
In the first meetings of the third stage we will define the lines to follow. One option is to study genome behavior in all types of human cells (brain, bone, muscle, liver, etc.) because so far the ENCODE project has only looked at a couple dozen cell types.
What funding will the project receive from the US Administration?
Roughly $1 million.
Do you expect to hold another scientific meeting in Barcelona, like the one in July 2010 organized under the framework of B·Debate (then known as the International Center for Scientific Debate)?
That’s something we’ll have to assess because it would put Barcelona on the global genome research map. It already is but we don’t want to rest on our laurels! More than 95% of the participants in this new stage are from North America, so if we don’t get funds to cover part of the expense of the congress it will surely be held in the United States… Although we all love coming here!
Tell us about the group of scientists you lead.
It is multidisciplinary: including statisticians, computer scientists, biologists, physicists and more. It is a very dynamic team with a high turnover rate. Younger and younger people are joining the team, above all from other countries, whose training is intrinsically multidisciplinary, meaning that they have studied biology and computer science at the same time and aren’t really biologists or computer scientists, but specialists in bioinformatics or biologists and computer scientists.
A highly international environment.
Some 75% of the fifteen scientists that make up the team are foreigners, although the proportion from Catalonia has grown recently. As I mentioned, there are countries that offer simultaneous training in biology and computer science, and having a person on the team who knows programming, has experience in statistics, and also understands biology programs is a priority over someone who has to be trained from scratch in highly competitive projects like this one. I don’t think it’s bad that a large percentage of the team is from abroad because our world isn’t just Catalonia. We want to attract the best and scientists by nature are highly mobile. It’s important that the best students in the world count our centers, universities and research groups among their goals, on par, in terms of their scientific career, with Stanford, Cambridge, Oxford or Harvard.
Catalonia needs bioinformatics experts and you have said on occasion that we aren’t training students in this field.
Even through our research groups have important economic problems right now, if there is one profile they are looking for it is someone with a background in bioinformatics. Right now, the only path is to study an undergraduate degree in biology or computer science and then a graduate degree to specialize in bioinformatics. This process is a bit long and I think it would be good for there to be a more hybrid training option.
What other projects are you currently working on?
Our group’s research always seeks to understand how information is transferred from the DNA sequence to the amino-acid sequence in proteins, and all of the molecular processes involved. We have a project funded by the European Research Council (ERC) that is in the same line as the ENCODE project, but in this case we aim to see how genome function changes during the cell differentiation process, which is when a cell of one type turns into another type. It’s like a dynamic ENCODE. We are also working on two other very interesting projects that involve genome variation and how this is manifested in RNA production.