ml_bioinformatica_6ed: Post3 Modulo 2: Foundation models en bioinformática: ¿nueva inteligencia para los datos ómicos o nueva caja negra? / Foundation Models in Bioinformatics: New Intelligence for Omics Data or a New Black Box?

Post3 Modulo 2: Foundation models en bioinformática: ¿nueva inteligencia para los datos ómicos o nueva caja negra? / Foundation Models in Bioinformatics: New Intelligence for Omics Data or a New Black Box?

de Coral del Val Muñoz - jueves, 19 de marzo de 2026, 23:30

In this module we are working on data analysis, programming languages, exploratory data analysis, and functional annotation. But it may be worth opening a very timely question: how might foundation models change the way we analyse and interpret omics data?

In this context, foundation models are large-scale models trained on massive biological datasets, such as DNA sequences, transcriptomics, proteomics, or single-cell data, to learn general representations that can later be reused across many different tasks. In recent years, they have started to be applied in genomics, functional annotation, variant prediction, and single-cell analysis, and they are increasingly being presented as a new layer of infrastructure for bioinformatics.

This raises an interesting question: if these models can learn complex patterns at scale, could they transform tasks such as functional annotation, variant interpretation, multi-omics integration, or even part of exploratory data analysis? There are already studies showing promising results in genome annotation at single-nucleotide resolution and in the use of pretrained models for diverse downstream tasks with limited additional data.

At the same time, however, an important caution emerges: greater predictive power does not automatically mean greater biological understanding. Recent benchmarking studies show real promise, but also make clear that major challenges remain in interpretability, rigorous evaluation, generalization, and biological reliability.

So perhaps the question is not only whether foundation models will change bioinformatics, but also what role expert knowledge will continue to play.

Will EDA, statistical validation, and classical functional annotation remain just as important?
Or will some of these tasks increasingly be mediated by ever more powerful pretrained models?
And above all, how do we prevent a very powerful tool from also becoming a new black box?

References

· Dalla-Torre H, et al. Nucleotide Transformer: building and evaluating robust foundation models for human genomics. Nature Methods (2025). (Nature)

· de Almeida BP, et al. Annotating the genome at single-nucleotide resolution with DNA foundation models. Nature Methods (2025). (Nature)

· Feng H, et al. Benchmarking DNA foundation models for genomic and genetic tasks. Nature Communications (2025). (Nature)

· Tomaz da Silva P, et al. Nucleotide dependency analysis of genomic language models. Nature Genetics (2025). (Nature)

-++++++++++++++++

In this context, foundation models are large-scale models trained on massive biological datasets, such as DNA sequences, transcriptomics, proteomics, or single-cell data to learn general representations that can later be reused across many different tasks. In recent years, they have started to be applied in genomics, functional annotation, variant prediction, and single-cell analysis, and they are increasingly being presented as a new layer of infrastructure for bioinformatics.

So perhaps the question is not only whether foundation models will change bioinformatics, but also what role expert knowledge will continue to play.
Will EDA, statistical validation, and classical functional annotation remain just as important?
Or will some of these tasks increasingly be mediated by ever more powerful pretrained models?
And above all, how do we prevent a very powerful tool from also becoming a new black box?

References

· Dalla-Torre H, et al. Nucleotide Transformer: building and evaluating robust foundation models for human genomics. Nature Methods (2025). (Nature)

· de Almeida BP, et al. Annotating the genome at single-nucleotide resolution with DNA foundation models. Nature Methods (2025). (Nature)

· Feng H, et al. Benchmarking DNA foundation models for genomic and genetic tasks. Nature Communications (2025). (Nature)

· Tomaz da Silva P, et al. Nucleotide dependency analysis of genomic language

Foro de debate módulo 2

Post3 Modulo 2: Foundation models en bioinformática: ¿nueva inteligencia para los datos ómicos o nueva caja negra? / Foundation Models in Bioinformatics: New Intelligence for Omics Data or a New Black Box?

Post3 Modulo 2: Foundation models en bioinformática: ¿nueva inteligencia para los datos ómicos o nueva caja negra? / Foundation Models in Bioinformatics: New Intelligence for Omics Data or a New Black Box?

References

References

Centro de Producción de Recursos para la Universidad Digital

MOOC Machine Learning y Big Data para la Bioinformática. 6ª Edición

Foro de debate módulo 2

Post3 Modulo 2: Foundation models en bioinformática: ¿nueva inteligencia para los datos ómicos o nueva caja negra? / Foundation Models in Bioinformatics: New Intelligence for Omics Data or a New Black Box?

References

References

Centro de Producción de Recursos para la Universidad Digital