Project Vision & Research Goals

Applied Foundation Model Research (Appl-FM) - for Science and Industry

Foundation Models (FMs) are large, general-purpose Machine Learning (ML) models that are trained on very broad data and can then be adapted to many different tasks instead of being built for just one specific problem. FMs can be considered what economists refer to as a general-purpose technology. General-purpose technologies refer to technologies like the steam engine and electricity, which drive waves of transformation and productivity growth due to their pervasiveness, improvement over time, and ability to spawn complementary innovations (a host of products and services that revolve around one core product). While FMs may not be pervasive now, they seem poised to be the basis of widespread technological innovations and have the key hallmarks of a general-purpose technology. As a result, these models are likely to bring about important economic impacts in three broad areas: productivity, wage inequality, and ownership.

Project Vision

Future aspects of ML will be largely concerned with FMs, their improved generalizability, their multimodality, and their translation into real-world applications. Our aim of this interdisciplinary initiative is to understand and explore methods for FMs in application for research. We choose three research scenarios: Robotics and Behavioral Learning, Quantitative Biology, and Predictive Medicine. ML systems working across these scenarios require learning data efficiently from multiple modalities, scarce data, with sparse distributions, and in very high dimensionality.

FMs can address these requirements. This new, successful paradigm enables the training of one model on broad data at scale and adapting it to many applications. These powerful models are already generating photo-realistic images from text input or engaging in real-time communications at scale to name just a few. However, adapting these models to further applications is not trivial or even impossible, which blocks their use for a majority of industrial, medical and scientific use-cases, which is exactly the overarching issue we address.

Research Goals

Our scientific goals focus on understanding how these FMs can be adapted with small, sparse training data and novel brain-inspired architectures. We would like to understand how these models can learn continuously, how we can learn complex motion trajectories by demonstration, and finally how we can holistically evaluate their robustness, fairness, and explainability in an automated fashion.
We will focus on robust and data-efficient models for three research scenarios: robotics, quantitative biology, and predictive medicine (R1-R3). Our aim is to make advances in adaptation, applicability, and deployment of FMs for applications in the selected research scenarios. They serve as a demonstration of potential applications, where insights thus gained can be translated to other domains.

Infografic Overview research scenarios & Machine Learning methods within the project

We define challenges based on our research questions in the scenarios (R1-R3). The challenges lead to emerging requirements for intelligent data processing. These needs are then aligned with four key areas of the field of ML (M1-M4; research methods). Those in return will progress the research scenarios once they are successfully implemented.

Research Challenges

R1	In Robotics, a considerable challenge is to transfer human manipulation and locomotion skills using learning from demonstration to robotics. Recording with human interaction is time consuming and expensive. Ideally, the robot should learn directly from observing complex demonstrations.
R2	In Quantitative Biology & Smart Microscopy, a main challenge is to extract specific quantitative information from imbalanced, rare imaging data sets with the goal to improve effectiveness and reproducibility in clinically relevant fields.
R3	In Predictive Medicine, it is a challenge to support doctors with trustworthy decision support systems for comprehensive sets of diseases based on few, multimodal data.

Despite the diversity of the aforementioned challenges, the solutions to these challenges share fundamental data requirements:

Predicting, retrieving, and generalizing from rare, sparse, and continually arriving data.
Multimodality and event detection.
Data quality, explainability, robustness, and fairness.

We will investigate how FMs can be adapted and augmented to meet these requirements focusing on four key areas of the field of ML (M1-M4).

Key Areas of the Field of ML - Research Methods

To meet the challenges, satisfy the data requirements from the various research scenarios, and to achieve our scientific goals, we focus on the following methods (M1-M4):

M1	Continual Few-Shot Learning. We will research novel brain-inspired methods and architectures, with the aim of improving capabilities in few-shot and continual learning.
M2	Vision and Motion. We will study data-centric workflows for annotating vision models that support human feedback, incorporate multimodal data, and learn complex motion by demonstration.
M3	Holistic Evaluation, Fairness, Robustness. We will examine how to automate and holistically evaluate FMs, their robustness, fairness, and explainability.
M4	Adapting Text Embeddings. We will refine text presentations in a multimodal setup, based on existing FMs.

Long-term Impact

The long-term impact of FM research is difficult to predict with certainty, as it is an area that is still rapidly evolving and developing. However, it is likely that FMs will play an increasingly important role in a wide range of applications across various fields. Despite the research in FMs to date, it remains unclear if the standard architectures today could be truly applicable to specific requirements for multiple domains, tasks, and languages. Some potential impacts of our FM research in the long term could include:

Accessibility. One of the most prominent arguments for providing access to systems is to avoid allowing a few high-resource organizations to collect so much power, such that they are the only groups capable of developing and deploying these systems. Large technology companies can create powerful AI systems because of their access to training data, computing infrastructure, and commercial capabilities for deploying that system. This monopolization also gives these high-resource organizations more influence over AI development, the behavior of these systems, and the narrative and direction of the field. Our research intends to roll back this negative trend. Our long-term impact is in methods for adapting existing publicly available models for analyzing bias and harmfulness in a holistic and automated fashion, and in providing multimodal models to both well-studied and currently neglected niche domains.

Disciplinary diversity in our research group. Research on building FMs has occurred almost exclusively in industry. Given the importance of disciplinary diversity in understanding and solving problems that combine technical, ethical, legal, social, and political dimensions, we see academia playing a crucial role in investigating FMs.