Peptide Computational Methods and Applications
Location: CECAM-IRL
Organisers
The peptide global market is rapidly expanding with its value estimated at $14.4 billion, accounting for 1.5% of the total worldwide pharmaceutical market. These peptides are usually between 5 and 40 amino acids and have a diverse range of applications and many advantages, including their ability to be produced at a large scale. Significant advancements in synthetic and recombination technologies have been a driving force in bringing bio-active peptides back to center stage as therapeutic and diagnostic tools. However, similar advances in artificial intelligence (AI) and data analytics methods for peptides are needed.
Recent advances in structural protein prediction have enabled rapid design of proteins with desired structural properties. However, understanding how to design proteins with specific functional or multi-functional properties is still challenging. This is critically important for smaller proteins – peptides – where there is often less secondary or tertiary structure to predict, and the functionality has a great dependence on extrinsically bound factors.
It is clear that new computational technology is needed to meet the requirements of the community that leverages the AI advances seen in protein modelling. However, peptide datasets are generally much smaller than protein training datasets, make it harder to build relevant AI representations, especially for non-canonical amino acids. To successfully scale AI in peptide studies, collaboration is necessary that integrates activities of data collection, software development, transfer learning and implementation.
This workshop seeks to bring together experts currently working in AI and peptide research. The workshop will highlight current state of the art machine learning techniques, including large language models (LLMs) and explainable AI being applied to peptide analysis and prediction. There will be a focus on establishing open-source guidelines and best practices in light of the recent EU AI act. Specifically, this workshop has the following objectives:
Session 1: AI and simulation approaches for Peptide discovery. This will focus on the discovery of peptides with target structures or functions. The session will show uses of state-of-the-art AI methods, including LLMs and generative AI, including but not limited:
- to discover linear, cyclic or other rigid structure peptides.
- to create peptides with unusual amino acids.
- to discover regulatable peptides.
- to discover multi-functional peptides.
Session 2: Peptide creation with low data or in uncertain environments. Most critical data during drug development of peptide leads comprises datasets of relatively few compounds in a particular class with relevant assays. This session wishes to highlight AI approaches which are developed to be used with sparse data and have been successfully applied to peptide research. It may be that certain assays are prohibitively expensive or time consuming and other, less accurate but economical assays must be used, we therefor will highlight AI approaches which can deal with uncertain or ambiguous results from peptide assays. Topics include:
- AI with uncertainty in assay measurement, such as peptide-MHC affinity, toxicity, etc.
- active learning for selection of important peptide training cases in low data experiments.
- human in the loop AI for fitness evaluation.
- dataset augmentation using generative AI.
- approximate/partial peptide discovery.
- Explainable AI applied to motif discovery.
Session 3: Data curation, collaboration methods, visualization methods and legal requirements in peptide AI research. This session will showcase collaboration technologies, including federated learning, which can enable safe and secure data sharing. This session will have topics including:
- GUI based tools and human computer interaction design principals needed for peptide design.
- tools to visualize/inform of trade-offs in peptide functions.
- legal requirements for AI based peptide drug development.
- peptide search space visualization methods.
- AI methods to ensure peptide safety (e.g. lowering toxicity, ensuring stability).
- how the EU AI act will affect cross border collaboration and data/model sharing.
- discussions on the legal status and specifics of using AI for peptide design.
- Collaboration to overcome hurdle of small dataset sizes, including federated learning across distributed proprietary datasets.
References
Aidan Murphy (University College Dublin) - Organiser
Denis Shields (University College Dublin) - Organiser