Dear All,
I would like to circulate some current Cheminformatics- (and related) news to everyone as follows. My apologies for the long gap in between, and I freely admit wasting my summer largely on non-cheminformatics topics for the first time in quite a while.
But now I am very happy to report that the newsletter is back, of course stronger than ever – and as usual, if you have information from your side for distribution please just let me know, and I am happy to include it on the next occasion!
So here we go…
Events
20 September 2023
Cambridge Cheminformatics Meeting
Cambridge, UK and on Zoom (Hybrid)
More information: http://www.c-inf.net
Direct Zoom registration: https://zoom.us/meeting/register/tJIqf-qhqjktHtSPZ0jtztLwDWnbp3CxmqUn
Programme
Benchmarking Structure-Based 3D Molecular Generative Models
Benoit Baillif, University of Cambridge and CCDC
https://www.ch.cam.ac.uk/person/bb596
Industrial Applications of Retrosynthesis Technologies – Shared Intermediates and Impurity Prediction
Hongbin Yang, Chemical.AI
https://chemical.ai/
Current Methods for Drug Property Prediction in the Real World
Ryan Greenhalgh, Deepmirror.ai
https://www.deepmirror.ai/
26 September 2023
3rd Munich-Leiden Virtual ChemBio Talks
Virtual Event
https://events.bizzabo.com/485752/home
3/4 October 2023
PhysChem Forum
Gothenburg, Sweden
http://physchem.org.uk/pcf2023/pcf2023.html
18 October 2023
TechBio UK: Data-driven discovery
London, UK
https://www.techbio.uk
27 October 2023
Broad Institute Machine Learning in Drug Discovery Symposium
Cambridge, MA and Virtual (Hybrid Mode)
https://www.broadinstitute.org/machine-learning-drug-discovery-symposium/machine-learning-drug-discovery-symposium
8 December 2023
Advancing Molecular Machine Learning – Overcoming Limitations
ELLIS Workshop, unofficial NeurIPS2023 side event (virtual)
https://moleculediscovery.github.io/workshop2023
Jobs
Director, Structure-based Drug Design
Exscientia
Cambridge, UK
https://www.linkedin.com/jobs/view/3710948682
Senior Computational Biologist
Turbine
Budapest, Hungary
https://turbineai.bamboohr.com/careers/47
Senior Scientist, NLP and Knowledge Discovery
Bristol Myers Squibb
Seville, Spain
https://www.linkedin.com/jobs/view/3648280978
Machine Learning Research Scientist – Explainable AI in Oncology and Drug Discovery
Bayer
Berlin, Germany
https://www.linkedin.com/jobs/view/3618581357
Senior Cheminformatics Scientist, Senior ML Researcher
CoSyne Therapeutics
London, UK
https://www.linkedin.com/jobs/view/3668861288
https://www.linkedin.com/jobs/view/3701399533
Computational Drug Discovery Research Scientist
Chemify
Scotland
https://www.linkedin.com/jobs/view/3708704108
Cheminformatician
FogPharma
Cambridge, MA
https://www.linkedin.com/jobs/view/3699628704
Junior professorship (W1) for Machine Learning in Computational Biology/Bioinformatics
University of Hamburg
Hamburg, Germany
https://www.nature.com/naturecareers/job/12805763/junior-professorship-w1-for-machine-learning-in-computational-biology-bioinformatics
Head of Biomedical Data Science
Bayer
Wuppertal, Germany
https://jobs.bayer.com/job/Wuppertal-Elberfeld-Head-of-Biomedical-Data-Science-%28mfd%29-Nort/880549001
Postdoctoral Researcher in Biomedical Artificial Intelligence
University of Zurich
Zurich, Switzerland
https://jobs.uzh.ch/offene-stellen/post-doctoral-researcher-in-biomedical-artificial-intelligence/5b99fde5-eb4c-4b96-b254-cce76b39cffe
Materials Informatics Scientist
Dunia
Berlin, Germany
https://www.linkedin.com/jobs/view/3702039508
Cheminformatics…
Chemoinformatics and Machine Learning for Drug Discovery
https://github.com/Aouidate/Chemoinformatics-tutos/tree/master
A series of introductory tutorials
Open code repositories of pharma and biotech companies heavily using AI/ML
https://github.com/chupvl/awesome-ls-ventures/blob/main/awesome-pharma-biotech-aiml.md
Compiled by Vladimir Chupakhin
Applied Mathematics and Informatics in Drug Discovery
http://amidd.ch
Course by University of Basel, all material online
pqsar2cpd – de novo generation of hit-like molecules from pQSAR pIC50 with AI-based generative chemistry
https://github.com/Novartis/pqsar2cpd
Code available on GitHub
PREFER: A New Predictive Modeling Framework for Molecular Discovery
https://pubs.acs.org/doi/10.1021/acs.jcim.3c00523
Code available on GitHub
Current Opinion in Structural Biology – Special Issue on “AI Methodologies in Structural Biology (2023)”
https://www.sciencedirect.com/journal/current-opinion-in-structural-biology/special-issue/1081K74ZW4G
Various articles of possible interest, freely accessible for 6 months
PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences
https://arxiv.org/abs/2308.05777
Always check, what gets generated (I.)
Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models?
https://arxiv.org/abs/2308.07413
Always check, what gets generated (II.)
Ringtail
https://github.com/forlilab/Ringtail
Package for creating SQLite database from virtual screening results, performing filtering, and exporting results
Introduction to artificial intelligence and deep learning using interactive electronic programming notebooks
https://onlinelibrary.wiley.com/doi/10.1002/ardp.202200628
https://github.com/kochgroup/intro_pharma_ai
How accurately can one predict drug binding modes using AlphaFold models?
https://www.biorxiv.org/content/10.1101/2023.05.18.541346v2
AlphaFold predictions are valuable hypotheses, and accelerate but do not replace experimental structure determination
https://www.biorxiv.org/content/10.1101/2022.11.21.517405v2
COATI: multi-modal contrastive pre-training for representing and traversing chemical space
https://chemrxiv.org/engage/chemrxiv/article-details/64e8137fdd1a73847f73f7aa
https://github.com/terraytherapeutics/COATI
by Terray Therapeutics
Berlin Digital Science for Drug Discovery Meeting, 24 May 2023
Recording available at https://youtu.be/WiWTrtOdMd8 including:
Protein-Ligand Binding Kinetics in Drug Design: Prediction of Kinetic Rates for Kinases
Ariane Nunes Alves, TU Berlin
Reagent Prediction With a Transformer and Its Benefits for Reaction Product Prediction
Mikhail Andronov, SUPSI/Pfizer
Cambridge Cheminformatics Meeting, 7 June 2023
Recording available at https://youtu.be/H-NcX6xrpZY including:
Structure-based Drug Design with Equivariant Diffusion Models
Charlie Harris, University of Cambridge
DECIMER: Deep Learning for Scraping, Curating and Registering Compounds From the Primary Literature
Kohulan Rajan, Jena University
Distributed HPC Workflows with Covalent
Will Cunningham, Agnostiq
Explaining Blood–Brain Barrier Permeability of Small Molecules by Integrated Analysis of Different Transport Mechanisms
https://pubs.acs.org/doi/pdf/10.1021/acs.jmedchem.2c01824
Data and models available at https://github.com/bartwesterman/Cornelissen-et-al
RSC CICAG – Summer 2023 Newsletter
http://www.rsccicag.org/index_htm_files/CICAG%20Newsletter%20Summer%202023%20FINAL.pdf
Molecular Assays Simulator to Unravel Predictors Hacking in Goal-Directed Molecular Generations
https://pubs.acs.org/doi/10.1021/acs.jcim.3c00195
And yes – it’s not only about ‘pumping up the numbers’
Open-Source Machine Learning in Computational Chemistry
https://pubs.acs.org/doi/10.1021/acs.jcim.3c00643
Survey of 179 open-source software projects
… beyond cheminformatics …
Observing many researchers using the same data and hypothesis reveals a hidden universe of uncertainty
https://www.pnas.org/doi/10.1073/pnas.2203150119
The same data, the same hypothesis… gives you vastly different results
The Right Data for Good Results: Introducing the 5 ‘V’s of Drug Discovery Data
https://medium.com/@leowossnig/the-right-data-for-good-results-introducing-the-5-vs-of-drug-discovery-data-331e29c683c5
Successful pharmaceutical discovery: Paul Janssen’s concept of drug research
https://onlinelibrary.wiley.com/doi/epdf/10.1111/j.1467-9310.2007.00481.x
How to discover 79 drugs in 40 years… away from ‘process’ thinking
On Decision Making Frameworks
https://idealistwarriorlabs.com/on-decision-making-frameworks
Example from Recursion
Predictive validity in drug discovery: what it is, why it matters and how to improve it
https://www.nature.com/articles/s41573-022-00552-x
Is it about more shots at the goal? Or is it, maybe, about better shots at the goal?
MLOps-Basics
https://github.com/graviraja/MLOps-Basics
From PyTorch and Hydra to GitHub, AWS and Docker (and beyond)
Unlocking the Potential of AI in Drug Discovery
https://www.bcg.com/publications/2023/unlocking-the-potential-of-ai-in-drug-discovery
A joint Wellcome/BCG Report on the above topic
SOTA Seeking – A Knife Fight in a Phone Booth
https://biotechbio.substack.com/p/sota-seeking-a-knife-fight-in-a-phone
Is it about SOTA in ML? What does really matter?
On the limitations of large language models in clinical diagnosis
https://www.medrxiv.org/content/10.1101/2023.07.13.23292613v1
GPT-4 will replace your doctor! Well, actually: It really depends on the completeness of input narratives
The Drug Discovery Game
https://drug-design-game.onrender.com/
Design a potent inhibitor of MMP12 in 30 weeks and with £100k
Engineering Biology: ML + Medicine—A Hammer in Search of Nails
https://www.digitalisventures.com/blog/engineering-biology-ml-medicine-a-hammer-in-search-of-nails
by Jacob Oppenheim
Pharma R&D Execs Offer Extravagant Expectations for AI But Few Proof Points
https://timmermanreport.com/2023/06/pharma-rd-execs-offer-extravagant-expectations-for-ai-but-few-proof-points/
by David Shaywitz
The Curse of Recursion: Training on Generated Data Makes Models Forget
https://arxiv.org/abs/2305.17493
Why Are the Majority of Active Compounds in the CNS Domain Natural Products? A Critical Analysis
https://pubmed.ncbi.nlm.nih.gov/29989814/
“20 natural products provided more than 400 clinically approved CNS drugs” – so when actually is novelty in chemical space needed? And which type, precisely?
… and clearly beyond cheminformatics
The gaming of citation and authorship in academic journals: a warning from medicine
https://journals.sagepub.com/doi/10.1177/05390184221142218
Pretty stark
On Good and Evil, the Mistaken Idea That Technology is Ever Neutral, and the Importance of the Double-charge Thesis
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4551487
“[…]the design of any technologic is a moral act, no technology is ever neutral[…]”
Elon Musk’s Shadow Rule
https://www.newyorker.com/magazine/2023/08/28/elon-musks-shadow-rule
Are really our politicians in charge?
Safe and just Earth system boundaries
https://www.nature.com/articles/s41586-023-06083-8
Boundaries of one type…
Boundaries are suddenly everywhere. What does the squishy term actually mean?
https://www.theguardian.com/lifeandstyle/2023/jul/14/what-are-relationship-boundaries-jonah-hill
… and of another
Faster sorting algorithms discovered using deep reinforcement learning
https://www.nature.com/articles/s41586-023-06004-9
AlphaDev… another Nature paper by DeepMind!
And some assorted comments:
https://news.ycombinator.com/item?id=36231147
“Steve Ballmer promoting Windows 1.0”
https://www.youtube.com/watch?v=DgJS2tQPGKQ
Cypress Hill: Tiny Desk Concert
https://www.youtube.com/watch?v=tUApO77uUUk
(also check out the other Tiny Desk Concerts, they are all excellent)
I believe this is all from my side for now – if you have any information for me to circulate, or wish to present at one of our next Cambridge Cheminformatics or Digital Science for Drug Discovery Meetings, please just let me know, cheers!
Best wishes,
Andreas