Skip to main navigation Skip to search Skip to main content

Context-aware computer vision for chemical reaction state detection

  • Junru Ren
  • , Abhijoy Mandal
  • , Rama El-khawaldeh
  • , Shi Xuan Leong
  • , Jason Hein
  • , Alán Aspuru-Guzik
  • , Lazaros Nalpantidis
  • , Kourosh Darvish*
  • *Corresponding author for this work
  • University of Toronto
  • University of British Columbia

Research output: Contribution to journalJournal articleResearchpeer-review

15 Downloads (Orbit)

Abstract

Real-time monitoring of laboratory experiments is essential for automating complex workflows and enhancing experimental efficiency. Accurate detection and classification of chemicals in varying forms and states support a range of techniques, including liquid–liquid extraction, distillation, and crystallization. However, challenges exist in the detection of chemical forms: some classes appear visually similar, and the classification of the forms is often context-dependent. In this study, we adapt the YOLO model into a multi-modal architecture that integrates scene images and task context for object detection. With the help of Large Language Models (LLM), the developed method facilitates reasoning about the experimental process and uses the reasoning result as the context guidance for the detection model. Experimental results show that by introducing context during training and inference, the performance of the proposed model, YOLO-text, has improved among all classes, and the model is able to make accurate predictions on visually similar areas. Compared to the baseline, our model increases 4.8% overall mAP without context given and 7% with context. The proposed framework can classify and localize substances with and without contextual suggestions, thereby enhancing the adaptability and flexibility of the detection process.

Original languageEnglish
JournalDigital Discovery
Number of pages13
ISSN2635-098x
DOIs
Publication statusAccepted/In press - 2026

Bibliographical note

Publisher Copyright:
This journal is © The Royal Society of Chemistry, 2026

Fingerprint

Dive into the research topics of 'Context-aware computer vision for chemical reaction state detection'. Together they form a unique fingerprint.

Cite this