Skip to content

disc04/simplydrug

Repository files navigation

Welcome to simplydrug

To advance scientific communication and integrative drug discovery, we developed a set of open-source based analysis workflows. These workflows describe the early stages of biological assay development and high throughput screening and provide a hands-on introduction to Drug Discovery for everybody with basic knowledge of biology, python programming, or data science.

List of notebooks

The notebooks are built in a sequence and gradually introduce concepts of experimental design, QC, and data analysis of different biological assays.

  • 01a_enzyme_kinetics
    Topics: enzyme kinetics, enzyme assays, fluorometry, assay variability and confidence intervals, Z-factor, Z-score based normalization, plate heatmap, hit extraction, molecule visualization, importing molecule bioactivity data.

  • 01b_enzyme_kinetics_in_chain
    Topics: Running enzymatic assay for a number of plates, generating screen hit matrix, plot for all the plates in the screen.

  • 02a_ion_channel_development
    Topics: Introduction to ion channels and assay development, ion flux assay normalization, ion channel kinetics time-series.

  • 02b_ion_channel_cherry_picking
    Topics: Calcium influx assay, cherry picking, percent of activation or inhibition.

  • 02c_ion_channel_dose_response
    Topics: Introduction to dose-response, Hill equation.

  • 03a_yeast_growth_screen
    Topics: Running yeast growth assay, growth curve, growth score, filtering out aberrant curves.

  • 03b_yeast_growth_in_chain
    Topics: Running yeast growth assay for a number of plates, filtering, generating screen hit matrix, plotting all the plates in the screen.

  • 03c_yeast_cherry_picking
    Topics: Running yeast growth assay with different doses of the compounds. Generation of automatic ppt report.

  • 04a_imaging_screen
    Topics: High-content screening and image analysis, reporter system, cell viability, systematic errors detection and correction.

  • 4b_imaging_assay_development
    Topics: Exploration data analysis, PCA, batch effect.

  • 04c_imaging_dose_response
    Topics: Activity versus viability, fitting dose-response for imaging data.

  • 05_xtt_assay
    Dose-response assay for compound toxicity.

Install

There are several options:

  1. Run the notebooks from Binder

Binder

  1. Run > conda install -c conda forge rdkit, and then run > pip install simplydrug

  2. Clone this repository: git clone https://github.com/disc04/simplydrug

Dependencies

The codebase relies on the following dependencies (tested version provided in parentheses):

  • python (3.6.1)
  • pubchempy (1.0.4)
  • scipy (1.4.1)
  • seaborn (0.10.0)
  • python-pptx (0.6.18)
  • wget(3.2)
  • xlrd (1.2.0)
  • rdkit (2019.09.3)

Example usage

In each experiment, first, we merge numerical data coming from equipment with the plate layout (descriptors). We describe the experimental design in a layout excel file, and the names of the excel sheets become the names of the columns in a final data table. Each excel sheet contains a table with dimensions of the experiment plate (usually 96 or 384-well plates) and represents some aspect of the layout - well ID, treatment, cell density, compound ID, compound concentration, etc.
The layout file must contain sheets named 'Well' and 'Status'. The 'Well' table lists well IDs, and the 'Status' can contain either 'Sample', 'Positive' or 'Negative' control, or 'Reference' values. 'Reference' wells are excluded from calculations. The function add_layout merges measurements and layout by the 'Well' column.
import pandas as pd
import simplydrug as sd

data = pd.DataFrame(pd.ExcelFile('hts_notebooks//test_data//enzyme_kinetics_data1.xlsx').parse(0))[['Well','0s','120s','240s', '360s']]
layout_path = 'hts_notebooks//test_data//enzyme_kinetics_layout.xlsx'
chem_path = 'hts_notebooks//test_data//compounds//example_chemicals.csv'
chem_plate = 'ex_plate1'

results = sd.add_layout(data, layout_path, chem_path = chem_path, chem_plate = chem_plate)
display(results.head())

To check our 384 well plate for systematic errors, we can use plate heatmap representation:

sd.heatmap_plate(df = results, layout_path = layout_path, features = ['120s'], path = None, save_as = None)

from IPython.display import Image
Image(filename = 'heatmap.png',  width = 400) 

In this plate, most of the readings across the plate are close to the plate average, and four wells with high readings probably represent our hit compounds.

Please refer to the documentation page for more information.

Copyright

Copyright 2020 onwards, Blavatnik Center for Drug Discovery. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this project's files except in compliance with the License. A copy of the License is provided in the LICENSE file in this repository.