For my final year MEng research project I worked within the University's Skin Research Group (ChemEng Surrey) to compare the performance of different co-solvency solubility models. For this project I collected different pharamcuetical cosolvency datasets and assessed the model performance on each, I did so by implementing the models within MATLAB (the code you are presented with), and finally I wrote my findings in a research paper as my final piece of assessment which you can read here.
Included models (p - predictive, e : empirical):
- Log-Linear (e)
- Predictive Log-Linear (p)
- Jouyban-Acree (e)
- General Single Model / GSM (e)
- NRTL (e)
- UNIQUAC (e)
Log-Linear:
The Log-Linear model of Yalkowsky is a simple to use model that represents ideal cosolvent systems (i.e. no solubility peaks with solvent mixtures). Needs: Solubility data in neat solvent and co-solvent.
This model is denoted in the code by LL_IMM
Predictive Log-Linear:
An extenstion of the log-linear model that replaces co-solvent solubility with the co-solvent's solubization power found from constants from literature and the logKow value for the solute. Values for the constants for common co-solvents can be found in A.Jouyban's 'Handbook of Solubility Data for Pharmacueticals' Table 1.11 on page 33 of the 2010 edition. (This book is also a good source for solubility data to try out the models with). This model is often times very innacurate (see my results) and I would not reccomend using it beyond testing / initial studies.
This model is denoted in the code by LL_SIG
Jouyban-Acree Model:
A straight-forward and highly effective correlative model that is more than adequate for modelling both ideal and non-ideal solubility systems. To use this model, the solubility data requires neat co-solvent solubility data points ("end points") and a series of data-points in-between. (See the SolubilityData - Propanol example.xlxs for an example of what this looks like.) Personally, unless you specifically need to estimate the activity co-effictients of your components I would reccomend using this model.
This model is denoted in the code by JA_REG
General Single Model:
A simplification of the Jouyban-Acree model which requires only the solubility data in one neat co-solvent and a data series to correlate from. Yields very similar results to the Jouyban-Acree model.
This model is denoted in the code by GSM
Further reading: All of the above four models are discussed in more detail in the paper Review of the Cosolvency Models for Predicting Drug Solubility in Solvent Mixtures: An Update by A. Jouyban. I would reccomend reading this review if you are interested: in cosolvency modelling in general, the algebraic form of the models, or wish to know the background of the above models.
NRTL Model:
The NRTL Model is a popular activity coefficient model, in this project I have employed it in an empirical rather than a predictive fashion. Currently there is a gap in (publicly available) research for published generic NRTL BIPs (binary interaction parameters) for pharmacuetical compounds in co-solvent systems. Using this model, a series of solubility data points is entered, and an optimisation routine determines BIPs for the system. When using the model be sure to adjust the non-randomness factor alpha to a value between 0.1 & 0.5 for the best results.
This model is denoted in the code by NRTL
UNIQUAC Model:
The UNIQUAC Model is also a commonly used activity coefficient model, that likewise lacks available published pharamcuetical & co-solvent BIP data. UNIQUAC functionality in this project is the same to that as NRTL. However, the UNIQUAC model requires the group contribution data for each component. For this project I made use of the Dortmund Data Bank online UNIFAC group assignment tool for this project. Unfortunately becusae this is the regular UNIQUAC model electrolytes and some drugs with complex structures are not supported.
This model is denoted in the code by UNIQUAC
Further reading: Significant advancements have been made to develop the NRTL & UNIQUAC models to cover their shortcomings. Some of these models include eNRTL, NRTL-SAC, eNRTL-SAC, Modified UNIQUAC, and eUNIQUAC to name just a few. If the models in this project do not fit your need, I would reccomend giving these models a search.
- data_files : stores solubility data
- program_files : scripts found here
- ModelScript.m
- SolModelTool.mlapp
- corr_funs : correlation functions for models
- pred_funs : predicting functions for models
- other_funs: functions that do other jobs
To perform correlation on a data-set the solubility data must be formatted in the same way as shown with the data in SolubilityData - Propanol example.xlxs
Note: For the NRTL & UNIQUAC models, the solubility data must be prepared as a ternary mole fraction system for both the solvent fractions and unit of solubility for truly representative modelling.
To interpolate existing solubility data with the available models, the UI based SolModelTool.mlapp is the easiest to use.
To incorporate the models within your own scripts I would reccomend trying the models with ModelScript.m to see how they are called, and then writing your own code by adding the corr_funs , pred_funs, and other_funs folders to your directory.