Using Conda:
conda create -n spatial-ppi python=3.9.12
conda activate spatial-ppi
conda install tensorflow==2.9.1 scikit-learn==1.1.1 numpy pandas matplotlib
conda install -c conda-forge -c bioconda mmseqs2
pip install biopython==1.79 tqdm
pip install git+https://www.github.com/keras-team/keras-contrib.git
- Prepare a dataset in
json
format, example - Perform Alphafold Multimer prediction
- Run
preprocess.py
to generate tensors and data list files
python preprocess.py \
--dataset [PATH to the json dataset] \
--data_dir [PATH to Alphafold Multimer prediction result folder] \
--work_dir [PATH to folder to save tensors and data files] \
--tensor_method [Methods for tensorization, choices: onehot, volume, distance, all] \
--split \ # Generate splited train, val and test dataset for 5-fold cross validation
--relaxed \ # Use relaxed Alphafold Multimer predictions
--threads [Number of threads to run] \
--models_per_pair [Number of Alphafold multimer models generated per protein pair]
Run train.py
to train models
python train.py \
--model [Backbone model to use] \
--datapath [PATH to data tensors] \
--weights [PATH to weights to fine-tuning] \
--savingPath [PATH to save trained models] \
--train_set [PATH to train set csv file generated in preprocess] \
--test_set [PATH to validation set csv file generated in preprocess]
test.py
could use trained model to make predictions and evaluations
python test.py \
--model [Backbone model to use] \
--datapath [PATH to data tensors] \
--weights [PATH to weights to fine-tuning] \
--output [PATH to save prediction results] \
--test_set [PATH to test set csv file generated in preprocess]
The data splits and weights trained from 5-fold cross-validation can be download from here.
- Hu W, Ohue M. SpatialPPI: three-dimensional space protein-protein interaction prediction with AlphaFold Multimer. Computational and Structural Biotechnology Journal, 2024. doi: 10.1016/j.csbj.2024.03.009