Authors: Qian-Yuan Tang, Weitong Ren, Jun Wang, Kunihiko Kaneko
Data and code related to the paper "The Statistical Trends of Protein Evolution: A Lesson from AlphaFold Database" (https://doi.org/10.1093/molbev/msac197).
-
The 'Code' folder include the Jupyter Notebook (Python) to reproduce this research.
-
Folder 'Index' includes the text files with the file names of the predicted protein structures in the 48 organisms (including 16 model organisms, and 32 organisms related to global health). In other data files, the ordering of data is in accord with the order of protein names in files from this folder.
-
The Readme files in every folder include a brief introduction of the data in the folder.
-
Text files "KS-Rg-N250.txt" and "KS-Rg_norm-N_full.txt" include the results of two-sample KS tests for 48 organisms in our database. In "KS-Rg-N250.txt", the results are based on the distribution of R_g for proteins with similar chain lengths (N ~ 250). In text file "KS-Rg_norm-N_full.txt", the results are based on the distribution of normalized R_g (i.e., R_g/N^{1/3}). Further details are listed in the SI Method of the paper.
-
Should you have any other technical questions, please feel free to contact Qian-Yuan Tang (tangqianyuan_at_gmail.com).
-
Updated on: Sep 14, 2022