Import Libraries: The code imports necessary libraries from RDKit, PIL, NumPy, PyTorch, and OpenVINO. These libraries are used for chemical informatics, image processing, numerical operations, deep learning, and efficient model deployment respectively.
Function to Convert SMILES to Fingerprints: smiles_to_fp: This function takes a SMILES string and converts it into a molecular fingerprint using RDKit. The fingerprint is a binary vector representation that captures the presence or absence of certain molecular substructures. This is crucial for the next steps where machine learning models predict chemical properties based on these fingerprints.
Load and Compile the Machine Learning Model: The model (specified by 'lipophilicity_openvino.xml') is loaded and compiled using OpenVINO's Core object for deployment on an NPU (Neural Processing Unit). NPUs are specialized hardware designed to accelerate neural network computations. This step boosts the efficiency of the model inference.
Define Prediction Function: predict_lipophilicity: This function receives a SMILES string, converts it to a fingerprint, wraps this fingerprint into a NumPy array, then a PyTorch tensor, and finally a tensor that OpenVINO can use for inference. The function runs the pre-loaded OpenVINO model to predict the lipophilicity (a measure of how well a compound can dissolve in fats, oils, and non-polar solvents) of the molecule.
Example SMILES Strings: A list of SMILES strings representing different molecules is provided. These strings can be used to visualize molecular structures and predict their properties using the loaded AI model.
Select a SMILES String and Predict Lipophilicity: The code selects the first SMILES string from the list, predicts its lipophilicity using the defined function, and prints the predicted value.
Visualize the Molecule: The selected molecule is visualized using RDKit's drawing module which converts the SMILES string into a visual representation of the molecule. This image is then displayed using PIL's Image module. Overall, this code is a practical application of cheminformatics combined with machine learning to predict and visualize molecular properties efficiently, leveraging the computational power of NPUs for rapid model inference.