Script documentation¶
BioCompoundML’s primary script is
bcml.py
BioCompoundML’s primary script is
bcml.py
--input: Training File Input¶--datain: Saved Data Input¶--test_input: Testing File Input¶--train: Train the model¶--test: Test the model¶--model: Output the model to a file¶--dataout: Output all data structures¶--pred: Prediction feature¶--proxy: URL of http/s proxy¶--cluster: Cluster the training data¶--split_value: Threshold for classification of prediction feature¶--random: User defined random seed¶--verbose: Verbose output¶--experimental: Extract experimental/computed features from PubChem¶--fingerprint: Extract CACTVS fingerprints from PubChem¶--chemofeatures: Run PaDEL-Descriptors¶--user: User features are provided in training and/or test files¶--distance: Calculate compound vs. compound distance matrix¶--impute: Impute missing data using K Nearest Neighbors Imputation¶--selection: Run Boruta Feature Selection to reduce uninformative features¶--cv: Run 50% hold-out Cross-Validation 100 times¶