Database module¶
Submodules¶
Database.initialize_database¶
Database.query¶
-
class
Database.query.Connector(database)¶ Bases:
objectConnects to a generated database
-
connect_to_database()¶
-
custom_query(query)¶ Takes a custom query and returns the results
-
get_all_compound_keggIDs()¶ Retrieves all compound Kegg IDs
-
get_all_compounds()¶ Retrieves all compounds in the database
-
get_all_cpd_chemicalformulas()¶ Retrieves all chemicalformulas
-
get_all_cpd_with_chemicalformula(cf)¶ Retrieves chemicalformula for compound ID
-
get_all_cpd_with_search(search)¶ Retrieves compound name for given search term (name/formula)
-
get_all_fba_models()¶ Retrieves all model IDs in the database
-
get_all_keggIDs()¶ Retrieves reactions based on type
-
get_all_models()¶ Retrieves all model IDs in the database
-
get_all_reactions()¶ Retrieves all reactions in the database
-
get_catalysts(reaction_ID)¶ Retrieves the catalyst of reaction
-
get_compartment(compartment)¶ Retrieves the compartment ID
-
get_compound_ID(compound_name, strict=False)¶ Retrieves compound ID given a compound name
-
get_compound_compartment(compound_ID)¶ Retrieves the compartment that the compound is in
-
get_compound_name(compound_ID)¶ Retrieves compound name given a compound ID
-
get_compounds_in_model(organism_ID)¶ Retrives all compounds in a metabolic model given model ID
-
get_cpd_casnumber(ID)¶ Retrieves casnumber for compound ID
-
get_cpd_chemicalformula(ID)¶ Retrieves chemicalformula for compound ID
-
get_genes(reaction_ID, organism_ID)¶ Retrieves gene associations for a reaction of a given metabolic network (model ID)
-
get_kegg_cpd_ID(ID)¶ Retrieves kegg ID for a compound based on main ID
-
get_kegg_reaction_ID(ID)¶ Retrieves kegg ID for a reaction based on main ID
-
get_model_ID(file_name)¶ Retrieves model ID for given file_name
-
get_models_from_cluster(cluster)¶ Retrieves model IDs from a specified cluster in the database
-
get_organism_ID(organism_name)¶ Retrieves ID of metabolic model given a specific model name
-
get_organism_name(organism_ID)¶ Retrieves name of metabolic model given a specific model ID
-
get_pressure(reaction_ID)¶ Retrieves the pressure reaction is performed at
-
get_products(reaction_ID)¶ Retrieves products (compound IDs) of a given reaction
-
get_products_reactions(compound_ID)¶ Retrieves reactions that have a given compound (ID) as a product
-
get_proteins(reaction_ID, organism_ID)¶ Retrieves protein associations for a reaction of a given metabolic network (model ID)
-
get_reactants(reaction_ID)¶ Retrieves reactants (compound IDs) of a given reaction
-
get_reactants_reactions(compound_ID)¶ Retrieves reactions that have a given compound (ID) as a reactant
-
get_reaction_name(reaction_ID)¶ Retrieves name of the reaction given the reaction ID
-
get_reaction_species(reaction_ID)¶ Retrieves compound IDs that are in a given a reaction
-
get_reaction_type(rxn)¶ Retrieves reaction on type
-
get_reactions(compound_ID, is_prod)¶ Retrieves reaction IDs that have a given compound ID as a reactant or product
-
get_reactions_based_on_type(rxntype)¶ Retrieves reactions based on type
-
get_reactions_in_model(organism_ID)¶ Retrieves all reactions in a metabolic model given model ID
-
get_reference(reaction_ID)¶ Retrieves the reference of reaction
-
get_solvents(reaction_ID)¶ Retrieves solvent of reaction
-
get_stoichiometry(reaction_ID, compound_ID, is_prod)¶ Retrieves stoichiometry of a compound for a given reaction
-
get_temperature(reaction_ID)¶ Retrieves the temperaature reaction is performed at
-
get_time(reaction_ID)¶ Retrieves the time that is required to perform reaction
-
get_uniq_metabolic_clusters()¶ Retrieves unique metabolic clusters (organisms with the exact same metabolism) in the database
-
get_yield(reaction_ID)¶ Retrieves yield that was reported with reaction
-
is_reversible(organism_ID, reaction_ID)¶ Retrieves reverisbility information of a reaction in a specified metabolic model (model ID)
-
is_reversible_all(reaction_ID)¶ Retrieves reversibility information of a reaction independent of model
-
-
Database.query.fetching_all_query_results(Q, conn, cnx, db, query, count)¶
-
Database.query.fetching_one_query_results(Q, conn, cnx, db, query, count)¶
-
Database.query.test_db_4_error(conn, cnx, query, db, count)¶
Database.build_ATLAS_db¶
-
Database.build_ATLAS_db.build_atlas(atlas_dir, DBPath, inchidb, processors, rxntype='bio')¶ Add atlas database to RSA metabolic database
-
Database.build_ATLAS_db.extract_KEGG_data(url)¶ Extract Kegg db info
-
Database.build_ATLAS_db.fill_arrays_4_db(rxn_keggids, rxn_atlas, DBPath, inchidb, processors, rxntype)¶ fill arrays with ATLAS reactions to add to database
-
Database.build_ATLAS_db.fill_database(cnx, reactions, reaction_reversible, model_reaction, reaction_protein, reaction_genes, model_compound, compound, reaction_compound, original_db_cpd_new)¶ fill database with ATLAS information
-
Database.build_ATLAS_db.fill_dictionary(larray, dictionary, KEGGID=False)¶ Fill dictionaries with ATLAS reactions
-
Database.build_ATLAS_db.fill_dictionary_atlasbiochem(larray, dictionary)¶ Fill dictionaries with ATLAS reactions
-
Database.build_ATLAS_db.get_inchi_4_cpd(cpd, INCHI)¶ Get inchi for a compound if inchidb is True
-
Database.build_ATLAS_db.open_atlas_files(atlas_files)¶ open ATLAS files
-
Database.build_ATLAS_db.process_reactions(rxninfo, currentcpds, dbcpds, original_db_cpd_current, inchidb, rxntype, INCHI, output_queue)¶ Process ATLAS reactions
-
Database.build_ATLAS_db.process_substrates(rxn, cpd, is_prod, currentcpds, dbcpds, inchidb, original_db_cpd_current, model_compound_temp, reaction_compound_temp, model_reaction_temp, compound_temp, original_db_cpd_temp, INCHI)¶ Process ATLAS compounds
Database.build_kbase_db¶
-
Database.build_kbase_db.BuildKbase(sbml_dir, kbase2keggCPD_translate_file, kbase2keggRXN_translate_file, inchi, DBpath, rxntype='bio')¶ Inserts values from metabolic networks, xml into sqlite database
-
Database.build_kbase_db.extract_KEGG_data(url)¶ Extract Kegg db info
-
Database.build_kbase_db.get_KEGG_IDs(ID, KEGGdict)¶ Retrieve KEGG IDs
-
Database.build_kbase_db.get_compartment_info(ID)¶ Retrieve compartment information for compound
-
Database.build_kbase_db.get_inchi_values(file_name, inchi_pubchem, inchi_cf, inchi_cas, total_set, CPD2KEGG, CT, INCHI)¶ Retrieve InChI values for compounds in metabolic networks
-
Database.build_kbase_db.get_metabolic_clusters(org_cpds, org_rxns, model_id, metabolic_clusters, cluster_org)¶ Retrieves metabolic cluster info
-
Database.build_kbase_db.insert_comprehensive_model_results_2_db(DBpath, modelcompartments, modelcompounds_allinfo, rxn_info, all_rxn_cpds, keggdict, inchi, inchi_pubchem, filenum, rxntype)¶ Inserts comprehensive compound and reaction information into tables
-
Database.build_kbase_db.insert_individual_model_results_2_db(DBpath, modelcompounds, modelreactions, genelist, proteinlist, mi, filename)¶ Inserts individual model information into tables
-
Database.build_kbase_db.kegg2pubcheminchi(cpd)¶ Using KEGG ID to get inchi
-
Database.build_kbase_db.load_file_info_2_db(args)¶ Open and insert information from metabolic networks (xml files) into database
-
Database.build_kbase_db.open_translation_file(file_name)¶ opens and stores KEGG translation files
-
Database.build_kbase_db.parse_data_sbmlfile(inchi, CPD2KEGG, RXN2KEGG, file_name, inchi_pubchem, inchi_cf, inchi_cas)¶ Open metabolic network file (xml file) and parse information
-
Database.build_kbase_db.process_compartments(compartmentsoup)¶ Get compartments in xml file
-
Database.build_kbase_db.process_compounds(speciessoup, CPD2KEGG, mi, inchi, inchi_pubchem, inchi_cf, inchi_cas)¶ Parse compound information from metabolic network file (xml file)
-
Database.build_kbase_db.process_reactions(reaction_soup, RXN2KEGG, CPD2KEGG, mi, inchi, inchi_pubchem)¶ Parse reaction information from metabolic network file (xml file)
-
Database.build_kbase_db.retrieve_exact_inchi_values(m, total_set, inchi_pubchem, inchi_cf, inchi_cas, CPD2KEGG, CT, INCHI)¶ Retrieve InChI values
-
Database.build_kbase_db.retrieve_metabolic_clusters(DBpath)¶ Identifies and inserts metabolic clusters (organisms with the same compounds and reactions) into database
Database.build_modelseed¶
-
class
Database.build_modelseed.BuildModelSeed(username, password, rxntype, inchidb, DBpath, output_folder, media='Complete', newdb=True, tokentype='patric', sbml_output=False, processors=4, verbose=False, patricfile='/Users/lwhitmo/software/RetSynth_lt/rs/Database/data/PATRIC_genome_complete_07152018.csv', previously_built_patric_models=False)¶ Bases:
object-
get_model_from_patric()¶ Builds models in patric and converts them to cobra models which are then imported into retsynth database
-
load_complete_genomes()¶ Loads a list of patric genome IDs that are complete genomes
-
process_cobra_model(model, genome_id)¶ Adds information from cobra model into appropriate arrays
-
-
class
Database.build_modelseed.LoadIntoDB(DBpath, verbose, inchidb)¶ Bases:
object-
add_all_info_existing(allcompounds, originalIDs, allreactions, reaction_reversibility, model_ids, model_compartments)¶ Loads unique compound, reaction, compartment and model information into preexisting database
-
add_all_info_new(allcompounds, originalIDs, allreactions, reaction_reversibility, model_ids, model_compartments)¶ Loads unique compound, reaction, compartment and model information into new database
-
add_cluster_info()¶ Loads cluster information into database
-
add_model_compounds(model_compounds)¶ Loads model compound information into database
-
add_model_reactions(model_reactions, reaction_genes, reaction_protein)¶ Loads reaction information into database
-
add_reaction_compound(reaction_compound, newdb)¶ Loads reaction compound information into database
-
get_model_sorted_cpds(model)¶
-
get_model_sorted_rxns(model)¶
-
-
Database.build_modelseed.build_patric_models(genome_id, genome_name, media, username)¶ Builds patric models on the patric server
-
Database.build_modelseed.extract_KEGG_data(url, verbose)¶ Extract Kegg db info
-
Database.build_modelseed.generate_sbml_output_folder(output_folder)¶ Generate output folder for sbml fba models if user has specified this option
-
Database.build_modelseed.get_KEGG_IDs(ID, compartment, KEGGdict)¶ Retrieve KEGG IDs
-
Database.build_modelseed.kegg2pubcheminchi(cpd, verbose)¶ Convvert kegg ID to InChI value
-
Database.build_modelseed.open_translation_file(file_name)¶ opens and stores KEGG translation files
-
Database.build_modelseed.retrieve_exact_inchi_values(new_cpd_keggid, raw_cpd_keggid, cpd_name, compart_info, inchi_pubchem, inchi_cf, inchi_cas, CT, INCHI, verbose)¶ Retrieve InChI values
-
Database.build_modelseed.verbose_print(verbose, line)¶ verbose print function
Database.build_user_rxns_db¶
Database.build_KEGG_db¶
-
Database.build_KEGG_db.BuildKEGG(types_orgs, inchidb, processors, currentcpds, num_organisms='all', num_pathways='all')¶ Build metabolic database from KEGG DB
-
class
Database.build_KEGG_db.CompileKEGGIntoDB(database, type_org, inchidb, processors, num_organisms, num_pathways, rxntype, add)¶ Bases:
objectAdd KEGG info to sqlite database
-
add_to_preexisting_db()¶ Add KEGG info to already developed database
-
fill_new_database()¶ Fill database
-
-
Database.build_KEGG_db.add_metabolite(reactionID, cpd, stoichiometry, is_prod, reactioninfo)¶ add metabolites to dictionary
-
Database.build_KEGG_db.extract_KEGG_data(url)¶ Extract Kegg db info
-
Database.build_KEGG_db.extract_KEGG_orgIDs(types_orgs, num_organisms)¶ Retrieve organism IDs in KEGG
-
Database.build_KEGG_db.extract_pathwayIDs(orgID, num_pathways, output_queue)¶ Retrieve pathway IDs
-
Database.build_KEGG_db.extract_reactionIDs(pathway, output_queue)¶ Extract reactions in pathways
-
Database.build_KEGG_db.process_compound(cpd, reactionID, reactioninfo, is_prod, inchidb, compoundinfo, cpd2inchi, inchi_cf, inchi_cas, currentcpds)¶ Extract compound info
-
Database.build_KEGG_db.process_reaction(reactionID, inchidb, compoundinfo, cpd2inchi, inchi_cf, inchi_cas, currentcpds, output_queue)¶ Extract reaction info
Database.build_metacyc_db¶
-
class
Database.build_metacyc_db.MetaCyc(DB, inchidb, cnx, verbose)¶ Bases:
objectOpens and parses metacyc xml file
-
arrays(cpdID, compartment, name, KEGG_ID, chemicalformula, cas)¶ Adds compound to compound arrays
-
check_db(cpdID, kegg_id)¶ Checks database for table original_db_cpdIDs if table exists and cpd ID exists the inchi value for that compound is retrieved
-
check_reaction_difference(rxnID, rxn_compounds)¶ Checks to see if promiscus mets are only difference between reactions
-
compound_translator(compound_ID, biocyc_ID, inchi_ID, KEGG_ID, name, compartment)¶ Checks if metacyc compound is in database, if it is not it adds it
-
fill_compound_arrays(ID, inchi, cpdID, KEGG_ID, name, compartment)¶ Determines whether or not to add compound to compound arrays which will later be added to the sqlite database
-
fill_temp_array(cpdID, is_prod, stoic, temp_all_rxn_compound)¶ Adds reaction information from metacyc xml file to temporary reaction list which later gets inserted into database
-
get_compounds_4_rxn(species)¶ get reaction compounds
-
get_fp_cf_info(inchi)¶
-
get_promiscuous_cpds(file_name)¶
-
multiple_copies_of_rxns(rxnID, rxn, name, genes, proteins, kegg, temp_all_rxn_compound)¶ Deals with rxns that have same catalytic enzyme but different substrates
-
read_metacyc_file(BIOCYC_translator, file_name)¶ Reads and parses metacyc SBML file
-
retrieve_compartment_4_compartment(temp_compartment, rxnID)¶
-
retrieve_rxn_info(rxn, rxnID, genes, proteins, kegg_ID, biocycID)¶ Parses reaction information from metacyc xml file
-
rxn_translator(reaction_ID, temp_all_rxn_compound, revers, name, genes, proteins, kegg)¶ Checks if metacyc reaction is in database, if it is not it adds it
-
-
class
Database.build_metacyc_db.Translate(DBPath, file_name, inchidb, rxntype, verbose, add=True)¶ Bases:
objectTranslates metacyc compound and reaction IDs to Kbase compound and reaction IDs if a kbase database is used (ONLY WORKS WITH KBASE, ModelSeed/patric)
-
add_metacyc_to_db()¶ Adds metacyc information to the database
-
-
Database.build_metacyc_db.extract_KEGG_data(url)¶ Extract Kegg db info
-
Database.build_metacyc_db.get_inchi_from_kegg_ID(cpd)¶
-
Database.build_metacyc_db.verbose_print(verbose, line)¶
Database.build_SPRESI_db¶
-
Database.build_SPRESI_db.RDF_Reader(file_directory, DBpath, rxntype, compartment, processors, temp_option=False, pressure_option=False, yield_option=False, time_option=False, catalyst_option=True, solvent_option=True)¶ Adds data from RDF files into database (specifically works with spresi formated rdf files)
-
Database.build_SPRESI_db.add_individual_file_info(text_file, cnx, conn, rxntype, compartment, eliminate_duplicates, identification)¶ Specifically adds individual file info from to database
-
Database.build_SPRESI_db.add_info_2_database(DBpath, rxntype, compartment)¶ Adds data from text files too database
-
Database.build_SPRESI_db.check_refs(table, rxn_id, test_info_ref, new_rxn_info, larray, cnx)¶ check if reference info is already in database
-
Database.build_SPRESI_db.generate_mol_file(compounds, filenumber, substrates=False)¶ Generates mole file and then reads it in using the indigo API to get the smile
-
Database.build_SPRESI_db.get_complex_reference_details(item, reference_parameters, full_citation_string, reference_details_array, reference_details_bool)¶ retrieves details for references with many parameters
-
Database.build_SPRESI_db.get_data(match, datatype, item, result_array, type_bool, DATA_TYPE=None)¶ retrieves a variety of other information for rxn (yield, reference etc…)
-
Database.build_SPRESI_db.get_mol_structure(match, item, type_bool, result_dict, count_item=0, GET_RXN=True)¶ retrieves rxn, solvents and catalyst information
-
Database.build_SPRESI_db.open_file(args)¶ Opens in an RDF file
-
Database.build_SPRESI_db.parse_file(output_file, RDF_dict, filenumber, file_name, options)¶ Parses elements of an RDFile outputs them to new next file
-
Database.build_SPRESI_db.process_rxntext(rxntext_array)¶ processes string containing reaction information
Database.build_MINE_db¶
-
class
Database.build_MINE_db.BuildMINEdb(dumpdirectory, database, inchidb, rxntype)¶ Bases:
objectAdds or builds MINE database to metabolic database
-
add2dictionary(temp)¶ Adds compound to file dictionary
-
extract_cpd_information(compoundid, INFO, tp)¶ Get compound information
-
extract_source_information(compoundid, tp)¶ extract operator information (EC number)
-
fill_database()¶ Generate arrays of database information and fill database with information
-
fill_reaction_components_dict(rxn, compound, typecpd)¶ Fill reaction info (substrates) in to dictionary
-
generate_reactions()¶ Get reactions from MINE files
-
open_mspfile(filename)¶ Opens and reads msp files
-