A Unifying Network Modeling Approach for Codon Optimization

DESCRIPTION

This software is an implementation of the integer programming formulation for codon optimization proposed by Alper Şen, Oya Karaşan and Banu Tiryaki. 

INSTALLATION REQUIREMENTS

To build the software, Python and Gurobi are required. Gurobi Optimization library which is gurobipy, numpy, math and time packages have to be installed by a Python package manager. 

USAGE

"Please enter protein's name:" The name of a text file which includes amino acid sequence where each amino acid is represented by one letter should be provided. For Example: LDVELTVEER. Some sample proteins are given in "Proteins" folder.    
 
"Please enter the output file name:" The name of a text file that model will output the final codon design should be provided.

"Please enter the objective function:" The user should enter "1" for maximizing CPB, "2" for maximizing CAI, "3" for minimizing RCPB and "4" for minimizing RCB.

"Options for the constraints:" The user should enter threshold values for the constraints to be activated and "N" for the constraints not to be activated. 

The model uses Fitness Values, Codon Pair Bias values, Frequency Codon Pair values and Frequency Codon values for Homo sapiens. These values are presented in cai, cpb, rcpb and rcb text files, respectively. If the user wants to use values for different organisms, the proper format of the files should be as follows.

Fitness Value file: A text file which includes Fitness Value of each codon (1 by 64) should be provided. Each value should be separated by "," in the file.
Small 1 by 3 Example:  1, 0.795767933108405, 0.229059703681972	

Frequency Codon file: A text file which includes Frequency of each codon (1 by 64) should be provided. Each value should be separated by "," in the file.
Small 1 by 3 Example:  0.421418, 0.538557, 0.578582

Codon Pair Bias file: A text file which includes Codon pair bias values (64 by 64) should be provided. Each value should be separated by a tab in the file. 
Small 3 by 3 Example:  -0.345	-0.221	0.125	
                        0.341	0.204	-0.011
	                0.272	0.225	0.364

Frequency Codon Pair file: A text file which includes Frequency Codon Pair values (64 by 64) should be provided. Each value should be separated by a tab in the file. 
Small 3 by 3 Example:  -0.345	-0.221	0.125	
                        0.341	0.204	-0.011
	                0.272	0.225	0.364

We assume the following ordering for amino acids and codons sets:

#20 aminoacids
aminoacidset=['I','L','V','F','M','C','A','G','P','T','S','Y','W','Q','N','H','E','D','K','R','X']

#64 codons
codonset=['AUU','AUA','AUC','CUA','CUC','CUG','CUU','UUA','UUG','GUU','GUA','GUC','GUG','UUU','UUC','AUG','UGU','UGC','GCA','GCC','GCG','GCU','GGU','GGC','GGA','GGG','CCU','CCC','CCA','CCG','ACU','ACC','ACA','ACG','UCU','UCC','UCA','UCG','AGU','AGC','UAU','UAC','UGG','CAA','CAG','AAU','AAC','CAU','CAC','GAA','GAG','GAU','GAC','AAA','AAG','CGU','CGC','CGA','CGG','AGA','AGG','UAA','UAG','UGA']
