Statistical description and findings

The current statistics are made for 123 bacteriocins, 113 secreted by Gram-positive bacteria and 10 by Gram-negative bacteria. Various bacterial species produce bacteriocins. The lactic acid bacteria (order Lactobacillales) are the predominant group of producers. Of those in our database, 86 are produced by lactic acid bacteria, two by halobacteria, six by actinobacteria, 18 by bacilli, four by clostridia and seven by proteobacteria. For 81.3%, the amino acid sequence length varies from 20 to 60 (Figure 1).

Figure 1: Histogram of the distribution of peptide length in the BACTIBASE database.

 

 Table 1 summarizes the amino acid percentages. Glycine is the most abundant amino acid and 93.5% of these bacteriocins contain at least glycine residue.

 Amino acid

Number of residues

% of total residues

 G (glycine)

717

15.03

 A (alanine)

491

10.29

 K (lysine)

360

7.55

 S (serine)

343

7.19

 V (valine)

299

6.27

 N (asparagine)

295

6.18

 T (threonine)

292

6.12

 I (isoleucine)

270

5.66

 C (cysteine)

254

5.32

 L (leucine)

253

5.30

 W (tryptophan)

160

3.35

 Y (tryrosine)

159

3.33

 F (phenylalanine)

149

3.12

 P (proline)

130

2.73

 Q (glutamine)

107

2.24

 H (histidine)

101

2.12

 E (glutamic acid)

100

2.10

 D (aspartic acid)

99

2.08

 R (arginine)

95

1.99

 M (methionine)

80

1.68

 X (variable)

16

0.34

 

Table 1: Amino acid occurrence in the BACTIBASE database

 

Calculated Pearson coefficients (r = 0.635) revealed a positive correlation between sequence length and number of glycine residues, indicating that glycine content is fairly constant (Figure 2).

Figure 2: Correlation between length and number of glycine residues among peptides in the BACTIBASE database.

 

It is noteworthy that 25% of the sequences do not contain cysteine and about 32% contain only one pair, as can be seen in Figure 3.

Figure 3: Histogram of the distribution of cysteine residues among peptides in the BACTIBASE database.

 

We also note low proline content, with over 74% of the amino acid sequences containing either one residue or none. The majority (71%) of sequences have net charges varying from 0 to +5, less than 18% possess a positive charge superior to +5, with the highest being +12 (BAC107) (Figure 4). In addition, only 11.4% of the sequences have a net negative charge, the most negatively charged bacteriocin having a net charge of -4 (BAC097). As a result, the average net charge of all bacteriocins in BACTIBASE is +2.90.

Figure 4: Histogram of the distribution of the net charge among peptides in the BACTIBASE database.

 

Figure 5 shows the distribution of basic and acidic residues. The majority of sequences display a basic pattern, 43% having from four to six basic residues. In comparison, acidic residue content is more limited. Over 22% do not contain any acidic amino acid and 83.7% contain two or fewer acidic amino acids. Current analysis revealed that three quarters of the bacteriocins contain between five and 20 hydrophobic residues. Hydrophobicity and basicity are major criteria for bacteriocin activity [17]. Only 16 of the bacteriocins were found to have 3D structures filed in the PDB database and resolved by NMR spectroscopy or crystallography. Some of them nevertheless possess many structures in the PDB database, bringing the total number of 3D entries to 24. These findings may be useful in isolating and characterizing novel bacteriocins or designing novel peptides with higher potency against pathogens or with broad antimicrobial spectra.

Figure 5: Bar graph of the distribution of acidic and basic amino acids among peptides in the BACTIBASE database.