The synthetic transposon Sleeping Beauty (SB) is widely used for the integration of gene sequences. It was derived from a consensus of sequences found in eight fish species, including the Atlantic salmon (Salmo salar). We used a recent high quality assembly of the complete genome of this species to identify hundreds of examples to produce a new consensus [1]. This consensus was actually not very different from the SB sequence and its transposition efficiency was not superior. Amplification of these sequences in the salmon genome may have followed constraints unrelated to transposition efficiency.
We review the problems associated to DNA tandem repeats in regard to the errors they generate in the assembly of genomes and how this resuls in the propagation of erroneous sequences eventually deposited in public databases [2].
References
[1] Scheuermann, B., T. Djem, Z. Ivics and M.A. Andrade-Navarro. 2019. Evolution-guided evaluation of the inverted terminal repeats of the synthetic transposon Sleeping Beauty. Sci. Reports. 9, 1171.
[2] Tørresen, O.K., B. Star, P. Mier, M.A. Andrade-Navarro, A. Bateman, P. Jarnot, A. Gruca, M. Grynberg, A. Kajava, V. Promponas, M. Asinimova, K.S. Jakobsen and D. Linke. 2019. Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases. Nucleic Acids Res. 47, 10994-11006.