Skip to content
Facundo Martínez edited this page Jul 31, 2023 · 2 revisions

BioSeeker's Wiki

Glossary

  • Codon: a sequence of three consecutive DNA or RNA nucleotides that code for a specific aminoacid.

  • Bicodon: two consecutive codons.

  • Conservation rate: the result of the division between the number a given codon is conserved across species of a same genus, and the number of times said codon is present in the reference sequence.

  • Genus: a principal taxonomic category that ranks above species and below family, and is denoted by a capitalized Latin name, e.g. Leo.

  • Species: a group of living organisms consisting of similar individuals capable of exchanging genes or interbreeding. The species is the principal natural taxonomic unit, ranking below a genus and denoted by a Latin binomial, e.g. Homo sapiens.

  • Drosophila: Genus of the small fruit fly, used extensively in genetic research because of its large chromosomes, numerous varieties, and rapid rate of reproduction.

  • FlyDIVaS: An online database containing D. melanogaster-centric orthologous gene sets, CDS and protein alignments, divergence statistics (% gaps, dN, dS, dN/dS), and codon-based tests of positive Darwinian selection.

  • MSA file: Multiple sequence alignment file which contains aligned homologous genes from different species.

  • FASTA: Text-based format for representing either nucleotide sequences or amino acid (protein) sequences.

  • NumPy: Python library used for data analysis.

  • Pandas: Python library used to handle datasets (works with Numpy)

  • ORF: Open Reading Frame. Reading frame refers to one of three possible ways of reading a nucleotide sequence. Let's say we have a stretch of 15 DNA base pairs: acttagccgggacta. We can start translating, or reading, the DNA from the first letter, 'a,' which would be referred to as the first reading frame. The reading frame that produces a fully functioning polypeptide is referred to as "open", while those which produce multiple stop codons are referred to as "closed".

  • Reference sequence: DNA sequence belonging to the reference species. The reference species is that to which all species are compared to.

  • CSV file: Comma Separated Value file. A type of file used to store tabular data separated by commas.

Clone this wiki locally