As a beginer in bioinformatics, I’m always confused with some basic biological concepts, like Gene, genotype, phenotype, allele, snps, loci, chromosome, dna, etc. And their relations. Here is a brief summary for further review. (Please feel free to correct me if there is any mistake)
A chromosome is a long threadlike macromolecule consisting of deoxyribonucleic acid(DNA). Chromosomes are the carriers of biologically expressed hereditary characteristics. A locus on a chromosome is a section of the chromosome that we distinguish. A locus that encodes a particular biologically expressed hereditary characteristic is called a gene, and we say the chromosomes carry genetic content of the organism. In eukaryotes, chromosomes are inside the nucleus.
A genome is a complete set of chromosomes in an organism. The human genome contains 23 chromosomes. A haploid cell contains one genome. so a human haploid cell contains 23 chromosomes. A diploid cell contains two genomes. each chromosome in one set is matched with a chromosome in the other set. This pair of chromosomes is called a homologous pair. each chromosome in the pair is called a homolog. So a human diploid cell contains 2*23 chromosomes.
A somatic cell is one of the cells in the body of the organism. A haploid organism is an organism whose somatic cells are haploid. A diploid organism is an organism whose somatic cells are diploid. Humans are diploid organisms.
A gamete is a mature sexual reproductive cell that unites with another gamete to become a zygote, which eventually grows into a new organism. A gamete is always haploid. The gamete produced by a male is called a sperm, whereas the gamete that is produced by the female is called an egg. Germ cells are precursors of gametes. They are diploid.
Generally speaking:
- Chromosome: A chromosome is a long DNA molecule with part or all of the genetic material of an organism. our human beings have 23.
- Gene: instructions for building a specific type or set of protiens. as showing in the below pictures.
- Alleles: The different versions of each gene. Various from person to person, which is one of two, or more, forms of a given gene variant. At the lowest possible size an allele can be a single nucleotide polymorphism (SNP). For example, our blood type is controlled by three alleles, IA, IB, and i, which determine compatibility of blood transfusions. Any individual has one of six possible genotypes (IAIA, IAi, IBIB, IBi, IAIB, and ii) which produce one of four possible phenotypes
- SNP: Single-nucleotide polymorphism 单核苷酸多态性. ‘A single-nucleotide polymorphism (SNP, pronounced snip) is a DNA sequence variation occurring when a single nucleotide adenine (A), thymine (T), cytosine ©, or guanine (G]) in the genome (or other shared sequence) differs between members of a species or paired chromosomes in an individual. For example, two sequenced DNA fragments from different individuals, AAGCCTA to AAGCTTA, contain a difference in a single nucleotide. In this case we say that there are two alleles: C and T. Almost all common SNPs have only two alleles.’
- Locus (genetics)/Loci: https://en.wikipedia.org/wiki/Locus_(genetics
- Allele: https://en.wikipedia.org/wiki/Allele
- Exon/Intron of genes: ‘Introns and exons are nucleotide sequences within a gene. Introns are removed by RNA splicing as RNA matures, meaning that they are not expressed in the final messenger RNA (mRNA) product, while exons go on to be covalently bonded to one another in order to create mature mRNA.’
- Pleiotropy: when one gene influences two or more seemingly unrelated phenotypic traits.
- LD(Linkage Disequilibrium):the non-random association of alleles at different loci in a given population.
These pics from internet give a brief summary of their relations, which is helpful.
Other Basic concepts from a book.
Mitosis & Meiosis (Check the figure recording)
Gene. A gene is a locus (section of a chromosome) that is functional in the sense that it is responsible for some biological expression in the organism. Genes are responsible for both the structure and the processes of the organism. Many organisms have tens of thousands of genes on each chromosome.
Allele. An allele is any of several forms of a gene, usually arising through mutation. Alleles are responsible for hereditary variation. Some alleles are recessive, while some of them are dominant.
Genotype & Phenotype. The genotype of an organism is its genetic makeup. The phenotype of an organism is its appearance resulting from the interaction of the genotype and environment. For example , if one individual had a blue allele and a brown allele, and another individual had two Brown alleles, they would have different genotypes relative to the bey2 gene.
Crossing-over. Recall that during meiosis homologous pairs of chromosomes line up. Often the chromatids from the two homologs that are next to each other exchange corresponding segments of genetic material.
Gene conversion. Gene conversion (nonreciprocal recombination) is similar to crossing-over except that one of the homologs has a segment of chromosome replaced by the homologous segment from the other homolog, whereas the other homolog is unchanged.
DNA. Chromosomes consist of the compound deoxyribonucleic acid (DNA). DNA is composed of four basic molecules called nucleotides. Each nucleotide contains a pentose sugar (deoxyribose), a phosphate group, and a purine or pyrimidine base. The purines, adenine (A) and guanine (G), are similar in structure, as are the pyrimidines, cytosine © and thymine (T). Adenine always pairs with thymine, and guanine always pairs with cytosine. Each such pair is called a canonical base pair (bp), and A, G, C and T are called bases. Authors often used the terms nucleotide(核苷酸) and ‘base’ interchangeably. When a cell divides, each chromosome first duplicates itself. The two strands of DNA separate, and then each strand serves as a template for synthesis of a new complementary strand. This is called semiconservative replication because each new chromosome consists of an original parent strand and a newly synthesized child strand.
Ribonucleic acid (RNA) differs from DNA by having ribose instead of deoxyribose as the sugar, having the base uracil (U) instead of thymine, and being single-stranded. (RNA has U instead of T).
Gene and proteins. A gene is a section of a chromosome, often consisting of thousands of base pairs, but the size of genes varies a great deal. Most genes, called protein-coding genes, produce a protein, which is a macromolecule composed of amino acids. The coding regions of a gene, which are the regions that determine the resultant protein, are called exons; any space between two exons is called an intron.
Gene produces proteins.
Transcription. First, the gene produces messenger RNA(mRNA) by a process called transcription. The transcription process starts when as many as 50 different protein transcription factors bind to promoter sites on the DNA. An enzyme, called RNA polymerase, then binds to the complex of transcription factors. Working together, they open the DNA helix. One of the resultant strands of DNA is called the antisense strand; the complementary strand is called the sense strand. The antisense strand synthesizes a single strand of RNA called precursor messenger RNA.
In the transcription process, multiple RNA polymerases can bind on a single DNA template, resulting in multiple rounds of transcription.
Translation. A mRNA molecule consists of triplets of sequential nucleotides called codons. Each codon codes for a unique amino acid. During the translation, in sequence each codon synthesizes one amino acid in the protein. This is accomplished via the work of a transfer RNA (tRNA) molecule. A tRNA molecule is a small molecule that has associated with it both a codon and a protein. Its job is to deliver the protein to the codon.
Codons. totally, the number of different codons is 4*4*4=64. It includes stop codons and sense codons. Codons that code for the same amino acid are called synonymous. With a few exceptions, the correspondence between the codons and the amino acids is the same in all organisms. For that reason it is called the universal genetic code.
Gene expression level. The level of mRNA produced by a gene is called the gene expression level. Recall that the transcription process starts with protein transcription factors binding to promoter sites on the DNA. Although all cells contain the same genetic code, their protein composition is quite different. Thus different cells have different protein transcription factors, which means that the same gene in one cell may be expressed at a different level than it is in another cell. We see that the protein produced by one gene can have a causal effect on the level of mRNA of another gene. When the zygote is first divided into two cells, asymmetric localization of cytoplasmic molecules within the cell before it divides. For example, in the fruit fly bicoid mRNA from the mother is translated into bicoid protein in the zygote. The protein diffuses through the egg, foring a gradient. The result is a high concentration of bicoid protein at one end of the egg and a low concentration at the other end. Bicoid is a transcription factor that turns on different genes at different levels.
Mutations. including: substitution mutations, insertions and deletions
pay attention to these words, showing in the figure. transversion, synonymous, missense.
insertion mutation, deletion mutation, unequal crossing-over, replication slippage (slip-strand mispairing).
Transposition. Some DNA sequences, called transposable elements, possess the intrinsic ability to move from one genomic location, called the donor site, to another genomic location, called target site. This type of insertion mutation is called transposition. When the sequence is retained at the donor site, we call the transposition conservative, and when it is excised from the donor site, we call the transposition duplicative. So, duplicative transposition increases the number of copies of the transposable element.
Process of transposition. First the target site splits open, resulting in a sequence of unpaired nucleotides at each end of the two pieces. That is , there is a gap in one of the strands where there are no nucleotides. Then the transposable element is inserted, and finally the synthesis of new nucleotides repairs the gaps. Ordinarily, transposable elements have inverted repeats at each of their ends. As shown in Figure 4.15 the transposable element called Ds in maize (corn). At one end we have the sequence TAGGGATGAAA on the top from left to right; on the other end we have the sample sequence from right to left on the bottom. Transposable elements are not uncommon.
Horizontal gene transfer. Horizontal gene transfer is a form of transposition in which genetic material is transferred from one species to another. A common form of horizontal gene transfer in bacteria is transformation. Another type of horizontal gene transfer is conjugation, in which the donor organism and recipient organism must physically interact. In bacterium- to bacterium conjugation, a plasmid transports the DNA from the host to the recipient. Transduction is similar to conjugation in that a vehicle transports the DNA, but the host and recipient need not be in physical contact.
References:
Neapolitan, Richard E. Probabilistic methods for bioinformatics: with an introduction to Bayesian networks. Morgan Kaufmann, 2009. and others.