25 Nov. - 6 Dec. 2024
00-445 (N33) NatFak Hauptgebäude
Day 1 / 25 November Monday / 9:00 - 17:00
Proteins coded in genomes and protein annotations. Homology.
You are expected to take notes during the class. Also, we will write down (paper and a pen) some exercises during the class.
Before the course starts:
Be sure that you are ready to run Python and you are familiar with the JGU Jupyter server: https://cbdm-01.zdv.uni-mainz.de/~muro/teaching/p4b/em-book/c0_set_up/c0_jgu_jupyter_notebook_server.html We will use the "Biology environment" that has been set up for this class.
We will run some Python programs, review in advance the material we learned in the bachelor studies (only chapter 2): https://cbdm-01.zdv.uni-mainz.de/~muro/teaching/p4b/em-book/c2_printing_and_manipulating_text/c2_printing_and_manipulating_text.html
See this example where the whole proteome of SAR2 can be analyzed:
EM_gzip_fasta_averageProtein_sars2.ipynb (download and rename to remove the _.txt extension)
See this example where the average aa mass can be calculated:
EM_average_aa_mass.ipynb (download and rename to remove the _.txt extension)
See this example where the real average aa mass in a whole can be calculated:
EM_average_aa_mass_proteome_students.ipynb (download and rename to remove the _.txt extension)
Or this other example where you can create a simple phylogenetic tree:
EM_simple_phylogenetic_tree.ipynb (download and rename to remove the _.txt extension)
For more functionality or in the case the JGU Jupyter server fails we will be ready to use colab. Then, open an account in colab from Google in advance: https://colab.research.google.com/
We will probably use some AI during the class. Open an account at https://chatgpt.com/ (for free) and use it in advance. Note: do not use it the day of the class before the class starts, because you have a very limited number of queries per day.
Some useful links we will use during the class:
-
- UniProt -> https://www.uniprot.org/
- UCSC: https://genome.ucsc.edu/
- BLAT (UCSC): https://genome.ucsc.edu/cgi-bin/hgBlat
- BLAT (UCSC European mirror): https://genome-euro.ucsc.edu/cgi-bin/hgBlat
- NCBI -> https://www.ncbi.nlm.nih.gov/
- NCBI BLAST -> https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins
- Blastp (NCBI): https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&BLAST_SPEC=&LINK_LOC=blasttab&LAST_PAGE=blastp
- T-Coffee (MSA) -> https://tcoffee.crg.eu/
- Clustal Omega -> https://www.ebi.ac.uk/jdispatcher/msa/clustalo
- Seaview -> https://doua.prabi.fr/software/seaview
- Dotlet: http://dotlet.vital-it.ch
- NCBI taxonomy -> https://www.ncbi.nlm.nih.gov/taxonomy
- Gene Ontology -> https://geneontology.org/
- Quick Go -> https://www.ebi.ac.uk/QuickGO/help
- AmiGO 2 -> https://amigo.geneontology.org/amigo
- Timetree: http://www.timetree.org/
Day 2 / 26 November Tuesday / 9:00 - 17:00
Biostatistics/Statistical genetics.
Software required: R / RStudio Desktop
R-script: Basics_in_R_1.R (download and rename to remove the .txt extension)
Slides: biostatistics_course_2024
Exercises: excercises_biostatistics
Solutions: Solutions_Basics_in_R.R
Day 3 / 27 November Wednesday / 9:00 - 17:00
Pairwise alignment. Multiple sequence alignment. Phylogeny.
The very same requirements as "day 1 (25 Nov)".
Day 4 / 28 November Thursday / 9:00 - 17:00
Protein structure, representation, Protein domains, disorder.
Software required: Chimera / JalView
Slides (Andrade): lesson1_chimera_7 | lesson2_domains_11
Slides (Eric Schumbera): IDR_and_LLPS_lecture_2024_Eric_Schumbera
Day 5 / 29 November Friday / 9:00 - 17:00
Protein structure prediction, low complexity, repeats. miRNAs
Data files (Andrade; repeats): MR1_fasta
Slides (Andrade): lesson3_model3D_12_MSc | lesson4_repeats_6 | lesson5_repeatsdbs_7
Data files (Mert Cihan; miRNA): sequences_mirna.fa
Slides (Mert Cihan): mirna_2024_december
Day 6 / 2 December Monday / 9:00 - 17:00
Protein interaction networks
Software required: Cytoscape
- Slides (Katja Luck):
- Files (Katja Luck):
Slides (Emily Vagiona): MSc_module_P&B_2024_Function_PPIs
Day 7 / 3 December Tuesday / 9:00 - 17:00
Programming with R (I)
Software required: R / RStudio Desktop
Tutorial (Johannes Wolter): Programming_with_R_Students
Day 8 / 4 December Wednesday / 9:00 - 17:00
RNAseq.
Slides (Federico Marini): https://seafile.rlp.net/d/aa1de6f4a9f746978989/
Dynamic modelling.
Slides (Alex Anyaegbunam): DynamicModelling_ProteinKinetics
Day 9 / 5 December Thursday / 9:00 - 17:00
Programming with R (II).
Data Mining
Exercise material (Piyush More): PMo-DataMining-Exercise
Slides (Piyush More): 01_PPT-DataMining-Biomedicine
Day 10 / 6 December Friday / 9:00 - 13:00
Proteomics.
Slides (Ute Distler): https://seafile.rlp.net/f/3c17c782a03446e19026/?dl=1
Slides (Stefan Tenzer): https://seafile.rlp.net/f/53a584868c3a419bb7ac/
Software required:
Please, install in your remote desktop and test prior to the corresponding lesson using https://apps.zdv.uni-mainz.de/. If you have problems installing any of the indicated software then email the corresponding contact person (in brackets).
R: (Programming with R; Johannes Wolter)
RStudio Desktop: (Programming with R; Johannes Wolter)
Chimera: (protein 3D representation; Miguel Andrade)
JalView: (alignment and structure representation; Miguel Andrade)
Cytoscape: (Protein networks; Katja Luck)
Links:
Protein Data Bank (PDB): http://www.rcsb.org/
UniProt: https://www.uniprot.org/