Machine learning has become a part in our everyday life, from simple product recommendations to personal electronic
assistant to self-driving cars. More recently, through the advent of potent hardware and cheap computational power,
“Deep Learning” has become a popular and powerful tool for learning from complex, large-scale data.
In this course, we will discuss the fundamentals of deep learning and its application to various different fields.
We will learn about the power but also the limitations of these deep neural networks. At the end of the course, the
students will have significant familiarity with the subject and will be able to apply the learned techniques to a
broad range of different fields.
Mainly, the following topics will be covered:
- Linear and Logistic Regression
- Feedforward neural networks
- Recurrent neural networks
- Convolutional neural networks
- Backpropagation algorithm
- Modern architectures
The Deep Learning course comes in two different flavors:
- 10 ECTS course for Computer Science Students (DM873)
- 5 ECTS course for Data Science Students (DS809)
The data science version of the course consists of the first hald of the larger 10 ECTS version. While DM873 involves
two projects, the DS809 version has only one project. DS809 students are nevertheless invited to join the lectures
after their course has been completed.
Course Materials and schedule will be published on the itslearning platform during the semester.
Deep Learning Summer School
We also offer Deep Learning as a summer school course, normally held in the 2nd and 3rd week of August. Data Science
students are invited to take this lecture instead of DS809. As a regular summer school visitor, you will have to submit
a final project to pass the course. As a data scientist student, you will submit the project and additionally
have to pass the oral exam of DS809 in order to make these two courses interchangeable.
How to Apply
Are you thinking of applying for the SDU summer school and join two week of fun and learning? Please follow the
instructions of how to apply from the official website: https://www.sdu.dk/summerschool.
In principle, all students in the 2nd or 3rd year of their Bachelor education are eligible for the summer course. You
should be knowledgeable in programming, in particular in python. Also, you should not be afraid of a bit of math in
your curriculum, to be able to fully understand and follow the course. When in doubt, just write us en Email and we
can help you with your decision.
Introduction to Bioinformatics
The purpose of this course is to give an understanding of computational problems in modern biomedical research.
We will start with concrete medical questions, develop a formal problem description, setup an algorithmic/statistical
model, solve it and subsequently derive real-world answers from within the solved model. The course aims for giving a
basic understanding of which problems arise in modern molecular biology and clinical research, and how these problems
can be solved with appropriate computational tools. It is a class that needs regular attendance. Precondition for
admittance to the exam will be the preparation of exercise sheets as well as the course project.
Expected Learning Outcome
- Explain and understand the central dogma of molecular biology, central aspects of gene regulation, the basic principle of epigenetic DNA modifications, and specialties w.r.t. bacteria & phage genetics
- Model ontologies for biomedical data dependencies
- Design of systems biology databases
- Explain and implement DNA & amino acid sequence analysis methods (HMMs, scoring matrices, and efficient statistics with them on data structures like suffix arrays)
- Explain and implement statistical learning methods on biological networks (network enrichment)
- Explain the specialties of bacterial genetics (the operon prediction trick)
- Explain and implement methods for suffix trees, suffix arrays, and the Burrows-Wheeler transformation
- Explain de novo sequence pattern screening with EM algorithm and entropy models.
- Explain and implement basic methods for supervised and unsupervised data mining, as well as their application to biomedical OMICS data sets
The following main topics are contained in the course:
- Central dogma of molecular genetics, epigenetics, and bacterial and phage genetics
- DNA and amino acid sequence pattern models (HMMS, scoring matrices, mixed models, efficient statistics with them on big data sets)
- Specialities in bacterial genetics (sequence models and functional models for operons prediction)
- De novo identification of transcription factor binding motifs (recursive expectation maximization, entropy-based models)
- Analysis of next-generation DNA sequencing data sets (memory-aware short sequence read mapping data with Burrows Wheeler transformation and suffix arrays, bi-modal peak calling)
- Visualization of biological networks (graph layouting: small but highly variable graphs vs. huge but rather static graphs)
- Systems biology and statistics on networks (network enrichment with CUSP, jActiveModules and KeyPathwayMiner)
- Basic supervised and unsupervised classification methods for OMICS data analysis
The course introduces the student to the architecture of general purpose computers, from the logic level over the
microprogramming level to the conventional ISA level; also major components in the storage hierarchy, bus
architectures and the organization of pipelined CPU’s are presented. In addition, the main aspects of a system
programming language are introduced.
The student will obtain insight into the organization of modern computers and their CPU’s, in order to be able to
compare and evaluate their performance on a level independent of the specific technology. More specifically the
course provide the following competences:
- to understand basic logic diagrams, and to express the functionality of basic CPU components in terms of such diagrams.
- to express the functionality of an ISA level instruction by interpretation on an underlying (micro)architecture.
- to be able to interpret ordinary binary integer and floating point number representations, and to be able to convert between these.
- to know and be able to explain the properties and limitations of the different storage components, including their addressing, and to evaluate the performance of a multi-level storage hierarchy.
- to be able to explain and discuss the exploitation of parallelism in the form of pipelining, their limitations, and the distribution of tasks on multiple functional units.
- to be able to explain and discuss the internal organization and internal communication paths at a high level, including communication with external units and interrupts from these.
- to express the functionality of a given algorithm as an assembler program, including to bring such a program to execution on a specific machine.
- to express the functionality of a given algorithm as a system program, including to bring such a program to execution on a specific machine.
One trend can be observed over almost all fields of informatics: we have to cope with an ever-increasing amount of
available data of all kinds. This amount of data renders it impossible to inspect the dataset “by hand”, or even
deduce knowledge from the given data, without sophisticated computer aided help. In this course we will discuss one
of the most common mechanism of unsupervised machine learning for investigating datasets: Clustering. Clustering
separates a given dataset into groups of similar objects, the clusters, and thus allows for a better understanding
of the data and their structure. We discuss a number of clustering methods and their application to various
different fields such as biology, economics or sociology.
- Mathematical Foundations
- Detecting Clusters Graphically
- Principal Component Analysis & Principal Coordinate Analysis
- Proximity Measures
- Hierarchical Clustering
- Optimization Based Clustering
- Gaussian Mixture Models
- Cluster Analysis Pipeline
- Advanced Clustering methods, like bi-clustering, subspace clustering