LVs des Lehrstuhls für Efficient Algorithms
Data Science (CITHN1007)
| Vortragende/r (Mitwirkende/r) | |
|---|---|
| Nummer | 0000002458 |
| Art | Vorlesung |
| Umfang | 4 SWS |
| Semester | Wintersemester 2025/26 |
| Unterrichtssprache | English |
| Stellung in Studienplänen | Siehe TUMonline |
Teilnahmekriterien
Beschreibung
As larger and more complex data continues to become widely available, it is increasingly important for practitioners to understand the fundamentals of data science. The field of data science covers a large breadth of topics relating to data: from acquiring data, to processing it, to analyzing it, to visualizing it, to communicating it. A key application of data science is that by applying rigorous mathematical methods to real-world observed phenomena, we can make better decisions, predict outcomes, and improve efficiency. We will cover the following topics:
- Introduction to Data Science: Defining data, data science, and key terminology. Understanding the role of data in decision-making and the data analysis pipeline.
- Data Types and Structures: Identifying numerical, ordinal, categorical, textual, and relational data, along with best practices for data cleaning, representation, and encoding. Using Python, Jupyter notebooks, and common libraries.
- Statistical Analysis and Testing: Scalar numerical analysis, A/B testing, hypothesis testing (parametric, non-parametric, bootstrap), and linear regression.
- Multivariate Analysis: Techniques for dimensionality reduction (linear and non-linear). Clustering algorithms and classification methods may also be studied.
- Relational Data Analysis: Understanding graph structures, layout algorithms, and centrality measures.
- Block models, diffusion processes, clustering, and anomaly detection may also be studied.
- Machine Learning for Data Science: Neural networks for regression and classification. Graph neural networks may also be studied.
- Interpreting Results: Techniques for understanding correlations, probabilities, and confidence intervals, and the significance of findings.
- Visual Analytics: Principles of effective data visualization, interactivity, and data communication.
- Ethical Considerations: Awareness of common data science pitfalls, including p-hacking, data privacy, and the ethical implications of data-driven decisions.
- Introduction to Data Science: Defining data, data science, and key terminology. Understanding the role of data in decision-making and the data analysis pipeline.
- Data Types and Structures: Identifying numerical, ordinal, categorical, textual, and relational data, along with best practices for data cleaning, representation, and encoding. Using Python, Jupyter notebooks, and common libraries.
- Statistical Analysis and Testing: Scalar numerical analysis, A/B testing, hypothesis testing (parametric, non-parametric, bootstrap), and linear regression.
- Multivariate Analysis: Techniques for dimensionality reduction (linear and non-linear). Clustering algorithms and classification methods may also be studied.
- Relational Data Analysis: Understanding graph structures, layout algorithms, and centrality measures.
- Block models, diffusion processes, clustering, and anomaly detection may also be studied.
- Machine Learning for Data Science: Neural networks for regression and classification. Graph neural networks may also be studied.
- Interpreting Results: Techniques for understanding correlations, probabilities, and confidence intervals, and the significance of findings.
- Visual Analytics: Principles of effective data visualization, interactivity, and data communication.
- Ethical Considerations: Awareness of common data science pitfalls, including p-hacking, data privacy, and the ethical implications of data-driven decisions.
Inhaltliche Voraussetzungen
Foundations of Algorithms and Data Structures
Computational Mathematics I: Linear Algebra
Computational Mathematics II: Calculus
Computational Mathematics I: Linear Algebra
Computational Mathematics II: Calculus