This repository is a comprehensive collection of projects and assignments completed during my intensive Data Science journey. It covers the full lifecycle of data analysis: from environment setup and data engineering to advanced machine learning modeling.
The goal of this repository is to demonstrate a solid foundation in Python, SQL, and Machine Learning, as well as the ability to solve practical analytical problems.
| Module | Topic | Key Skills & Tools |
|---|---|---|
| DSB1-DSB3 | Software Engineering Basics | UNIX, Bash, Python OOP, Syntax & Semantics |
| DSB4-DSB5 | Development Practices | Package management (pip/conda), Efficient code |
| DSB6 | MovieLens Analytics | Exploratory Data Analysis (EDA), Recommender basics |
| DSB7-DSB9 | Data Manipulation & SQL | Pandas, SQL queries, Data Visualization |
| DSB10-DSB11 | Machine Learning | Supervised/Unsupervised Learning, Feature Engineering |
| DSB12 | Food & Nutrition Project | Recommendation System & Domain-specific insights |
- Implemented various ML algorithms to solve classification and regression tasks.
- Focused on model interpretability and hyperparameter tuning.
- Applied feature engineering techniques to improve model performance.
- Building complex SQL queries to extract insights from relational databases.
- Advanced data cleaning and transformation pipelines using Pandas.
- Automating data processing tasks for efficiency.
- Languages: Python, SQL, Bash
- Data Analysis: Pandas, NumPy
- Machine Learning: Scikit-learn
- Visualization: Matplotlib, Seaborn, Plotly
- Tools: Git, Jupyter Notebooks, Visual Studio Code
- Clone the repository:
git clone https://github.com/knight99rus/Data-Science-Bootcamp.git
- Explore Modules: Each folder contains a specific project or set of exercises with its own logic.
- Author: Ivan Terekhov
- GitHub: knight99rus
- LinkedIn: vanya_terekhov
- Telegram: knight99rus