3. Context and Scope ============================ This section describes the technical framework and the dataset used in this project. The developed **MLOps pipeline** ensures that machine learning models can be efficiently trained, managed, and deployed. ================================== 3.1 Adult Income Context ================================== The **Adult Income Dataset** contains demographic and income-related information. The goal is to build a classification model that predicts whether an individual's income is above or below 50.000$. To better understand the dataset, several Jupyter Notebook analyses are available: .. raw:: html
📊 Click to Expand: Environment for Exploration
.. raw:: html
📊 Click to Expand: Data Exploration & Validation
.. raw:: html
📊 Click to Expand: Data Preparation
========================== 3.2 Technical Context ========================== .. raw:: html :file: _images/context_view.html 🖥️ **Streamlit User Interface:** - A simple web frontend where users can enter their data. 🚀 **FastAPI (Backend):** - Receives user inputs and processes them. - Communicates with the Data Processor to generate predictions. - Returns the prediction result to the user. 🧠 **Data Processor (Core Component):** - Trains and stores ML models. - Performs predictions. - Stores all data processing steps in MinIO and manages model versions with MLflow. 📦 **MLflow (Model Management):** - Manages different model versions. - Stores model metrics and artifacts. 🗄️ **MinIO (Data Versioning):** - Stores different dataset versions. - Ensures reproducibility of the pipeline. **Why is this Context Important?** This MLOps pipeline is designed to ensure that machine learning models are not just trained once but can be continuously improved and efficiently deployed.