Home Backend Development Python Tutorial Magic and Muscles: ETL with Magic and DuckDB with data from my powerlifting training

Magic and Muscles: ETL with Magic and DuckDB with data from my powerlifting training

Jul 17, 2024 pm 12:39 PM

You can acecces the full pipeline here

Mage

In my last post, i writed about a dashboard I builted using Python and Looker Studio, to visualize my powrlifting training data. In this post I will walk you through the step by step of a ETL (Extract, Transform, Load) pipeline, using the same dataset.

To build the pipeline we will use Mage to orchestrate the pipeline and Python for transforming and loading the data, as a last step we will export the transformed data into a DuckDB database.

To execute Mage, we're a gointo use the oficial docker image:

docker pull mageai/mageai:latest
Copy after login

The pipeline will look like this:

Image description

Extract

The extraction will be simple, we just have to read a csv file and create a pandas dataframe with it, so we can procide to the next steps. Using the Data loader block, we already have a tamplate to work with, just remember to set the "sep" parameter int he read_csv( ) function, só the data is loaded correct.

from mage_ai.io.file import FileIO
import pandas as pd

if 'data_loader' not in globals():

    from mage_ai.data_preparation.decorators import data_loader

if 'test' not in globals():

    from mage_ai.data_preparation.decorators import test

@data_loader
def load_data_from_file(*args, **kwargs):

    filepath = 'default_repo/data_strong.csv'
    df = pd.read_csv(filepath, sep=';')  

    return df

@test
def test_output(output, *args) -> None:
    assert output is not None, 'The output is undefined'`

Copy after login

Transform

In this step, using the Transformer block, that have a lot of templates to chose from, we will select a custom template.

The transformation we have to do is basicly the maping of the Exercise Name column, so we can identify which body part correspond to the specific exercise.

import pandas as pd

if 'transformer' not in globals():

    from mage_ai.data_preparation.decorators import transformer

if 'test' not in globals():

    from mage_ai.data_preparation.decorators import test

body_part = {'Squat (Barbell)': 'Pernas',

    'Bench Press (Barbell)': 'Peitoral',

    'Deadlift (Barbell)': 'Costas',

    'Triceps Pushdown (Cable - Straight Bar)': 'Bracos',

    'Bent Over Row (Barbell)': 'Costas',

    'Leg Press': 'Pernas',

    'Overhead Press (Barbell)': 'Ombros',

    'Romanian Deadlift (Barbell)': 'Costas',

    'Lat Pulldown (Machine)': 'Costas',

    'Bench Press (Dumbbell)': 'Peitoral',

    'Skullcrusher (Dumbbell)': 'Bracos',

    'Lying Leg Curl (Machine)': 'Pernas',

    'Hammer Curl (Dumbbell)': 'Bracos',

    'Overhead Press (Dumbbell)': 'Ombros',

    'Lateral Raise (Dumbbell)': 'Ombros',

    'Chest Press (Machine)': 'Peitoral',

    'Incline Bench Press (Barbell)': 'Peitoral',

    'Hip Thrust (Barbell)': 'Pernas',

    'Agachamento Pausado ': 'Pernas',

    'Larsen Press': 'Peitoral',

    'Triceps Dip': 'Bracos',

    'Farmers March ': 'Abdomen',

    'Lat Pulldown (Cable)': 'Costas',

    'Face Pull (Cable)': 'Ombros',

    'Stiff Leg Deadlift (Barbell)': 'Pernas',

    'Bulgarian Split Squat': 'Pernas',

    'Front Squat (Barbell)': 'Pernas',

    'Incline Bench Press (Dumbbell)': 'Peitoral',

    'Reverse Fly (Dumbbell)': 'Ombros',

    'Push Press': 'Ombros',

    'Good Morning (Barbell)': 'Costas',

    'Leg Extension (Machine)': 'Pernas',

    'Standing Calf Raise (Smith Machine)': 'Pernas',

    'Skullcrusher (Barbell)': 'Bracos',

    'Strict Military Press (Barbell)': 'Ombros',

    'Seated Leg Curl (Machine)': 'Pernas',

    'Bench Press - Close Grip (Barbell)': 'Peitoral',

    'Hip Adductor (Machine)': 'Pernas',

    'Deficit Deadlift (Barbell)': 'Pernas',

    'Sumo Deadlift (Barbell)': 'Costas',

    'Box Squat (Barbell)': 'Pernas',

    'Seated Row (Cable)': 'Costas',

    'Bicep Curl (Dumbbell)': 'Bracos',

    'Spotto Press': 'Peitoral',

    'Incline Chest Fly (Dumbbell)': 'Peitoral',

    'Incline Row (Dumbbell)': 'Costas'}


@transformer
def transform(data, *args, **kwargs):
    strong_data = data[['Date', 'Workout Name', 'Exercise Name', 'Weight', 'Reps',    'Workout Duration']]
    strong_data['Body part'] = strong_data['Exercise Name'].map(body_part)

    return strong_data

@test
def test_output(output, *args) -> None:
    assert output is not None, 'The output is undefined'

Copy after login

An intresting feature of Mage is that we can visualize the changes we are making inside the Tranformer block, using the Add chart opting it's possible to generate a pie graph using the column Body part.

Image description

Load

Now is time to load the data to DuckDB. In the docker image we already have DuckDB, so we just have to include another block in our pipeline. Lets inlcude the Data Exporter block, with a SQL template so we can create a table and insert the data.

CREATE OR REPLACE TABLE powerlifting 
(
    _date DATE,
    workout_name STRING,
    exercise_name STRING,
    weight STRING,
    reps STRING,
    workout_duration STRING,
    body_part STRING
);

INSERT INTO powerlifting SELECT * FROM {{ df_1 }};
Copy after login

Conclusion

Mage is a powrfull tool to orchestrate pipelines, provide a complete set of templates for developing especific tasks involving ETL. In this post we had a breathh explanation about how to get start using Mage to build pipelines of data. In future post we're gointo explore more about this amizing framework.

The above is the detailed content of Magic and Muscles: ETL with Magic and DuckDB with data from my powerlifting training. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1672
14
PHP Tutorial
1276
29
C# Tutorial
1256
24
Python vs. C  : Learning Curves and Ease of Use Python vs. C : Learning Curves and Ease of Use Apr 19, 2025 am 12:20 AM

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

Learning Python: Is 2 Hours of Daily Study Sufficient? Learning Python: Is 2 Hours of Daily Study Sufficient? Apr 18, 2025 am 12:22 AM

Is it enough to learn Python for two hours a day? It depends on your goals and learning methods. 1) Develop a clear learning plan, 2) Select appropriate learning resources and methods, 3) Practice and review and consolidate hands-on practice and review and consolidate, and you can gradually master the basic knowledge and advanced functions of Python during this period.

Python vs. C  : Exploring Performance and Efficiency Python vs. C : Exploring Performance and Efficiency Apr 18, 2025 am 12:20 AM

Python is better than C in development efficiency, but C is higher in execution performance. 1. Python's concise syntax and rich libraries improve development efficiency. 2.C's compilation-type characteristics and hardware control improve execution performance. When making a choice, you need to weigh the development speed and execution efficiency based on project needs.

Python vs. C  : Understanding the Key Differences Python vs. C : Understanding the Key Differences Apr 21, 2025 am 12:18 AM

Python and C each have their own advantages, and the choice should be based on project requirements. 1) Python is suitable for rapid development and data processing due to its concise syntax and dynamic typing. 2)C is suitable for high performance and system programming due to its static typing and manual memory management.

Which is part of the Python standard library: lists or arrays? Which is part of the Python standard library: lists or arrays? Apr 27, 2025 am 12:03 AM

Pythonlistsarepartofthestandardlibrary,whilearraysarenot.Listsarebuilt-in,versatile,andusedforstoringcollections,whereasarraysareprovidedbythearraymoduleandlesscommonlyusedduetolimitedfunctionality.

Python: Automation, Scripting, and Task Management Python: Automation, Scripting, and Task Management Apr 16, 2025 am 12:14 AM

Python excels in automation, scripting, and task management. 1) Automation: File backup is realized through standard libraries such as os and shutil. 2) Script writing: Use the psutil library to monitor system resources. 3) Task management: Use the schedule library to schedule tasks. Python's ease of use and rich library support makes it the preferred tool in these areas.

Python for Scientific Computing: A Detailed Look Python for Scientific Computing: A Detailed Look Apr 19, 2025 am 12:15 AM

Python's applications in scientific computing include data analysis, machine learning, numerical simulation and visualization. 1.Numpy provides efficient multi-dimensional arrays and mathematical functions. 2. SciPy extends Numpy functionality and provides optimization and linear algebra tools. 3. Pandas is used for data processing and analysis. 4.Matplotlib is used to generate various graphs and visual results.

Python for Web Development: Key Applications Python for Web Development: Key Applications Apr 18, 2025 am 12:20 AM

Key applications of Python in web development include the use of Django and Flask frameworks, API development, data analysis and visualization, machine learning and AI, and performance optimization. 1. Django and Flask framework: Django is suitable for rapid development of complex applications, and Flask is suitable for small or highly customized projects. 2. API development: Use Flask or DjangoRESTFramework to build RESTfulAPI. 3. Data analysis and visualization: Use Python to process data and display it through the web interface. 4. Machine Learning and AI: Python is used to build intelligent web applications. 5. Performance optimization: optimized through asynchronous programming, caching and code

See all articles