Python Data Engineer (Pipelines, Data Quality, Cloud Workflows)
Automated Data Pipelines, Data Validation (Great Expectations), and Cloud-Native Workflows.
Remote / Hybrid
Full-Time
The Mission
In a modernized enterprise, "Garbage In, Garbage Out" is a business-killing risk. At DuskByte, we treat data as code. As a Python Data Engineer, you will be the architect of our "Data Reliability" layer. You will build the sophisticated Python-based pipelines that ensure data from legacy systems is cleaned, transformed, and delivered to modern cloud environments with 100% integrity and zero manual intervention.
What You Will Do (The Role)
Automated Pipeline Engineering
Design and implement robust ELT/ETL workflows using Python (Pandas, PySpark, Dask) and modern orchestrators like Airflow, Dagster, or Prefect.
Data Quality Frameworks
Build automated "Circuit Breakers" for data using frameworks like Great Expectations or Pandas Profiling to halt pipelines if data quality drops.
Legacy-to-Cloud Integration
Write custom Python connectors to extract data from aging APIs, flat files, or legacy SQL databases and stream them into AWS, GCP, or Azure.
Workflow Modernization
Transition brittle, manual cron jobs into scalable, containerized cloud workflows using Docker and Kubernetes.
Performance Tuning
Optimize Python processing scripts for memory efficiency and execution speed, ensuring cost-effective cloud resource usage.
The Data Architect Tech Stack
You are a master of the Python data ecosystem
The Core
Advanced Python 3.x (Asynchronous programming, Generators, and Type Hinting).
Data Processing
Pandas, PySpark, Polars, or Dask.
Orchestration
Apache Airflow, Dagster, or Prefect.
Database Mastery
Expert SQL (PostgreSQL/MySQL) and experience with NoSQL (Redis, MongoDB).
Cloud Infrastructure
Professional experience with AWS Glue/Lambda, GCP Cloud Functions/Dataflow, or Azure Data Factory.
Validation
Great Expectations, Pydantic, or DBT tests.
Who You Are (Requirements)
The "Clean Code" Dataist
You don't just write scripts; you write maintainable, tested, and documented Python code. You believe in PEP 8 and CI/CD for data.
The Integrity Obsessive
You hate "silent failures." You build monitoring and alerting into every stage of the data lifecycle.
The Systems Architect
You understand how data moves through a network and how to handle API rate limits, retries, and backoffs gracefully.
Experience
8+ years of professional Python development with a heavy focus on data engineering or backend systems.
Why This Role is Critical at DuskByte
You are the "Protector of Truth." You ensure that when our MLOps Engineers build a model or our Full-Stack Engineers build a dashboard, the data they are using is accurate, fresh, and secure. You turn "Messy Legacy Data" into "High-Value Enterprise Assets."
We use cookies to enhance your browsing experience, serve personalised ads or content, and analyse our traffic. By clicking "Accept All", you consent to our use of cookies. Cookie Policy