We're Hiring
MLOps Engineer (Deployment, Monitoring, Model Risk Controls)
LLM Lifecycle Management, Model Observability, & AI Safety.
Remote / Hybrid
Full-Time
The Mission
The jump from a "Local AI Prototype" to an "Enterprise SaaS Feature" is where most AI projects fail.
At DuskByte, we bridge that gap. As an MLOps Engineer, you are the architect of the AI production line. You will ensure that the machine learning models and LLMs we integrate into modernized SaaS platforms are as stable, secure, and observable as the core legacy code we refactor.
What You Will Do (The Role)
Model Deployment Pipelines
Design and maintain automated CI/CD/CT (Continuous Training) pipelines for LLMs and custom ML models using Kubeflow, BentoML, or SageMaker.
AI Observability & Monitoring
Implement real-time tracking for Model Drift, Latency, and Hallucination rates to ensure the AI remains an asset, not a liability.
Model Risk Controls
Build "Guardrail" systems (using NeMo Guardrails or custom logic) to prevent prompt injection, data leakage, and biased outputs.
Vector Database Engineering
Optimize and scale high-performance Vector DBs (Pinecone, Weaviate, or pgvector) to power RAG (Retrieval-Augmented Generation) at enterprise scale.
GPU & Cost Optimization
Manage the high cost of AI by right-sizing inference clusters and implementing caching layers to reduce token consumption.
The MLOps Tech Stack
You are the master of the "AI Infrastructure"
Orchestration & Workflow
Kubeflow, Airflow, or MLflow.
Model Serving
AWS SageMaker, Google Vertex AI, or Azure Machine Learning.
Vector Infrastructure
Pinecone, Weaviate, Milvus, or pgvector.
LLM Tooling
LangChain, LlamaIndex, and OpenAI/Anthropic Enterprise APIs.
Monitoring & Safety
Weights & Biases, Arize Phoenix, or WhyLabs for model health.
Core Languages
Python (Mastery), GoLang, and Bash.
Who You Are (Requirements)
The "Stability-First" Data Scientist
You understand the nuances of server-side PHP but are a master of modern React patterns. You know how to make them work togeYou care more about a model's P99 latency and Reliability than its sheer complexity.ther seamlessly.
The AI Safety Advocate
You understand the legal and ethical risks of AI in the enterprise and build systems that can be audited and "turned off" instantly if they deviate.
The Production Specialist
You have moved models out of Jupyter Notebooks and into high-traffic production environments where millions of users interact with them daily.
Experience
8+ years in DevOps or Data Science, with at least 2 years focused specifically on MLOps or AI infrastructure.
Why This Role is Critical at DuskByte
You are the reason our "Modernization" isn't just about catching up to the present—it's about leaping into the future. You enable us to add StartDeck-level intelligence to legacy platforms safely. You turn AI from a "cool experiment" into a Mission-Critical Enterprise Feature.
© 2026 DuskByte. Engineering stability for complex platforms.