Data Engineer
About VOX
VOX is a visionary company led by a single founder, currently leading the way in flashcall and telecom carrier services, transforming the way businesses communicate, authenticate and connect. As a hyper-growth company, VOX achieved over 25% YoY revenue growth last year and is aiming to reach $100M+ revenue this year. VOX is looking for a team of growth-driven individuals to take the company to new heights.
VOX's cutting-edge technology and dedicated customer service team ensure that telcos and enterprises maintain secure, fast, and reliable connections while protecting their networks. VOX's promise of a hassle-free experience and superior customer support enables telcos and enterprises to focus on success. As a company, VOX focuses on solutions that monetize the assets of mobile network operators.
Joining VOX offers the opportunity to work with the industry's leading technologies and help them stay ahead and continue to innovate with a comprehensive suite of flashcall and telecom carrier services. VOX is highly committed to providing its employees with a dynamic, forward-thinking work environment, competitive compensation and benefits, vacation and time-off packages, and stock options. This is a once-in-a-lifetime opportunity for highly ambitious individuals, as VOX plans to expand its solutions portfolio and go public in the next 3-5 years.
About the Role
VOX is building a multi-tenant Customer Data Platform for mobile network operators across multiple countries. Our platform ingests billions of events from telecom traffic and transforms them into actionable insights, segmentation, and campaign activation.
As a Data Engineer on the VOX CDP team, you will work across Kafka ingestion, Spark processing, Iceberg/Nessie lake house modeling, Dremio/dbt transformations, and Kubernetes-based multi-tenant deployments.
This is a role for someone who wants to work deeply with high-volume event data and a modern, cloud-native analytics architecture.
Responsibilities
Event Ingestion & Streaming (Kafka KRaft)
Build and maintain Kafka ingestion pipelines
Define topic structures, partition strategies, retention policies, and consumer logic for multi-tenant setups
Manage data contracts and schema evolution
Develop idempotent ingestion services that land data into Iceberg tables
Lakehouse Architecture (Iceberg + Nessie)
Design and optimize Iceberg tables (partitioning, compaction, clustering, retention rules)
Work with Nessie branches/tags to manage multi-environment (dev/test/prod) and multi-MNO deployments
Implement Python/Spark loaders writing from Kafka → Iceberg
Manage Iceberg compaction, metadata pruning, snapshot control, and performance tuning
Distributed Processing & Enrichment (Spark)
Develop Spark jobs (batch + micro-batch where needed) for:
Cleaning and normalizing events
Categorizing senders
Engagement signals
Identity stitching and grouping
Audience enrichment and behavioral metricsEnsure Spark jobs scale efficiently across large volumes of event data
Query Layer & Transformations (Dremio + dbt)
Build dbt models on top of Iceberg datasets via Dremio and dbt
Deliver telecom-specific analytical models including:
Descriptive Analytics
Quality/quantity audience scoring
Campaign performance metrics
RFU relevance scoring
Cohort segmentation pipelinesOptimise Dremio queries using reflections, column pruning, and Iceberg metadata
Kubernetes, Helm, and CI/CD
Maintain Helm charts for each VOX deployment (multiple clusters)
Build CI/CD pipelines (GitHub Actions/GitLab/Argo) for:
Ingestion services
Spark job deployment
Kafka topic configs
Dbt model updates
Helm releases into customer clustersAutomate rollouts, config updates, and monitoring installation
Observe-ability, Quality & Governance
Implement monitoring for ingestion lag, consumer errors, Iceberg table health, Spark jobs, and Dremio performance
Implement custom Python-based data validation checks where needed
Handle all PII with strict tenant isolation and encryption (Vault)
Ensure compliant and governed data flows across all deployments.
Collaboration & Product Development
Work with product teams to translate telecom and marketing requirements into scalable data models
Support analysts and ML engineers with clean, enriched datasets from Iceberg
Collaborate with DevOps on cluster performance, scaling, and stability
Requirements
3+ years of experience as a Data Engineer or equivalent, with documented experience in working with big datasets
Strong Python engineering skills (must-have)
Experience building pipelines on Apache Kafka (KRaft mode preferred)
Strong SQL + experience with Iceberg table design and optimization
Experience with Spark for large-scale processing
Experience with dbt and SQL modeling on lakehouse storage
Experience working with Dremio, Trino, or similar query engines
Experience with Kubernetes, Helm, and Git-based CI/CD
Understanding of PII handling, encryption, and compliance requirements
Ability to work in distributed, multi-environment setups (dev/test/prod + multi-deployments)
Nice to Have
Experience with telecom data structures (CDRs)
Experience with Nessie catalogs (branching, tagging, schema versioning)
Understanding of audience-building, scoring, or marketing activation models
Experience tuning object storage (S3/MinIO)
Join the team and help shape the future of the telecom industry!
- Department
- Product AdTech
- Locations
- Brasov, Romania, Bucharest, Craiova, Romania, Iasi, Romania
- Remote status
- Fully Remote