Machinify is the leading provider of AI-powered software products that transform healthcare claims and payment operations. Each year, the healthcare industry generates over $200B in claims mispayments, creating incredible waste, friction and frustration for all participants: patients, providers, and especially payers. Machinify’s revolutionary AI-platform has enabled the company to develop and deploy, at light speed, industry-specific products that increase the speed and accuracy of claims processing by orders of magnitude.

Why This Role Matters

Our data pipelines power payment decisions, product insights, ML models, and customer operations - our data engineering infrastructure must evolve to support scaling and efficiency.

As a Staff/Lead Data Engineer, Infra, you will play a pivotal role in enabling every data engineer to be faster, more reliable, and more productive. You will build the core frameworks, tools, and observability systems that:

Abstract common pipeline patterns into reusable components
Implement robust monitoring and testing across the platform
Drive testability and data quality at scale
Unify tooling and standards across our post-merger environment
Explore the use of GenAI to further accelerate data engineering productivity

You’ll collaborate deeply with Data Engineers, Data Scientists , Platform Engineers, ML, Product, and SMEs to shape the foundation of our next-generation data platform.

What You’ll Do

🛠 Build Core Data Engineering Infrastructure

Develop and maintain internal DE SDKs / libraries to abstract and standardize Spark + Airflow patterns.
Design and implement pipeline testing frameworks to enable CI/CD-based data validation.
Create pipeline observability & monitoring systems (Grafana, ELK, DataDog) to ensure reliability and visibility.
Drive adoption of data validation frameworks to automate and scale data quality enforcement.

🚀 Drive Platform & Unification Initiatives

Lead initiatives to unify our data platform post-merger by defining scalable standards and patterns.
Partner with Data Engineers and SMEs to improve canonical modeling, schema evolution, and data versioning.
Help architect centralized metadata management to replace fragmented, ad-hoc systems.

🤖 Innovate with GenAI & Modern Practices

Leverage GenAI and modern tooling to improve pipeline development, monitoring, and debugging.
Prototype new ways to improve developer productivity, data quality and pipeline reliability using emerging technologies.

🤝 Collaborate Across Teams

Work with core Data Engineers to identify and address productivity bottlenecks and scaling challenges.
Support customer data onboarding frameworks indirectly through improved tooling and processes.
Partner closely with Platform/Server Engineering to build/request new features

What You Bring

Required Technical Skills & Experience

7+ years of experience as a Data Engineer / Software Enginer / Platform Engineer, with strong expertise building internal tooling and frameworks.
Proficient in Python and SQL.
Deep expertise with Apache Spark (core processing engine today).
Advanced experience with Apache Airflow.
Experience building monitoring & observability systems (Grafana, ELK Stack, or DataDog).
Experience designing and implementing data validation & testing frameworks at scale.
Strong understanding of AWS (primary cloud environment).
Proficient in Kubernetes and modern orchestration patterns.
Deep understanding of schema evolution, data modeling, and versioning for large-scale data platforms.
Proven ability to operate as a Staff/Lead: driving technical strategy, mentoring others, and collaborating cross-functionally.

Bonus Experience (Nice to Have)

Experience working with Kafka, Spark Streaming, or other modern streaming platforms.
Scala experience (Spark internals, performance tuning, advanced transformations).
Familiarity with metadata management tools such as DataHub, Amundsen, or OpenMetadata.
Experience using GenAI to improve data engineering workflows and developer experience.
Experience building unified data platforms post-merger.
Exposure to GCP or multi-cloud environments.

Why Join Us

Real impact — your work will make the entire data engineering organization dramatically more effective.
Total ownership — build the core frameworks and standards that define the future of our data platform.
Opportunity to innovate — drive adoption of GenAI and modern data engineering practices.
Cross-functional leadership — collaborate across Data, ML, Platform, Product, and SMEs at a pivotal stage of company growth.
Fast-growing environments — contribute at a moment when building scalable, unified data infrastructure will have outsized impact.

Equal Employment Opportunity at Machinify

Machinify is committed to hiring talented and qualified individuals with diverse backgrounds for all of its positions. Machinify believes that the gathering and celebration of unique backgrounds, qualities, and cultures enriches the workplace.