Vetted Data Engineers

Pre-screened and vetted.

SL

Senior Software Engineer specializing in backend microservices and data platforms

Colorado Springs, CO12y exp
ZocdocUniversity of Minnesota
View profile
Mounika S - Senior Machine Learning Engineer specializing in MLOps and Generative AI in St. Louis, Missouri

Senior Machine Learning Engineer specializing in MLOps and Generative AI

St. Louis, Missouri7y exp
Emerson
View profile
PK

Senior Data Engineer specializing in multi-cloud data platforms and generative AI

Weston, FL5y exp
UKGUniversity of Alabama at Birmingham
View profile
VV

Vaishnavi Veerkumar

Screened ReferencesStrong rec.

Mid-level AI Engineer specializing in GenAI and RAG systems

Boston, MA4y exp
VizitNortheastern University

AI engineer who built a production e-commerce system that analyzes product images alongside sales and demographic data to generate actionable creative recommendations, now used by 20+ clients. Also built orchestrated document/agent pipelines (Airflow, LangGraph) including a compliance drift detector auditing 401 compliance documents, with an emphasis on traceability, logging, and production integration.

View profile
DW

David Wisdom

Screened

Mid-level Data & Machine Learning Engineer specializing in production ML and data platforms

San Francisco, CA5y exp
Spice DataWilliam & Mary

Built and deployed a production LLM system that scraped Google Maps menu photos, extracted structured prices via OpenAI, and cross-validated them against website-scraped data to automate data-quality verification at scale (replacing costly manual contractor checks). Demonstrates strong reliability instincts—precision-first prompting, output gating with image-quality metadata, and fuzzy matching/RAG techniques—plus solid orchestration (Dagster/Airflow) and observability (Sentry, Prometheus/Grafana).

View profile
NK

Senior Data Engineer specializing in Palantir Foundry and Snowflake for regulated industries

USA5y exp
American ExpressUniversity of Massachusetts Boston

Data engineer focused on high-volume transaction pipelines (2M+ per day) using Snowflake/Snowpipe, Spark/PySpark, Kafka, and Airflow, with a strong emphasis on schema/data-quality enforcement and reliability improvements. Also built a greenfield compliance-focused RAG solution, using CloudWatch monitoring and adding ingestion validation to prevent malformed OCR documents from degrading search quality.

View profile
MG

Senior Data Engineer specializing in cloud data platforms and real-time streaming

6y exp
HCA HealthcareWright State University

Data engineer in healthcare (HCA) who owned end-to-end Azure-based pipelines at very large scale (50M+ daily claims/patient records). Strong focus on reliability: schema-drift fail-fast validation, quarantine layers, and Python/SQL data quality checks that reduced issues ~25%, plus performance tuning in Databricks/PySpark and versioned serving in Synapse for downstream consumers.

View profile
Cristian Vega - Senior AI/ML Engineer specializing in Generative AI and RAG in California, null

Cristian Vega

Screened

Senior AI/ML Engineer specializing in Generative AI and RAG

California, null9y exp
Morf HealthUniversity of Texas at Austin

ML/NLP practitioner at Morf Health focused on unifying fragmented healthcare data by linking structured patient/encounter records with unstructured clinical notes. Has hands-on experience with transformer embeddings, vector databases, and domain fine-tuning, plus rigorous evaluation (precision/recall) and human-in-the-loop validation with clinical SMEs to make pipelines production-grade.

View profile
Sagar Patel - Mid-level Full-Stack Python Developer & Data Engineer specializing in ETL and web platforms in Arizona, United States

Sagar Patel

Screened

Mid-level Full-Stack Python Developer & Data Engineer specializing in ETL and web platforms

Arizona, United States6y exp
GoDaddyCampbellsville University

Backend engineer who led major modernization efforts at GoDaddy, migrating legacy Perl services to Python/FastAPI with an incremental rollout strategy, containerization (Docker/Kubernetes), and CI/CD (Jenkins/GitHub Actions). Strong focus on secure, reliable API design (JWT, RBAC, PostgreSQL row-level security), rigorous testing, and data integrity—plus experience hardening an automated web-scraping pipeline against changing site structures and downtime.

View profile
SK

Mid-level Data Analyst and Data Engineer specializing in healthcare and financial analytics

3y exp
UnitedHealth GroupUniversity of North Texas

Analytics professional with healthcare and operations experience who turns messy enterprise data from platforms like Teradata, GCP, SQL Server, and Snowflake into trusted reporting layers and reproducible analysis workflows. They combine SQL, Python, PySpark, Power BI, and Tableau to improve reporting accuracy and performance, including a 30% dashboard refresh improvement and 20-25% accuracy gains in healthcare reporting.

View profile
Pavan Punna - Mid-level AI/ML Engineer specializing in LLMs, MLOps, and healthcare-fintech AI in Dallas, TX

Pavan Punna

Screened

Mid-level AI/ML Engineer specializing in LLMs, MLOps, and healthcare-fintech AI

Dallas, TX5y exp
Federal Soft SystemsConcordia University

Built and owned a production GPT-4 RAG assistant for clinical and enterprise query resolution, taking it from initial experiment to deployment, monitoring, and iterative improvement. Their work cut resolution time from 45 minutes to under 2 minutes, achieved roughly 95% accuracy, and scaled to thousands of additional monthly queries while emphasizing safety and trust in a sensitive clinical domain.

View profile
DF

Staff Machine Learning Engineer specializing in NLP, LLMs, and document intelligence

Austin, TX9y exp
PNCUniversity of Cincinnati

ML/AI engineer at PNC who has shipped enterprise-grade RAG and document intelligence systems for compliance and policy workflows. Stands out for combining LLM product thinking with production rigor—owning FastAPI/Kubernetes deployments, monitoring, evaluation, and human-feedback loops that drove measurable gains like 40% faster policy search and 30% faster compliance review.

View profile
PV

Mid-level Machine Learning Engineer specializing in LLM agents, RAG, and MLOps

New York City, NY6y exp
AvanadeUniversity of North Texas

Built a production AI-driven contract/document extraction system combining OCR, normalization, and LLM schema-guided extraction, orchestrated with PySpark and Azure Data Factory and loaded into PostgreSQL for analytics. Emphasizes reliability at scale—using strict JSON schemas, confidence scoring, targeted retries, and multi-layer validation to control hallucinations while processing thousands of PDFs per hour—and partners closely with non-technical business teams to refine fields and deliver usable dashboards.

View profile
VM

Senior DevOps & Release Engineer specializing in CI/CD automation and AWS IaC

Raleigh, NC12y exp
VidmobUniversity of Central Missouri

Infrastructure/DevOps engineer (Vidmob) focused on AWS + containers, owning GitLab CI/CD and Terraform-managed environments. Led a high-impact CI incident by correlating runner queue time, Docker pull latency, and NAT egress; implemented ECR pull-through caching and VPC endpoints to restore performance and then standardized the fix in Terraform for future scale-ups.

View profile
MR

Mid-level GenAI Engineer specializing in production AI agents and evaluation pipelines

Overland Park, Kansas5y exp
MinutentagWilmington University

Built and shipped a production LLM-powered internal operations automation platform using LangChain RAG (Pinecone) and FastAPI microservices, deployed on AWS EKS, serving 10k+ daily interactions. Implemented a rigorous evaluation/observability stack (golden datasets, prompt regression tests, MLflow, retrieval metrics, hallucination monitoring) that drove hallucinations below 2% and improved reliability, and partnered closely with non-technical ops leaders to cut manual lookup work by 60%+.

View profile
RK

Ram Kottala

Screened

Mid-level Data & GenAI Engineer specializing in lakehouse, streaming, and RAG platforms

Michigan, USA5y exp
FordWebster University

Built a production internal LLM-powered knowledge assistant using a RAG architecture (Python, LLM APIs, cloud services) that answers employee questions with sourced, grounded responses from internal documents. Demonstrates strong practical depth in retrieval tuning (chunking/metadata filters), orchestration with LangChain, and production reliability practices (latency optimization, automated embedding refresh, evaluation metrics, logging/monitoring) while partnering closely with non-technical operations teams.

View profile
NY

Naga Yanala

Screened

Mid-level Data Engineer specializing in cloud data pipelines and analytics platforms

Texas, USA5y exp
Molina HealthcareSoutheast Missouri State University

Data engineer with healthcare and enterprise experience (Molina Healthcare, Dell Technologies) building and operating high-volume batch + streaming pipelines across AWS and Azure. Strong focus on data quality (schema validation, fail-fast checks), reliability (monitoring/alerts, retries), and performance tuning (Spark/partitioning), with measurable runtime reduction and improved downstream trust.

View profile
SK

Mid-level Data Engineer specializing in cloud data pipelines and financial services warehousing

Chicago, IL4y exp
Charles SchwabDePaul University

Data engineer (Charles Schwab) who took ownership of an unstable, ambiguous nightly financial data pipeline and rebuilt it into a reliable, incremental AWS Glue/Airflow/Redshift system feeding Power BI. Created a custom Python data-quality framework with hard-stop gating and schema drift detection, improving integrity (99.9%), cutting runtime (~20%), and reducing incidents/tickets (35% fewer schema-related dashboard incidents; 30% fewer investigations).

View profile
Mohan Naik Megavath - Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms in Remote, USA

Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms

Remote, USA4y exp
TruistElmhurst University

Backend engineer with hands-on experience building secure Python/Flask services (sessions, JWT, RBAC) and optimizing PostgreSQL/SQLAlchemy performance, including custom SQL using CTEs/window functions profiled via EXPLAIN ANALYZE. Also integrates LLM features via OpenAI/Azure into backend systems and improves scalability with RabbitMQ-driven async processing, caching, and multi-tenant data isolation patterns.

View profile
Sri Harsha patallapalli - Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS in Boston, MA

Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS

Boston, MA5y exp
Dextr.aiNortheastern University

Built and deployed a fine-tuned Qwen 2.5 14B model into production at Dextr.ai as the backbone for hotel-operations agentic workflows, running on AWS EKS with Triton and TensorRT-LLM. Demonstrates strong cost-aware LLM engineering (QLoRA, FP8/BF16 on H100) plus rigorous benchmarking/observability (Prometheus, LangSmith) with reported sub-30ms TTNT. Previously handled long-running ETL orchestration with Airflow at GE Healthcare and Lowe's.

View profile
UMESH KAMISETTY - Mid-level Data Engineer specializing in cloud lakehouse and streaming platforms in Seattle, WA

Mid-level Data Engineer specializing in cloud lakehouse and streaming platforms

Seattle, WA5y exp
First United BankCleveland State University

Data engineer focused on building production-grade pipelines on AWS (Kafka/Kinesis/Glue/S3) through to curated serving layers in Snowflake and Delta Lake. Emphasizes automated data quality validation (PySpark + CI/CD), modular dbt transformations for analytics (customer spending, risk metrics), and operational reliability with CloudWatch and DLQs; data consumed by BI tools and ML pipelines for fraud detection and risk analytics.

View profile
Harshitha Parupalli - Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines in Jersey City, NJ

Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines

Jersey City, NJ4y exp
Elevance HealthNJIT

Data engineer with healthcare domain experience who owned 100M+ record pipelines end-to-end (Kafka/Kinesis/ADF → PySpark/dbt validation → Spark SQL transforms → Snowflake/Power BI serving). Built production-grade reliability practices (Airflow orchestration, CloudWatch/Grafana monitoring, pytest + contract/regression tests, idempotent ingestion/backfills) and delivered measurable improvements: 35% lower latency and 40% better query performance.

View profile
KP

Mid-level Data Engineer specializing in capital markets post-trade data platforms

Whippany, NJ3y exp
BarclaysUniversity of Connecticut

Data/streaming engineer in capital markets who led an end-to-end trade settlement data product (Kafka→MongoDB→data lake) with rigorous data-quality logic and ~$175K first-year operational impact. Also built a low-latency Go-based CME market data engine feeding SOFR curve generation, using MSK on EKS with performance tuning (idempotency, compression, partitioning) to achieve sub-100ms delivery.

View profile
KE

Kamal Ede

Screened

Mid-level Data Engineer specializing in cloud data platforms, Spark, and streaming pipelines

MO, USA4y exp
S&P GlobalUniversity of Central Missouri

Data/MLOps engineer (Cognizant background) who owned an AWS/Airflow/Snowflake healthcare transactions pipeline processing ~8–10M records/day and cut pipeline/data-quality incidents by ~33%. Also built and deployed a production FastAPI model-inference service on Kubernetes (Docker, HPA) with strong observability (Prometheus/Grafana), versioned endpoints, and resilient backfill/idempotent external data ingestion patterns.

View profile

Need someone specific?

AI Search