“AI engineer who built a production e-commerce system that analyzes product images alongside sales and demographic data to generate actionable creative recommendations, now used by 20+ clients. Also built orchestrated document/agent pipelines (Airflow, LangGraph) including a compliance drift detector auditing 401 compliance documents, with an emphasis on traceability, logging, and production integration.”

Agile AI Agents Amazon EC2 Amazon S3 Apache Airflow Data Engineering+137

View profile

David Wisdom

Screened

Mid-level Data & Machine Learning Engineer specializing in production ML and data platforms

San Francisco, CA5y exp

Spice DataWilliam & Mary

“Built and deployed a production LLM system that scraped Google Maps menu photos, extracted structured prices via OpenAI, and cross-validated them against website-scraped data to automate data-quality verification at scale (replacing costly manual contractor checks). Demonstrates strong reliability instincts—precision-first prompting, output gating with image-quality metadata, and fuzzy matching/RAG techniques—plus solid orchestration (Dagster/Airflow) and observability (Sentry, Prometheus/Grafana).”

Python SQL Ruby Snowflake BigQuery dbt+74

View profile

Neimisha Konda

Screened

Senior Data Engineer specializing in Palantir Foundry and Snowflake for regulated industries

USA5y exp

American ExpressUniversity of Massachusetts Boston

“Data engineer focused on high-volume transaction pipelines (2M+ per day) using Snowflake/Snowpipe, Spark/PySpark, Kafka, and Airflow, with a strong emphasis on schema/data-quality enforcement and reliability improvements. Also built a greenfield compliance-focused RAG solution, using CloudWatch monitoring and adding ingestion validation to prevent malformed OCR documents from degrading search quality.”

Snowflake SQL PostgreSQL MySQL NoSQL Apache Spark+109

View profile

Manasa Gunreddy

Screened

Senior Data Engineer specializing in cloud data platforms and real-time streaming

6y exp

HCA HealthcareWright State University

“Data engineer in healthcare (HCA) who owned end-to-end Azure-based pipelines at very large scale (50M+ daily claims/patient records). Strong focus on reliability: schema-drift fail-fast validation, quarantine layers, and Python/SQL data quality checks that reduced issues ~25%, plus performance tuning in Databricks/PySpark and versioned serving in Synapse for downstream consumers.”

AWS AWS CloudFormation AWS CodePipeline AWS Glue AWS IAM AWS Lambda+136

View profile

Cristian Vega

Screened

Senior AI/ML Engineer specializing in Generative AI and RAG

California, null9y exp

Morf HealthUniversity of Texas at Austin

“ML/NLP practitioner at Morf Health focused on unifying fragmented healthcare data by linking structured patient/encounter records with unstructured clinical notes. Has hands-on experience with transformer embeddings, vector databases, and domain fine-tuning, plus rigorous evaluation (precision/recall) and human-in-the-loop validation with clinical SMEs to make pipelines production-grade.”

Python R Java JavaScript SQL MySQL+154

View profile

Sagar Patel

Screened

Mid-level Full-Stack Python Developer & Data Engineer specializing in ETL and web platforms

Arizona, United States6y exp

GoDaddyCampbellsville University

“Backend engineer who led major modernization efforts at GoDaddy, migrating legacy Perl services to Python/FastAPI with an incremental rollout strategy, containerization (Docker/Kubernetes), and CI/CD (Jenkins/GitHub Actions). Strong focus on secure, reliable API design (JWT, RBAC, PostgreSQL row-level security), rigorous testing, and data integrity—plus experience hardening an automated web-scraping pipeline against changing site structures and downtime.”

Python SQL JavaScript Django Flask FastAPI+73

View profile

Sreenija Karnati

Screened

Mid-level Data Analyst and Data Engineer specializing in healthcare and financial analytics

3y exp

UnitedHealth GroupUniversity of North Texas

“Analytics professional with healthcare and operations experience who turns messy enterprise data from platforms like Teradata, GCP, SQL Server, and Snowflake into trusted reporting layers and reproducible analysis workflows. They combine SQL, Python, PySpark, Power BI, and Tableau to improve reporting accuracy and performance, including a 30% dashboard refresh improvement and 20-25% accuracy gains in healthcare reporting.”

SQL Python Power BI Snowflake Databricks Business Intelligence+89

View profile

Pavan Punna

Screened

Mid-level AI/ML Engineer specializing in LLMs, MLOps, and healthcare-fintech AI

Dallas, TX5y exp

Federal Soft SystemsConcordia University

“Built and owned a production GPT-4 RAG assistant for clinical and enterprise query resolution, taking it from initial experiment to deployment, monitoring, and iterative improvement. Their work cut resolution time from 45 minutes to under 2 minutes, achieved roughly 95% accuracy, and scaled to thousands of additional monthly queries while emphasizing safety and trust in a sensitive clinical domain.”

Python SQL Java Scala Bash PyTorch+124

View profile

Duncan Freeman

Screened

Staff Machine Learning Engineer specializing in NLP, LLMs, and document intelligence

Austin, TX9y exp

PNCUniversity of Cincinnati

“ML/AI engineer at PNC who has shipped enterprise-grade RAG and document intelligence systems for compliance and policy workflows. Stands out for combining LLM product thinking with production rigor—owning FastAPI/Kubernetes deployments, monitoring, evaluation, and human-feedback loops that drove measurable gains like 40% faster policy search and 30% faster compliance review.”

Machine Learning Data Science Natural Language Processing Large Language Models Computer Vision Time-Series Forecasting+169

View profile

PAVAN VARMA PENMETHSA

Screened

Mid-level Machine Learning Engineer specializing in LLM agents, RAG, and MLOps

New York City, NY6y exp

AvanadeUniversity of North Texas

“Built a production AI-driven contract/document extraction system combining OCR, normalization, and LLM schema-guided extraction, orchestrated with PySpark and Azure Data Factory and loaded into PostgreSQL for analytics. Emphasizes reliability at scale—using strict JSON schemas, confidence scoring, targeted retries, and multi-layer validation to control hallucinations while processing thousands of PDFs per hour—and partners closely with non-technical business teams to refine fields and deliver usable dashboards.”

Machine Learning Generative AI Large Language Models (LLMs)Prompt Engineering Retrieval-Augmented Generation (RAG)Embeddings+131

View profile

Veera Mallipudi

Screened

Senior DevOps & Release Engineer specializing in CI/CD automation and AWS IaC

Raleigh, NC12y exp

VidmobUniversity of Central Missouri

“Infrastructure/DevOps engineer (Vidmob) focused on AWS + containers, owning GitLab CI/CD and Terraform-managed environments. Led a high-impact CI incident by correlating runner queue time, Docker pull latency, and NAT egress; implemented ECR pull-through caching and VPC endpoints to restore performance and then standardized the fix in Terraform for future scale-ups.”

Agentic AI Claude CI/CD GitLab CI Jenkins Git+168

View profile

Manichandra Reddy Bethi

Screened

Mid-level GenAI Engineer specializing in production AI agents and evaluation pipelines

Overland Park, Kansas5y exp

MinutentagWilmington University

“Built and shipped a production LLM-powered internal operations automation platform using LangChain RAG (Pinecone) and FastAPI microservices, deployed on AWS EKS, serving 10k+ daily interactions. Implemented a rigorous evaluation/observability stack (golden datasets, prompt regression tests, MLflow, retrieval metrics, hallucination monitoring) that drove hallucinations below 2% and improved reliability, and partnered closely with non-technical ops leaders to cut manual lookup work by 60%+.”

A/B Testing Alerting AWS AWS Lambda BERT CI/CD+120

View profile

Ram Kottala

Screened

Mid-level Data & GenAI Engineer specializing in lakehouse, streaming, and RAG platforms

Michigan, USA5y exp

FordWebster University

“Built a production internal LLM-powered knowledge assistant using a RAG architecture (Python, LLM APIs, cloud services) that answers employee questions with sourced, grounded responses from internal documents. Demonstrates strong practical depth in retrieval tuning (chunking/metadata filters), orchestration with LangChain, and production reliability practices (latency optimization, automated embedding refresh, evaluation metrics, logging/monitoring) while partnering closely with non-technical operations teams.”

Python PySpark Scala Java R SQL+173

View profile

Naga Yanala

Screened

Mid-level Data Engineer specializing in cloud data pipelines and analytics platforms

Texas, USA5y exp

Molina HealthcareSoutheast Missouri State University

“Data engineer with healthcare and enterprise experience (Molina Healthcare, Dell Technologies) building and operating high-volume batch + streaming pipelines across AWS and Azure. Strong focus on data quality (schema validation, fail-fast checks), reliability (monitoring/alerts, retries), and performance tuning (Spark/partitioning), with measurable runtime reduction and improved downstream trust.”

Python SQL PySpark Bash ETL Data pipelines+85

View profile

Sai Kavyusha Ponnagant

Screened

Mid-level Data Engineer specializing in cloud data pipelines and financial services warehousing

Chicago, IL4y exp

Charles SchwabDePaul University

“Data engineer (Charles Schwab) who took ownership of an unstable, ambiguous nightly financial data pipeline and rebuilt it into a reliable, incremental AWS Glue/Airflow/Redshift system feeding Power BI. Created a custom Python data-quality framework with hard-stop gating and schema drift detection, improving integrity (99.9%), cutting runtime (~20%), and reducing incidents/tickets (35% fewer schema-related dashboard incidents; 30% fewer investigations).”

Python SQL Amazon S3 AWS Glue Amazon Redshift Amazon Athena+73

View profile

Mohan Naik Megavath

Screened

Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms

Remote, USA4y exp

TruistElmhurst University

“Backend engineer with hands-on experience building secure Python/Flask services (sessions, JWT, RBAC) and optimizing PostgreSQL/SQLAlchemy performance, including custom SQL using CTEs/window functions profiled via EXPLAIN ANALYZE. Also integrates LLM features via OpenAI/Azure into backend systems and improves scalability with RabbitMQ-driven async processing, caching, and multi-tenant data isolation patterns.”

Amazon Athena Amazon DynamoDB Amazon EC2 Amazon Redshift Amazon S3 AngularJS+137

View profile

Sri Harsha patallapalli

Screened

Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS

Boston, MA5y exp

Dextr.aiNortheastern University

“Built and deployed a fine-tuned Qwen 2.5 14B model into production at Dextr.ai as the backbone for hotel-operations agentic workflows, running on AWS EKS with Triton and TensorRT-LLM. Demonstrates strong cost-aware LLM engineering (QLoRA, FP8/BF16 on H100) plus rigorous benchmarking/observability (Prometheus, LangSmith) with reported sub-30ms TTNT. Previously handled long-running ETL orchestration with Airflow at GE Healthcare and Lowe's.”

Python Java C++SQL JavaScript Bash+113

View profile

UMESH KAMISETTY

Screened

Mid-level Data Engineer specializing in cloud lakehouse and streaming platforms

Seattle, WA5y exp

First United BankCleveland State University

“Data engineer focused on building production-grade pipelines on AWS (Kafka/Kinesis/Glue/S3) through to curated serving layers in Snowflake and Delta Lake. Emphasizes automated data quality validation (PySpark + CI/CD), modular dbt transformations for analytics (customer spending, risk metrics), and operational reliability with CloudWatch and DLQs; data consumed by BI tools and ML pipelines for fraud detection and risk analytics.”

Python PySpark SQL Shell Scripting AWS Amazon S3+146

View profile

Harshitha Parupalli

Screened

Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines

Jersey City, NJ4y exp

Elevance HealthNJIT

“Data engineer with healthcare domain experience who owned 100M+ record pipelines end-to-end (Kafka/Kinesis/ADF → PySpark/dbt validation → Spark SQL transforms → Snowflake/Power BI serving). Built production-grade reliability practices (Airflow orchestration, CloudWatch/Grafana monitoring, pytest + contract/regression tests, idempotent ingestion/backfills) and delivered measurable improvements: 35% lower latency and 40% better query performance.”

Python SQL Shell Scripting R Scala Java+160

View profile

Kamalesh Ponnivalavan

Screened

Mid-level Data Engineer specializing in capital markets post-trade data platforms

Whippany, NJ3y exp

BarclaysUniversity of Connecticut

“Data/streaming engineer in capital markets who led an end-to-end trade settlement data product (Kafka→MongoDB→data lake) with rigorous data-quality logic and ~$175K first-year operational impact. Also built a low-latency Go-based CME market data engine feeding SOFR curve generation, using MSK on EKS with performance tuning (idempotency, compression, partitioning) to achieve sub-100ms delivery.”

Amazon Athena Amazon DynamoDB Amazon Redshift Amazon S3 Apache Hadoop Apache Kafka+118

View profile

Kamal Ede

Screened

Mid-level Data Engineer specializing in cloud data platforms, Spark, and streaming pipelines

MO, USA4y exp

S&P GlobalUniversity of Central Missouri

“Data/MLOps engineer (Cognizant background) who owned an AWS/Airflow/Snowflake healthcare transactions pipeline processing ~8–10M records/day and cut pipeline/data-quality incidents by ~33%. Also built and deployed a production FastAPI model-inference service on Kubernetes (Docker, HPA) with strong observability (Prometheus/Grafana), versioned endpoints, and resilient backfill/idempotent external data ingestion patterns.”

Python PySpark SQL Scala Batch Processing Data Transformation+119

View profile

Data Engineers in Bay Area Data Engineers in DFW Metroplex Data Engineers in NYC Metro Data Engineers in Remote Data Engineers in Chicago Metro Data Engineers in Greater Boston Data Engineers in Greater Seattle Data Engineers in Los Angeles Metro Data Engineers in Austin Metro Data Engineers in DMV

Need someone specific?

AI Search

Related

Need someone specific?