“Data engineer focused on AWS-based enterprise data platforms, owning end-to-end pipelines from multi-source batch/stream ingestion (Glue/Kinesis/StreamSets/Airflow) through PySpark transformations into curated datasets for Redshift/Snowflake. Emphasizes production reliability with strong monitoring/observability and data quality gates, and reports ~30% performance improvement plus improved SLAs and latency after optimization.”

Amazon Athena Amazon DynamoDB Amazon EMR Amazon EKS Amazon Kinesis Amazon Redshift+138

View profile

Parvinder Singh

Screened

Mid-level Data Engineer specializing in AWS lakehouse platforms and scalable ETL/ELT

Texas, USA4y exp

HumanaUniversity of Texas at Dallas

“Data engineer focused on reliable, production-grade pipelines and data services: has owned end-to-end ingestion-to-serving workflows processing millions of records/day, using Airflow, Python/SQL, and PySpark. Demonstrates strong operational rigor (monitoring, retries, idempotency, backfills) and measurable outcomes (98% stability, ~30% faster processing), plus experience exposing curated warehouse data via versioned REST APIs.”

Data Engineering Data Pipelines AWS Databricks Snowflake ETL+88

View profile

Varshitha K

Screened

Mid-level Data Engineer specializing in cloud data platforms and lakehouse architectures

Lakewood, CO4y exp

First BankUniversity of Central Missouri

“Data engineer in a banking context who has owned end-to-end Azure lakehouse pipelines ingesting financial/vendor data from APIs, Azure SQL, and flat files into Databricks/Delta (bronze-silver-gold). Emphasizes production reliability via schema-drift validation, data quality controls, monitoring/alerting, retries/checkpointing, and Spark/Delta performance tuning, with outputs served to BI/reporting teams (e.g., Tableau).”

Python Scala Java C++SQL PL/SQL+173

View profile

Apurva Banka

Screened

Mid-level Full-Stack & AI Engineer specializing in cloud, data platforms, and LLM automation

Houston, TX5y exp

Jay Logistics & Trade LLCUniversity at Buffalo

“Software engineer/product builder who has owned an agentic affiliate lead-gen platform end-to-end (Django + React/TypeScript) and deployed it on Kubernetes in anticipation of 10x user growth from ~5K DAUs. Also has healthcare claims microservices experience using Kafka, including hands-on performance tuning to address consumer lag and broker pressure, and built an internal downtime alerting tool adopted across the organization.”

Python JavaScript TypeScript SQL Java React+91

View profile

Sharanya Rao

Screened

Mid-level AI/ML Engineer specializing in NLP, LLMs, and RAG for finance and healthcare

Remote, USA3y exp

Ally FinancialUniversity of Maryland, Baltimore County

“Built an AI lending assistant (RAG + DeBERTa) used by credit analysts to retrieve policies and past loan decisions, tackling real production issues like hallucinations, document quality, and sub-second latency. Deployed a modular, Dockerized AWS architecture (ECS/EMR + load balancer) with load testing, caching/precomputed embeddings, and CloudWatch monitoring, and used Airflow to automate scheduled data/embedding/vector DB refresh pipelines with retries and alerts.”

Python PySpark SQL Pandas NumPy Scikit-learn+133

View profile

srilekha pothula

Screened

Mid-level Data Engineer specializing in cloud data pipelines for healthcare and financial services

Bloomfield, CT4y exp

CignaPace University

“Data engineer with ~4 years of experience (Cigna) building and operating Azure Data Factory pipelines for healthcare claims/member/provider data at 2–3M records/day. Emphasizes reliability and downstream safety via schema/data-quality validation, quarantine workflows, idempotent processing, and backfills; also improved runtime ~20% through SQL optimization and served curated datasets through versioned views and well-documented, analyst-friendly interfaces.”

Apache Airflow Apache Kafka Apache Spark AWS AWS Glue AWS Lambda+71

View profile

Agna Antony

Screened

Mid-level Data Engineer specializing in cloud-native healthcare and enterprise data platforms

Michigan, USA5y exp

MedStar HealthAPJ Abdul Kalam Technological University

“Data Engineer (TCS) who owned an end-to-end CRM analytics pipeline for Bayer’s eSalesWeb integration, ingesting from Salesforce APIs/databases/S3 and serving analytics-ready datasets via PostgreSQL/S3 for Tableau. Drove measurable outcomes: ~60% reduction in manual data-quality effort, ~30% lower latency through SQL optimization, and ~35% improved stability via monitoring, retries, and idempotent processing.”

SDLC Agile Scrum Kanban Waterfall DevOps+124

View profile

Surya Pavan

Screened

Mid-level Machine Learning Engineer specializing in Generative AI and LLM applications

Baltimore, MD5y exp

AcerCalifornia State University, Northridge

“GenAI engineer who has deployed production LLM/RAG chatbots for internal document search, focusing on reliability (hallucination reduction via prompt guardrails + retrieval filtering) and performance (latency improvements via caching). Experienced with LangChain/LangGraph orchestration for multi-step agent workflows and iterates using monitoring/logs and benchmark-driven evaluation while partnering closely with product and business teams.”

Agentic AI Amazon EC2 Amazon EMR Amazon S3 AWS IAM AWS Lambda+153

View profile

Ankush Banthia

Screened

Senior Data & Platform Engineer specializing in cloud-native streaming and distributed systems

USA10y exp

JPMorgan ChaseNew York Institute of Technology

“Financial data engineer who has built and operated high-volume batch + streaming pipelines (200–300 GB/day; 5–10k events/sec) using AWS, Spark/Delta, Airflow, Kafka, and Snowflake, with strong emphasis on data quality and reliability. Demonstrated measurable impact via 99.9% SLA adherence, major reductions in bad records/nulls, MTTR improvements, and significant latency/runtime/query performance gains; also built a distributed web-scraping system processing 5–10M records/day with anti-bot and schema-drift defenses.”

Team Building Onboarding Mentoring Agile Scrum Jira+150

View profile

Gaurav Pawar

Screened

Junior Backend/Full-Stack Software Engineer specializing in cloud microservices and AI apps

Miami, FL2y exp

Marketeq DigitalCal State Fullerton

“Accenture engineer who owned an insurance e-application end-to-end and drove incremental releases that reduced recurring production issues. Also built a TypeScript/React (Next.js) + NestJS microservices platform using PostgreSQL, Redis, Stripe, and Kafka, with strong focus on decoupling, eventual consistency, and scaling consumers under load. Created a hackathon chat-based internal assistant that used live form context and documentation-grounded answers to help agents resolve customer queries during form filling.”

Python Java JavaScript SQL Flask Spring Boot+91

View profile

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp

DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

Python PySpark SQL Scala Pandas NumPy+157

View profile

Thrinesh Thode

Screened

Mid-level AI/ML Engineer specializing in MLOps and LLM applications

New York, NY4y exp

BNY MellonUniversity at Albany

“BNY Mellon engineer who has built and operated production AI systems end-to-end: a LangChain/Pinecone RAG platform scaled via FastAPI + Kubernetes to 1000 RPM with 99.9% uptime, supported by monitoring and data-drift detection. Also deep in data/infra orchestration (Airflow, Dagster, Terraform on AWS/EMR/EC2), processing 500GB+ daily and delivering measurable reliability and performance gains, plus strong compliance-facing model explainability using SHAP and Tableau.”

A/B Testing Agentic AI Apache Kafka Apache Spark AWS AWS Lambda+86

View profile

Lingyi Wu

Screened

Mid-level Financial/Data Analyst specializing in analytics, forecasting, and healthcare/MarTech data

Los Angeles, CA4y exp

MINISOWestcliff University

“Growth/creative marketer from Esleydunn Games who uses Google Analytics to integrate cross-channel performance data (TikTok, YouTube, LinkedIn, Facebook) and run structured A/B tests on video ad length and layout. Reported reducing CPA by 20 per customer when leveraging YouTube and TikTok, and improved CTR through CTA/button placement testing and ongoing user-feedback loops (forum/WeChat topics).”

Python SQL R Machine Learning Deep Learning Feature Engineering+104

View profile

Koushik Gunjala

Screened

Senior AI Engineer specializing in Agentic AI and distributed systems

Charlotte, NC4y exp

UnitedHealth GroupUniversity of North Carolina at Charlotte

“LLM/agentic workflow engineer with healthcare domain experience who built a HIPAA-compliant multi-agent RAG system for clinical review automation at UnitedHealth Group, achieving 92% precision and cutting latency 40% through async orchestration and Redis semantic caching. Also has strong data engineering orchestration background (Airflow on AWS EMR with Great Expectations) and a proven clinician-in-the-loop feedback process that improved model faithfulness by 18%.”

Agentic AI Distributed Systems Retrieval-Augmented Generation (RAG)GPT-4 LangChain LangGraph+95

View profile

Hema Edavalapati

Screened

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp

LexisNexisUniversity of South Florida

“AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.”

SQL SQL query optimization Python Pandas NumPy PySpark+159

View profile

Chethan Thimapuram

Screened

Mid-level AI Engineer specializing in LLMs, MLOps, and healthcare NLP

4y exp

HCA HealthcareUniversity of South Florida

“Built a production, real-time clinical documentation system at HCA that converts doctor–patient conversations into structured clinical summaries using speech-to-text, LLM summarization, and RAG. Demonstrated measurable gains from medical-domain fine-tuning (clinical concept recall +18%, ROUGE-L 0.62 to 0.74) while meeting HIPAA constraints via PHI anonymization and encryption, and deployed via Docker/FastAPI with CI/CD and monitoring.”

Python PyTorch Machine Learning Generative AI Large Language Models OpenAI+182

View profile

Harsha Sikha

Screened

Mid-level AI/ML Engineer specializing in Generative AI and data engineering

Armonk, New York4y exp

IBMSaint Peter's University

“IBM engineer who built and deployed a production RAG-based LLM assistant using LangChain/FAISS with a fine-tuned LLaMA model, served via FastAPI microservices on Kubernetes, achieving 99%+ uptime. Demonstrates strong practical expertise in reducing hallucinations (semantic chunking + metadata-driven retrieval) and managing latency, plus mature MLOps practices (Airflow/dbt pipelines, MLflow tracking, monitoring, A/B and shadow deployments) and effective collaboration with non-technical stakeholders.”

A/B Testing Agile Anomaly Detection API Development Apache Hadoop Apache Hive+157

View profile

Harsha Chimirala

Screened

Mid-level Data Engineer specializing in cloud data platforms and scalable ETL pipelines

USA, USA3y exp

HCLTechUniversity of New Haven

“Data engineer (~4 years) with full-stack delivery experience (Next.js App Router/TypeScript + React) building a real-time operations monitoring dashboard backed by Kafka and orchestrated data pipelines. Strong production focus: Airflow + CloudWatch monitoring, automated Python/SQL validation (99.5% accuracy), and CI/CD with Jenkins/Docker; has delivered measurable improvements in latency, pipeline reliability, and query performance (Postgres/Redshift).”

Python SQL PySpark Scala Bash Apache Spark+80

View profile

Data Engineers Machine Learning Engineers Software Engineers Data Scientists Data Analysts Software Development Engineers Data & Analytics Engineering AI & Machine Learning Education

Need someone specific?

AI Search

Related

Need someone specific?