Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Engineers

Pre-screened and vetted.

Python SQL ETL CI/CD Amazon S3 AWS

Bay Area DFW Metroplex NYC Metro Remote Chicago Metro Greater Boston Greater Seattle Los Angeles Metro Austin Metro DMV

Srilekha Jakkula

Screened

Senior Data Engineer specializing in scalable data pipelines and API-driven data services

Chicago, IL5y exp

Northern TrustNorthern Illinois University

“Data engineer focused on building scalable, reliable end-to-end data pipelines and backend REST data services, spanning API ingestion plus batch/stream processing with Airflow, Kafka, Spark/PySpark, and SQL. Emphasizes strong data quality validation, monitoring/fault tolerance, and performance tuning for large datasets, with experience deploying in cloud environments using containerization and CI/CD.”

Python SQL REST APIs API Integration JSON XML+51

View profile

Alain Villalonga Martinez

Screened

Entry-level Data Engineer specializing in ETL, analytics, and anomaly detection

Lubbock, TX1y exp

SiteProTexas Tech University

“Worked on industrial pump analytics at SitePro, where they built an anomaly detector using messy sensor and pump data and used historical failure and maintenance cost analysis to make the business case to stakeholders. They combine SQL/Python data preparation with practical stakeholder communication around metrics like churn and operational impact.”

Data Analysis SQL Python ETL ETL Pipelines Anomaly Detection+66

View profile

Kevin Cruz

Screened

Senior AI/ML Data Engineer specializing in LLMs, RAG, and intelligent platforms

Kissimmee, FL15y exp

OpendoorUSC

“Built and owned end-to-end production systems for a healthcare platform, including a predictive task recommendation feature (React + FastAPI + ML on AWS ECS) that cut backlog 20% and saved coordinators ~10 hours/week. Also productionized an AI-native RAG system (vector DB + LLM) delivering 40% faster query resolution, and led phased modernization of a monolithic FastAPI service into async microservices using feature flags and canary releases.”

Python SQL Java TypeScript Go Kotlin+252

View profile

Alekya Battu

Screened

Mid-level Data Scientist specializing in machine learning, MLOps, and cloud analytics

USA5y exp

Wells FargoWilmington University

“Senior data scientist with ~5 years’ experience building production ML/NLP systems in finance (Wells Fargo) and deep learning for sensor analytics in connected vehicles (Medtronic). Has delivered end-to-end platforms combining time-series forecasting with transformer-based NLP, including automated drift monitoring/retraining (MLflow + Airflow) and standardized Docker/CI/CD deployments; achieved a reported 22% precision improvement after domain fine-tuning.”

Python SQL R Classification XGBoost Random Forest+171

View profile

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp

DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

Python PySpark SQL Scala Pandas NumPy+157

View profile

Koushik Gunjala

Screened

Senior AI Engineer specializing in Agentic AI and distributed systems

Charlotte, NC4y exp

UnitedHealth GroupUniversity of North Carolina at Charlotte

“LLM/agentic workflow engineer with healthcare domain experience who built a HIPAA-compliant multi-agent RAG system for clinical review automation at UnitedHealth Group, achieving 92% precision and cutting latency 40% through async orchestration and Redis semantic caching. Also has strong data engineering orchestration background (Airflow on AWS EMR with Great Expectations) and a proven clinician-in-the-loop feedback process that improved model faithfulness by 18%.”

Agentic AI Distributed Systems Retrieval-Augmented Generation (RAG)GPT-4 LangChain LangGraph+95

View profile

Hema Edavalapati

Screened

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp

LexisNexisUniversity of South Florida

“AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.”

SQL SQL query optimization Python Pandas NumPy PySpark+159

View profile

Sridharan Kairmaknoda

Screened

Mid-level Data Engineer specializing in cloud data platforms and real-time analytics

Saint Louis, MO5y exp

CignaSaint Louis University

“Customer-facing data engineering professional who builds and deploys real-time reporting/dashboard solutions, gathering reporting and compliance requirements through direct stakeholder engagement. Experienced with Google Cloud IAM governance, secure integrations (encryption, audit logging), and fast production troubleshooting of ETL/pipeline failures with follow-on monitoring and automated recovery improvements; motivated by hands-on, travel-oriented customer work.”

SDLC Agile Waterfall Python SQL Jupyter Notebook+137

View profile

Suraj Thangellapally

Screened

Junior Software Engineer specializing in machine learning and data science

San Jose, CA2y exp

dataAnnotationUC Irvine

“Python backend engineer who built a personal LLM-powered AI code review tool that parses code into context-preserving diff chunks and uses the OpenAI API to analyze and summarize changes. Has hands-on Kubernetes deployment experience (replicas, rolling updates, ConfigMaps/Secrets, health probes) and follows GitOps-style, declarative CI/CD workflows; also has experience designing streaming/event-style processing with attention to reliability and observability.”

Python C++Java JavaScript TypeScript SQL+118

View profile

Sudeep govathoti

Screened

Mid-level Data Analyst/Data Engineer specializing in BI, ETL pipelines, and cloud analytics

4y exp

VerizonLindsey Wilson College

“Data engineer focused on marketing/web analytics and external API pipelines, handling ~10M records/week. Built Azure-based ingestion and PySpark transformations with rigorous data quality checks, then served curated datasets into Synapse/Redshift for Power BI. Also designed an Airflow-orchestrated crypto REST API pipeline with monitoring, retries/exponential backoff, schema-change detection, and backfill-friendly reprocessing.”

SQL Python R PySpark Pandas Scikit-learn+71

View profile

Garrett Berg

Screened

Senior Solutions Engineer specializing in blockchain governance and compliance analytics

4y exp

AgoraPenn State University

“Consulting background (Accenture) delivering technically complex solutions involving on-chain data and strict government security standards, including building isolated sandbox environments to move from PoC to production. Experienced in debugging agentic/LLM-style workflows (e.g., document scanning issues) with deterministic guardrails, preprocessing, and strong logging/monitoring. Has led large-scale crypto wallet workshops (including for the CFTC) and helped win business via clear, layered technical demos; also built internal marketing taxonomy tooling and drove adoption through cross-functional alignment.”

Python TypeScript React API Integration API Design Data Warehousing+58

View profile

Harsha Chimirala

Screened

Mid-level Data Engineer specializing in cloud data platforms and scalable ETL pipelines

USA, USA3y exp

HCLTechUniversity of New Haven

“Data engineer (~4 years) with full-stack delivery experience (Next.js App Router/TypeScript + React) building a real-time operations monitoring dashboard backed by Kafka and orchestrated data pipelines. Strong production focus: Airflow + CloudWatch monitoring, automated Python/SQL validation (99.5% accuracy), and CI/CD with Jenkins/Docker; has delivered measurable improvements in latency, pipeline reliability, and query performance (Postgres/Redshift).”

Python SQL PySpark Scala Bash Apache Spark+80

View profile

Tharun Kshathriya Sangaraju

Screened

Mid-level AI Engineer specializing in LLM orchestration, RAG, and multi-agent systems

Houston, TX4y exp

University of HoustonUniversity of Houston

“Research Assistant at the University of Houston who built and live-deployed a production RAG system for 1000+ research documents, using hybrid retrieval (dense+BM25+RRF) with cross-encoder reranking and RAGAS-based evaluation; reported 66% MRR, 0.85+ faithfulness, and 68% lower LLM inference costs. Also built a deployed LangGraph multi-agent research system (Researcher/Critic/Writer) with tool integrations (Tavily, arXiv) and dual memory (ChromaDB + Neo4j), plus freelance automation work delivering a WhatsApp chatbot and n8n workflows for a wholesale clothing business.”

Agentic AI AI Agents API Integration Apache Airflow Apache Hadoop Apache Kafka+118

View profile

Sai Harshith Varma Pericherla

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and lakehouse architecture

Jersey City, NJ4y exp

State StreetUniversity of New Haven

“Data engineer focused on sales/marketing analytics pipelines, owning ingestion from CRMs/ad platforms through warehouse serving and dashboards at ~hundreds of thousands of records/day. Built reliability-focused systems including dbt/SQL/Python data quality gates with alerting, a resilient web-scraping pipeline (retries/backoff, anti-bot tactics, schema-change detection, backfills), and a versioned internal REST API with caching and strong developer usability.”

SQL Python Pandas NumPy Scikit-learn Java+151

View profile

Sheshikanth Pothuganti

Screened

Mid-level Data Engineer specializing in real-time streaming and cloud data platforms

New York, NY4y exp

Wells FargoUniversity of Birmingham

“Data engineer with Wells Fargo experience owning an end-to-end lakehouse ETL pipeline on Databricks/Azure Data Factory, processing ~480GB daily and implementing robust data quality/reconciliation across 40+ tables to reach ~99.3% reliability. Strong in performance optimization (cut runtime 5.5h→3.8h), CI/CD and monitoring, and resilient external/API ingestion with retries, schema validation, and backfills.”

Python SQL Java Scala R PostgreSQL+122

View profile

Sai Swetha Bodlapati

Screened

Senior Data Engineer specializing in Spark, Kafka, and Databricks Lakehouse platforms

Dallas, TX5y exp

Fidelity InvestmentsNorthwest Missouri State University

“Data engineer at Fidelity who built and operated a real-time financial transactions lakehouse on AWS/Databricks, processing millions of records daily with Kafka streaming. Demonstrated strong reliability and data quality practices (watermarking, idempotent Delta writes, validation/reconciliation, observability) and delivered measurable improvements (~30% faster jobs and ~30% fewer data issues) while enabling trusted gold-layer analytics for downstream teams.”

Python Java SQL Apache Spark PySpark Apache Kafka+110

View profile

Kavita Tamire

Screened

Mid-level Data Engineer specializing in AWS cloud data platforms

California, USA3y exp

Charter CommunicationsUniversity of South Florida

“Data engineer with Charter Communications experience modernizing large-scale AWS data lake pipelines: ingesting S3 data, validating against legacy systems, transforming with PySpark/Spark SQL, and serving via Iceberg/Delta tables. Worked at 50M–300M record scale, delivered >99.5% data match, and built monitoring/alerting (CloudWatch/SNS) plus retry orchestration (Step Functions) and data quality gates (Great Expectations).”

Python SQL PySpark Apache Kafka Apache Airflow AWS+87

View profile

Tejas Kolpek

Screened

Mid-level Solutions Architect/Engineer specializing in AI and data integrations

Mountain View, CA5y exp

IpserLabUniversity at Buffalo

“Solutions Engineer specializing in taking LLM copilots from demo to production, with a strong emphasis on enterprise security (RBAC/OAuth), grounded RAG behavior (cite-or-refuse), and operational readiness (eval loops, logging, runbooks). Experienced in real-time diagnosis of agentic/LLM workflow failures and in partnering with Sales/CS to run integration-first POCs that clear security and reliability concerns and accelerate rollout.”

Requirements Gathering OAuth SQL Python ETL JSON+118

View profile

Prasanth Sai

Screened

Mid-level Data Engineer specializing in cloud lakehouse/warehouse pipelines

4y exp

Wells FargoChristian Brothers University

“Data engineer with HCA Healthcare experience building and operating end-to-end AWS-based pipelines for clinical and operational reporting (50–100 GB/day), serving curated data into Redshift/Snowflake for Power BI/Tableau. Emphasizes production reliability (Airflow SLAs/retries/alerting, logging/observability) and strong data quality controls (reconciliations, schema/null/duplicate checks), and has shipped versioned REST APIs to expose warehouse data to downstream systems.”

Amazon EC2 Amazon EKS Amazon Kinesis Amazon Redshift Amazon S3 Ansible+98

View profile

esha Pothukanuru

Screened

Mid-level Data Engineer specializing in cloud lakehouse platforms and ETL/ELT

Charlotte, NC4y exp

AccentureUniversity of North Carolina at Charlotte

“Accenture data engineer who greenfielded a supply-chain lakehouse platform, building an end-to-end medallion/Delta pipeline ingesting ~1.4TB/day from 17+ ERP/WMS/TMS/shipment sources. Delivered Gold datasets to Redshift/Synapse/Databricks SQL powering Power BI/Tableau with a 99.5% SLA, while cutting runtime 30% and cloud costs 16% through Spark/Delta optimizations and robust data quality controls.”

Python PySpark SQL Bash Apache Spark Databricks+126

View profile

Shashank R

Screened

Senior Data Engineer specializing in cloud data platforms and real-time analytics

Las Vegas, NV6y exp

Credit One BankUniversity of North Texas

“Data engineer (Credit One) who built and owned real-time financial transaction and credit risk/fraud data systems end-to-end on AWS + Snowflake. Delivered high-scale pipelines (150k events/hour; ~2TB/week), raised data accuracy to 99%, and cut Snowflake costs 42% while adding strong observability, schema-drift handling, and production-grade APIs/documentation.”

Agile Amazon EC2 Amazon EMR Amazon Redshift Amazon RDS Amazon S3+199

View profile