Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Engineers

Pre-screened and vetted.

Python SQL ETL CI/CD Amazon S3 AWS

Bay Area DFW Metroplex NYC Metro Remote Chicago Metro Greater Boston Greater Seattle Los Angeles Metro Austin Metro DMV

Swastik Chowdhury

Screened

Mid-level Data Engineer specializing in cloud data platforms and real-time streaming

5y exp

Vertisage TechnologiesCarnegie Mellon University

“Worked on onboarding a Middle East logistics client processing thousands of invoices/month, building a production-ready pipeline that routes known vendor PDFs to deterministic regex parsers via Tax ID matching and falls back to LlamaParse for unknown layouts. Added financial consistency validation plus human-in-the-loop review and logging/metrics to continuously reduce LLM usage and improve template coverage.”

Python Scala Java SQL React MySQL+96

View profile

Hyndhavi Sibbala

Screened

Senior Data Engineer specializing in multi-cloud data platforms and streaming pipelines

4y exp

Northern TrustUniversity of Texas at Arlington

“Data platform engineer with hands-on ownership of high-volume financial data pipelines (millions of transactions/day) on Azure (ADF, Databricks, Delta Lake, Synapse), emphasizing schema-drift protection and automated data-quality gates. Also built resilient web scraping pipelines with anti-bot and backfill strategies, and shipped a versioned FastAPI + Redis data API with autoscaling, testing, and CI/CD via GitHub Actions.”

Python SQL Apache Spark PySpark Databricks Azure Data Factory+131

View profile

Gautham Yerroju

Senior Software Engineer specializing in AWS cloud infrastructure and microservices

CA, USA12y exp

SoftQuip TechnologiesUniversity of Nevada, Reno

JavaScript TypeScript Node.js Express Python Java+54

View profile

Emmanuel Obi

Senior Software Engineer specializing in distributed systems and cloud infrastructure

U.S.A., U.S.A.12y exp

ElasticUniversity of Georgia

Python AWS PostgreSQL Docker Kubernetes Helm+93

View profile

Gopala Nelapati

Mid-level Data Engineer specializing in cloud-native ETL and data warehousing

Remote, USA4y exp

PayPalLamar University

Python Pandas PySpark SQL PostgreSQL Amazon Redshift+52

View profile

Birender Singh

Senior Data Scientist specializing in LLMs, NLP, and anomaly detection

Foster City, CA9y exp

VisaUniversity at Buffalo

Python SQL Machine Learning Large Language Models (LLMs)LLaMA Transformers+77

View profile

Miguel Saldana

Senior AI/ML Engineer specializing in GenAI, MLOps, and healthcare analytics

Chicago, IL13y exp

WezomRice University

A/B Testing Agile Amazon ECS Amazon EKS Amazon Redshift Anomaly Detection+359

View profile

Rakesh Sarikonda

Mid-level Data Engineer specializing in AWS lakehouse and Spark pipelines

Minneapolis, MN4y exp

OptumConcordia University

AWS Amazon S3 AWS Lambda AWS Glue AWS Step Functions Amazon CloudWatch+73

View profile

Priyanka Ponnam

Senior Data Engineer specializing in Cloud Data Platforms and Generative AI

Brooklyn, NY11y exp

JPMorgan ChaseOsmania University

Agile Amazon Athena Amazon Bedrock Amazon CloudWatch Amazon EC2 Amazon ECS+179

View profile

Karveandhan Palanisamy

Mid-level Data Engineer specializing in GCP, Spark, and healthcare analytics

New York, NY3y exp

CVS HealthColumbia University

Agile Algorithms Apache Airflow Apache Hadoop Apache Kafka Apache Spark+81

View profile

Adithya Rajendra

Screened ReferencesStrong rec.

Junior Data Engineer specializing in Azure data platforms and GenAI analytics

Bengaluru, India1y exp

ZEISSUC Irvine

“Data/ML practitioner with experience spanning medical imaging (retinal vessel analysis for hypertension/CVD risk prediction) and enterprise data engineering at Carl Zeiss. Built large-scale SAP data cleaning/validation pipelines (10M+ daily records, ~99% accuracy) and RAG-based semantic search with LangChain/vector DBs that cut manual querying by 82%, plus automation that reduced data onboarding from 8 hours to 12 minutes.”

Azure Data Factory Python SQL PySpark Power BI Streamlit+114

View profile

Timothy Wong

Screened

Mid-level Data Engineer specializing in experimentation, analytics, and AI-driven product experiences

4y exp

ZoomInfoUniversity of Texas at Austin

“Built production LLM automations using the Claude API, including a sales enablement workflow that summarizes playbooks and incorporates sales call metadata into strategic one-pagers. Experienced in orchestrating and scheduling data pipelines with SnapLogic, Airflow, and Databricks, and in scaling LLM API calls via parallel/batch processing. Also partnered with HR to deliver prompt-tuned, automated Slack messaging aligned to business tone and acceptance criteria.”

A/B Testing AWS BigQuery Confluence CRM Data Engineering+94

View profile

Edwin Tse

Screened

Junior Data Engineer specializing in BI, governed metrics, and workflow automation

Berkeley, CA3y exp

EnvoyXUC San Diego

“Built and shipped LLM/OCR/NLP-driven document-intelligence workflows in operational environments (EnvoyX and UPS), emphasizing production readiness via explicit state-machine orchestration, confidence gates, and human-in-the-loop review. Demonstrated strong business impact in customs brokerage/document ingestion: 50% fewer customs rejects, 30% higher throughput, SLA adherence improved from 71% to 96%, and platform reliability reaching 99.6% with 78% fewer bad-data incidents.”

SQL Python Java Power BI Snowflake BigQuery+82

View profile

Albert Pimentel

Screened

Mid-level Data Engineer specializing in cloud data pipelines and enterprise data platforms

4y exp

ConnectiveRxUniversity of Pennsylvania

“Data engineer/backend engineer who owns large-scale, real-time event pipelines on AWS end-to-end, including a petabyte-scale CDC ingestion flow from multiple Postgres DBs into Redshift. Re-architected a legacy DynamoDB+S3 approach into a Delta Lake + DuckDB/PyArrow-compatible design, improving performance dramatically (e.g., ~600s to ~10s for 1k records) and increasing reliability at high file volumes.”

Amazon DynamoDB Amazon EC2 Amazon ECS Amazon Redshift Amazon RDS Amazon S3+73

View profile

Adrian Alvarez

Screened

Principal Cloud & Infrastructure Engineer specializing in reliability and regulated data platforms

Remote, USA10y exp

Khipu, LLCNorth Carolina State University

“Founder/CTO-type startup leader who has built cloud-native data and AI platforms from scratch while owning both technical vision and product direction. Brings rare end-to-end startup experience spanning zero-to-one building, growth-stage execution, and fundraising from early stage through exit, with a strong ability to translate technical complexity into clear investor narratives.”

Amazon CloudWatch Apache Kafka Apache Spark Audit Logging AWS AWS Glue+106

View profile

Darshan Patel

Screened

Mid-level Data Engineer specializing in financial and trading data

Sydney, Australia4y exp

Australian Securities ExchangeUNSW Sydney

“Quant Data Engineer at ASX who is also building FinishKit, a full-stack SaaS that scans AI-generated codebases for bugs and production-readiness issues. Combines React/TypeScript, Supabase/serverless, Fly.io, and Postgres with strong product instincts, rapid iteration, and prior experience building secure multi-tenant data and dashboard systems across enterprise teams.”

Python Pandas NumPy C#SQL JavaScript+95

View profile

Geetha Bommareddy

Screened

Mid-level AI/ML Engineer specializing in fraud detection and risk analytics in Financial Services

USA5y exp

JPMorgan ChaseTrine University

“At JP Morgan Chase, built and deployed a production LLM-powered RAG knowledge assistant to help fraud investigators and risk analysts quickly navigate regulatory updates and internal policies, reducing investigation delays and compliance risk. Strong focus on secure retrieval (RBAC filtering), reliability (layered testing + observability), and production constraints (latency/SLOs), with Airflow-orchestrated, auditable ML pipelines.”

Amazon EC2 Amazon EKS Amazon Redshift Amazon S3 Amazon SageMaker Anomaly Detection+159

View profile

Harini Kv

Screened

Mid-level AI/ML Engineer specializing in GenAI, NLP, and MLOps

Dallas, TX7y exp

EquinixFitchburg State University

“GenAI/data engineering practitioner with production experience across Equinix, Optum, and Citibank—built an Azure OpenAI (GPT-4) + LangChain document intelligence platform processing 1.5M+ docs/month and a HIPAA-compliant Airflow healthcare pipeline handling 5M+ claims/day. Also delivered a real-time fraud detection + explainability system using LightGBM and a fine-tuned T5 NLG component, improving fraud accuracy by 15%+ while partnering closely with compliance stakeholders.”

Python SQL PySpark Bash Java JavaScript+169

View profile

Sai Gowtham Madaka

Screened

Mid-level Data Engineer specializing in streaming and cloud data platforms for financial services

Edison, NJ3y exp

Morgan StanleyPace University

“Data engineering-focused candidate (internship/project experience) who built end-to-end pipelines processing a few million transactional records/day for fraud detection and reporting, using Airflow, Python/SQL, and PySpark with strong emphasis on data quality gates, idempotency, and monitoring. Also implemented an external web/API data collection system with anti-bot tactics and schema-change quarantine, and shipped a versioned Flask API to serve curated warehouse data.”

Apache Airflow Apache Hadoop Apache Hive Apache Kafka Apache Spark AWS+82

View profile

Pooja Dokuri

Screened

Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps

Remote, USA4y exp

UnitedHealth GroupEast Texas A&M University

“Built and deployed a production LLM + vector search clinical decision support system at UnitedHealth Group, retrieving medical evidence and patient context in real time for prior authorization and risk scoring. Strong in end-to-end RAG architecture (Hugging Face embeddings, Pinecone/FAISS, SageMaker, Redis) plus orchestration (Airflow/Kubeflow) and rigorous evaluation/monitoring, with demonstrated ability to align solutions with clinical operations stakeholders.”

Python Pandas NumPy PySpark Scikit-learn SQL+133

View profile

Samatha Amsala

Screened

Mid-level Data Engineer specializing in cloud data warehousing and analytics

Omaha, NE6y exp

American ExpressBellevue University

“Data engineer at American Express who owned end-to-end pipelines for transaction and customer data used in finance reporting and risk analytics, processing ~5–8M records/day. Built Airflow-orchestrated ingestion (including external APIs/web sources) with strong data quality controls, monitoring/alerts, and resilient backfill/retry patterns, and also shipped a versioned REST API serving aggregated metrics to analytics teams.”

Data Engineering Data Warehousing Analytics Fraud Detection ETL Data Validation+167

View profile

Bernard Griffin

Screened

Senior Data Scientist / ML Engineer specializing in cloud ML pipelines and GenAI

Baltimore, MD17y exp

IntelIllinois Institute of Technology

“ML/NLP practitioner with experience building a transformer-failure prediction system that combines sensor signals with unstructured maintenance comments using LLM-based extraction and similarity validation. Strong emphasis on production readiness—data leakage controls, SQL-driven data quality tiers, and rigorous bias/fairness validation (including contract/spec evaluation across diverse company profiles).”

A/B Testing Amazon Athena Amazon Bedrock Amazon EC2 Amazon EMR Amazon Kinesis+130

View profile

Nafeezuddin Mohammed

Screened

Mid-level Data Engineer specializing in Analytics & AI/ML

Virginia, USA6y exp

SonyFitchburg State University

“Data engineer with experience at Sony and Walmart building high-volume, near-real-time analytics and ingestion systems. Has owned end-to-end pipelines from Kafka/Spark streaming through S3/Parquet and Redshift/Looker, emphasizing data quality (Great Expectations), observability (CloudWatch/Azure Monitor), and reliability (Airflow SLAs, retries, checkpointing), including measurable performance and latency improvements.”

Agile Amazon Athena Amazon CloudWatch Amazon EMR Amazon Redshift Amazon S3+124

View profile

Data Engineers in Bay Area Data Engineers in DFW Metroplex Data Engineers in NYC Metro Data Engineers in Remote Data Engineers in Chicago Metro Data Engineers in Greater Boston Data Engineers in Greater Seattle Data Engineers in Los Angeles Metro Data Engineers in Austin Metro Data Engineers in DMV

Need someone specific?

AI Search

Related

Need someone specific?