Vetted Data Engineers

Pre-screened and vetted.

SC

Mid-level Data Engineer specializing in cloud data platforms and real-time streaming

5y exp
Vertisage TechnologiesCarnegie Mellon University

Worked on onboarding a Middle East logistics client processing thousands of invoices/month, building a production-ready pipeline that routes known vendor PDFs to deterministic regex parsers via Tax ID matching and falls back to LlamaParse for unknown layouts. Added financial consistency validation plus human-in-the-loop review and logging/metrics to continuously reduce LLM usage and improve template coverage.

View profile
HS

Senior Data Engineer specializing in multi-cloud data platforms and streaming pipelines

4y exp
Northern TrustUniversity of Texas at Arlington

Data platform engineer with hands-on ownership of high-volume financial data pipelines (millions of transactions/day) on Azure (ADF, Databricks, Delta Lake, Synapse), emphasizing schema-drift protection and automated data-quality gates. Also built resilient web scraping pipelines with anti-bot and backfill strategies, and shipped a versioned FastAPI + Redis data API with autoscaling, testing, and CI/CD via GitHub Actions.

View profile
Gautham Yerroju - Senior Software Engineer specializing in AWS cloud infrastructure and microservices in CA, USA

Senior Software Engineer specializing in AWS cloud infrastructure and microservices

CA, USA12y exp
SoftQuip TechnologiesUniversity of Nevada, Reno
View profile
EO

Senior Software Engineer specializing in distributed systems and cloud infrastructure

U.S.A., U.S.A.12y exp
ElasticUniversity of Georgia
View profile
GN

Mid-level Data Engineer specializing in cloud-native ETL and data warehousing

Remote, USA4y exp
PayPalLamar University
View profile
BS

Senior Data Scientist specializing in LLMs, NLP, and anomaly detection

Foster City, CA9y exp
VisaUniversity at Buffalo
View profile
MS

Senior AI/ML Engineer specializing in GenAI, MLOps, and healthcare analytics

Chicago, IL13y exp
WezomRice University
View profile
RS

Mid-level Data Engineer specializing in AWS lakehouse and Spark pipelines

Minneapolis, MN4y exp
OptumConcordia University
View profile
PP

Senior Data Engineer specializing in Cloud Data Platforms and Generative AI

Brooklyn, NY11y exp
JPMorgan ChaseOsmania University
View profile
KP

Mid-level Data Engineer specializing in GCP, Spark, and healthcare analytics

New York, NY3y exp
CVS HealthColumbia University
View profile
AR

Adithya Rajendra

Screened ReferencesStrong rec.

Junior Data Engineer specializing in Azure data platforms and GenAI analytics

Bengaluru, India1y exp
ZEISSUC Irvine

Data/ML practitioner with experience spanning medical imaging (retinal vessel analysis for hypertension/CVD risk prediction) and enterprise data engineering at Carl Zeiss. Built large-scale SAP data cleaning/validation pipelines (10M+ daily records, ~99% accuracy) and RAG-based semantic search with LangChain/vector DBs that cut manual querying by 82%, plus automation that reduced data onboarding from 8 hours to 12 minutes.

View profile
TW

Timothy Wong

Screened

Mid-level Data Engineer specializing in experimentation, analytics, and AI-driven product experiences

4y exp
ZoomInfoUniversity of Texas at Austin

Built production LLM automations using the Claude API, including a sales enablement workflow that summarizes playbooks and incorporates sales call metadata into strategic one-pagers. Experienced in orchestrating and scheduling data pipelines with SnapLogic, Airflow, and Databricks, and in scaling LLM API calls via parallel/batch processing. Also partnered with HR to deliver prompt-tuned, automated Slack messaging aligned to business tone and acceptance criteria.

View profile
ET

Edwin Tse

Screened

Junior Data Engineer specializing in BI, governed metrics, and workflow automation

Berkeley, CA3y exp
EnvoyXUC San Diego

Built and shipped LLM/OCR/NLP-driven document-intelligence workflows in operational environments (EnvoyX and UPS), emphasizing production readiness via explicit state-machine orchestration, confidence gates, and human-in-the-loop review. Demonstrated strong business impact in customs brokerage/document ingestion: 50% fewer customs rejects, 30% higher throughput, SLA adherence improved from 71% to 96%, and platform reliability reaching 99.6% with 78% fewer bad-data incidents.

View profile
AP

Mid-level Data Engineer specializing in cloud data pipelines and enterprise data platforms

4y exp
ConnectiveRxUniversity of Pennsylvania

Data engineer/backend engineer who owns large-scale, real-time event pipelines on AWS end-to-end, including a petabyte-scale CDC ingestion flow from multiple Postgres DBs into Redshift. Re-architected a legacy DynamoDB+S3 approach into a Delta Lake + DuckDB/PyArrow-compatible design, improving performance dramatically (e.g., ~600s to ~10s for 1k records) and increasing reliability at high file volumes.

View profile
AA

Principal Cloud & Infrastructure Engineer specializing in reliability and regulated data platforms

Remote, USA10y exp
Khipu, LLCNorth Carolina State University

Founder/CTO-type startup leader who has built cloud-native data and AI platforms from scratch while owning both technical vision and product direction. Brings rare end-to-end startup experience spanning zero-to-one building, growth-stage execution, and fundraising from early stage through exit, with a strong ability to translate technical complexity into clear investor narratives.

View profile
Darshan Patel - Mid-level Data Engineer specializing in financial and trading data in Sydney, Australia

Darshan Patel

Screened

Mid-level Data Engineer specializing in financial and trading data

Sydney, Australia4y exp
Australian Securities ExchangeUNSW Sydney

Quant Data Engineer at ASX who is also building FinishKit, a full-stack SaaS that scans AI-generated codebases for bugs and production-readiness issues. Combines React/TypeScript, Supabase/serverless, Fly.io, and Postgres with strong product instincts, rapid iteration, and prior experience building secure multi-tenant data and dashboard systems across enterprise teams.

View profile
GB

Mid-level AI/ML Engineer specializing in fraud detection and risk analytics in Financial Services

USA5y exp
JPMorgan ChaseTrine University

At JP Morgan Chase, built and deployed a production LLM-powered RAG knowledge assistant to help fraud investigators and risk analysts quickly navigate regulatory updates and internal policies, reducing investigation delays and compliance risk. Strong focus on secure retrieval (RBAC filtering), reliability (layered testing + observability), and production constraints (latency/SLOs), with Airflow-orchestrated, auditable ML pipelines.

View profile
HK

Harini Kv

Screened

Mid-level AI/ML Engineer specializing in GenAI, NLP, and MLOps

Dallas, TX7y exp
EquinixFitchburg State University

GenAI/data engineering practitioner with production experience across Equinix, Optum, and Citibank—built an Azure OpenAI (GPT-4) + LangChain document intelligence platform processing 1.5M+ docs/month and a HIPAA-compliant Airflow healthcare pipeline handling 5M+ claims/day. Also delivered a real-time fraud detection + explainability system using LightGBM and a fine-tuned T5 NLG component, improving fraud accuracy by 15%+ while partnering closely with compliance stakeholders.

View profile
SG

Mid-level Data Engineer specializing in streaming and cloud data platforms for financial services

Edison, NJ3y exp
Morgan StanleyPace University

Data engineering-focused candidate (internship/project experience) who built end-to-end pipelines processing a few million transactional records/day for fraud detection and reporting, using Airflow, Python/SQL, and PySpark with strong emphasis on data quality gates, idempotency, and monitoring. Also implemented an external web/API data collection system with anti-bot tactics and schema-change quarantine, and shipped a versioned Flask API to serve curated warehouse data.

View profile
Pooja Dokuri - Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps in Remote, USA

Pooja Dokuri

Screened

Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps

Remote, USA4y exp
UnitedHealth GroupEast Texas A&M University

Built and deployed a production LLM + vector search clinical decision support system at UnitedHealth Group, retrieving medical evidence and patient context in real time for prior authorization and risk scoring. Strong in end-to-end RAG architecture (Hugging Face embeddings, Pinecone/FAISS, SageMaker, Redis) plus orchestration (Airflow/Kubeflow) and rigorous evaluation/monitoring, with demonstrated ability to align solutions with clinical operations stakeholders.

View profile
Samatha Amsala - Mid-level Data Engineer specializing in cloud data warehousing and analytics in Omaha, NE

Mid-level Data Engineer specializing in cloud data warehousing and analytics

Omaha, NE6y exp
American ExpressBellevue University

Data engineer at American Express who owned end-to-end pipelines for transaction and customer data used in finance reporting and risk analytics, processing ~5–8M records/day. Built Airflow-orchestrated ingestion (including external APIs/web sources) with strong data quality controls, monitoring/alerts, and resilient backfill/retry patterns, and also shipped a versioned REST API serving aggregated metrics to analytics teams.

View profile
BG

Senior Data Scientist / ML Engineer specializing in cloud ML pipelines and GenAI

Baltimore, MD17y exp
IntelIllinois Institute of Technology

ML/NLP practitioner with experience building a transformer-failure prediction system that combines sensor signals with unstructured maintenance comments using LLM-based extraction and similarity validation. Strong emphasis on production readiness—data leakage controls, SQL-driven data quality tiers, and rigorous bias/fairness validation (including contract/spec evaluation across diverse company profiles).

View profile
NM

Mid-level Data Engineer specializing in Analytics & AI/ML

Virginia, USA6y exp
SonyFitchburg State University

Data engineer with experience at Sony and Walmart building high-volume, near-real-time analytics and ingestion systems. Has owned end-to-end pipelines from Kafka/Spark streaming through S3/Parquet and Redshift/Looker, emphasizing data quality (Great Expectations), observability (CloudWatch/Azure Monitor), and reliability (Airflow SLAs, retries, checkpointing), including measurable performance and latency improvements.

View profile

Need someone specific?

AI Search