Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Azure Data Factory Professionals

Pre-screened and vetted.

Azure Data Factory Python SQL Docker CI/CD Amazon S3

Tate Mara

Senior Data Engineer specializing in cloud data platforms and big data pipelines

Austin, TX11y exp

Accenture

Agile Amazon CloudFront Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon ECS+208

View profile

Leela Tikkisetty

Screened

Mid-level Software Engineer specializing in ML platforms and cloud-native backend systems

San Francisco, CA5y exp

City and County of San FranciscoSan Francisco State University

“Software engineer with experience at Google and the City and County of San Francisco building production AI systems, including a RAG-based internal support chatbot and ML-driven ticket priority tagging. Has scaled data/ML platforms with Airflow on GCP (1M+ records/day, 99.9% SLA) and deployed multi-component systems with Docker and Kubernetes (GKE), using modern LLM tooling (LangChain/CrewAI, Claude/OpenAI, Pinecone/ChromaDB, Bedrock/Ollama).”

A/B Testing Agile Amazon Bedrock Amazon EKS Amazon Redshift Authentication+198

View profile

Sahithi K

Screened

Mid-level Data Engineer specializing in cloud data platforms and streaming pipelines

Boston, MA4y exp

ModernaUniversity of Massachusetts Dartmouth

“Data engineer with experience at Moderna and Block owning high-volume (≈10TB/day) production pipelines on AWS, using Kafka/S3/Glue/dbt/Snowflake with strong data quality and observability practices (schema validation, anomaly detection, CloudWatch monitoring). Also built external financial API ingestion with Airflow retries, throttling/token rotation, and schema versioning, and helped stand up an early-stage biomedical data platform with CI/CD and incident debugging.”

Python SQL PySpark Apache Spark Apache Kafka Amazon Kinesis+94

View profile

Lalithya Manasa Patri

Screened

Senior Data Engineer specializing in cloud ETL and real-time streaming pipelines

Austin, TX5y exp

eBayTexas Tech University

“Data engineer with eBay experience owning end-to-end pipelines for real-time order and user behavior analytics at 10M+ records/day. Strong in PySpark/SQL transformations, Airflow reliability patterns, and production observability (CloudWatch), with measurable outcomes including improved data quality and 30–40% query performance gains. Also built Python data APIs for analytics/ML consumers with versioning and backward compatibility.”

Python SQL Java Scala R Apache Spark+97

View profile

Travoy Spelling

Screened

Senior Data Scientist / ML Engineer specializing in GenAI, LLMs, and NLP

Texarkana, TX10y exp

TredenceUniversity of Texas at Austin

“ML/NLP engineer focused on production GenAI and data linking systems: built a large-scale RAG pipeline over millions of support docs using LangChain/Pinecone and added a LangGraph-based validation layer to cut hallucinations ~40%. Also built scalable PySpark entity resolution (95%+ accuracy) and fine-tuned Sentence-BERT embeddings with contrastive learning for ~30% relevance lift, with strong CI/CD and observability practices (OpenTelemetry, Prometheus/Grafana).”

A/B Testing API Development AWS AWS Lambda AWS Step Functions Azure Data Factory+247

View profile

Byron Pineda

Screened

Staff/Lead Data Scientist specializing in Generative AI, NLP/LLMs, and MLOps

Pascagoula, MS10y exp

TuringMississippi State University

“Lead Data Scientist (10+ years) with recent work in healthcare data: built production pipelines that unify EHR, genomics, and clinical notes using NLP (spaCy/BERT/BioBERT) and scalable Spark-based processing. Also led development of domain-specific LLM/NLP systems for chatbots and semantic search, deploying models via FastAPI/Flask and improving retrieval with FAISS-backed, fine-tuned clinical embeddings and RAG-style workflows.”

Python R SQL Pandas NumPy Scikit-learn+132

View profile

Saiteja Gaddam

Screened

Mid-Level Data Engineer specializing in cloud data platforms and streaming analytics

3y exp

IntuitUniversity at Buffalo

“Data engineer (Intuit) who owned an end-to-end telemetry and subscription analytics platform processing ~22M events/day, built on Kinesis/S3/Glue/Spark/Airflow/Redshift. Strong focus on reliability and data quality (schema drift controls, quarantine layers, idempotent reruns) and performance tuning, achieving a reporting latency reduction from ~15 minutes to under 4 minutes while enabling revenue and churn analytics for business teams.”

Scala Hibernate JSON HTML CSS SQL+120

View profile

Melbourne Brown

Screened

Senior Software Engineer specializing in AI-driven cloud-native platforms

Atlanta, GA12y exp

McKinsey & CompanyKennesaw State University

“Engineer with unusual breadth: from a tiny startup building racehorse medical-record systems on credit-card chips for live racetrack demos to modern AI-powered contract intelligence platforms in production. Brings hands-on full-stack and backend depth across React, Python, .NET, PostgreSQL, Kubernetes, and Azure, with a track record of making complex, reliability-sensitive systems work in real-world conditions.”

.NET Python C#C++Java JavaScript+196

View profile

Prachi Jain

Screened

Mid-level Machine Learning Engineer specializing in NLP, LLMs, and MLOps

Remote, US6y exp

JPMorgan ChaseUniversity of Massachusetts Amherst

“Built and productionized a RAG-based analytics Q&A assistant for a financial analytics team, enabling natural-language querying across 200+ datasets (SQL tables, PDFs, compliance docs, wikis) and cutting turnaround time by 60%. Deep experience delivering regulated, audit-ready LLM systems on Azure (Azure OpenAI + LangChain) with strict grounding/citations, hybrid retrieval, and AKS-based low-latency deployment, plus strong collaboration with compliance analysts and auditors via iterative Gradio demos.”

Python C C++SQL MATLAB HTML+129

View profile

sai venkata

Screened

Senior Data Engineer specializing in cloud lakehouse and real-time streaming pipelines

Texas, USA6y exp

CVS HealthUniversity of Central Missouri

“Senior data engineer with experience in both healthcare (CVS Health) and financial services (Bank of America), building large-scale Azure lakehouse pipelines (30+ EHR sources, ~5TB) and real-time streaming services (Event Hubs/Kafka) for patient vitals. Strong focus on reliability and data quality (Great Expectations, monitoring/alerting, schema drift automation), with measurable outcomes like 50% runtime reduction and 99%+ uptime for regulatory reporting pipelines.”

Python SQL Scala Java Shell Scripting Apache Spark+117

View profile

jahnavi Vasala

Screened

Mid-level Data Engineer specializing in cloud data platforms and streaming pipelines

San Diego, CA6y exp

IntuitCleveland State University

“Data engineer with Intuit experience owning end-to-end, high-volume financial data pipelines (API/S3 ingestion, Airflow orchestration, Spark/PySpark + SQL transforms, Snowflake marts). Strong focus on reliability and data quality—achieved 99.8% SLA and cut discrepancies by 35% using Great Expectations, reconciliation, schema versioning, and automated backfills; also built near real-time Kafka/API data services with CI/CD and observability.”

Python SQL PySpark Scala Shell scripting Apache Spark+87

View profile

Biplob Bidari

Screened

Senior Data Engineer specializing in FinTech analytics and ML data platforms

USA5y exp

Goldman SachsUniversity of the Cumberlands

“ML/AI engineer with Goldman Sachs experience building production fraud detection and RAG-based trading insights systems end-to-end. Stands out for combining real-time ML infrastructure, GenAI retrieval systems, and compliance-aware design, with measurable impact including nearly 25% false-positive reduction and improved analyst productivity.”

Python Pandas NumPy PySpark SQL Bash+139

View profile

Kevin Lim

Screened

Intern Software Engineer specializing in data science and machine learning

Remote2y exp

StylistGemUC Berkeley

“Backend engineer with hands-on experience building Flask REST APIs (auth, CRUD, S3 media uploads) and driving measurable Postgres/SQLAlchemy performance gains (p95 reduced to 200–400ms by eliminating N+1s and switching to keyset pagination). Implemented multi-tenant isolation with strict tenant scoping plus Postgres RLS, and built an OpenAI-powered quiz generation pipeline using queued workers, structured JSON outputs, and Celery/Redis optimizations to stabilize high-throughput workloads.”

API Development AWS Azure Functions CI/CD Cloud Computing CSS+108

View profile

Sandeep Reddy Karumudi

Screened

Mid-level Data & Business Analyst specializing in analytics engineering and BI

6y exp

AdobeUniversity of Wisconsin–Madison

“Data/analytics professional with experience across manufacturing and enterprise environments (Wisconsin School of Business project with CNH Industrial; roles/projects at Ascensia Technologies, S&C, and Adobe). Has hands-on work combining warranty/lifecycle tables with technician free-text notes using TF-IDF + tree models (XGBoost/Random Forest), and deep experience in entity resolution/reconciliation across mismatched financial systems using Python/SQL and fuzzy matching, with production-grade pipeline practices in Azure Data Factory/Databricks.”

Python Pandas NumPy scikit-learn R SQL+119

View profile

Palak Siroya

Screened

Senior Site Reliability Engineer specializing in Azure cloud reliability and data analytics

Renton, WA10y exp

MicrosoftCentral Washington University

“AppSec-focused customer advisor with hands-on experience integrating SAST/DAST/SCA into production CI/CD (Azure DevOps) and designing secure agent/scanning deployments in AWS (least-privilege IAM, private subnets, VPC endpoints). Demonstrates strong incident troubleshooting using logs/metrics/traces to diagnose load-related failures (timeouts/retry storms) and drive durable fixes, while tailoring risk/tradeoff communication across engineering, security, and leadership stakeholders.”

Automation Azure Data Factory Azure DevOps Azure SQL Database CI/CD C+125

View profile

Praveen Nutulapati

Screened

Mid-level Generative AI Engineer specializing in LLM fine-tuning, RAG, and agentic systems

New York, NY6y exp

JPMorgan ChaseUniversity of Central Missouri

“Built and deployed a production multi-agent RAG system at JPMorgan Chase to automate regulated credit analysis and compliance clause discovery across large internal policy/document libraries. Implemented LangGraph-based supervisor orchestration with structured state management (Azure OpenAI) to support long-running, resumable workflows, plus hybrid retrieval + re-ranking and guardrails for reliability. Strong at evaluation/observability (trace logging, LLM-judge, HITL) and at communicating results to non-technical stakeholders via Power BI embeds and Streamlit prototypes.”

A/B Testing Agile Amazon Bedrock Amazon EC2 Amazon RDS Amazon SageMaker+184

View profile

Tori Cole

Screened

Senior Full-Stack Software Engineer specializing in .NET, cloud collaboration, and enterprise platforms

Houston, TX13y exp

The Bryant Heritage LLCUniversity of Houston-Downtown

“Serial entrepreneur with 15+ years in the VC, studio, and accelerator ecosystem who has founded multiple startups, raised capital previously, and built a consulting business running since 2008. Currently building a pre-seed SaaS marketplace for long-term housing in Texas with plans to expand across the U.S. and into Portugal, bringing a notably strategic focus on long-term market trends and exit planning.”

C#JavaScript React Angular JSON Microsoft Azure+122

View profile

Sirisha Maddikunta

Screened

Mid-level Generative AI Engineer specializing in enterprise LLM and healthcare AI solutions

O Fallon, MO6y exp

MastercardUniversity of Texas at Arlington

“Built and owned an end-to-end LLM-powered fraud investigation assistant that automated case summaries and risk analysis, cutting analyst investigation/documentation time by 40%. Stands out for translating RAG concepts into a production-grade internal platform with strong evaluation, monitoring, and reusable Python service architecture that improved both analyst trust and engineering velocity.”

Generative AI Natural Language Processing Computer Vision Prompt Engineering Retrieval-Augmented Generation LoRA+234

View profile

Rahul Reddy

Screened

Senior Data Engineer specializing in cloud data platforms and big data pipelines

New York, NY6y exp

CVS HealthSouthern Arkansas University

“Data engineer with healthcare (CVS Health) experience who migrated production PySpark workloads to native BigQuery SQL and built a Great Expectations-based validation microservice on GKE (Flask + REST) integrated into Cloud Composer. Has operated high-volume pipelines (~300–400GB/day) and designed external vendor ingestion on AWS (Lambda/Step Functions/Glue) with schema-drift detection, alerting, and backfill-safe controls to protect downstream Snowflake/BigQuery tables.”

Python Java SQL MySQL PostgreSQL Apache Hive+118

View profile

Bhanu Chander

Screened

Senior Data Engineer specializing in cloud data platforms and real-time pipelines

New York, NY6y exp

DisneyIndiana Wesleyan University

“Data engineer focused on reliability and observability, building end-to-end pipelines processing millions of records/day from sources like S3 and Kafka. Has hands-on experience with Airflow-based data quality automation, PySpark/Databricks transformations, and shipping versioned Python REST APIs deployed via Docker/Kubernetes with CI/CD (Jenkins) and monitoring (CloudWatch/Azure Logs).”

Python SQL Scala C#JavaScript Java+140

View profile

Shalini Jeela

Screened

Senior Data Engineer specializing in data pipelines, APIs, and machine learning

Austin, TX6y exp

ExpediaTrine University

“Data engineer with experience at Expedia building SQL Server and Azure Data Factory pipelines for business reporting and analytics. Stands out for pragmatic end-to-end pipeline ownership in ambiguous environments, with a strong emphasis on data quality, rerunnability, query performance, and making downstream datasets reliable for other teams.”

Python SQL Java C#JavaScript R+100

View profile

Apoorva Nanabolu

Screened

Senior Data Scientist / Generative AI Engineer specializing in fraud, risk, and MLOps

5y exp

PayPalUniversity of New Haven

“Built and deployed a production LLM/RAG fraud investigation system to replace manual investigator workflows, combining transaction data, historical cases, and policy documents with agent-style steps and LoRA fine-tuning. Demonstrates strong reliability engineering (grounding, citations, abstention paths), performance optimization (retrieval/indexing/caching), and end-to-end MLOps orchestration using Azure ML Pipelines/MLflow plus Kubernetes/Argo with canary and rollback deployments.”

Python R SQL NoSQL Snowflake BigQuery+178

View profile

Data Engineers Machine Learning Engineers Software Engineers Data Scientists Data Analysts Software Developers Data & Analytics AI & Machine Learning Engineering Executive & Leadership

Need someone specific?

AI Search

Related

Need someone specific?