Vetted Azure Data Factory Professionals

Pre-screened and vetted.

SA

Mid-level Full-Stack Engineer specializing in FinTech and cloud-native applications

Lewisville, TX3y exp
Fidelity InvestmentsNortheastern University
View profile
SD

Senior Data Scientist specializing in NLP, MLOps, and cloud ML platforms

Westfield Center, OH7y exp
Westfield Insurance
View profile
GS

Mid-level Data Scientist & Generative AI Engineer specializing in LLMs and RAG

Auburn Hills, MI4y exp
StellantisUniversity of Cincinnati

ML/NLP practitioner who built a retrieval-augmented generation (RAG) system for large financial and operational document sets using Sentence-Transformers (all-mpnet-base-v2) and a vector DB (e.g., Pinecone), with a strong focus on retrieval evaluation and chunking strategy optimization. Experienced in entity resolution (rules + embedding similarity with type-specific thresholds) and in productionizing scalable Python data workflows using Airflow/Dagster and Spark.

View profile
SN

Senior Data Engineer specializing in cloud data platforms and ML pipelines

Atlanta, GA8y exp
Berkshire HathawayUniversity of Alabama at Birmingham

Data engineer focused on AWS-based enterprise data platforms, owning end-to-end pipelines from multi-source batch/stream ingestion (Glue/Kinesis/StreamSets/Airflow) through PySpark transformations into curated datasets for Redshift/Snowflake. Emphasizes production reliability with strong monitoring/observability and data quality gates, and reports ~30% performance improvement plus improved SLAs and latency after optimization.

View profile
VK

Varshitha K

Screened

Mid-level Data Engineer specializing in cloud data platforms and lakehouse architectures

Lakewood, CO4y exp
First BankUniversity of Central Missouri

Data engineer in a banking context who has owned end-to-end Azure lakehouse pipelines ingesting financial/vendor data from APIs, Azure SQL, and flat files into Databricks/Delta (bronze-silver-gold). Emphasizes production reliability via schema-drift validation, data quality controls, monitoring/alerting, retries/checkpointing, and Spark/Delta performance tuning, with outputs served to BI/reporting teams (e.g., Tableau).

View profile
Sharanya Rao - Mid-level AI/ML Engineer specializing in NLP, LLMs, and RAG for finance and healthcare in Remote, USA

Sharanya Rao

Screened

Mid-level AI/ML Engineer specializing in NLP, LLMs, and RAG for finance and healthcare

Remote, USA3y exp
Ally FinancialUniversity of Maryland, Baltimore County

Built an AI lending assistant (RAG + DeBERTa) used by credit analysts to retrieve policies and past loan decisions, tackling real production issues like hallucinations, document quality, and sub-second latency. Deployed a modular, Dockerized AWS architecture (ECS/EMR + load balancer) with load testing, caching/precomputed embeddings, and CloudWatch monitoring, and used Airflow to automate scheduled data/embedding/vector DB refresh pipelines with retries and alerts.

View profile
srilekha pothula - Mid-level Data Engineer specializing in cloud data pipelines for healthcare and financial services in Bloomfield, CT

Mid-level Data Engineer specializing in cloud data pipelines for healthcare and financial services

Bloomfield, CT4y exp
CignaPace University

Data engineer with ~4 years of experience (Cigna) building and operating Azure Data Factory pipelines for healthcare claims/member/provider data at 2–3M records/day. Emphasizes reliability and downstream safety via schema/data-quality validation, quarantine workflows, idempotent processing, and backfills; also improved runtime ~20% through SQL optimization and served curated datasets through versioned views and well-documented, analyst-friendly interfaces.

View profile
AA

Agna Antony

Screened

Mid-level Data Engineer specializing in cloud-native healthcare and enterprise data platforms

Michigan, USA5y exp
MedStar HealthAPJ Abdul Kalam Technological University

Data Engineer (TCS) who owned an end-to-end CRM analytics pipeline for Bayer’s eSalesWeb integration, ingesting from Salesforce APIs/databases/S3 and serving analytics-ready datasets via PostgreSQL/S3 for Tableau. Drove measurable outcomes: ~60% reduction in manual data-quality effort, ~30% lower latency through SQL optimization, and ~35% improved stability via monitoring, retries, and idempotent processing.

View profile
FM

Senior AI/ML Engineer specializing in healthcare AI and MLOps

Mansfield, TX16y exp
McKessonSam Houston State University

Healthcare AI engineer with hands-on ownership of production ML and LLM systems at McKesson, spanning clinical risk prediction and RAG-based documentation tools. Stands out for combining deep clinical-data experience, HIPAA-aware deployment practices, and measurable impact through reduced readmissions, clinician workflow gains, and 20% to 30% faster ML delivery for engineering teams.

View profile
Harika M - Mid-level Full-Stack Developer specializing in FinTech and enterprise platforms in Plano, TX

Harika M

Screened

Mid-level Full-Stack Developer specializing in FinTech and enterprise platforms

Plano, TX6y exp
PennyMacUniversity of Central Missouri

Engineer with a pragmatic, production-focused approach to AI-assisted development, using tools like Copilot and ChatGPT to accelerate coding while maintaining strict validation for correctness, security, and performance. Particularly notable for building a multi-agent incident-resolution workflow for a financial platform, with specialized agents for log analysis, root cause identification, fix suggestions, and test generation.

View profile
AB

Senior Data & Platform Engineer specializing in cloud-native streaming and distributed systems

USA10y exp
JPMorgan ChaseNew York Institute of Technology

Financial data engineer who has built and operated high-volume batch + streaming pipelines (200–300 GB/day; 5–10k events/sec) using AWS, Spark/Delta, Airflow, Kafka, and Snowflake, with strong emphasis on data quality and reliability. Demonstrated measurable impact via 99.9% SLA adherence, major reductions in bad records/nulls, MTTR improvements, and significant latency/runtime/query performance gains; also built a distributed web-scraping system processing 5–10M records/day with anti-bot and schema-drift defenses.

View profile
MS

Mid-level Data Engineer specializing in multi-cloud data platforms for healthcare and finance

USA6y exp
CignaUniversity of Cincinnati

Data engineer with Cigna experience building and operating an end-to-end AWS-based healthcare claims pipeline processing ~2TB/day, using Glue/Kafka/PySpark/SQL into Redshift. Strong focus on data quality and reliability (schema validation, monitoring/alerting, retries/checkpointing/backfills), reporting improved accuracy (~99%) and reduced latency, plus experience serving real-time Kafka/Spark data to downstream analytics with documented data contracts.

View profile
HM

Mid-Level Full-Stack Software Engineer specializing in cloud-native and GenAI solutions

Remote, USA5y exp
Capital OneUniversity of North Carolina at Charlotte

Built and shipped production RAG-based LLM agents automating multi-step document query workflows, emphasizing reliability via monitoring, retries, structured exception handling, and fallback retrieval (alternative embeddings/keyword search). Demonstrated measurable gains (18% latency improvement, 25% retrieval efficiency, 12% precision) and has experience integrating agents with messy tax and transaction data at RSM using validation/cleaning and idempotent design.

View profile
AG

Mid-level Data Engineer specializing in cloud ETL and real-time streaming

New York, NY6y exp
PNCRochester Institute of Technology

Data engineer focused on AWS + Spark/Databricks pipelines, including an end-to-end nightly loan-data ingestion flow (~2.2M records) from Postgres/S3 through Glue and Databricks into a DWH with layered validation and alerting. Also built real-time streaming with Kafka + Spark Structured Streaming and a master’s project streaming Reddit data for sentiment analysis under ambiguous requirements and tight budget constraints.

View profile
MS

Mohammad Sami

Screened

Mid-level Data Analyst specializing in financial services and fraud analytics

Beaverton, OR3y exp
Facteus, IncUniversity of Tampa

Analytics candidate currently at Facteus with hands-on experience turning messy transactional data into trusted reporting layers in Snowflake and Power BI. They combine SQL and Python automation with strong validation, performance tuning, and stakeholder-facing metric design, including cohort-based retention and segmentation work that improved trust and adoption of analytics.

View profile
SK

Mid-level Data Analyst specializing in healthcare and business intelligence

Michigan, USA4y exp
Banner HealthTrine University

Healthcare analytics candidate with hands-on experience turning messy EHR, billing, and operational data into validated SQL datasets and automated Python/Airflow pipelines. They appear strongest in hospital KPI reporting—especially length of stay, readmissions, retention, and bed utilization—and have owned projects from metric definition through Power BI delivery and impact measurement.

View profile
AB

Alekya Battu

Screened

Mid-level Data Scientist specializing in machine learning, MLOps, and cloud analytics

USA5y exp
Wells FargoWilmington University

Senior data scientist with ~5 years’ experience building production ML/NLP systems in finance (Wells Fargo) and deep learning for sensor analytics in connected vehicles (Medtronic). Has delivered end-to-end platforms combining time-series forecasting with transformer-based NLP, including automated drift monitoring/retraining (MLflow + Airflow) and standardized Docker/CI/CD deployments; achieved a reported 22% precision improvement after domain fine-tuning.

View profile
BS

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp
DiscoverUniversity of South Dakota

Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.

View profile
HE

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp
LexisNexisUniversity of South Florida

AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.

View profile
Somil Shah - Mid-level AI/ML Engineer specializing in generative AI, RAG platforms, and LLM agents in San Francisco, CA

Somil Shah

Screened

Mid-level AI/ML Engineer specializing in generative AI, RAG platforms, and LLM agents

San Francisco, CA4y exp
INTERACT Animal LabNortheastern University

AI/LLM engineer who has shipped 10+ production applications, including InvestIQ on GCP—a production-grade RAG due-diligence engine that ethically scrapes web/PDF sources, builds a ChromaDB knowledge base, and delivers analyst-style dashboards plus a citation-backed chat copilot. Deep focus on reliability (evidence-only answers, hard citations, refusal gating), retrieval tuning, and orchestration (Airflow/Cloud Composer), plus multi-agent systems (CrewAI with 7 specialized finance agents).

View profile
Brian Mar - Senior Data Engineer specializing in data infrastructure and marketing/CRM analytics in San Mateo, CA

Brian Mar

Screened

Senior Data Engineer specializing in data infrastructure and marketing/CRM analytics

San Mateo, CA8y exp
Full Circle InsightsUC Davis

Salesforce-focused implementation/solutions engineer from Full Circle Insights who owned end-to-end campaign attribution and reporting deployments for multiple customers at once (3–5 concurrently), including sandbox testing, KPI monitoring, and rollback-safe migrations from legacy reporting. Also builds personal multi-agent workflows and uses Claude Code to rapidly scaffold data/analytics scripts like an advertising optimization parser over CSV/XLSX inputs.

View profile
SG

Mid-level Data Analyst/Data Engineer specializing in BI, ETL pipelines, and cloud analytics

4y exp
VerizonLindsey Wilson College

Data engineer focused on marketing/web analytics and external API pipelines, handling ~10M records/week. Built Azure-based ingestion and PySpark transformations with rigorous data quality checks, then served curated datasets into Synapse/Redshift for Power BI. Also designed an Airflow-orchestrated crypto REST API pipeline with monitoring, retries/exponential backoff, schema-change detection, and backfill-friendly reprocessing.

View profile
VG

Mid-level GenAI Engineer specializing in LLM fine-tuning, RAG, and MLOps

Glassboro, NJ5y exp
HCLTechRowan University

Healthcare-focused LLM engineer who deployed a production triage and clinical knowledge retrieval assistant using RAG and LangGraph-orchestrated multi-agent workflows. Emphasizes clinical safety and compliance with robust hallucination controls, HIPAA/PHI protections (tokenization, encryption, audit logging, zero-retention), and human-in-the-loop escalation; reports a 75% latency reduction in a healthcare agent system.

View profile
VS

Senior AI/ML Engineer specializing in Generative AI, LLMs, and MLOps

Tampa, FL9y exp
VerizonJawaharlal Nehru Technological University

Telecom (Verizon) AI/ML practitioner who built a production multimodal system that ingests messy customer issue reports (calls, chats, emails, screenshots, videos) and turns them into confidence-scored incident summaries with reproducible steps and evidence links. Also built KPI/alarm-to-ticket correlation to rank likely root-cause domains (RAN/Core/Transport), cutting triage from hours to minutes and improving MTTR.

View profile

Need someone specific?

AI Search