Vetted PySpark Professionals

Pre-screened and vetted.

SK

Mid-level Full-Stack Python Developer specializing in cloud, data engineering, and AI/ML

Washington, USA4y exp
Fannie MaeSt. Francis College

Full stack Python developer who actively integrates AI coding assistants into day-to-day engineering work, including code generation, debugging, testing, and documentation. Has also coordinated multi-agent workflows across backend, frontend, testing, and code review, showing an applied, productivity-focused approach to AI-enabled software delivery.

View profile
VB

Entry Data Scientist specializing in data engineering and automotive analytics

Bangalore, India1y exp
Tata ElxsiUniversity of Cincinnati

Frontend-focused candidate with hands-on experience building React and TypeScript dashboards for searching, filtering, and analyzing large datasets in real time. Demonstrates practical performance tuning skills using React DevTools, memoization, debouncing, and pagination, and has also built a Mapbox-based location data dashboard with interactive markers and popups.

View profile
MB

Mid-level Python Developer specializing in FinTech and banking platforms

USA3y exp
IntuitUniversity of Bridgeport

Built and owned an AI-powered real-time financial fraud detection and monitoring platform end-to-end, spanning product decisions, backend architecture, frontend dashboards, deployment, and production support. Their work scaled to 120M transactions/day and materially improved fraud detection accuracy from 78% to 94%, showing rare breadth across distributed systems, observability, and React-based operational analytics.

View profile
FB

Farhath Banu

Screened

Senior Software Engineer specializing in AI-driven marketing and data platforms

Boston, MA7y exp
PostscriptShadan College of Engineering and Technology

Backend/data engineer who builds production FastAPI microservices and AWS serverless/Glue pipelines for SMS analytics and marketing segmentation. Led a legacy batch modernization into modular services (FastAPI + Glue/Athena + ClickHouse) using shadow-mode parity checks, feature flags, and incremental rollout. Demonstrated measurable performance wins (12s to sub-second SQL; ~40% CPU reduction) and strong incident ownership with proactive schema-drift prevention.

View profile
SJ

Mid-level AI/ML Engineer specializing in fraud detection and healthcare predictive analytics

Missouri, USA4y exp
KPMGUniversity of Central Missouri

Built and deployed a production LLM-powered calorie-counting chatbot that turns plain-English meal descriptions into normalized food entities, quantities, and calorie estimates using a hybrid transformer + rule-engine pipeline. Emphasizes reliability with schema/constraint guardrails, confidence-based routing (including embedding similarity search fallbacks), and strong observability/metrics (hallucination rate, calibration, latency, cost). Partnered closely with nutritionists to encode domain standards into mappings and validation logic.

View profile
AM

Alex Manni

Screened

Senior Agile/Product Delivery Leader specializing in enterprise transformation, data and cybersecurity

London, UK39y exp
OfcomPolitecnico di Milano

Built a web-based online Sudoku game in JavaScript (multiplayer format supporting up to 6 teams with up to 5 players each) and demonstrates strong product/analytics orientation. Uses a KPI-driven approach (DAU/WAU, ARPU, session duration, LTV) and structured prioritization methods (MoSCoW, story mapping, cost of delay, DFV) to iterate toward targets; seeking a remote role around $70k/year.

View profile
GC

Mid-Level Full-Stack Software Engineer specializing in healthcare, cloud, and data platforms

Sunnyvale, CA5y exp
Intuitive SurgicalStevens Institute of Technology

Backend/platform engineer who owned a real-time customer analytics microservice stack in Python/FastAPI with Kafka streaming into PostgreSQL, including schema enforcement (Avro) and high-throughput optimizations. Strong Kubernetes + GitOps practitioner (EKS/GKE, Helm, Argo CD) who has handled CI/CD reliability issues with automated pre-deploy checks and rollbacks, and supported major migrations (on-prem to AWS; VM to EKS) with blue-green cutover planning.

View profile
SP

Mid-level Machine Learning Engineer specializing in LLM systems and healthcare data automation

California, USA2y exp
Prime HealthcareUSC

React performance-focused engineer who contributed performance patches back to an open-source context+reducer state helper after profiling and fixing excessive re-renders in an enterprise project management platform at Easley Dunn Productions. Also built an end-to-end LLM-driven pipeline at Prime Healthcare to normalize millions of supply-chain records, reducing defects by 80% and saving 160+ hours/month.

View profile
NJ

Mid-level Data & AI Engineer specializing in healthcare data pipelines and MLOps

FL, USA4y exp
HumanaFlorida State University

Built and deployed a production LLM-powered clinical note summarization system used by care managers to speed review of 5–20 page unstructured medical records. Implemented safety-focused validation (prompt constraints, rule-based and section-level checks, human-in-the-loop) to reduce hallucinations while maintaining low latency and meeting privacy/regulatory constraints, integrating via APIs into existing clinical tools.

View profile
SM

Mid-level AI/ML Engineer specializing in Generative AI and NLP

Dallas, TX5y exp
Gilead SciencesUniversity of North Texas

AI/LLM engineer with production experience building secure, scalable compliance-focused generative AI systems (GPT-3/4, BERT) including RAG over internal regulatory document bases. Has delivered end-to-end pipelines on AWS with PySpark/Airflow/Kubernetes/FastAPI, emphasizing privacy controls, monitoring, and iterative evaluation (A/B testing). Also partnered closely with bank compliance officers using prototypes to refine NLP summarization/classification and reduce document review time.

View profile
HR

Mid-level Data Engineer specializing in scalable ETL, streaming analytics, and cloud data platforms

Remote, USA7y exp
Dreamline AICalifornia State University, Fullerton

At Dreamline AI, built and productionized an AWS-based incentive intelligence platform that uses Llama-2/GPT-4 to extract eligibility rules from unstructured state policy documents into structured JSON, then processes them with Glue/PySpark and serves results via Lambda/SageMaker/API Gateway. Designed state-specific ingestion connectors plus schema validation and automated checks/alerts to handle frequent policy/format changes without breaking the pipeline, and partnered with business/analytics stakeholders to deliver interpretable eligibility decisions via explanations and dashboards.

View profile
RK

Mid-level AI/ML Engineer specializing in MLOps and LLM-powered applications

Mountain View, CA5y exp
IntuitUniversity of Central Missouri

AI/ML engineer with production experience building a RAG-based internal analytics assistant (Databricks + ADF ingestion, Pinecone vector store, LangChain orchestration) deployed via Docker on AWS SageMaker with CI/CD and MLflow. Strong focus on real-world constraints—latency/cost optimization (LoRA ~60% compute reduction), hallucination control with citation grounding, and enterprise security/governance. Previously at Intuit, delivered an interpretable churn prediction system (PySpark/Databricks, Airflow/Azure ML) that improved retention targeting ~12%.

View profile
PM

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and MLOps in Financial Services

Austin, TX5y exp
Charles SchwabUniversity of Central Missouri

ML/LLM engineer at Charles Schwab who built a production loan-advisor chatbot integrated with internal knowledge and loan-calculator APIs, adding strict numeric validation to prevent rate hallucinations and optimizing context to control costs. Also runs ~40 Airflow DAGs orchestrating retraining/ETL/drift monitoring with an automated Snowflake→SageMaker→auto-deploy pipeline, and uses rigorous testing plus canary rollouts tied to business metrics and compliance constraints.

View profile
NK

Senior Data Scientist / ML Engineer specializing in NLP, anomaly detection, and cloud ML platforms

Remote, CA10y exp
EmotionallNMIMS University

ML/NLP practitioner who built customer-feedback topic modeling (NMF + TF-IDF) to diagnose chatbot-to-agent handovers and drove product/ops changes that reduced operational costs by 20%. Also developed LSTM-based intent recognition using Word2Vec/GloVe embeddings for semantic linking, and deployed an LSTM autoencoder for fraud anomaly detection that cut false positives by 25% while capturing 15% more fraud in A/B testing.

View profile
SM

Mid-level AI/ML Engineer specializing in GenAI agents, RAG pipelines, and MLOps

USA6y exp
UnitedHealthcareKent State University

AI/ML engineer who built a production RAG-based internal document intelligence assistant (LangChain + Pinecone) to let employees query enterprise reports in natural language. Demonstrated hands-on pipeline orchestration with Apache Airflow and tackled real production issues like retrieval grounding and latency using tuning, caching, and token optimization, while partnering closely with non-technical business stakeholders through iterative demos.

View profile
RW

Ruijing Wang

Screened

Intern Data Scientist specializing in healthcare AI and experimentation

Boulder, CO1y exp
EchoPlus AIStevens Institute of Technology

Human-AI Design Lab practitioner who productionized a wearable-health anomaly detection system by evolving a standalone autoencoder into a hybrid autoencoder + GPT-based approach, backed by PySpark ETL and MLOps on AWS SageMaker/MLflow. Also has applied LLM troubleshooting experience (fine-tuned FLAN-T5 summarization) and partnered with BI teams to run A/B tests and improve retention via feature stores and experimentation.

View profile
DD

Mid-level Data Scientist specializing in Generative AI, RAG systems, and ML engineering

Amherst, MA6y exp
University of Massachusetts AmherstUniversity of Massachusetts Amherst

AI/LLM engineer who built a production QA RAG for a University of Massachusetts faculty success initiative, cutting service tickets by 70%. Strong end-to-end RAG implementation skills (LangChain, Qdrant, hybrid/HyDE retrieval, FastAPI) with rigorous evaluation (RAGAS, LLM-as-judge) and practical handling of constraints like API rate limits and cost. Prior cross-functional delivery experience collaborating with SMEs and business owners at TCS and IBM.

View profile
HG

Senior Data Engineer specializing in cloud-native data platforms for finance and healthcare

Charlotte, NC4y exp
Bank of AmericaUniversity of Cincinnati

Data engineer/backend data services practitioner with Bank of America experience building real-time and batch transaction-monitoring pipelines and APIs (Kafka + databases, REST/GraphQL). Highlights include a reported 45% response-time improvement through performance optimizations and use of Delta Lake schema evolution plus CI/CD (GitHub Actions/Jenkins) and operational reliability patterns like CloudWatch monitoring and dead-letter queues.

View profile
MV

Senior Data Engineer specializing in cloud data platforms and big data pipelines

Seattle, WA8y exp
SafecoFitchburg State University

Data engineer focused on building reliable, production-grade pipelines and external data collection systems on AWS (S3/Lambda/SQS/Glue/EMR) using PySpark/SQL, serving curated datasets to Snowflake/Redshift for finance and fraud teams. Has operated a large-scale crawler ingesting millions of records/day with anti-bot tactics, schema versioning/quarantine, and CloudWatch/Datadog monitoring, and also shipped a versioned REST API with caching and query optimization.

View profile
AA

Aayush Anand

Screened

Intern Full-Stack/Software Engineer specializing in web apps, cloud, and data/ML systems

New York, NY1y exp
The NorthStar GroupNYU

Built and productionized LLM-driven content intelligence/SEO agents for a high-traffic media platform, automating tagging/summarization/metadata with FastAPI + async orchestration and strict JSON-schema outputs. Demonstrated measurable impact (40% faster publishing, +20% organic traffic in 3 months) and strong reliability practices (offline evals, shadow mode, canaries, fallbacks, idempotency, and monitoring).

View profile
SR

Mid-level Data Engineer specializing in cloud ETL/ELT and big data pipelines

Columbus, OH4y exp
Western Alliance BankUniversity of Missouri-Kansas City

Data engineer focused on production-grade pipelines and data services: ingests millions of records/day into S3, performs SQL/Python quality validation and PySpark/SQL transformations, and serves curated datasets via Athena/Redshift. Has experience hardening external data collection with retries/rate-limit handling and shipping versioned internal data APIs with backward compatibility, monitoring, and CI/CD in early-stage environments.

View profile
NS

Mid-level ML Data Engineer specializing in MLOps and scalable healthcare data pipelines

Boston, MA5y exp
CignaNortheastern University

Data/ML platform engineer with healthcare (Cigna) experience owning an end-to-end pipeline spanning Airflow + Debezium CDC ingestion, PySpark/SQL transformations, rigorous data quality gates, and feature-store/API serving for ML training and inference. Worked at 10+ TB scale and cites a ~30% latency reduction plus stronger reliability via idempotent design, monitoring, and backfill-safe reprocessing; also built pragmatic early-stage data pipelines at Frankenbuild Ventures.

View profile
Molli Dinesh - Mid-level AI/ML Engineer specializing in NLP, LLMs, and MLOps in Remote, USA

Molli Dinesh

Screened

Mid-level AI/ML Engineer specializing in NLP, LLMs, and MLOps

Remote, USA4y exp
Marsh McLennanIllinois Institute of Technology

Built an AI-driven insurance policy summarization platform at Marsh, taking it end-to-end from messy PDF ingestion/OCR and custom extraction through LLM fine-tuning and AWS SageMaker deployment. Delivered measurable impact (25% reduction in manual review time, 99% uptime) and demonstrated strong production MLOps/LLMOps practices with Airflow/Step Functions orchestration, rigorous evaluation (ROUGE + human review), and continuous monitoring for drift, latency, and hallucinations.

View profile
SUMIT MAMTANI - Mid-level Data Scientist specializing in ML, MLOps, and customer analytics in Tempe, AZ

SUMIT MAMTANI

Screened

Mid-level Data Scientist specializing in ML, MLOps, and customer analytics

Tempe, AZ4y exp
QlikArizona State University

ML/NLP practitioner focused on insurance/claims analytics for a large financial firm, working with millions of fragmented structured and unstructured records. Built production-grade pipelines for entity extraction, entity resolution, and semantic search using Sentence-BERT + vector DB, including fine-tuning with contrastive learning (reported ~15% recall lift) and scalable ETL/containerized deployment on Kubernetes.

View profile

Need someone specific?

AI Search