Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Spark Professionals

Pre-screened and vetted.

Apache Spark Python Docker SQL AWS CI/CD

Narayanaroyal Marisetty

Screened

Mid-level Data Scientist/ML Engineer specializing in healthcare AI and MLOps

USA4y exp

CVS HealthUniversity at Buffalo

“Designed and deployed an enterprise LLM-powered clinical/pharmacy policy knowledge assistant at CVS Health, replacing manual searches across PDFs/Word/SharePoint with a HIPAA-compliant RAG system. Built end-to-end ingestion and orchestration (Airflow + Azure ML/Data Lake + vector index) with PHI masking, versioned re-embedding, and production monitoring (Prometheus/Grafana), and partnered closely with clinicians/compliance to ensure policy-grounded, auditable answers.”

A/B Testing Apache Airflow Apache Hadoop Apache Hive Apache Kafka Apache Spark+132

View profile

Neeraj Jawahirani

Screened

Mid-level Data & AI Engineer specializing in healthcare data pipelines and MLOps

FL, USA4y exp

HumanaFlorida State University

“Built and deployed a production LLM-powered clinical note summarization system used by care managers to speed review of 5–20 page unstructured medical records. Implemented safety-focused validation (prompt constraints, rule-based and section-level checks, human-in-the-loop) to reduce hallucinations while maintaining low latency and meeting privacy/regulatory constraints, integrating via APIs into existing clinical tools.”

Agile Amazon CloudWatch Amazon Redshift Amazon S3 Amazon SageMaker Ansible+122

View profile

Siva Manikanta Lakumarapu

Screened

Mid-level AI/ML Engineer specializing in Generative AI and NLP

Dallas, TX5y exp

Gilead SciencesUniversity of North Texas

“AI/LLM engineer with production experience building secure, scalable compliance-focused generative AI systems (GPT-3/4, BERT) including RAG over internal regulatory document bases. Has delivered end-to-end pipelines on AWS with PySpark/Airflow/Kubernetes/FastAPI, emphasizing privacy controls, monitoring, and iterative evaluation (A/B testing). Also partnered closely with bank compliance officers using prototypes to refine NLP summarization/classification and reduce document review time.”

A/B Testing Agile Amazon EC2 Amazon Redshift Amazon S3 Apache Airflow+164

View profile

kesav boob

Screened

Mid-Level Full-Stack Java Engineer specializing in microservices and cloud

San Francisco, California5y exp

Dell TechnologiesCal State LA

“Full-stack developer who built an end-to-end Hotel Management System using React and Spring Boot with MongoDB and AWS. Has hands-on experience debugging API/data-fetching issues with Postman and validating results against the database, plus exposure to handling large data workloads with chunking and monitoring via Grafana/Tabula.”

Java SQL C C++C#Python+129

View profile

Niveditha A

Screened

Mid-level AI/ML Engineer specializing in healthcare ML and LLM/RAG systems

USA4y exp

UnitedHealth GroupBowling Green State University

“AI/LLM engineer with recent production experience at UnitedHealth Group building an end-to-end RAG system over structured EMR data and unstructured clinical notes, including evidence retrieval, GPT/LLaMA-based reasoning, and a validation layer for reliability. Strong in orchestration (Kubeflow/Airflow/MLflow), prompt engineering for noisy healthcare text, and rigorous evaluation/monitoring with gold-standard benchmarking, plus close collaboration with clinical operations stakeholders.”

Python NumPy Pandas JSON SQL PostgreSQL+152

View profile

Hrishikesh Raghunath

Screened

Mid-level Data Engineer specializing in scalable ETL, streaming analytics, and cloud data platforms

Remote, USA7y exp

Dreamline AICalifornia State University, Fullerton

“At Dreamline AI, built and productionized an AWS-based incentive intelligence platform that uses Llama-2/GPT-4 to extract eligibility rules from unstructured state policy documents into structured JSON, then processes them with Glue/PySpark and serves results via Lambda/SageMaker/API Gateway. Designed state-specific ingestion connectors plus schema validation and automated checks/alerts to handle frequent policy/format changes without breaking the pipeline, and partnered with business/analytics stakeholders to deliver interpretable eligibility decisions via explanations and dashboards.”

A/B Testing Amazon CloudWatch Amazon Kinesis Amazon Redshift Amazon S3 Amazon SageMaker+114

View profile

Rakesh Kolagani

Screened

Mid-level AI/ML Engineer specializing in MLOps and LLM-powered applications

Mountain View, CA5y exp

IntuitUniversity of Central Missouri

“AI/ML engineer with production experience building a RAG-based internal analytics assistant (Databricks + ADF ingestion, Pinecone vector store, LangChain orchestration) deployed via Docker on AWS SageMaker with CI/CD and MLflow. Strong focus on real-world constraints—latency/cost optimization (LoRA ~60% compute reduction), hallucination control with citation grounding, and enterprise security/governance. Previously at Intuit, delivered an interpretable churn prediction system (PySpark/Databricks, Airflow/Azure ML) that improved retention targeting ~12%.”

A/B Testing Amazon S3 Apache Airflow AWS Glue AWS Lambda AWS Step Functions+126

View profile

Pooja Murigappa

Screened

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and MLOps in Financial Services

Austin, TX5y exp

Charles SchwabUniversity of Central Missouri

“ML/LLM engineer at Charles Schwab who built a production loan-advisor chatbot integrated with internal knowledge and loan-calculator APIs, adding strict numeric validation to prevent rate hallucinations and optimizing context to control costs. Also runs ~40 Airflow DAGs orchestrating retraining/ETL/drift monitoring with an automated Snowflake→SageMaker→auto-deploy pipeline, and uses rigorous testing plus canary rollouts tied to business metrics and compliance constraints.”

Amazon DynamoDB Apache Airflow Apache Kafka Apache Spark AWS AWS Glue+183

View profile

Supriya Mattapelly

Screened

Mid-level AI/ML Engineer specializing in GenAI agents, RAG pipelines, and MLOps

USA6y exp

UnitedHealthcareKent State University

“AI/ML engineer who built a production RAG-based internal document intelligence assistant (LangChain + Pinecone) to let employees query enterprise reports in natural language. Demonstrated hands-on pipeline orchestration with Apache Airflow and tackled real production issues like retrieval grounding and latency using tuning, caching, and token optimization, while partnering closely with non-technical business stakeholders through iterative demos.”

A/B Testing Amazon CloudWatch Amazon EC2 Amazon Redshift Amazon S3 Apache Airflow+152

View profile

Ruijing Wang

Screened

Intern Data Scientist specializing in healthcare AI and experimentation

Boulder, CO1y exp

EchoPlus AIStevens Institute of Technology

“Human-AI Design Lab practitioner who productionized a wearable-health anomaly detection system by evolving a standalone autoencoder into a hybrid autoencoder + GPT-based approach, backed by PySpark ETL and MLOps on AWS SageMaker/MLflow. Also has applied LLM troubleshooting experience (fine-tuned FLAN-T5 summarization) and partnered with BI teams to run A/B tests and improve retention via feature stores and experimentation.”

Python Pandas Scikit-Learn PyTorch TensorFlow SQL+97

View profile

Vinay Nadella

Screened

Mid-level Java Full-Stack Developer specializing in microservices and cloud-native web apps

Wichita, Kansas5y exp

Koch IndustriesUniversity of Central Missouri

“Full-stack engineer who has shipped and owned production analytics dashboards using Next.js App Router + TypeScript, combining server components for data-heavy pages with client components for interactive charts/filters. Also built a Temporal-orchestrated payment reconciliation workflow with versioning, idempotency, and exponential-backoff retries, and has hands-on Postgres query/index optimization using EXPLAIN ANALYZE.”

Agile Ansible Angular AngularJS Apache Kafka Apache Maven+122

View profile

Ankit Patra

Screened

Mid-Level Software Engineer specializing in cloud, microservices, and AI/ML

New York, NY6y exp

Binghamton UniversityBinghamton University

“Backend/API engineer with ~4 years experience building production services in .NET Core/PostgreSQL/Redis/Docker and optimizing real-world latency issues (claims ~60% response-time improvement). Also built and owned an end-to-end RAG-based AI assistant using Python/FastAPI, OpenAI APIs, and Pinecone, plus agentic workflows with reliability guardrails (retries, confidence thresholds, monitoring). Currently pursuing a master’s degree and targeting a $150k base salary.”

Agile Ansible Apache Kafka Apache Spark AWS AWS Lambda+120

View profile

HarshaSree gudapati

Screened

Senior Data Engineer specializing in cloud-native data platforms for finance and healthcare

Charlotte, NC4y exp

Bank of AmericaUniversity of Cincinnati

“Data engineer/backend data services practitioner with Bank of America experience building real-time and batch transaction-monitoring pipelines and APIs (Kafka + databases, REST/GraphQL). Highlights include a reported 45% response-time improvement through performance optimizations and use of Delta Lake schema evolution plus CI/CD (GitHub Actions/Jenkins) and operational reliability patterns like CloudWatch monitoring and dead-letter queues.”

Azure Data Factory AWS Amazon S3 AWS Glue Amazon Redshift AWS Lambda+125

View profile

Madhav Vaddepalli

Screened

Senior Data Engineer specializing in cloud data platforms and big data pipelines

Seattle, WA8y exp

SafecoFitchburg State University

“Data engineer focused on building reliable, production-grade pipelines and external data collection systems on AWS (S3/Lambda/SQS/Glue/EMR) using PySpark/SQL, serving curated datasets to Snowflake/Redshift for finance and fraud teams. Has operated a large-scale crawler ingesting millions of records/day with anti-bot tactics, schema versioning/quarantine, and CloudWatch/Datadog monitoring, and also shipped a versioned REST API with caching and query optimization.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon Redshift Amazon RDS+192

View profile

Aayush Anand

Screened

Intern Full-Stack/Software Engineer specializing in web apps, cloud, and data/ML systems

New York, NY1y exp

The NorthStar GroupNYU

“Built and productionized LLM-driven content intelligence/SEO agents for a high-traffic media platform, automating tagging/summarization/metadata with FastAPI + async orchestration and strict JSON-schema outputs. Demonstrated measurable impact (40% faster publishing, +20% organic traffic in 3 months) and strong reliability practices (offline evals, shadow mode, canaries, fallbacks, idempotency, and monitoring).”

Agile Apache Hadoop Apache Hive Apache Kafka Apache Spark AWS+112

View profile

Sushanth Reddy

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and big data pipelines

Columbus, OH4y exp

Western Alliance BankUniversity of Missouri-Kansas City

“Data engineer focused on production-grade pipelines and data services: ingests millions of records/day into S3, performs SQL/Python quality validation and PySpark/SQL transformations, and serves curated datasets via Athena/Redshift. Has experience hardening external data collection with retries/rate-limit handling and shipping versioned internal data APIs with backward compatibility, monitoring, and CI/CD in early-stage environments.”

Python SQL R Node.js ETL Data pipelines+57

View profile

Nivedita Shainaj Nair

Screened

Mid-level ML Data Engineer specializing in MLOps and scalable healthcare data pipelines

Boston, MA5y exp

CignaNortheastern University

“Data/ML platform engineer with healthcare (Cigna) experience owning an end-to-end pipeline spanning Airflow + Debezium CDC ingestion, PySpark/SQL transformations, rigorous data quality gates, and feature-store/API serving for ML training and inference. Worked at 10+ TB scale and cites a ~30% latency reduction plus stronger reliability via idempotent design, monitoring, and backfill-safe reprocessing; also built pragmatic early-stage data pipelines at Frankenbuild Ventures.”

Agile Alerting Anomaly Detection Apache Airflow Apache Kafka Apache Spark+135

View profile

Molli Dinesh

Screened

Mid-level AI/ML Engineer specializing in NLP, LLMs, and MLOps

Remote, USA4y exp

Marsh McLennanIllinois Institute of Technology

“Built an AI-driven insurance policy summarization platform at Marsh, taking it end-to-end from messy PDF ingestion/OCR and custom extraction through LLM fine-tuning and AWS SageMaker deployment. Delivered measurable impact (25% reduction in manual review time, 99% uptime) and demonstrated strong production MLOps/LLMOps practices with Airflow/Step Functions orchestration, rigorous evaluation (ROUGE + human review), and continuous monitoring for drift, latency, and hallucinations.”

Python Pandas NumPy Scikit-learn R SQL+132

View profile

SUMIT MAMTANI

Screened

Mid-level Data Scientist specializing in ML, MLOps, and customer analytics

Tempe, AZ4y exp

QlikArizona State University

“ML/NLP practitioner focused on insurance/claims analytics for a large financial firm, working with millions of fragmented structured and unstructured records. Built production-grade pipelines for entity extraction, entity resolution, and semantic search using Sentence-BERT + vector DB, including fine-tuning with contrastive learning (reported ~15% recall lift) and scalable ETL/containerized deployment on Kubernetes.”

Python Pandas NumPy Scikit-learn TensorFlow PyTorch+117

View profile

Pravalika Kasojjala

Screened

Mid-level AI/ML Engineer specializing in LLM, RAG/GraphRAG, and fraud analytics

Charlotte, NC5y exp

Bank of AmericaUniversity of Wisconsin–Milwaukee

“LLM/agent engineer who has deployed a production internal assistant to reduce employee inquiry resolution time while maintaining regulatory compliance. Experienced with RAG, hallucination risk triage, and graph-based orchestration (LangGraph) for enterprise/banking-style workflows, emphasizing schema-validated, citation-backed, tool-constrained agent designs and tight collaboration with non-technical business/compliance stakeholders.”

A/B Testing Agile Amazon Bedrock Amazon CloudWatch Amazon EC2 Amazon ECS+190

View profile

Saniya Shinde

Screened

Mid-level Data Scientist specializing in NLP, LLMs, and RAG systems

Washington, DC4y exp

World BankGeorge Washington University

“Built and deployed a production-style vision-language pipeline that generates structured medical reports from chest X-rays using BioViLT embeddings, an image-text alignment module, and BiGPT fine-tuned with LoRA, delivered via Streamlit and hosted on AWS EC2. Also collaborating experience presenting EDA findings, feature importance, and model performance to Ford managers while working with vehicle parts data at Bimcon.”

Python SQL R C++PyTorch TensorFlow+93

View profile

Arthi R

Screened

Mid-level Full-Stack Software Engineer specializing in FinTech and cloud-native microservices

Remote – Washington, D.C.5y exp

Fannie MaeWright State University

“Backend engineer with fintech/banking experience (e.g., Canara Bank) building secure Python/Flask microservices for financial reporting and unified data access. Strong in Postgres/SQLAlchemy performance optimization (including materialized views) and in productionizing ML services on AWS (Lambda/ECS/CloudWatch) with Docker, model registries, and blue-green deployments, plus multi-tenant isolation via JWT-based middleware.”

Python JavaScript TypeScript C C++Go+129

View profile

Shweta Gupta

Screened

Senior Backend Software Engineer specializing in Java microservices, Kafka, and AWS

Seattle, WA6y exp

EasyBee AIUC Irvine

“AI engineer who shipped a production chat assistant for a storage company by building the underlying RAG-style knowledge base (document ingestion, chunking/embeddings, FAISS vector store) and an admin update interface to keep content current. Also has full-stack delivery experience (Python REST APIs + React/TypeScript UI) and AWS operations using Terraform/Jenkins, including handling a real production performance incident by optimizing DB queries and adding auto-scaling.”

A/B Testing Agile API Testing AWS Bash Batch Processing+111

View profile

Kiran M

Screened

Mid-Level Full-Stack Software Engineer specializing in cloud-native microservices and data platforms

Bentonville, AR5y exp

WalmartNorthern Arizona University

“Backend/ML integration engineer with experience at Accenture and Walmart building Flask-based analytics and prediction APIs on PostgreSQL/MySQL. Strong focus on performance and scalability—uses precomputed aggregates, Redis caching, query tuning (indexes/partitioning/EXPLAIN), and async/background processing; also designs secure multi-tenant isolation with JWT and schema/db-per-tenant strategies.”

API Gateway AWS AWS Glue AWS Lambda Bitbucket BigQuery+145

View profile

Software Engineers Machine Learning Engineers Data Scientists Data Engineers Software Developers AI Engineers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?