Vetted Apache Spark Professionals

Pre-screened and vetted.

Apache Spark Python Docker SQL AWS CI/CD

SUSENDRANATH MUSANI

Screened

Mid-level AI/ML Engineer specializing in GenAI, NLP, and MLOps

Connecticut, USA5y exp

PfizerUniversity of New Haven

“Built and deployed an enterprise GenAI knowledge assistant over thousands of internal PDFs/reports using a RAG stack (GPT-4 + Hugging Face embeddings + vector DB) to reduce manual search and SME escalations. Uses LangGraph/LangChain to orchestrate modular agent workflows with relevance filtering and fallback handling, and applies rigorous evaluation (golden datasets, edge cases, A/B tests) with production monitoring metrics.”

A/B Testing Agile Apache Kafka Apache Spark AWS Lambda BERT+103

View profile

Srinivas Matta

Screened

Mid-Level Full-Stack Software Developer specializing in cloud-native web platforms

Paducah, KY4y exp

IntuitSoutheast Missouri State University

“Software engineer at Capital One who owned and shipped AI-driven personalization and internal insights dashboards end-to-end, emphasizing fast iteration with feature flags and tight user feedback loops. Built a TypeScript/React + Spring Boot/Python document automation platform with compute-heavy NLP microservices, async workflows, and production-scale reliability/performance practices (Kafka/RabbitMQ-style queues, Redis caching, tracing).”

Ajax Apache Airflow Apache Kafka Apache Spark AWS CloudFormation AWS Lambda+130

View profile

Aisha Sartaj

Screened

Mid-level AI Engineer specializing in LLM systems, RAG, and MLOps

Remote3y exp

ILMAscentUCLA

“Built an LLM multi-agent “ingredient safety” analyzer for cosmetics that cuts consumer research time from ~20+ minutes to minutes, using LangGraph orchestration, hybrid retrieval (Qdrant + Tavily), and safety-focused critic validation (false rejections reduced ~30%→~8%). Also has research-internship experience building computer-vision pipelines to classify emerald color/clarity by translating gem-expert heuristics into quantitative model features.”

A/B Testing API Gateway AWS AWS Glue AWS Lambda CI/CD+118

View profile

Avijit Saha

Screened

Junior Software Engineer specializing in cloud-native microservices and AI/ML observability

Bedford, TX3y exp

JPMorgan ChaseUniversity of the Cumberlands

“Engineer with banking and industrial/IoT experience who has deployed a payment-processing microservice with zero downtime, handling Protobuf schema evolution and sensitive data migration via dual-write/checksum techniques. Demonstrates strong cross-stack troubleshooting (pinpointed intermittent distributed timeouts to a failing ToR switch port) and customer-facing Python ETL customization using plugin-based parsers and Pydantic validation, plus hands-on monitoring/alerting improvements with operators.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon EKS Amazon S3+103

View profile

Harshavardhan Reddy

Screened

Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics

Albany, NY5y exp

Capital OnePace University

“ML/AI engineer with Capital One experience building production-grade customer segmentation and fraud detection systems combining NLP (transformers) and anomaly detection. Strong MLOps and orchestration background (PySpark ETL, MLflow, Airflow, Docker/Kubernetes, Azure ML) with real-time monitoring/alerting and performance optimizations like quantization and caching, plus proven ability to deliver business-facing insights through Power BI/Tableau for marketing stakeholders.”

Python R SQL PySpark Scala Java+105

View profile

Bhuvan Chandi

Screened

Mid-level Data Engineer specializing in AI/ML data platforms

NY, NY6y exp

BlackRockWebster University

“Built and productionized an LLM-powered PDF document Q&A system to eliminate manual searching through long documents, focusing on scalability and answer reliability. Implemented semantic chunking (using headings/paragraphs/tables), overlap, and preprocessing/quality checks to reduce hallucinations, and orchestrated the end-to-end pipeline with Airflow using retries, alerts, and parallel tasks.”

Python SQL Shell Scripting Apache Spark PySpark Apache Hadoop+103

View profile

Sravani Kasaraneni

Screened

Mid-level Machine Learning Engineer specializing in NLP and cloud MLOps

CT, USA4y exp

ServiceNowRivier University

“Built and deployed a production LLM-powered internal documentation assistant using embeddings, a vector database, and a RAG pipeline to reduce time spent searching PDFs/manuals. Experienced in orchestrating end-to-end LLM workflows with Airflow/LangChain, improving reliability via monitoring/error handling, and driving measurable quality through retrieval and hallucination-focused evaluation metrics.”

SDLC Agile Waterfall Python R Java+104

View profile

Kevin Fang

Screened

Intern Software Engineer specializing in full-stack and data systems

Beverly Hills, CA1y exp

Alo YogaUC Irvine

“Software developer with healthcare operations experience at Epic Systems (Referrals & Authorizations), delivering customer-facing tooling to speed manual insurance authorization/denial documentation and support future automation. Also supported an HRIS migration to Workday at Aloe Yoga, solving legacy ID interoperability via scripting and mapping, and demonstrates strong production debugging and test-driven maintainability practices.”

Apache Hadoop Apache Kafka API Development AWS C C#+79

View profile

Min-Han Shih

Screened

Junior Machine Learning Engineer specializing in speech and multimodal AI

Taipei, Taiwan2y exp

FurboUSC

“New grad who has shipped a production vision-language recommendation feature for a pet camera/mobile app, including building a tagged video dataset with human annotators and optimizing inference by FPS downsampling under device compute limits. Also built a multimodal MLLM benchmark using an LLM-as-judge (GPT-5-thinking) with a feedback loop, validated against human scoring, and measured post-feedback quality gains (12% average score improvement).”

Python C C++MySQL Go Apache Spark+61

View profile

Rohit Khoja

Screened

Mid-level Full-Stack Engineer specializing in cloud microservices and NLP/LLM systems

Tempe, AZ4y exp

CitigroupArizona State University

“Full-stack engineer with 3+ years using Java/Spring Boot (Citi) and React, who built a production observability dashboard monitoring 53 microservices across 17 clusters with real-time health/latency tracing and significant performance improvements (cut load time from ~10s). Also designed a serverless AWS face-recognition system (Lambda/S3/SQS) built to handle burst traffic (~1000 concurrent requests), demonstrating strength in scalable, event-driven architectures.”

Agile Amazon EC2 Amazon S3 Amazon SQS Apache Kafka AWS Lambda+106

View profile

pavan kalyan padala

Screened

Mid-level Data Scientist specializing in predictive and generative AI

Daytona Beach, Florida4y exp

2725 Hospitality LLCYeshiva University

“AI/ML engineer with production LLM experience in regulated financial services (J.P. Morgan Chase), building a customer response engine to automate first-contact resolution while addressing privacy, bias, compliance, and scale. Strong MLOps/orchestration background (Airflow, Docker/Kubernetes, AWS Step Functions, Azure ML/SageMaker) plus proven ability to integrate with legacy systems and drive stakeholder adoption through dashboards, auditability, and training.”

Python Pandas NumPy Scikit-learn TensorFlow PyTorch+98

View profile

Shanmukh Sai Madhu

Screened

Mid-level Data Engineer specializing in real-time pipelines and cloud analytics

Chicago, IL5y exp

JPMorgan ChaseUniversity of South Dakota

“Researcher from the University of South Dakota who built a production medical RAG system to help interpret model predictions by retrieving relevant clinical notes and medical literature, overcoming retrieval accuracy and imaging-dataset challenges through semantic chunking and metadata-driven indexing. Also has hands-on orchestration experience with Airflow and Azure Data Factory, plus a pragmatic approach to LLM evaluation and stakeholder-driven iteration.”

Agile Amazon EMR Apache Airflow Apache Kafka Apache Spark AWS+122

View profile

Akshit Modi

Screened

Mid-level AI/ML Engineer specializing in healthcare NLP and MLOps

Remote, USA5y exp

TempusArizona State University

“Healthcare/clinical ML practitioner who built and productionized ClinicalBERT-based pipelines to extract and standardize oncology EHR data, improving downstream model F1 from 0.81 to 0.92 while controlling training cost via LoRA/QLoRA. Experienced orchestrating real-time AWS ETL/ML workflows (Glue, Lambda, SageMaker) and partnering with clinicians using SHAP-based interpretability, contributing to an 18% reduction in readmissions and full adoption.”

Python SQL C++Java NumPy Pandas+166

View profile

Ramtin Khorrami

Screened

Principal Software Engineer specializing in AI/ML and cloud-native backend systems

New York, NY16y exp

McKinsey & CompanyNJIT

“McKinsey data/ML practitioner who led production deployment of an entity resolution + semantic search platform for unstructured finance and healthcare data, integrating with legacy systems under HIPAA constraints. Deep hands-on stack across transformers (spaCy/HF BERT), embeddings + FAISS, and production MLOps/workflow tooling (Airflow, Docker, CI/CD, Prometheus/Grafana), with reported gains of +30% decision speed and +25% search relevance.”

Python SQL R Ruby Java JavaScript+124

View profile

Sai Raja Ramya Bhavana Thota

Screened

Senior Data Scientist specializing in machine learning and customer analytics

Illinois, USA7y exp

Northern TrustBradley University

“Data/ML practitioner with experience applying NLP and classical ML to large-scale customer data (2B+ records) for segmentation, prediction, and survey-text classification, delivering measurable business impact (~18% engagement efficiency). Has hands-on entity resolution across multi-source datasets and has built embedding-based semantic search using SentenceBERT + a vector database with domain fine-tuning (~20% relevance improvement), plus production workflow experience with Spark/Airflow and cloud tooling (AWS/Azure).”

A/B Testing Analytics Azure Machine Learning Bash BigQuery C+195

View profile

guna jaswanth maduri

Screened

Mid-level Machine Learning Engineer specializing in MLOps, NLP, and Computer Vision

USA5y exp

WalmartUniversity of New Haven

“ML/AI engineer with production experience across retail and healthcare: built a real-time computer-vision shelf monitoring system at Walmart and optimized edge inference latency by ~30% using TensorRT/ONNX and pruning. Also partnered with CVS Health clinical/pharmacy teams to deliver a medication-adherence predictive model, using Streamlit explainability dashboards and achieving an 18% adherence improvement.”

Python C++SQL Shell Scripting TensorFlow PyTorch+102

View profile

Yufan Wei

Screened

Intern AI Engineer specializing in LLM agents, RAG, and applied biostatistics

Beijing, China0y exp

SiemensEmory University

“Siemens AI engineer who shipped production multi-agent LLM systems across cybersecurity and sustainability, including a vulnerability automation agent that cut manual work 70%. Deep in orchestration (LangGraph supervisor-worker state machines), reliability engineering (async fault tolerance, retries, spike handling), and rigorous evaluation (offline benchmarks, LLM-as-a-Judge improving label agreement 28.9%) with measurable production guardrails.”

Python JavaScript TypeScript SQL R HTML+70

View profile

Rahul Hatkar

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and MLOps

San Francisco, CA6y exp

Scale AIWebster University

“AI/ML engineer who has shipped production AI systems end-to-end, including an automated multi-channel (Gmail/WhatsApp/voice) candidate interviewing workflow and an enterprise RAG knowledge search platform. Demonstrates strong production rigor (monitoring, A/B tests, guardrails, schema validation, shadow testing) with quantified impact: ~60–70% reduction in interview evaluation time and ~20–30% relevance gains in RAG retrieval.”

A/B Testing Agile Anomaly Detection Ansible Apache Hadoop Apache Spark+167

View profile

Aditya Jaiswal

Screened

Intern Software Engineer specializing in cloud, DevOps, and applied AI

Carlsbad, CA1y exp

ViasatUSC

“Full-stack engineer with startup ownership experience (Aiir) building 15+ TypeScript/Go microservice APIs on GCP Cloud Run with Kafka-based async event streaming and React CRM integrations for billing/analytics. Strong post-launch operator who tuned Oracle performance (partitioning/indexing/query optimization) and validated a 23% retrieval-time reduction via AWR, and has a quality/DevSecOps mindset (94% Pytest coverage, GitHub Actions, SonarQube, Twistlock, CloudWatch) including migrating 18+ production CI/CD pipelines.”

A/B testing Apache Kafka Apache Spark Artificial Intelligence AWS AWS IAM+125

View profile

Kanaka Chalam Volety

Screened

Staff DevOps/SRE Engineer specializing in AWS, Kubernetes, and GitOps

San Jose, CA24y exp

ZoomThompson Rivers University

“Infrastructure-focused engineer with Vonage experience modernizing early-stage cloud architecture (Terraform modularization, blue-green deployments, containerization, and zero-downtime database migration planning to Aurora). Also built a local end-to-end side project, Vastu AI, combining a custom-trained YOLO model (Roboflow-labeled data) with a locally hosted LLM via Ollama to generate a vastu compliance report from floor-plan images.”

Agile Amazon CloudWatch Amazon EC2 Amazon EKS Amazon RDS Amazon S3+190

View profile

Avinash Pancheneni

Screened

Mid-level Machine Learning Engineer specializing in fraud detection and LLM applications

Charlotte, NC5y exp

Bank of AmericaUniversity of North Carolina at Charlotte

“Unreal Engine UI engineer focused on scalable, production-ready UI architecture (C++/Slate/UMG/CommonUI) with strong designer enablement via decoupled, interface-driven patterns and MVVM. Demonstrated measurable performance wins: replaced 200+ per-frame Blueprint bindings to cut UI prepass/paint from 4.2ms to 0.5ms and reduced VRAM by ~120MB using texture streaming proxies.”

Machine Learning Artificial Intelligence Supervised Learning Unsupervised Learning Predictive Modeling Fraud Detection+119

View profile

Devender Kunta

Screened

Senior Data Engineer specializing in Azure Lakehouse, Databricks/Spark, and Snowflake

Richardson, TX6y exp

PwCUniversity of Central Missouri

“Data engineer/platform builder with experience across PwC and Liberty Mutual delivering high-volume, production-grade pipelines and real-time data services. Has owned end-to-end streaming + batch architectures on AWS and Azure, including web scraping systems, with quantified reliability gains (99.9% availability, 90%+ error reduction, 30% latency reduction) and strong observability/CI-CD practices.”

AWS Databricks Apache Spark PySpark Scala Python+109

View profile

Deepthi Mundarinti

Screened

Mid-level Generative AI Engineer specializing in decision intelligence and RAG for regulated enterprises

5y exp

JPMorgan ChaseSaint Louis University

“Healthcare GenAI engineer who built a HIPAA-compliant, auditable RAG-based claims decision support system at Molina Healthcare, processing 3M claims and delivering major impact (48% faster manual reviews, 43% higher decision accuracy). Deep hands-on experience with LangChain orchestration, vector search (ChromaDB/FAISS), embedding fine-tuning, and safety controls (confidence scoring, rule validation, human-in-the-loop escalation) for clinical workflows.”

Generative AI GPT-4 OpenAI API Prompt Engineering Retrieval-Augmented Generation (RAG)Machine Learning+96

View profile

Akashreddy Madduri

Screened

Senior Backend Engineer specializing in real-time data platforms for FinTech and Healthcare

Plano, Texas6y exp

JPMorgan ChaseNorthern Arizona University

“Backend/data engineer with experience at JPMorgan building near real-time payment risk and fraud scoring pipelines using Python, Spark Structured Streaming, and Delta Lake, emphasizing auditability, security, and data correctness (dedupe/late events) to reduce false positives. Also led a legacy-to-cloud migration of claims/eligibility data at Cogna with parallel runs, phased rollout, and healthcare-specific validation (ICD-CPT mapping).”

Python FastAPI Flask SQL PySpark Shell Scripting+102

View profile

Machine Learning Engineers Software Engineers Data Scientists Data Engineers Software Developers AI Engineers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?