Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Spark Professionals

Pre-screened and vetted.

Apache Spark Python Docker SQL AWS CI/CD

Harideep Balusa

Screened

Mid-level AI/ML Engineer specializing in FinTech risk, fraud detection, and GenAI/RAG systems

USA6y exp

Freddie MacUniversity of Wisconsin

“Built and productionized Azure-based LLM/RAG systems for regulatory/compliance use cases, including automating analyst research and compliance report generation across large unstructured document sets. Demonstrates strong practical depth in hallucination mitigation, hybrid retrieval tuning (BM25 + embeddings), and production MLOps (Databricks, Cognitive Search, AKS, Airflow/MLflow), plus proven ability to deliver auditable, explainable solutions with non-technical compliance teams.”

Python R SQL Scala Machine Learning Deep Learning+125

View profile

Ojasmitha Pedirappagari

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and agentic platforms

Jersey City, NJ5y exp

Nurture HoldingsUC Santa Cruz

“Built and shipped a production RAG-based assistant that lets parents ask natural-language questions about their child’s learning progress, using pgvector retrieval (child-id filtered) and Redis caching to hit ~180ms latency. Implemented real-world guardrails and compliance (Llama Guard, COPPA, retrieval thresholds, fallbacks) with 99.5% uptime, and ran human-in-the-loop eval loops that improved satisfaction from 3.8 to 4.2 while serving 60k+ monthly users and reducing costs significantly.”

Python SQL C#TypeScript JavaScript AWS+83

View profile

Srilekha Jakkula

Screened

Senior Data Engineer specializing in scalable data pipelines and API-driven data services

Chicago, IL5y exp

Northern TrustNorthern Illinois University

“Data engineer focused on building scalable, reliable end-to-end data pipelines and backend REST data services, spanning API ingestion plus batch/stream processing with Airflow, Kafka, Spark/PySpark, and SQL. Emphasizes strong data quality validation, monitoring/fault tolerance, and performance tuning for large datasets, with experience deploying in cloud environments using containerization and CI/CD.”

Python SQL REST APIs API Integration JSON XML+51

View profile

Srijitha Katkuri

Screened

Mid-level Data Analyst specializing in healthcare and business intelligence

Michigan, USA4y exp

Banner HealthTrine University

“Healthcare analytics candidate with hands-on experience turning messy EHR, billing, and operational data into validated SQL datasets and automated Python/Airflow pipelines. They appear strongest in hospital KPI reporting—especially length of stay, readmissions, retention, and bed utilization—and have owned projects from metric definition through Power BI delivery and impact measurement.”

SQL Python Pandas NumPy Power BI Tableau+70

View profile

Alekya Battu

Screened

Mid-level Data Scientist specializing in machine learning, MLOps, and cloud analytics

USA5y exp

Wells FargoWilmington University

“Senior data scientist with ~5 years’ experience building production ML/NLP systems in finance (Wells Fargo) and deep learning for sensor analytics in connected vehicles (Medtronic). Has delivered end-to-end platforms combining time-series forecasting with transformer-based NLP, including automated drift monitoring/retraining (MLflow + Airflow) and standardized Docker/CI/CD deployments; achieved a reported 22% precision improvement after domain fine-tuning.”

Python SQL R Classification XGBoost Random Forest+171

View profile

AadarshKumar Vanga

Screened

Mid-level Full-Stack .NET Developer specializing in healthcare and financial platforms

Bethesda, MD5y exp

Accompany HealthTrine University

“Backend/ML systems engineer who built a Flask + PostgreSQL internal ticketing platform and demonstrates strong database/ORM performance depth (indexes, partitioning, RLS multi-tenancy). Notably optimized a high-throughput attachment OCR/embedding pipeline with batching, deduplication, and Redis caching, cutting median latency from 45s to 10s and reducing worker cost by 35% while increasing throughput 4x.”

C#SQL PL/SQL TypeScript PowerShell JavaScript+215

View profile

Hard Parikh

Screened

Mid-level Software Engineer specializing in data platforms, distributed systems, and applied AI

Austin, TX3y exp

Compass GroupUC Riverside

“AI/full-stack product engineer currently owning Fleck Intelligent Survey Chatbot at E15, a production RAG analytics assistant embedded in Compass Group dashboards for 300+ field operators. Stands out for combining LLM orchestration, analytics engineering, and strong systems thinking—cutting hallucinated numeric answers from 14% to 2%, reducing backlog 62%, and previously delivering a low-level protocol redesign at Amadeus that cut P99 latency by 56%.”

Python SQL C++Java TypeScript JavaScript+113

View profile

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp

DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

Python PySpark SQL Scala Pandas NumPy+157

View profile

Thrinesh Thode

Screened

Mid-level AI/ML Engineer specializing in MLOps and LLM applications

New York, NY4y exp

BNY MellonUniversity at Albany

“BNY Mellon engineer who has built and operated production AI systems end-to-end: a LangChain/Pinecone RAG platform scaled via FastAPI + Kubernetes to 1000 RPM with 99.9% uptime, supported by monitoring and data-drift detection. Also deep in data/infra orchestration (Airflow, Dagster, Terraform on AWS/EMR/EC2), processing 500GB+ daily and delivering measurable reliability and performance gains, plus strong compliance-facing model explainability using SHAP and Tableau.”

A/B Testing Agentic AI Apache Kafka Apache Spark AWS AWS Lambda+86

View profile

Lingyi Wu

Screened

Mid-level Financial/Data Analyst specializing in analytics, forecasting, and healthcare/MarTech data

Los Angeles, CA4y exp

MINISOWestcliff University

“Growth/creative marketer from Esleydunn Games who uses Google Analytics to integrate cross-channel performance data (TikTok, YouTube, LinkedIn, Facebook) and run structured A/B tests on video ad length and layout. Reported reducing CPA by 20 per customer when leveraging YouTube and TikTok, and improved CTR through CTA/button placement testing and ongoing user-feedback loops (forum/WeChat topics).”

Python SQL R Machine Learning Deep Learning Feature Engineering+104

View profile

Koushik Gunjala

Screened

Senior AI Engineer specializing in Agentic AI and distributed systems

Charlotte, NC4y exp

UnitedHealth GroupUniversity of North Carolina at Charlotte

“LLM/agentic workflow engineer with healthcare domain experience who built a HIPAA-compliant multi-agent RAG system for clinical review automation at UnitedHealth Group, achieving 92% precision and cutting latency 40% through async orchestration and Redis semantic caching. Also has strong data engineering orchestration background (Airflow on AWS EMR with Great Expectations) and a proven clinician-in-the-loop feedback process that improved model faithfulness by 18%.”

Agentic AI Distributed Systems Retrieval-Augmented Generation (RAG)GPT-4 LangChain LangGraph+95

View profile

Akshata Vijay Kulkarni

Screened

AI & Full-Stack Software Engineer specializing in LLM-powered applications

Atlanta, GA4y exp

PRGXArizona State University

“Full-stack engineer focused on productionizing LLM applications, including an Android privacy-policy risk summarization app (Kotlin/React Native + FastAPI + Ollama) that cut response times from ~10s to ~5–6s via batching, caching, async, and event-driven architecture. Currently at PRGX building an LLM-based legal contract clause extraction system, partnering closely with legal/procurement SMEs to create schemas, labeled datasets, and evaluation pipelines that improved accuracy from 70% to 85%. Also has experience architecting real-time voice/LLM systems with streaming microservices (Kafka, Kubernetes, gRPC/WebSockets) and an avatar chatbot pipeline (TalkingHead, Google TTS, AnythingLLM).”

Python JavaScript TypeScript Java SQL C+++95

View profile

Hema Edavalapati

Screened

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp

LexisNexisUniversity of South Florida

“AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.”

SQL SQL query optimization Python Pandas NumPy PySpark+159

View profile

Vardhan Addakattu

Screened

Mid-level Data Scientist specializing in Generative AI and NLP for financial risk

Glassboro, NJ4y exp

S&P GlobalRowan University

“Built and shipped production generative AI/RAG assistants in regulated financial contexts (S&P Global), automating compliance-oriented Q&A over earnings reports/filings with grounded answers and citations. Experienced across the full stack—AWS-based ingestion (PySpark/Glue), vector retrieval + LangChain agents, GPT-4/Claude model selection, and production reliability (monitoring, caching, retries) plus rigorous evaluation and regression testing.”

Python R SQL PySpark Pandas Apache Spark+111

View profile

Nandini Kosgi

Screened

Mid-level AI/ML Engineer specializing in NLP, RAG systems, and real-time risk modeling

PA, USA4y exp

Capital OneRobert Morris University

“AI/ML Engineer with 4+ years of experience (Capital One, Odin Technologies) and a master’s in Data Analytics (4.0 GPA) who has deployed LLM/RAG systems to production for compliance/risk and document review. Strong in orchestration and MLOps (Airflow, Kubernetes, MLflow, GitHub Actions) and in tackling real-world LLM constraints like latency, context limits, and data privacy, with measurable impact (20%+ manual review reduction; 33% faster release cycles).”

Agentic AI Anomaly Detection Apache Hadoop Apache Hive Apache Kafka Apache Spark+115

View profile

Sridharan Kairmaknoda

Screened

Mid-level Data Engineer specializing in cloud data platforms and real-time analytics

Saint Louis, MO5y exp

CignaSaint Louis University

“Customer-facing data engineering professional who builds and deploys real-time reporting/dashboard solutions, gathering reporting and compliance requirements through direct stakeholder engagement. Experienced with Google Cloud IAM governance, secure integrations (encryption, audit logging), and fast production troubleshooting of ETL/pipeline failures with follow-on monitoring and automated recovery improvements; motivated by hands-on, travel-oriented customer work.”

SDLC Agile Waterfall Python SQL Jupyter Notebook+137

View profile

Daniel Berhane Araya

Screened

Senior AI/ML Engineer specializing in production-grade LLM systems for regulated finance

Fairfax, VA9y exp

George Mason UniversityGeorge Mason University

“AI/LLM engineer with published work who built FinVet, a production financial misinformation detection system using multi-pipeline RAG, confidence-based voting, and evidence-backed outputs (F1 0.85, +37% vs baseline). Also built NexusForest-MCP, a Dockerized Model Context Protocol server exposing structured global deforestation/carbon data via SQL tools for reliable LLM tool use. Previously delivered borrower risk-rating (PD) models at BMO Financial Group that were validated and integrated into an enterprise credit system through close collaboration with credit officers and portfolio managers.”

Python NumPy Pandas SQL PostgreSQL SQLite+112

View profile

maheen Adeeb

Screened

Senior Machine Learning Engineer specializing in LLMs, speech AI, and RAG systems

Chicago, IL3y exp

VosynDePaul University

“AI engineer with production experience building multilingual speech-to-speech translation pipelines (ASR + LLM) for enterprise/media, focused on reliability at scale. Has hands-on orchestration experience (including IBM Watson contexts) and emphasizes production evaluation/monitoring using a mix of traditional metrics and LLM-based evaluators to catch quality regressions while balancing latency and cost.”

Python SQL JavaScript TypeScript C++PyTorch+116

View profile

Hritvik Gupta

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and healthcare AI

San Francisco, CA3y exp

Penn MedicineUC Riverside

“Built and scaled an AI-powered voice/chat patient engagement platform at Penn Medicine from early prototype into production clinical workflows, focusing on latency, edge cases, and user trust. Strong in LLM reliability engineering (structured prompts, validation/fallbacks), real-time troubleshooting with observability, and cross-functional enablement through pilots, demos, and sales/customer partnership.”

AWS AWS Lambda C++CI/CD Communication Data Engineering+78

View profile

Mohithkumar Bolisetti

Screened

Mid-level Java Full-Stack Developer specializing in microservices and cloud platforms

Strongsville, Ohio5y exp

PNCUniversity of Dayton

“Full-stack engineer focused on modernizing legacy financial/compliance platforms into cloud-native, domain-driven microservices. Deep hands-on experience across Spring Boot/Kafka/Redis/Postgres-Mongo backends and React/Angular frontends, with strong CI/CD and Kubernetes/OpenShift deployment practices for real-time, high-volume workloads.”

Java Python JavaScript TypeScript SQL PL/SQL+169

View profile

Bala Venkateswarlu K

Screened

Mid-level Data Scientist specializing in Generative AI, NLP, and MLOps

USA5y exp

MetLifeHarrisburg University of Science and Technology

“Built and deployed an LLM-powered claims-document summarization system (insurance domain) that cut agent review time from 4–5 minutes to under 2 minutes and saved 1,200+ hours per quarter. Hands-on across orchestration and production infrastructure (Airflow retraining DAGs, Kubernetes, SageMaker endpoints, FastAPI) and recent RAG workflows using n8n + Pinecone, with a strong focus on reliability, cost, and explainability for non-technical stakeholders.”

A/B Testing Agile Apache Kafka Apache Spark Auto Scaling AWS+148

View profile

Sathvika Meka

Screened

Mid-level Data Analyst specializing in BI, analytics, and healthcare data

Remote, USA4y exp

CVS HealthUniversity of South Florida

“Analytics professional at Optum with hands-on experience turning messy healthcare claims data from SQL, Excel, and CRM systems into validated reporting datasets and Power BI dashboards. They also built reproducible Python workflows for claims analysis and owned an end-to-end project focused on improving claims processing efficiency through metric design, segmentation, and stakeholder-driven operational improvements.”

Data Analysis Data Cleaning Data Transformation Statistical Analysis Data Validation Data Governance+74

View profile

Chethan Thimapuram

Screened

Mid-level AI Engineer specializing in LLMs, MLOps, and healthcare NLP

4y exp

HCA HealthcareUniversity of South Florida

“Built a production, real-time clinical documentation system at HCA that converts doctor–patient conversations into structured clinical summaries using speech-to-text, LLM summarization, and RAG. Demonstrated measurable gains from medical-domain fine-tuning (clinical concept recall +18%, ROUGE-L 0.62 to 0.74) while meeting HIPAA constraints via PHI anonymization and encryption, and deployed via Docker/FastAPI with CI/CD and monitoring.”

Python PyTorch Machine Learning Generative AI Large Language Models OpenAI+182

View profile

John Chance

Screened

Senior Machine Learning Engineer specializing in conversational AI and healthcare ML

Greenwood, LA9y exp

Elevance HealthMedaille University

“ML/AI engineer with hands-on ownership of both classical recommender systems and safety-sensitive LLM agent platforms. They combine production MLOps depth with behavioral health domain experience, including clinical safety validation, explainability, and multi-agent orchestration, and cite measurable impact in both business metrics and latency reduction.”

Python SQL Scala PyTorch TensorFlow Scikit-learn+93

View profile

Software Engineers Machine Learning Engineers Data Scientists Data Engineers Software Developers AI Engineers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?