Vetted PySpark Professionals

Pre-screened and vetted.

Pavan Punna - Mid-level AI/ML Engineer specializing in LLMs, MLOps, and healthcare-fintech AI in Dallas, TX

Pavan Punna

Screened

Mid-level AI/ML Engineer specializing in LLMs, MLOps, and healthcare-fintech AI

Dallas, TX5y exp
Federal Soft SystemsConcordia University

Built and owned a production GPT-4 RAG assistant for clinical and enterprise query resolution, taking it from initial experiment to deployment, monitoring, and iterative improvement. Their work cut resolution time from 45 minutes to under 2 minutes, achieved roughly 95% accuracy, and scaled to thousands of additional monthly queries while emphasizing safety and trust in a sensitive clinical domain.

View profile
PS

Mid-level AI/ML Engineer specializing in NLP, MLOps, and FinTech

Remote, USA4y exp
AccentureUniversity of Houston

ML/AI engineer with production experience at S&P Global and Accenture, focused on regulated, enterprise-grade systems. Built end-to-end financial risk and credit default models with >90% precision and 12% fewer false positives, and is currently developing secure RAG pipelines on AWS SageMaker for enterprise insight extraction.

View profile
DF

Staff Machine Learning Engineer specializing in NLP, LLMs, and document intelligence

Austin, TX9y exp
PNCUniversity of Cincinnati

ML/AI engineer at PNC who has shipped enterprise-grade RAG and document intelligence systems for compliance and policy workflows. Stands out for combining LLM product thinking with production rigor—owning FastAPI/Kubernetes deployments, monitoring, evaluation, and human-feedback loops that drove measurable gains like 40% faster policy search and 30% faster compliance review.

View profile
RT

Mid-level Machine Learning Engineer specializing in NLP, computer vision, and LLMs

New York City, NY3y exp
WayfairStevens Institute of Technology

Wayfair ML/AI engineer who has shipped and operated production LLM systems for both internal analytics and customer-facing assistants. Stands out for combining strong RAG/retrieval engineering with production-grade platform work—improving trust, reducing latency by ~30%, and cutting ad hoc reporting demand by ~50%.

View profile
Siva Harini Sri Janaki Raman - Mid-level Data Engineer specializing in cloud data platforms in Dallas, TX

Mid-level Data Engineer specializing in cloud data platforms

Dallas, TX3y exp
CVS HealthTexas Tech University

Built an AI-powered internal support assistant at CVS Health using GPT-4, LangChain, and Pinecone, applying RAG, validation, and monitoring to reduce repetitive support tickets while protecting sensitive healthcare data. Stands out for a pragmatic approach to AI engineering: using multi-agent and LLM workflows to accelerate development while keeping systems constrained, observable, and production-friendly.

View profile
CP

Director of Software Engineering specializing in AI, data platforms, and cloud architecture

Washington, DC29y exp
ZipRecruiterAmerican University

Veteran software engineering leader who started as an early internet engineer in the mid-1990s and has since grown into Director/VP-level leadership across legacy web platforms, logistics systems, and modern data engineering. Particularly compelling for companies needing a hands-on leader who can modernize complex Perl/UNIX monoliths, manage large cross-functional teams, and deliver operational systems in warehouse, marketplace, and reverse-logistics environments.

View profile
KS

Entry Data Scientist specializing in ML, NLP, and GenAI

Hyderabad, India1y exp
KofluenceRowan University

AI/full-stack engineer who has built a production-style LLM knowledge assistant from scratch, combining FastAPI, LangChain, FAISS, semantic retrieval, and a user-facing chat interface. Stands out for owning both the technical architecture and the product usability layer, including latency optimization, prompt refinement, and source-backed responses to improve trust for non-technical users.

View profile
ET

Evan Teague

Screened

Senior Software Engineer specializing in backend and data platforms

Bethesda, MD10y exp
Spatial Data LogicUniversity of Virginia

Series A startup engineer with broad full-stack ownership across backend, data, and frontend, including a real-time ingestion platform that scaled to 10x higher daily volume without downtime while cutting latency from minutes to seconds. Brings strong fintech and B2B SaaS experience building auditable, high-throughput systems for analysts, operations, and compliance teams in regulated environments.

View profile
RK

Rohith kollu

Screened

Senior Software Engineer specializing in backend microservices, cloud, and full-stack systems

Dallas, TX7y exp
CiscoIndiana Wesleyan University

Backend/platform engineer who has built and scaled production Java/Spring Boot + Kafka services on AWS/Kubernetes (1M+ msgs/day) and led reliability/performance fixes that restored SLAs (25–30% latency improvement; 99.9% uptime). Also shipped an AI customer-support chatbot end-to-end using retrieval + guardrails and rigorous evaluation/observability, improving resolution time 40% and satisfaction 25%, with a strong plan/execute/verify approach to agentic workflow reliability.

View profile
PS

Priya Shah

Screened

Mid-level DevOps Engineer specializing in AWS cloud infrastructure and CI/CD automation

OH6y exp
ServiceNowSardar Patel University

Backend/data engineer with production experience building a SaaS analytics platform: FastAPI-based microservices with Redis caching and reliability patterns (RBAC, retries/backoff, centralized error handling). Also delivered AWS data pipelines (Glue/PySpark to Redshift) and owned real production incidents using CloudWatch/SNS, plus hands-on PostgreSQL query tuning on multi-million-row reporting workloads.

View profile
PK

Senior GenAI/ML Engineer specializing in LLMs, RAG, and multimodal generative AI

USA4y exp
GE HealthCareFranklin University

LLM/RAG engineer with production deployments in highly regulated domains (Frost Bank and GE Healthcare). Built secure, explainable document-grounded Q&A systems using LoRA fine-tuning, strict RAG with confidence thresholds, and citation-based responses; also established evaluation/monitoring (golden QA sets, hallucination tracking, drift) and achieved ~40% latency reduction through retrieval/prompt tuning.

View profile
AS

Aditya Sairam

Screened

Mid-Level Software Engineer specializing in cloud data platforms and AI search

Troy, MI6y exp
Robotics Technologies LLCCleveland State University

Open-source JavaScript contributor focused on data visualization, extending Chart.js/React with custom plugins for real-time streaming dashboards. Designed an end-to-end telemetry pipeline using Apache Kafka and Azure Cosmos DB, optimizing partitioning, batching, caching, and client throttling to keep latency low and support thousands of concurrent users. Demonstrates strong ownership in fast-changing environments, including building full-stack AI applications and ingestion/ETL pipelines at Robotics Technologies LLC.

View profile
PV

Mid-level Machine Learning Engineer specializing in LLM agents, RAG, and MLOps

New York City, NY6y exp
AvanadeUniversity of North Texas

Built a production AI-driven contract/document extraction system combining OCR, normalization, and LLM schema-guided extraction, orchestrated with PySpark and Azure Data Factory and loaded into PostgreSQL for analytics. Emphasizes reliability at scale—using strict JSON schemas, confidence scoring, targeted retries, and multi-layer validation to control hallucinations while processing thousands of PDFs per hour—and partners closely with non-technical business teams to refine fields and deliver usable dashboards.

View profile
VM

Mid-level Machine Learning Engineer specializing in LLMs, RAG, and Clinical AI

Chicago, Illinois4y exp
OptumIllinois Institute of Technology

Built and productionized a HIPAA-compliant LLM+RAG Clinical AI assistant at Optum, fine-tuning GPT/LLaMA on de-identified patient notes and integrating FAISS/Pinecone for sub-second retrieval; reported to cut diagnosis time by ~20 minutes per case. Experienced in orchestrating ML pipelines (Airflow, AWS Step Functions, Azure Data Factory) and in reliability techniques for LLM systems (grounding, citations, confidence filters, monitoring) while partnering closely with clinicians and compliance teams.

View profile
BB

Mid-level Data Analyst specializing in healthcare and finance analytics

New Jersey, USA5y exp
Omada HealthRowan University

Built an end-to-end Alexa smart-home IoT application controlling a Wi-Fi bulb, including ESP32 firmware (MQTT) and an AWS serverless backend (IoT Core/Device Shadow, Lambda, DynamoDB) with a REST API. Demonstrates strong real-time scalability patterns (streaming ingestion, stateless processing, partition-key design) and full-stack delivery with Spring Boot + React (JWT auth, CORS, data-heavy dashboards).

View profile
SP

Mid-level Data Analyst specializing in AI/ML and advanced analytics

USA3y exp
AccentureMurray State University

Accenture data/ML practitioner who deployed a retail churn prediction and BERT-based sentiment analysis system to production, integrating behavioral + feedback data and operationalizing it with ETL automation, orchestration, and CI/CD. Experienced managing 2TB+ multi-source data, monitoring drift in Databricks, and translating results into Power BI dashboards for marketing teams (including K-means customer segmentation).

View profile
AP

Mid-level Full-Stack Java Developer specializing in cloud-native microservices

USA4y exp
Epic SystemsWebster University

Full-stack Java developer with IBM and Epic Systems experience modernizing legacy enterprise apps into microservices and delivering customer-facing healthcare claims workflows at very high scale (2M+ transactions/day). Strong blend of product engineering (APIs + React/TypeScript UI) and production operations on AWS, including performance incident remediation via query optimization, indexing, and autoscaling.

View profile
RG

Rohan Gore

Screened

Intern AI/ML Engineer specializing in agentic systems and full-stack development

New York City, NY0y exp
MARV CapitalNYU

Built and scaled a multi-agent LLM automation pipeline during a fintech internship, growing from a rapid 1-week proof-of-concept to a 15+ agent hierarchical system that cut market brief report generation time from ~5 hours to under 30 minutes. Hands-on with agent frameworks (Haystack, CrewAI, LangChain) and experienced in debugging agent communication issues via sandboxed modular testing and context/token management; also regularly gives architecture-first technical demos at multiple hackathons and university events.

View profile
NV

Mid-level AI/ML Engineer specializing in Generative AI, RAG, and real-time fraud detection

4y exp
U.S. BankUniversity of Massachusetts Dartmouth

GenAI/ML engineer who has shipped production agentic systems in highly regulated and high-throughput environments, including an AWS Bedrock-based fraud/compliance workflow at U.S. Bank with PII redaction and hallucination detection that cut investigation time by 50%+. Also built and evaluated RAG and recommendation systems at Target, using RAGAS-driven testing, hybrid retrieval with re-ranking, and SHAP explainability dashboards to align model behavior with merchandising business KPIs.

View profile
KK

Mid-level Data Scientist specializing in MLOps, LLM/RAG applications, and deep learning

United States5y exp
CitigroupUniversity of North Texas

Built and deployed a production compliance automation RAG system (at Citi) that generates citation-backed, schema-validated risk summaries for regulatory document review. Emphasizes regulated-environment reliability with retrieval-only grounding, abstention, confidence thresholds, and immutable audit logging, plus orchestration using LangChain/LangGraph and Airflow. Reported ~60% reduction in compliance review effort while maintaining high precision and traceability.

View profile
MR

Mid-level GenAI Engineer specializing in production AI agents and evaluation pipelines

Overland Park, Kansas5y exp
MinutentagWilmington University

Built and shipped a production LLM-powered internal operations automation platform using LangChain RAG (Pinecone) and FastAPI microservices, deployed on AWS EKS, serving 10k+ daily interactions. Implemented a rigorous evaluation/observability stack (golden datasets, prompt regression tests, MLflow, retrieval metrics, hallucination monitoring) that drove hallucinations below 2% and improved reliability, and partnered closely with non-technical ops leaders to cut manual lookup work by 60%+.

View profile
RK

Ram Kottala

Screened

Mid-level Data & GenAI Engineer specializing in lakehouse, streaming, and RAG platforms

Michigan, USA5y exp
FordWebster University

Built a production internal LLM-powered knowledge assistant using a RAG architecture (Python, LLM APIs, cloud services) that answers employee questions with sourced, grounded responses from internal documents. Demonstrates strong practical depth in retrieval tuning (chunking/metadata filters), orchestration with LangChain, and production reliability practices (latency optimization, automated embedding refresh, evaluation metrics, logging/monitoring) while partnering closely with non-technical operations teams.

View profile
NY

Naga Yanala

Screened

Mid-level Data Engineer specializing in cloud data pipelines and analytics platforms

Texas, USA5y exp
Molina HealthcareSoutheast Missouri State University

Data engineer with healthcare and enterprise experience (Molina Healthcare, Dell Technologies) building and operating high-volume batch + streaming pipelines across AWS and Azure. Strong focus on data quality (schema validation, fail-fast checks), reliability (monitoring/alerts, retries), and performance tuning (Spark/partitioning), with measurable runtime reduction and improved downstream trust.

View profile
SK

Mid-level Data Engineer specializing in cloud data pipelines and financial services warehousing

Chicago, IL4y exp
Charles SchwabDePaul University

Data engineer (Charles Schwab) who took ownership of an unstable, ambiguous nightly financial data pipeline and rebuilt it into a reliable, incremental AWS Glue/Airflow/Redshift system feeding Power BI. Created a custom Python data-quality framework with hard-stop gating and schema drift detection, improving integrity (99.9%), cutting runtime (~20%), and reducing incidents/tickets (35% fewer schema-related dashboard incidents; 30% fewer investigations).

View profile

Need someone specific?

AI Search