Vetted PySpark Professionals

Pre-screened and vetted.

SM

Mid-level Full-Stack Engineer specializing in cloud-native FinTech analytics

McKinney, TX5y exp
Martingale Solution GroupUniversity of Texas at Dallas

Full-stack/ML-leaning engineer who has shipped production-grade real-time analytics and an internal AI support assistant using RAG over enterprise documentation. Demonstrates strong systems thinking across scalability, reliability, observability, and LLM safety/evaluation (thresholded retrieval, RBAC, response validation, regression-gated evals), with concrete iteration based on performance metrics and user feedback.

View profile
JJ

Mid-level Data Engineer specializing in cloud data platforms and real-time pipelines

Denton, TX5y exp
Real DynamicsUniversity of North Texas

Data engineer who has owned production pipelines end-to-end—from Kafka/Airflow ingestion through SQL/Python validation and dbt transformations into Redshift/BI. Also built and operated a large-scale distributed web scraping platform (50–100 sites daily, ~5–10M records/day) with Kubernetes, Kafka queues, robust retries/DLQ, anti-bot measures, and backfill-safe raw HTML storage.

View profile
Vikram Sandigaru - Mid-level AI Engineer specializing in AI agents, RAG pipelines, and LLM evaluation in Boston, US

Mid-level AI Engineer specializing in AI agents, RAG pipelines, and LLM evaluation

Boston, US3y exp
FounderWayNortheastern University

Built and shipped production LLM systems at Founderbay, including a low-latency voice agent and a graph-based multi-agent research assistant. Strong focus on reliability in real workflows—hybrid SERP + full-site scraping RAG, grounding guardrails, validation checkpoints, and transcript-driven evaluation—plus performance tuning with async FastAPI, Redis caching, and containerization. Also partnered with a non-technical ops lead to automate post-call follow-ups via call summarization, field extraction, and tool-triggered actions.

View profile
chandankumar ramamurthy - Junior Full-Stack Engineer specializing in LLM-powered products in Washington, D.C.

Junior Full-Stack Engineer specializing in LLM-powered products

Washington, D.C.3y exp
Data Science for Sustainable Development (DSSD)George Washington University

Built multiple systems from scratch at DSSD and Aglint, including an NGO sustainability reporting dashboard and a production LLM-powered phone screening agent using Twilio/Retell AI with RAG grounded in PostgreSQL candidate/job data. Strong focus on real-world reliability: guardrails, monitoring, and lightweight eval/regression loops that reduced recruiter score overrides by ~30%. Currently on OPT through May 2026 (plans STEM OPT extension) and committed to relocating to NYC for in-person work; seeking $90k–$120k base with meaningful equity for founding engineer roles.

View profile
MP

Mid-level Data Engineer specializing in FinTech data platforms

California, USA4y exp
AlloyUniversity of Massachusetts Dartmouth

Backend-focused engineer with experience at Ramp, Easebuzz, and George Mason University, spanning data pipelines, workflow automation, and production reliability. Stands out for quantifiable performance gains, strong debugging instincts in distributed job systems, and translating ambiguous finance operations processes into measurable automation outcomes.

View profile
Keeravani Chekuri - Mid-level AI/ML Engineer specializing in LLM systems and MLOps in Boston, MA

Mid-level AI/ML Engineer specializing in LLM systems and MLOps

Boston, MA3y exp
Nexoraschool.aiUniversity of Massachusetts

Built and deployed an AI tutoring assistant end-to-end at Nexora School, spanning discovery with school districts, multi-agent LangGraph/RAG architecture, AWS Bedrock migration, and post-launch stabilization. Stands out for combining hands-on LLM systems engineering with strong educator-facing trust building, FERPA-driven architecture decisions, and disciplined production practices around evals, logging, and messy document ingestion.

View profile
VY

vivek y

Screened

Junior Software Engineer specializing in full-stack development and machine learning

Tallahassee, FL1y exp
Florida State UniversityFlorida State University

Built a production Apple-focused LLM Q&A bot that answers user issues using similar past discussion records, including large-scale scraping and cleaning of thousands of forum threads. Used BeautifulSoup + Playwright for static/dynamic extraction, PySpark + NLP for preprocessing, and LangChain RAG with a custom response-likeliness metric to evaluate performance.

View profile
LC

Mid-level Data Scientist specializing in NLP, recommender systems, and ML deployment

Fairfax, VA4y exp
ProvenBaseNJIT

At Provenbase, built and shipped a production LLM-powered semantic search and candidate matching platform (RAG with GPT-4/Gemini, multi-agent orchestration, Elasticsearch vector search) to scale sourcing across 10M+ candidate records and 1000+ data sources. Drove sub-second performance, cut LLM spend 30% with routing/caching, and improved recruiting outcomes (+45% sourcing accuracy; +38% visibility of underrepresented talent) through bias-aware ranking and tight collaboration with recruiting stakeholders.

View profile
PY

Puruhuthika y

Screened

Mid-level Software Engineer specializing in backend engineering and applied AI workflows

Austin, TX4y exp
Western UnionNorthwest Missouri State University

Backend engineer with fintech/transaction-processing experience who built and optimized a Spring Boot + PostgreSQL + AWS service handling money transactions, resolving peak-traffic latency via query/index and connection pool tuning. Shipped an LLM-driven risk-flagging workflow integrated via a FastAPI Python service, owning prompt design, validation guardrails, monitoring, and human-in-the-loop escalation to reduce false positives and improve precision over time.

View profile
Krishna K - Junior Machine Learning Engineer specializing in multimodal systems and LLMs in Jersey City, NJ

Krishna K

Screened

Junior Machine Learning Engineer specializing in multimodal systems and LLMs

Jersey City, NJ2y exp
JerseySTEMUniversity at Buffalo

Built and productionized a domain-specific LLM-powered RAG knowledge assistant at JerseyStem for answering questions over large internal document corpora, owning the full stack from FAISS retrieval and LoRA/QLoRA fine-tuning to AWS autoscaling GPU deployment. Drove measurable gains (28% accuracy lift, 25% latency reduction) and improved reliability through hybrid retrieval, grounded decoding, preference-model reranking, and Airflow-orchestrated pipelines (35% faster runtime), while partnering closely with non-technical stakeholders to define success metrics and ensure adoption.

View profile
Prabhdeep Gandhi - Mid-level Software Engineer specializing in real-time IoT and event-driven platforms

Mid-level Software Engineer specializing in real-time IoT and event-driven platforms

5y exp
Eagl TechnologySavitribai Phule Pune University

Founding engineer at a startup building LLM/agentic workflows for public-safety customers, with hands-on experience delivering a hybrid on-prem + secure cloud solution to meet strict compliance needs. Implemented OpenTelemetry observability for multimodal agentic systems behind closed networks and used the resulting traces to optimize prompting/token usage for customer-specific security integrations. Regularly runs technical workshops and supports pre/post-sales by translating integration feedback into product roadmap decisions.

View profile
hetvi patel - Mid-level Software/Data Engineer specializing in cloud ETL pipelines and data infrastructure in New Jersey

hetvi patel

Screened

Mid-level Software/Data Engineer specializing in cloud ETL pipelines and data infrastructure

New Jersey5y exp
Plore AIAvila University

Backend/data engineer who built a production analytics data service (Python/FastAPI on AWS/Postgres with PySpark ETL) handling millions of records per day and drove major latency improvements (10–15s to <2s) via indexing, Redis caching, and shifting aggregations into ETL. Also shipped an LLM-based natural-language-to-SQL assistant end-to-end with strong guardrails (schema restrictions, read-only validation, RBAC, masking) and designed a multi-step agent workflow with verification and fallback logic.

View profile
HC

Mid-level Data Engineer specializing in cloud data platforms and ETL automation

Atlanta, GA4y exp
Blue Diamond TechnologiesUniversity of Texas at Arlington

Data engineer who has owned high-volume production pipelines end-to-end (200–300 GB/day) on AWS, implementing strong data quality/observability and achieving 99.9% reliability while cutting data issues ~33%. Also built a large-scale external data collection system ingesting millions of records/day with anti-bot/rate-limit handling and backfill tooling, and shipped a versioned REST service exposing curated Snowflake data to downstream teams.

View profile
AS

Aditya Sharma

Screened

Intern Machine Learning Engineer specializing in deep learning and LLM systems

Tempe, AZ0y exp
Arizona State UniversityArizona State University

Built and shipped a personal LLM-powered news aggregation platform (Clear Brief) that scrapes ~200 articles per cycle, clusters them into ~15–30 consolidated stories, and supports on-demand deep dives via a Next.js API route. Emphasizes production-minded reliability (token/cost controls, timeouts, graceful frontend degradation) and database-backed orchestration using SQLite with retry + exponential backoff for burst processing.

View profile
SN

Mid-level Software Engineer specializing in full-stack and machine learning systems

Clemson, SC4y exp
MyUI.aiClemson University

Full-stack product engineer who led system design and backend/cloud architecture for a senior-living platform spanning an Android kiosk and admin web portal. They combine Azure microservices expertise with strong accessibility instincts, and their UI/UX improvements for seniors and wheelchair users reportedly helped drive 21% revenue growth and a new customer through word of mouth.

View profile
SG

Mid-level Data Analyst specializing in ETL pipelines and business intelligence

Albany, NY4y exp
Office of the New York State ComptrollerUniversity at Albany

Analytics-focused candidate with hands-on experience building compliance and contract utilization reporting from messy contract, vendor, subcontractor, and payment data. They combine SQL and Python automation to improve reporting speed and accuracy, and show strong stakeholder discipline through validation sessions, documentation, and dashboard adoption.

View profile
RAUNAQ BEDI - Entry-level Software Engineer specializing in AI, data engineering, and cloud DevOps in San Francisco, CA

RAUNAQ BEDI

Screened

Entry-level Software Engineer specializing in AI, data engineering, and cloud DevOps

San Francisco, CA1y exp
mParticleRochester Institute of Technology

Product-minded full-stack engineer with strong React/TypeScript, serverless AWS, and Postgres depth, highlighted by owning real-time personalization and onboarding experiences at mParticle. Stands out for combining deep performance debugging with measurable product impact—improving activation by 28%, reducing time-to-insights by 35%, and building reusable internal platform primitives adopted by 12 teams.

View profile
Aryan Karadia - Intern Full-Stack Software Engineer specializing in web apps and edge ML in Calgary, AB

Intern Full-Stack Software Engineer specializing in web apps and edge ML

Calgary, AB1y exp
Engineered AirUniversity of Calgary
View profile
Sakshi Ravindra Asati - Junior Data Engineer and ML Engineer specializing in backend systems and applied AI in Boulder, CO

Junior Data Engineer and ML Engineer specializing in backend systems and applied AI

Boulder, CO3y exp
Innovation & Entrepreneurship Initiative, CU BoulderUniversity of Colorado Boulder
View profile
VN

Mid-level Data Analyst specializing in AML, fraud detection, and cloud data pipelines

Remote5y exp
OutlierGovernors State University
View profile
DV

Senior Data Engineer specializing in cloud lakehouse and AI/ML pipelines

Boulder, CO4y exp
CLD-9University of Colorado Boulder
View profile
SR

Mid-level Software Engineer specializing in full-stack and distributed systems

Colorado Springs, Colorado3y exp
University of Texas at DallasUniversity of Texas at Dallas
View profile
Kiran Ranganalli - Junior Data Engineer specializing in cloud data pipelines and warehousing in San Francisco, CA

Junior Data Engineer specializing in cloud data pipelines and warehousing

San Francisco, CA2y exp
San Francisco State UniversitySan Francisco State University
View profile

Need someone specific?

AI Search