Vetted PySpark Professionals

Pre-screened and vetted.

Gautam Agrawal - Mid-Level Software Engineer specializing in backend systems, cloud, and applied LLM/NLP in IN, USA

Mid-Level Software Engineer specializing in backend systems, cloud, and applied LLM/NLP

IN, USA4y exp
Project 990Indiana University Bloomington

Applied LLMs to classify long nonprofit mission statements into 8 segments without labeled data, using an ensemble of clustering/embedding methods plus zero-shot RoBERTa/BART and a Tree-of-Thought prompting pipeline with LLM-as-judge evaluation (Gemma). Also built LangChain/LlamaIndex agentic RAG workflows including a text-to-SQL data analysis assistant grounded on DB schema with retries and performance optimizations on an HPC cluster.

View profile
Vengalarao Pachava - Junior AI Data Engineer specializing in Azure Databricks lakehouse and GenAI RAG systems in Irving, TX

Junior AI Data Engineer specializing in Azure Databricks lakehouse and GenAI RAG systems

Irving, TX2y exp
Cloud Rack SystemsIllinois Institute of Technology

Backend/applied AI engineer from Cloud Rack Systems who built production GenAI/RAG and data platforms on Azure/Databricks at enterprise scale (2.5M records/day). Known for making LLM systems behave like deterministic services via strict retrieval contracts, citation-based validation, and strong observability—shipping a knowledge assistant used daily by 50+ users while driving hallucinations near zero and materially improving latency and cost.

View profile
PN

Intern Software Developer specializing in ML, NLP, and data engineering

India1y exp
Karmanye TechUniversity of Texas at Dallas

Robotics competition (ABU Robocon) team member who programmed two robots for a rugby-style game, integrating IoT sensors and real-time decision-making. Implemented low-latency, secure inter-robot communication by moving from Bluetooth to ESP8266/NodeMCU WiFi (with Bluetooth as backup) and used OpenCV plus CNN training workflows for vision-related tasks; no practical ROS/ROS2 experience.

View profile
NT

Neel Thiru

Screened

Mid-level Data Analyst specializing in analytics engineering and financial services

3y exp
Lipdub AiSeneca Polytechnic

Data-driven growth and partnerships professional with experience leading an analytics/reporting vendor rollout end-to-end (vendor selection via stakeholder interviews and PoC, then negotiating scope/pricing/support and tracking adoption/efficiency/accuracy KPIs). At PC Financial, built regression and segmentation models to optimize multi-channel targeting (in-app/email/push), driving +15% campaign engagement and +10% PC Optimum offer loads, and ran behavior-triggered lifecycle experiments that lifted upsell conversion by 20%.

View profile
Sriram Krishna - Mid-Level Software Engineer specializing in AI/ML and cloud-native platforms in Redmond, WA

Mid-Level Software Engineer specializing in AI/ML and cloud-native platforms

Redmond, WA5y exp
Quadrant TechnologiesSeattle University

Backend/AI engineer who has built production LLM orchestration and agentic workflow systems in Python/FastAPI on Kubernetes across AWS/Azure. Demonstrated strong reliability engineering by debugging a real-world memory retention issue that caused latency spikes/timeouts, and strong data/performance chops with a PostgreSQL optimization that cut query latency from ~1.2s to ~15ms. Targets roles building scalable, guardrailed AI-driven workflow automation with robust observability and human-in-the-loop controls.

View profile
Manas Agarwal - Junior Full-Stack Software Engineer specializing in Python APIs, React, and cloud AI integrations in Superior, CO

Manas Agarwal

Screened

Junior Full-Stack Software Engineer specializing in Python APIs, React, and cloud AI integrations

Superior, CO2y exp
VertexOneUniversity of New Haven

Customer-facing software engineer who builds and deploys practical AI/RAG solutions (e.g., an AI assistant for searching billing PDFs) by deeply understanding support workflows and iterating with users. Demonstrates strong production instincts—quickly stabilizing peak-traffic API timeouts with caching/background jobs, then implementing durable fixes with proper monitoring and maintainable code practices.

View profile
RR

Mid-level Data Scientist specializing in AI, analytics, and predictive modeling

Boston, MA4y exp
Humanitarians.AINortheastern University

Data analytics and BI professional with experience turning messy institutional and customer data into decision-ready reporting and predictive systems. They combine strong SQL/Python execution with end-to-end ownership of churn analytics, stakeholder alignment, and operational rollout into dashboards and CRM workflows.

View profile
RN

Junior Machine Learning Engineer specializing in data science and automation

Seattle, WA2y exp
Seattle UniversitySeattle University

Built and shipped an end-to-end AI-powered portfolio chatbot, owning the React frontend, FastAPI backend, and FAISS-based retrieval layer. Demonstrates hands-on full-stack product thinking with attention to UI performance, TypeScript maintainability, and post-launch iteration on response relevance and speed.

View profile
KP

Mid-level AI/ML Software Engineer specializing in GPU-optimized LLM inference and cloud microservices

Seattle, WA5y exp
DVR SoftekSan José State University

Built and deployed a production RAG-based multilingual analytics assistant for healthcare operations, enabling non-technical teams to query claims/EHR and risk metrics with grounded explanations. Demonstrates strong end-to-end LLM system engineering (retrieval tuning, re-ranking, hallucination controls, verification layers) plus workflow orchestration (Airflow/Composer/Step Functions) and stakeholder-driven iteration via prototypes and dashboards.

View profile
BP

Intern Data Scientist specializing in GenAI agents, RAG, and ML platforms

Chicago, IL3y exp
Immerso.aiIllinois Institute of Technology

LLM/agent systems builder who deployed a production hybrid router for immerso.ai that dynamically selects retrieval vs reasoning vs generative pathways, achieving an 82% factual-accuracy lift. Deep hands-on experience optimizing local Mistral 7B inference (4–5 bit GGUF quantization, KV-cache reuse) and building reliable RAG/agent workflows with LangChain/LangGraph/AutoGen across GCP Cloud Run and AWS (ECS/Lambda).

View profile
VK

Vamsi Krishna

Screened

Senior Machine Learning Engineer specializing in MLOps and Generative AI

Austin, TX7y exp
Tungsten AutomationUniversity of Central Missouri

Built and deployed a production generative-AI copilot at Tungsten that automates invoice/form extraction template creation, reducing weeks of manual model-building work. Combines fine-tuned LLMs (PyTorch/HuggingFace) with OpenCV layout grounding to reduce hallucinations, and runs an end-to-end Kubeflow-based MLOps pipeline with drift monitoring, canary releases, and automated retraining.

View profile
DD

Dinal Dholiya

Screened

Mid-level Full-Stack Engineer specializing in AI-powered and cloud-native systems

Remote4y exp
ZentraisUniversity at Buffalo

Product-minded engineer who has owned features end-to-end, including a full onboarding redesign that lifted completion ~25% and a production LLM/RAG report-generation system with strong guardrails (schema-constrained JSON, confidence gating, logging) and an automated eval/regression loop built from real user queries. Also built a scalable research data pipeline ingesting messy PDFs/JSON/CSVs with normalization, idempotent reruns, observability, and cost/latency tradeoffs.

View profile
SR

Mid-level Backend Engineer specializing in Python APIs and cloud-native services

Texas, USA5y exp
Verveba TelecomNorthern Arizona University

Data engineer with experience at Morgan Stanley and Star Health owning production-grade lakehouse pipelines for credit risk and healthcare datasets. Built Azure/Databricks/Delta/Snowflake-based platforms processing millions of records per day with strong data quality, observability (Monte Carlo/Azure Monitor), and reliability practices, plus experience delivering curated data services with performance tuning and backward-compatible versioning.

View profile
Sampath Achalla - Mid-level Python Full-Stack Engineer specializing in AI microservices and cloud data platforms in USA

Mid-level Python Full-Stack Engineer specializing in AI microservices and cloud data platforms

USA3y exp
DoJaGaIllinois Institute of Technology

Backend-leaning full-stack engineer in fintech/payments who shipped an end-to-end Stripe payments + webhook system for a financial microservices platform, emphasizing ledger accuracy via idempotency, transactional writes, retries, and DLQs. Also delivered a real-time React/TypeScript payment status dashboard informed by user interviews, and improved production performance by 35% p95 latency through PostgreSQL tuning and Redis caching on AWS.

View profile
Gomathy Selvamuthiah - Junior Data/AI Engineer specializing in MLOps, real-time pipelines, and LLM applications in Portland, US

Junior Data/AI Engineer specializing in MLOps, real-time pipelines, and LLM applications

Portland, US2y exp
SBD TechnologiesNortheastern University

Built an LLM-driven MLOps agent at SBD Technologies that automated an EV-charging prediction workflow end-to-end, integrating with real-time Kafka/FastAPI systems supporting 120K+ chargers at 99.99% event delivery. Addressed frequent schema drift by implementing SQLAlchemy/Flyway validation (60% reduction in drift issues) and deployed as Kubernetes microservices with GitHub Actions CI/CD; also has Airflow-based ingestion/crawling experience into Snowflake and stakeholder-facing delivery via a Fleetcharge PWA.

View profile
AA

Senior Full-Stack AI/ML Engineer specializing in MLOps and GenAI

Belmont, Michigan10y exp
AvaSureCapitol Technology University

Senior backend/data engineer who has built and maintained HIPAA-compliant, real-time clinical FastAPI services on AWS, orchestrating ML/LLM and vector DB calls with strong reliability patterns (auth, timeouts/retries, graceful degradation, idempotency). Also delivered AWS IaC/CI-CD (Terraform/Helm/GitHub Actions) across EKS/Lambda/SageMaker and built Glue/Spark ETL with schema evolution and data quality controls, plus demonstrated large SQL performance wins (15 min to <9 sec) and hands-on incident ownership.

View profile
Akram Ali - Senior Full-Stack Developer specializing in Node.js/TypeScript, cloud, and data engineering in Chicago, United States

Akram Ali

Screened

Senior Full-Stack Developer specializing in Node.js/TypeScript, cloud, and data engineering

Chicago, United States10y exp
BeyondMindsUniversity of the Punjab

Frontend/fullstack lead who inherited a messy psychological app with production issues, drove a rapid stabilization (2–3 weeks) and major performance/architecture overhaul (Redux Toolkit, memoization, caching, lazy loading, CDN offload to S3/CloudFront). Also owns delivery and infrastructure practices (multi-env, Docker, GitHub Actions CI/CD, AWS ECS + load balancing) and led a 1-week POC for an AI-powered trucking management system (app.neblo.ai).

View profile
SS

Sam Sharif

Screened

Senior AI Engineer specializing in machine learning, GenAI, and MLOps

Drexel Hill, PA8y exp
Tech PrysmTemple University

Built an end-to-end agentic population health strategy copilot for healthcare leadership, turning broad chronic disease questions into structured, evidence-backed strategy briefs. Stands out for combining healthcare domain knowledge with production-grade GenAI implementation, including LangGraph orchestration, Databricks/MLflow deployment, human review, and quality gates focused on citations, metrics, risks, and safety.

View profile
VC

Mid-level AI Engineer specializing in GenAI, agentic workflows, and RAG systems

USA6y exp
Federal Home Loan BankIndiana Tech

Built a production multi-agent RAG assistant using LangChain/LangGraph with OpenAI embeddings and FAISS, focusing on retrieval quality and latency (Redis caching, parallel retrieval, precomputed embeddings). Experienced orchestrating ETL/ML pipelines with Airflow and Databricks Workflows, and has delivered an AI assistant for business ops to extract insights from policy/compliance documents through close non-technical stakeholder collaboration.

View profile
PK

Senior AI/Data Engineer specializing in Agentic AI and Advanced RAG on Azure Databricks

United States7y exp
Spark Data SolutionsUniversity of Cincinnati

Built production LLM/agent systems for procurement and contract spend controls, including a proactive contract value leakage detection platform that moved an organization from reactive audits to pre-payment rejection. Combines multi-agent orchestration (Semantic Kernel/LangChain/AutoGen), document AI benchmarking (Textract vs Azure DI), and MLOps/testing (MLflow, QTest/Pytest) with strong security practices (RAG-grounded responses to prevent prompt injection). Integrated anomaly alerts directly into SAP SES workflows and Power BI dashboards, citing ~$38M leakage addressed across large spend environments.

View profile
VC

Mid-level AI Engineer specializing in Generative AI, LLM fine-tuning, and RAG systems

Edison, NJ4y exp
EliteUS Software SolutionsRutgers University

Built and deployed production LLM applications including a natural-language-to-read-only-SQL system focused on ambiguity handling and query safety (schema whitelisting, intent validation, confidence checks, deterministic execution). Experienced with LangChain-based, modular agent orchestration and RAG document QA for large PDFs, with a metrics-driven testing/evaluation approach and cross-functional delivery with marketing on an AI content recommendation/search tool.

View profile
Rizwana Shaik - Mid-level Full-Stack Software Engineer specializing in cloud-native apps and AI copilots in Dallas, TX

Rizwana Shaik

Screened

Mid-level Full-Stack Software Engineer specializing in cloud-native apps and AI copilots

Dallas, TX4y exp
Integrated Digital SolutionsUniversity of North Texas

Internship project building and deploying a LLaMA-based, RAG-enabled copilot inside a Professional Services Automation platform, enabling natural-language navigation, text-to-SQL reporting, and project/resource/budget insights across multiple modules. Addressed real production issues like context drift and vague queries with hybrid search, metadata enrichment, and an intent classification/rewriting layer, orchestrated via Apache Airflow—ultimately cutting PMO reporting time by 40%.

View profile
Ahmad Alomari - Senior Data Scientist & Machine Learning Engineer specializing in computer vision and production ML in Cleveland, OH

Ahmad Alomari

Screened

Senior Data Scientist & Machine Learning Engineer specializing in computer vision and production ML

Cleveland, OH7y exp
Cleveland State UniversityCleveland State University

PhD in computer engineering who has built production-oriented ML/NLP systems for space-weather prediction using Spark-based ETL on noisy satellite sensor logs. Strong in entity resolution and semantic search—fine-tuned E5 embeddings with contrastive learning and deployed to Pinecone, improving top-5 retrieval precision by 25%—and emphasizes scalable, observable pipelines with Airflow, Docker, and CI/CD.

View profile
SN

Sam Nick

Screened

Senior AI/ML Engineer specializing in Generative AI, LLMs, and NLP

Dallas, TX10y exp
Rexus Group

ML/AI engineer with hands-on experience building healthcare and fraud-detection systems from experimentation through deployment, monitoring, and retraining. Stands out for combining real-time IoT pipelines, cloud-native MLOps, and GenAI/RAG in regulated healthcare settings, with reported impact including reduced emergency response times and a 25% reduction in manual diagnosis time.

View profile

Need someone specific?

AI Search