Vetted Databricks Professionals

Pre-screened and vetted.

Hari Billa - Mid-level Data Scientist specializing in machine learning, NLP, and healthcare AI in USA

Hari Billa

Screened

Mid-level Data Scientist specializing in machine learning, NLP, and healthcare AI

USA3y exp
HCA HealthcareSouthern Arkansas University

Senior data scientist with hands-on ownership of production ML and GenAI systems across enterprise churn, clinical Q&A, and real-time fraud detection. Stands out for combining strong MLOps discipline with measurable business impact, including $2M+ retained revenue, 10K TPS low-latency fraud infrastructure, and a clinician-reviewed RAG system that improved retrieval accuracy by ~38%.

View profile
Ruturaj Dixit - Junior Data Scientist specializing in AI/ML and product analytics in New York, NY

Ruturaj Dixit

Screened

Junior Data Scientist specializing in AI/ML and product analytics

New York, NY2y exp
Pace UniversityPace University

Applied ML/data scientist who has owned backend-heavy AI systems end-to-end, including a market-signal platform on FastAPI/AWS and rapid MVP delivery in medical computer vision. Particularly interesting for teams needing someone who can combine model development, backend APIs, production debugging, and pragmatic low-latency architecture decisions.

View profile
VK

Vaishnavi K

Screened

Mid-level AI/ML Engineer specializing in GenAI, MLOps, and anomaly detection

USA5y exp
TCSUniversity of New Haven

LLM/MLOps engineer who has shipped a production RAG-based technical documentation assistant (FastAPI) cutting manual review by 45%, with deep hands-on retrieval optimization in Pinecone/LangChain (HNSW, hybrid + multi-query search, caching). Also brings healthcare domain experience—building Airflow-orchestrated EHR pipelines and delivering FDA-auditability-friendly predictive maintenance solutions using SHAP/LIME explainability surfaced in Power BI.

View profile
TP

Thilak P

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and big data pipelines

5y exp
W. R. BerkleySacred Heart University

Backend/data engineer who builds Python (FastAPI) data-processing API services for internal analytics/reporting, emphasizing modular architecture, async performance tuning, and reliability patterns (health checks, retries, observability). Also migrated legacy on-prem ETL pipelines to Azure using ADF/Data Lake/Functions and implemented a near-real-time ingestion flow with Event Hubs plus watermarking to handle late events and deduplication.

View profile
YP

Mid-level AI Engineer specializing in LLMs, RAG, and data engineering

Boston, MA5y exp
Humanitarians.AINortheastern University

AI Engineer Co-Op at Northeastern University who built a production Patient Persona Chat Bot to help nursing students practice clinical interactions, fine-tuning Llama 3 and integrating a LangChain + Pinecone RAG pipeline deployed on Amazon Bedrock. Emphasizes clinical accuracy and reliability with guardrails, retrieval filtering, and continuous evaluation, and also brings strong data engineering/orchestration experience (Airflow, EMR/PySpark, ADF, dbt, Databricks, Snowflake).

View profile
NB

Mid-level Data Scientist specializing in ML, NLP, and LLM-powered solutions

Tampa, FL4y exp
LumenUniversity of South Florida

AI/NLP-focused practitioner who built a zero-/few-shot LLM event extraction system on the long-tail Maven dataset, combining prompt-structured outputs with LoRA/QLoRA fine-tuning and rigorous F1 evaluation. Also implemented entity resolution/data cleaning pipelines and embedding-based semantic search using Sentence-BERT + FAISS, and has healthcare experience delivering a multilingual speech/translation mobile prototype using HIPAA-compliant Azure Cognitive Services.

View profile
Snehitha Penumaka - Mid-level AI/ML Engineer specializing in predictive modeling and cloud ML pipelines in Dallas, TX

Mid-level AI/ML Engineer specializing in predictive modeling and cloud ML pipelines

Dallas, TX3y exp
Cambard LLCUniversity of Texas at Dallas

LLM engineer/data engineer who has deployed production RAG systems for internal-document Q&A, building end-to-end ingestion, embedding, vector search, and FastAPI serving while actively reducing hallucinations and latency through rigorous retrieval tuning and caching. Also experienced in orchestrating cloud data pipelines (Airflow, AWS Glue, Azure Data Factory) and partnering with non-technical business teams to deliver AI solutions like automated document review.

View profile
AK

Mid-level AI/ML Engineer specializing in production ML, RAG systems, and MLOps

KS, USA4y exp
Black & VeatchUniversity of Central Missouri

Built and shipped a widely adopted, production-grade RAG internal search assistant that unified scattered engineering knowledge, deployed as a FastAPI service on Kubernetes with FAISS + LangChain. Demonstrates deep practical expertise in retrieval tuning (chunking, hybrid search, re-ranking) and in making LLM workflows reliable in production via guardrails, monitoring, and evaluation, plus strong cross-functional delivery with non-technical operations teams.

View profile
RC

RIYA CHADDHA

Screened

Mid-level Data Engineer and Business Analyst specializing in cloud ETL and analytics

Remote, US5y exp
MellicellNortheastern University

Data analyst with cross-industry experience spanning insurance analytics at L&T Infotech and experimental imaging analytics at Mylyser. Stands out for building scalable SQL/PySpark data pipelines, standardizing business-critical metrics like claims lifecycle and policy retention, and delivering measurable impact such as 50%+ faster query performance and a 15% reduction in claims settlement time.

View profile
David Alvarado - Junior Business Analyst specializing in data analytics and BI in Orlando, FL

Junior Business Analyst specializing in data analytics and BI

Orlando, FL3y exp
ChubbUniversity of Central Florida

Analytics candidate with insurance domain experience at Chubb, combining strong SQL/Python data engineering for claims reporting with business-facing metric design in Power BI. Also built an MLB game outcome predictor that beat Vegas implied probabilities using public data, showing strong product thinking and applied modeling ability beyond standard BI work.

View profile
MK

Mid-level AI/ML Engineer specializing in Generative AI and MLOps

Arlington, TX4y exp
micro1University of Texas at Austin

Built and shipped a production RAG assistant using GPT-4, LangChain, and Pinecone/FAISS to search 50K+ institutional documents, with a strong focus on groundedness and hallucination reduction through retrieval optimization and re-ranking. Pairs this with a metrics-driven evaluation/monitoring approach (BLEU/ROUGE, manual sampling, logging) and workflow automation via Airflow, and has experience translating stakeholder needs into iterative AI prototypes.

View profile
Andrew Clayman - Senior Data Scientist specializing in ML, NLP, and production AI systems in Remote

Senior Data Scientist specializing in ML, NLP, and production AI systems

Remote8y exp
AppstemUniversity of Southampton

Machine learning/NLP engineer with deep Azure stack experience (Data Factory, Databricks/Spark, Delta Lake, Azure OpenAI, Azure AI Search) who built end-to-end production systems for semantic clustering, entity resolution, and hybrid search. Demonstrated measurable gains from embedding fine-tuning (~15% retrieval precision, ~10–12% nDCG@10) and designed scalable, quality-checked pipelines with MLOps best practices.

View profile
SG

somasekhar G

Screened

Mid-level Data Engineer specializing in cloud big data and streaming pipelines

California, USA4y exp
Smarc Solutions IncUniversity of Colorado Boulder

Data engineer focused on large-scale financial data platforms, with hands-on ownership of an AWS + Databricks + Snowflake pipeline processing ~2TB/day. Strong in data quality (Great Expectations), schema drift automation, and production reliability (99.9%), plus measurable performance/cost wins (4h→1.2h, ~25% cost reduction). Also built an async Python crawling/ingestion framework with anti-bot mitigation, retries, and Airflow-driven backfills.

View profile
JD

Jimmy Dani

Screened

Mid-level AI Researcher specializing in privacy-preserving ML and applied cryptography

College Station, TX6y exp
Texas A&M UniversityTexas A&M University

Graduate researcher who builds production-grade AI systems spanning LLM security evaluation and on-device RAG. Created HoneyLearner, a self-learning attack framework using GPT-4-class models as structured black-box attackers against honeywords defenses, with rigorous metrics and reproducible orchestration (Airflow/Spark/Kafka/Docker). Also partnered with agriculture scientists at Texas A&M–Corpus Christi to deliver UAV + 3D point-cloud crop-stress maps that cut time-to-insight ~40% and enabled ~30% earlier interventions.

View profile
SK

Mid-level AI Developer & Machine Learning Engineer specializing in LLM and MLOps systems

Champaign, IL5y exp
CenteneEastern Illinois University

Built and deployed an enterprise RAG application at Centene to help clinical teams retrieve insights from large internal policy document sets, cutting manual research by 30–40%. Implemented custom domain-adapted embeddings (SageMaker + BERT transfer learning) and hybrid retrieval (BM25 + Pinecone) to drive a 22% relevance lift, and ran the system in production on AWS EKS with CI/CD, MLflow, and Prometheus monitoring (99% uptime, ~40% latency reduction).

View profile
LG

Junior Business Analytics & SAP BASIS professional specializing in AI and predictive modeling

Denton, TX3y exp
University of North TexasUniversity of North Texas

Built and deployed a production LLM-powered email assistant (“wood flow”) for a local pet resort to automate after-hours inbound email handling, including email categorization and context-aware auto-responses. Uses n8n for orchestration and applies CRISP-DM, load/edge-case testing, and RAG-based context retrieval, and has experience presenting AI solutions with budgeting and ROI to a non-technical founder.

View profile
SB

Mid-level AI/ML & Data Engineer specializing in MLOps and cloud data pipelines

Remote, USA4y exp
MerkleUniversity of North Carolina at Charlotte

AI/ML engineer (Merkle) with hands-on experience deploying RAG-based LLM applications and real-time recommendation engines into production. Strong in cloud/on-prem architectures, GPU autoscaling, caching, and network optimization—delivered measurable latency reductions (40–70%) and improved retrieval relevance by systematically benchmarking chunking/embedding configurations and validating pipelines via CI/CD.

View profile
KG

Mid-Level Forward Deployed AI Engineer specializing in RAG systems and backend microservices

Austin, TX4y exp
SequretekStevens Institute of Technology

LLM solutions practitioner with SOC/alert-triage experience who takes LLM prototypes to production using RAG (Pinecone), FastAPI services, guardrails, CI/CD, monitoring, and robust fallback logic. Known for rapid real-time debugging of embedding/vector and agent workflow issues, and for driving adoption through code-first workshops and sales-aligned custom demos with measurable improvements (35% faster triage; 40% increase in correct tool usage).

View profile
LD

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and MLOps

Atlanta, GA3y exp
AIGKennesaw State University

Data professional with ~4 years of experience, most recently at AIG (insurance), building ML/NLP systems for fraud detection and policy automation using transformers, CNNs, and clustering/anomaly detection. Also developed a RAG-based knowledge retrieval system, iterating across embedding models and moving to production based on precision and latency SLAs, then containerizing and deploying with SageMaker and CI/CD.

View profile
PRAHARSHA JANDHYALA - Mid-level Data Scientist/Data Analyst specializing in ML, BI dashboards, and ETL pipelines in Dallas, TX

Mid-level Data Scientist/Data Analyst specializing in ML, BI dashboards, and ETL pipelines

Dallas, TX4y exp
HumanaArizona State University

Data/ML practitioner with experience at Humana and Hexaware, focused on turning messy, semi-structured datasets into production-ready pipelines. Built an age-prediction model from book ratings using heavy feature engineering and multiple regression models, and has hands-on entity resolution (deterministic + fuzzy matching) plus embeddings/vector DB approaches for linking and search relevance.

View profile
RT

Rakesh Thota

Screened

Mid-level Data Engineer specializing in multi-cloud real-time data pipelines

California, USA5y exp
Molina HealthcareUniversity at Buffalo

Data engineer with healthcare/clinical trial domain experience who owned a 100TB+/month AWS pipeline end-to-end (Glue/S3/Redshift/Airflow) and drove measurable outcomes (20% lower latency, 99.9% reliability, 40% less manual reporting). Also built production data services and API-based ingestion on GCP (Cloud Run/Functions/BigQuery) with strong validation, versioning, and safe migration practices, and launched an early-stage RAG solution (LangChain + GPT-4) for researchers.

View profile
GM

Mid-level Data Engineer specializing in Azure, Spark, and scalable ETL/ELT pipelines

Charleston, IL4y exp
Eastern Illinois UniversityEastern Illinois University

Data engineer with banking FP&A experience who led an end-to-end migration of 10+ TB from Teradata to Azure (ADF + Data Lake + Databricks/PySpark + Synapse). Emphasizes reliability (multi-stage validation, monitoring/alerts) and performance (Spark tuning, incremental loads, autoscaling), reporting ~99.5% pipeline reliability while supporting downstream consumers with stable schemas and clear change management.

View profile
Anay Dongre - Junior Machine Learning Engineer specializing in GenAI and LLM fine-tuning in Pomona, California

Anay Dongre

Screened

Junior Machine Learning Engineer specializing in GenAI and LLM fine-tuning

Pomona, California1y exp
Aerolift.AICal Poly Pomona

Robotics software engineer focused on hard real-time autonomy for legged robots, building a quadruped navigation stack that combines vision SLAM with MPC and maintains a deterministic 500Hz control loop. Deep performance optimization experience across CUDA (sub-2ms perception latency), ROS 2/DDS real-time tuning, and motion planning (cut 500ms spikes to sub-5ms). Also designed distributed ROS 2 + Zenoh communications between quadrupeds and aerial drones and validated robustness under lossy wireless conditions.

View profile
Ambuk Rehani - Mid-level AI/Backend Engineer specializing in RAG and data platforms in Dallas, TX

Ambuk Rehani

Screened

Mid-level AI/Backend Engineer specializing in RAG and data platforms

Dallas, TX7y exp
EABArizona State University

Built and shipped a production LLM-powered financial Q&A interface that extracts precise numeric data from PDFs using a hybrid AWS Textract + LLM normalization pipeline, with confidence gating and guardrails to prevent unreliable answers. Experienced with LangChain-based RAG orchestration (chunking, memory, structured outputs) and collaborated closely with PMs/analysts on IRS Form 990 extraction requirements.

View profile

Need someone specific?

AI Search