Vetted Data Preprocessing Professionals

Pre-screened and vetted.

AK

Ansh Krishna

Screened

Intern Data Scientist specializing in ML systems and LLM-powered analytics

Noida, India1y exp
Data Security Council of IndiaUSC

Built an autonomous decision analytics LLM agent for end-to-end tabular binary classification, using RAG (FAISS) to retain context across multi-step queries. Deployed as a FastAPI service with production-style reliability features (schema-aware validation, fallbacks, retries, structured outputs) plus offline/online evaluation and monitoring to reduce analysis time and improve consistency versus stateless approaches.

View profile
Rushir Bhavsar - Intern AI/ML Engineer specializing in LLMs, MLOps, and distributed training

Intern AI/ML Engineer specializing in LLMs, MLOps, and distributed training

1y exp
Cadence Design SystemsArizona State University

Founding AI engineer (June 2024) at Talon Labs who built and productionized an LLM-powered chatbot for interacting with proprietary supply-chain documents, deployed at large scale (25–100,000 users). Experienced with RAG/LLM orchestration (LangChain, LlamaIndex, Groq AI) and production ops tooling (Kubernetes, Docker, Kubeflow, Airflow), with a metrics-driven approach to evaluation, observability, and stakeholder alignment.

View profile
Aniket Janrao - Junior Data Scientist specializing in healthcare ML and clinical NLP/LLMs in Houma, LA

Aniket Janrao

Screened

Junior Data Scientist specializing in healthcare ML and clinical NLP/LLMs

Houma, LA2y exp
Objective Medical Systems LLCUniversity at Buffalo

Healthcare-focused LLM engineer who has built two production clinical applications: an automated structured clinical report generator from physician-patient conversations and a RAG-based chatbot for retrieving patient history (procedures, allergies, etc.). Demonstrates strong applied RAG expertise (overlapping chunking, entity dependency graphs, temporal filtering, graph RAG) to reduce hallucinations/omissions and partners closely with clinicians to automate hospital workflows.

View profile
Arya Mane - Junior Full-Stack & AI/ML Engineer specializing in LLMs and multimodal document processing in Dallas, Texas

Arya Mane

Screened

Junior Full-Stack & AI/ML Engineer specializing in LLMs and multimodal document processing

Dallas, Texas1y exp
Receptro.AIUniversity of Texas at Dallas

Built a production RAG-based NBA player scouting assistant that embeds player profiles into FAISS, orchestrates retrieval and LLM recommendations with LangChain, and surfaces results via embedded Tableau dashboards. Demonstrates strong focus on evaluation/monitoring (batch tests, LLM-as-judge, latency/failure/token metrics) and has experience translating non-technical founder goals into DAPT + fine-tuning plans on curated data.

View profile
Ankita A Khartmol - Junior Backend Software Engineer specializing in conversational AI and cloud APIs in Bangalore, India

Junior Backend Software Engineer specializing in conversational AI and cloud APIs

Bangalore, India1y exp
HarmanUSC

Backend/ML-focused software engineer who built and evolved a Python/FastAPI backend for a large-scale conversational AI platform, decoupling API and inference services to improve stability and deployment velocity. Experienced in production hardening (timeouts/fallbacks/monitoring), secure multi-tenant systems (JWT/RBAC/RLS), and low-risk migrations using shadow deployments and incremental traffic ramp-ups.

View profile
PK

Junior Software Engineer specializing in AI/LLM backend systems

Los Angeles, CA2y exp
Easley-Dunn ProductionsUSC

Built production AI systems in high-stakes domains, including a medical RAG chatbot focused on reducing hallucinations and a document-processing workflow that automated manual PDF extraction. Demonstrates strong end-to-end ownership across backend services, APIs, LLM integration, and iterative reliability improvements based on real usage and failure analysis.

View profile
KP

Mid-level Data Analytics & ML Engineer specializing in NLP, LLMs, and cloud data platforms

Dallas, TX5y exp
MattelKennesaw State University

At KPMG, built and productionized a secure RAG-based LLM assistant that lets business and risk stakeholders query data warehouses in natural language, reducing dependence on data engineers for ad-hoc analysis. Demonstrates strong production rigor (Airflow orchestration, CI/CD, containerization), retrieval/embedding tuning (rechunking, semantic abstraction for structured data), and reliability controls (confidence thresholds, refusal behavior, monitoring and canary evals).

View profile
MB

Manav Bhasin

Screened

Junior Full-Stack Machine Learning Engineer specializing in production ML systems

San Jose, CA2y exp
AgroFocal Technologies IncSan José State University

Software engineer who owned end-to-end delivery of customer-facing agricultural forecast reporting (crop yield/health) and iterated quickly via rigorous edge-case testing and customer feedback. Also built an internal ML training platform (TypeScript/React + Flask/Python + MongoDB) used by every developer, with architecture designed to stay responsive under heavy compute load.

View profile
YM

Yogi Makadiya

Screened

Mid-Level Full-Stack Software Engineer specializing in cloud-native microservices and DevSecOps

Seattle, WA3y exp
CuraJoyUniversity of Maryland, College Park

Backend-leaning product engineer with DevSecOps depth who has shipped real-time, Kafka-driven data pipelines and AI-enabled customer-facing features to production on AWS. Built a Spring Boot API layer serving real-time predictions at 100K+ requests/day, improving latency by 35% and user task completion by ~25%, and delivered a React/TypeScript dashboard plus a Postgres audit/history model optimized for search and large event volumes.

View profile
Harini Vinu - Intern Software Engineer specializing in cloud, big data, and test automation in New York, United States

Harini Vinu

Screened

Intern Software Engineer specializing in cloud, big data, and test automation

New York, United States1y exp
QualitestNYU

Internship experience at Qualitest building and deploying an LLM-powered test automation system that reduced manual test creation and improved efficiency (~40%). Demonstrates strong production engineering for LLM systems (timeouts/retries/monitoring/caching, prompt optimization, batching) and has scaled workflows to 100+ concurrent jobs; also has orchestration experience with AWS Step Functions and Kubernetes.

View profile
Fangjian Xiong - Junior Machine Learning Engineer specializing in NLP and biomedical entity extraction in Boston, MA

Junior Machine Learning Engineer specializing in NLP and biomedical entity extraction

Boston, MA2y exp
Northeastern UniversityNortheastern University

Built and deployed a production LLM-powered biomedical knowledge extraction pipeline that processed millions of papers to identify tools/techniques and produce a unified knowledge graph via active learning NER (Prodigy + spaCy transformers) and entity linking (Bio-tools/Wikidata). Addressed hard NLP engineering challenges like WordPiece span-offset alignment and scaled inference over ~1.5M documents using batching/caching, containerized services, async workers, and orchestration with Prefect/Airflow.

View profile
Nishad Kane - Mid-level Data Scientist & AI Engineer specializing in RAG, agentic AI, and production ML

Nishad Kane

Screened

Mid-level Data Scientist & AI Engineer specializing in RAG, agentic AI, and production ML

5y exp
Xtrium AIArizona State University

AI/data engineer who built a production LLM-powered schema drift detection system (LangChain/LangGraph) to catch semantic data changes before they break downstream analytics/ML. Deployed on AWS with Docker/S3 and implemented an LLM-as-a-judge evaluation framework to improve trust, reduce hallucinations, and control false positives/alert fatigue. Collaborated with non-technical risk/business analytics stakeholders at EY by delivering human-readable drift explanations that improved confidence in financial analytics dashboards.

View profile
RK

Mid-level AI/ML Engineer specializing in MLOps and LLM-powered applications

Mountain View, CA5y exp
IntuitUniversity of Central Missouri

AI/ML engineer with production experience building a RAG-based internal analytics assistant (Databricks + ADF ingestion, Pinecone vector store, LangChain orchestration) deployed via Docker on AWS SageMaker with CI/CD and MLflow. Strong focus on real-world constraints—latency/cost optimization (LoRA ~60% compute reduction), hallucination control with citation grounding, and enterprise security/governance. Previously at Intuit, delivered an interpretable churn prediction system (PySpark/Databricks, Airflow/Azure ML) that improved retention targeting ~12%.

View profile
PM

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and MLOps in Financial Services

Austin, TX5y exp
Charles SchwabUniversity of Central Missouri

ML/LLM engineer at Charles Schwab who built a production loan-advisor chatbot integrated with internal knowledge and loan-calculator APIs, adding strict numeric validation to prevent rate hallucinations and optimizing context to control costs. Also runs ~40 Airflow DAGs orchestrating retraining/ETL/drift monitoring with an automated Snowflake→SageMaker→auto-deploy pipeline, and uses rigorous testing plus canary rollouts tied to business metrics and compliance constraints.

View profile
NK

Senior Data Scientist / ML Engineer specializing in NLP, anomaly detection, and cloud ML platforms

Remote, CA10y exp
EmotionallNMIMS University

ML/NLP practitioner who built customer-feedback topic modeling (NMF + TF-IDF) to diagnose chatbot-to-agent handovers and drove product/ops changes that reduced operational costs by 20%. Also developed LSTM-based intent recognition using Word2Vec/GloVe embeddings for semantic linking, and deployed an LSTM autoencoder for fraud anomaly detection that cut false positives by 25% while capturing 15% more fraud in A/B testing.

View profile
RW

Ruijing Wang

Screened

Intern Data Scientist specializing in healthcare AI and experimentation

Boulder, CO1y exp
EchoPlus AIStevens Institute of Technology

Human-AI Design Lab practitioner who productionized a wearable-health anomaly detection system by evolving a standalone autoencoder into a hybrid autoencoder + GPT-based approach, backed by PySpark ETL and MLOps on AWS SageMaker/MLflow. Also has applied LLM troubleshooting experience (fine-tuned FLAN-T5 summarization) and partnered with BI teams to run A/B tests and improve retention via feature stores and experimentation.

View profile
Saniya Shinde - Mid-level Data Scientist specializing in NLP, LLMs, and RAG systems in Washington, DC

Saniya Shinde

Screened

Mid-level Data Scientist specializing in NLP, LLMs, and RAG systems

Washington, DC4y exp
World BankGeorge Washington University

Built and deployed a production-style vision-language pipeline that generates structured medical reports from chest X-rays using BioViLT embeddings, an image-text alignment module, and BiGPT fine-tuned with LoRA, delivered via Streamlit and hosted on AWS EC2. Also collaborating experience presenting EDA findings, feature importance, and model performance to Ford managers while working with vehicle parts data at Bimcon.

View profile
Saicharitha Yanamandala - Mid-Level Software Developer specializing in Java, Cloud, and Microservices in Chicago, IL

Mid-Level Software Developer specializing in Java, Cloud, and Microservices

Chicago, IL6y exp
Capital OneChicago State University

Backend/Python engineer who owned an end-to-end FastAPI + AWS internal natural-language document Q&A system (Textract extraction, embeddings/vector DB, LLM integration) with strong focus on reliability and latency. Hands-on with Kubernetes + GitOps (Argo CD, Helm, rolling updates/auto-rollback) and built/optimized Kafka streaming pipelines using Prometheus/Grafana. Also supported a zero-downtime on-prem to cloud migration with parallel run and gradual traffic cutover.

View profile
Karan Javali - Mid-Level Full-Stack Software Engineer specializing in FinTech and cloud-native web platforms in Salt Lake City, Utah

Karan Javali

Screened

Mid-Level Full-Stack Software Engineer specializing in FinTech and cloud-native web platforms

Salt Lake City, Utah5y exp
Goldman SachsArizona State University

Software engineer with experience at Goldman Sachs and Arizona State University’s Learning Engineering Institute, shipping production backend systems including a vendor equities invoice-generation service designed for extensibility across multiple vendors. Built Django REST + PostgreSQL backends with JWT auth and Pytest coverage, and delivered data-heavy, responsive Angular dashboards; also has exposure to AWS EC2 deployments and GitLab CI/CD automation.

View profile
Anthony Thanpoovong - Entry-level Software Engineer specializing in cloud, AI, and full-stack development in Toronto, Canada

Entry-level Software Engineer specializing in cloud, AI, and full-stack development

Toronto, Canada1y exp
TELUSToronto Metropolitan University

Backend/AI engineer with hands-on experience building LLM-powered data products and AI platform workflows, including a project that turns tabular datasets into graphs, summaries, and chat-based insights with 1-2 second latency. Also contributed at TELUS to a Sovereign AI Factory self-serve onboarding platform tied to 100+ NVIDIA H200 GPUs, giving them an interesting mix of applied LLM, platform, and infrastructure exposure.

View profile
VS

Senior AI/ML Engineer specializing in Generative AI and agentic systems

Texas, USA5y exp
Bank of AmericaWichita State University

Built and deployed an agentic RAG assistant in production to automate enterprise knowledge search and multi-step workflows with tool calling, tackling real-world issues like hallucinations, retrieval accuracy, and latency. Demonstrates strong LLMOps and orchestration depth (MLflow, Airflow, LangGraph/LangChain/LlamaIndex) plus a metrics-driven approach to agent testing/evaluation and cross-functional delivery with business stakeholders.

View profile
AR

Mid-level Data Analyst specializing in healthcare and financial analytics

Texas, USA5y exp
Blue Cross Blue ShieldUniversity of North Texas

Healthcare analytics candidate with hands-on experience turning messy claims and clinical data into validated SQL/Python pipelines and Power BI dashboards. They have delivered measurable impact in revenue cycle operations, including 15-18% improvement in reimbursement accuracy and 40-45% reduction in manual reporting effort.

View profile

Need someone specific?

AI Search