Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Cleaning Professionals

Pre-screened and vetted.

Data Cleaning Python SQL pandas AWS Docker

Hendrix Crouther

Screened

Junior economics and statistics analyst specializing in healthcare and market research

Berkeley, CA2y exp

Berkeley Economic ReviewUC Berkeley

“Candidate brings a cross-functional mix of early-stage startup consulting, marketing analytics, and outbound/GTM exposure, including work with a radiology startup on market positioning and investor-facing materials. They stand out for combining research and data analysis with clear communication, and have a strong self-driven interest in B2B SaaS, workflow automation, and scalable outbound systems.”

Python R SQL Data Cleaning Google Analytics Canva+44

View profile

Yashodhara Dixit

Screened

Senior Project Manager specializing in Healthcare IT and SaaS implementations

South Brunswick, NJ13y exp

CaseWorthySavitribai Phule Pune University

“PMP-certified implementation project manager with 7+ years leading enterprise SaaS rollouts in government-funded healthcare and human services environments. At Caseworthy, they have owned full-lifecycle deployments of ClientTrack and the Azure OpenAI-based CARA AI copilot, managing complex data migration, compliance, and stakeholder governance challenges across multiple concurrent projects.”

Project Management SDLC Requirements Gathering Data Migration Data Validation Agile+150

View profile

Keerthana Priya

Screened

Mid-level Data Analytics & ML Engineer specializing in NLP, LLMs, and cloud data platforms

Dallas, TX5y exp

MattelKennesaw State University

“At KPMG, built and productionized a secure RAG-based LLM assistant that lets business and risk stakeholders query data warehouses in natural language, reducing dependence on data engineers for ad-hoc analysis. Demonstrates strong production rigor (Airflow orchestration, CI/CD, containerization), retrieval/embedding tuning (rechunking, semantic abstraction for structured data), and reliability controls (confidence thresholds, refusal behavior, monitoring and canary evals).”

SQL Python R PySpark Apache Spark Pandas+123

View profile

Ralish Routray

Screened

Mid-level Data Scientist & Machine Learning Engineer specializing in fraud and forecasting

USA5y exp

JPMorgan ChaseUniversity of Texas at Dallas

“ML/LLM practitioner who has shipped production RAG systems (summarization + Q&A) and end-to-end Airflow-orchestrated demand forecasting pipelines at NEON IT. Strong focus on reliability—uses evaluation scripts, retrieval/chunking tuning, validation/retries/alerts, and stakeholder-driven iteration to make AI workflows consistent and usable.”

SQL Python Pandas NumPy Machine Learning Classification+64

View profile

Iaroslav Kovalchuk

Screened

Junior ML Engineer specializing in energy forecasting and battery optimization

San Carlos, CA3y exp

ElecricFishUniversity of Michigan

“Backend/ML engineer working on a battery energy storage system operations dashboard: built a Flask backend integrated with OAuth and a separate FastAPI optimization/simulation service, deployed via Docker CI/CD to Azure Container Apps. Strong in productionizing ML (AzureML to batch endpoints) and in performance/scalability patterns (Postgres indexing/JSONB, per-unit data isolation, async throttling + caching for year-long CPU-intensive simulations across 40+ scenarios).”

Azure Machine Learning Bash CI/CD C C++Computer Vision+78

View profile

Neha P

Screened

Mid-level Full-Stack Java Developer specializing in cloud-native microservices

Texas, State4y exp

Bank of AmericaUniversity of Central Missouri

“Full-stack engineer with Bank of America experience modernizing a large-scale financial reporting platform. Built React frontends and Java/Spring Boot microservice APIs end-to-end, optimized data-heavy SQL performance (indexing/caching/pagination), and implemented an AI feature for forecasting and anomaly detection using Python/scikit-learn, with deployments supported on AWS.”

Amazon EC2 Amazon S3 Ansible Azure DevOps Bash Bootstrap+125

View profile

Hanish Kukkala

Screened

Mid-level Data Scientist specializing in Generative AI and NLP

USA6y exp

CVS HealthUniversity of Central Missouri

“ML/GenAI engineer with recent CVS Health experience building a production RAG system over unstructured financial/research documents using LangChain, FAISS, and Pinecone, plus LoRA/PEFT fine-tuning of GPT/LLaMA for domain-aware summarization. Demonstrates strong applied MLOps and data engineering skills (Airflow/Prefect, Docker/Kubernetes, CI/CD, MLflow) and measurable impact (sub-second retrieval, ~40% better context retrieval, ~25% entity matching improvement).”

A/B Testing Apache Hadoop Apache Hive Apache Kafka Apache Spark AWS+170

View profile

Sailaja Lokasani

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and healthcare analytics

Dallas, TX5y exp

Lightbeam Health SolutionsSyracuse University

“Healthcare-focused data engineer/ML practitioner with experience at Lightbeam Health Solutions and Humana building production entity-resolution and semantic similarity pipelines across EMR, lab, and claims data. Uses NLP/ML (spaCy, scikit-learn, BioBERT/LightGBM) plus Snowflake/Airflow and vector search (Pinecone) to improve linkage accuracy (reported 90%) and semantic match quality (reported +12–15%), while reducing manual cleanup by 40%+.”

Apache Airflow AWS AWS Glue AWS Lambda Agile C+++134

View profile

Vamshi Arempula

Screened

Senior AI/ML Engineer specializing in Generative AI, RAG, and agentic systems

6y exp

Wellmark Blue Cross and Blue ShieldIndiana Wesleyan University

“GenAI/LLM ML engineer (currently at Webprobo) building an enterprise GenAI platform with document intelligence and automation on AWS and blockchain. Has hands-on experience with RAG, LLM evaluation tooling, and orchestrating production LLM workflows with Apache Airflow, plus deep exposure to reliability challenges in globally distributed/edge deployments. Also partnered with business/marketing stakeholders at a banking client to deliver an AI-driven customer retention insights solution.”

A/B Testing Agile Amazon API Gateway Amazon Bedrock Amazon CloudWatch Amazon Redshift+212

View profile

Mayur Komaravelly

Screened

Senior Data Analyst specializing in data pipelines, web scraping, and legal data enrichment

Illinois, USA5y exp

The HartfordIndiana Wesleyan University

“Data engineer focused on reliable, scalable analytics pipelines and external data collection. Has owned end-to-end pipelines processing 5–10M records/day, serving Snowflake data marts to Power BI/Tableau, and reports ~99% reliability through strong validation/monitoring. Also shipped versioned REST APIs for curated data with query optimization and caching.”

Apache Airflow Apache Kafka Apache Spark Ansible API Design AWS Glue+140

View profile

Fangjian Xiong

Screened

Junior Machine Learning Engineer specializing in NLP and biomedical entity extraction

Boston, MA2y exp

Northeastern UniversityNortheastern University

“Built and deployed a production LLM-powered biomedical knowledge extraction pipeline that processed millions of papers to identify tools/techniques and produce a unified knowledge graph via active learning NER (Prodigy + spaCy transformers) and entity linking (Bio-tools/Wikidata). Addressed hard NLP engineering challenges like WordPiece span-offset alignment and scaled inference over ~1.5M documents using batching/caching, containerized services, async workers, and orchestration with Prefect/Airflow.”

AI Agents AWS BigQuery C#C++Data Preprocessing+94

View profile

Sai Vardhan Reddy

Screened

Mid-Level Data Engineer specializing in cloud data platforms and governed analytics

5y exp

OptumUniversity of Central Missouri

“Data engineer with Optum experience building end-to-end healthcare data pipelines for HL7/FHIR, processing millions of records daily across Kafka streaming and Databricks/Spark batch. Strong focus on data quality (schema enforcement/validations), reliability (Airflow monitoring/alerts), and analytics-ready serving in Snowflake powering Power BI/Tableau, with CI/CD via Git and Jenkins.”

AWS Amazon EC2 AWS Lambda AWS Glue Amazon S3 Amazon Kinesis+94

View profile

Tanvi Dasaripally

Screened

Mid-level Cloud Data Engineer specializing in Azure/AWS pipelines and medallion architecture

USA4y exp

UnitedHealth GroupSouthern Illinois University Carbondale

“Data engineer focused on reliability and data quality, owning end-to-end pipelines processing ~100k–300k records/day. Implemented robust validation and monitoring that cut reporting issues by ~30%, and built stable external data collection with anti-bot measures, backfills, and schema-change detection while maintaining backward-compatible internal data services.”

Python SQL PySpark Apache Kafka Azure Data Factory AWS+72

View profile

Sriraj Samala

Screened

Mid-level Data Analyst specializing in business analytics and BI

Dayton, OH3y exp

University of DaytonUniversity of Dayton

“Analytics professional with higher education experience at the University of Dayton, focused on turning inconsistent operational data into standardized metrics and recurring dashboards. They combine SQL, Python, and Power BI to automate reporting, improve data integrity, and reduce manual reporting by 30%, with outputs adopted in semester planning and cross-department performance tracking.”

Power BI Tableau Microsoft Excel Python Pandas NumPy+61

View profile

Nihar Turlapati

Screened

Intern-level Software Engineer specializing in AI/ML systems

Frankfort, KY2y exp

UPSPurdue University

“Built production LLM/RAG systems during a UPS internship, including a shipment knowledge agent used across 15+ hubs worldwide and a multi-agent PDF RAG workflow. Stands out for combining hands-on enterprise integration with rigorous evaluation, hallucination reduction, and efficient fine-tuning techniques like LoRA.”

C C++C#Python SQL Java+108

View profile

Dhruv Pandoh

Screened

Junior Full-Stack Software Engineer specializing in AI, FinTech, and e-commerce

New York, USA2y exp

MIO PartnersNYU

“Built both traditional internal tooling and LLM-powered systems during an internship, including a React/Python/AWS calculator onboarding platform and a production-style ROS2 RAG assistant over 10K+ documents. Stands out for combining full-stack delivery, stakeholder coordination, and practical AI reliability work like retrieval tuning, source-grounded answers, and low-confidence fallbacks.”

Python JavaScript SQL C++Java Django+142

View profile

Polam Srija

Screened

Mid-level AI/ML Engineer specializing in Generative AI and FinTech

Texas, USA3y exp

Fidelity InvestmentsUniversity of North Carolina at Charlotte

“AI Engineer with hands-on ownership of a production multi-agent RAG platform in financial services, spanning experimentation, architecture, deployment, monitoring, and iterative optimization. Stands out for measurable impact: 35% retrieval relevance improvement and nearly 50% reduction in manual operational analysis effort, plus strong experience making enterprise LLM systems safer and more reliable in production.”

Python SQL Java C C++JavaScript+176

View profile

Supreet Purthpli

Screened

Mid-level AI/ML Software Engineer specializing in cloud-native MLOps and FinTech

San Francisco, CA4y exp

JPMorgan ChaseUniversity of Kansas

“Software engineer with JPMorgan Chase experience delivering end-to-end fintech features (Next.js/React/Node/Postgres on AWS) and measurable performance gains. Built and productionized an AI-native credit decisioning workflow combining LLMs, vector retrieval, and a rules engine with strong governance (bias checks, auditability, human-in-loop), improving precision and cutting underwriting turnaround time by 40%.”

Python Java SQL PySpark JavaScript React+293

View profile

Vinodini Bassetti

Screened

Entry Data Scientist specializing in data engineering and automotive analytics

Bangalore, India1y exp

Tata ElxsiUniversity of Cincinnati

“Frontend-focused candidate with hands-on experience building React and TypeScript dashboards for searching, filtering, and analyzing large datasets in real time. Demonstrates practical performance tuning skills using React DevTools, memoization, debouncing, and pagination, and has also built a Mapbox-based location data dashboard with interactive markers and popups.”

Python SQL PySpark Shell Scripting Git GitHub+73

View profile

Soham Kukkar

Screened

Mid-level Software Engineer specializing in AI and FinTech backend systems

Oakland, CA4y exp

Capital OneClark University

“Full-stack and AI engineer with Capital One experience spanning real-time customer dashboards and production fraud-analysis systems. They combine TypeScript/Next.js/Node.js product engineering with LangChain-based RAG architecture over a 400 GB credit-report corpus, delivering measurable impact including 35% lower frontend latency and 45% faster analyst workflows.”

Python Java JavaScript TypeScript SQL Shell Scripting+113

View profile

Kishan Peesapati

Screened

Senior AI Engineer specializing in Generative AI and RAG applications

8y exp

Keurig Dr PepperGeorge Mason University

“AI engineer who has shipped production LLM systems across customer service and marketing use cases—building a RAG app on Azure OpenAI and speeding retrieval with Redis caching tied to Okta sessions. Also implemented a LangGraph multi-agent workflow that pulls image context from Figma to generate structured HTML marketing emails, adding a verification agent to improve image-selection accuracy while optimizing solution cost for business stakeholders.”

Generative AI Machine Learning Deep Learning Retrieval-Augmented Generation (RAG)Predictive Modeling Model Monitoring+86

View profile

Jahnavi Lasyapriya Vavilala

Screened

Junior Machine Learning Engineer specializing in LLMs, NLP, and computer vision

Bengaluru, Karnataka2y exp

PwCArizona State University

“Built a production, agentic multi-agent pharmaceutical intelligence system for US oncology (breast cancer) conference/news intelligence, automating MSL-style information gathering and summarization for pharma and healthcare stakeholders. Uses CrewAI + LangChain orchestration, custom scraping across ~15 pharma newsrooms, and a grounding-score evaluation approach (sentence transformers/cosine similarity) to mitigate hallucinations.”

Python SQL R Java JavaScript Snowflake+121

View profile

Narayanaroyal Marisetty

Screened

Mid-level Data Scientist/ML Engineer specializing in healthcare AI and MLOps

USA4y exp

CVS HealthUniversity at Buffalo

“Designed and deployed an enterprise LLM-powered clinical/pharmacy policy knowledge assistant at CVS Health, replacing manual searches across PDFs/Word/SharePoint with a HIPAA-compliant RAG system. Built end-to-end ingestion and orchestration (Airflow + Azure ML/Data Lake + vector index) with PHI masking, versioned re-embedding, and production monitoring (Prometheus/Grafana), and partnered closely with clinicians/compliance to ensure policy-grounded, auditable answers.”

A/B Testing Apache Airflow Apache Hadoop Apache Hive Apache Kafka Apache Spark+132

View profile

Ramiz Qudsi

Screened

Principal Data Scientist & Software Engineer specializing in space mission data systems

Boston, MA13y exp

Boston UniversityUniversity of Delaware

“Space/heliophysics ML engineer who built a PyTorch GRU model to propagate solar wind from L1 to the magnetopause with probabilistic outputs for uncertainty quantification, achieving ~25% better CRPS than standard approaches. Also developed production-grade Python ETL and an open-source telemetry processing package for a mission (LEXI), using Docker and GitHub Actions CI/CD and iterating with scientist/engineer stakeholders.”

Python MATLAB Bash SQL PyTorch Scikit-learn+75

View profile

Software Engineers Machine Learning Engineers Data Analysts Data Scientists Data Engineers Research Assistants Data & Analytics Engineering AI & Machine Learning Education

Need someone specific?

AI Search

Related

Need someone specific?