Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Data Validation Professionals

Pre-screened and vetted.

Data Validation Python SQL AWS CI/CD Git

Subhasmita Maharana

Screened

Mid-level Data Scientist specializing in NLP/LLMs, time series forecasting, and MLOps

New York, NY6y exp

CitigroupKent State University

“Data/ML practitioner with hands-on experience building NLP systems from prototype to production: delivered a Twitter sentiment classifier with robust preprocessing, SVM modeling, and Power BI reporting, and built entity-resolution pipelines for messy multi-source customer data (reporting ~95% improvement in unique entity identification). Also implemented semantic linking/search using SBERT embeddings with FAISS vector retrieval and domain fine-tuning (reported ~15% precision lift), and applies production workflow best practices (Airflow/Prefect, Docker, Azure ML/Databricks, Great Expectations).”

A/B Testing Apache Airflow Azure Machine Learning BERT CI/CD Clustering+170

View profile

Rishitha Reddy Buddala

Screened

Mid-level Full-Stack Developer specializing in cloud-native microservices and event-driven systems

4y exp

Molina HealthcareUniversity at Buffalo

“Software engineer with experience at Molina Healthcare and Target, owning production features end-to-end across backend, data pipelines, and UI. Built an event-driven claims validation system (Python/Java/Spring Boot/Kafka) with strong observability, and shipped embeddings-based semantic product search with evaluation loops (CTR/top-k + human review) and guardrails like keyword-search fallback.”

Java Python SQL JavaScript TypeScript Spring Boot+121

View profile

Yu Liu

Screened

Senior Big Data Engineer specializing in AML/KYC compliance and cloud data platforms

New York, NY17y exp

CitigroupUniversity of Missouri

“Data engineer with experience delivering an end-to-end pipeline handling ~3.5TB in a star-schema setup (fact + dimensions) and producing business-facing tables in Hive/Spark. Identified and resolved UAT-reported duplicate issues caused by joins through root-cause analysis, and also built automation to run Spark SQL metrics on weekly/monthly/quarterly cadences and distribute results to users.”

Python JavaScript Shell Scripting SQL MySQL PostgreSQL+110

View profile

Nafeezuddin Mohammed

Screened

Mid-level Data Engineer specializing in Analytics & AI/ML

Virginia, USA6y exp

SonyFitchburg State University

“Data engineer with experience at Sony and Walmart building high-volume, near-real-time analytics and ingestion systems. Has owned end-to-end pipelines from Kafka/Spark streaming through S3/Parquet and Redshift/Looker, emphasizing data quality (Great Expectations), observability (CloudWatch/Azure Monitor), and reliability (Airflow SLAs, retries, checkpointing), including measurable performance and latency improvements.”

Agile Amazon Athena Amazon CloudWatch Amazon EMR Amazon Redshift Amazon S3+124

View profile

Bhavya Sree Ganja

Screened

Senior Data Engineer specializing in cloud lakehouse platforms and streaming analytics

Pittsburgh, PA8y exp

First National BankTexas A&M University-Corpus Christi

“Data engineer focused on fraud and banking analytics who has owned end-to-end batch + streaming pipelines at very large scale (hundreds of millions of records/day). Built robust data quality/observability layers (schema validation, anomaly detection, alerting) and delivered low-latency serving via AWS Lambda/API Gateway with DynamoDB + Redis, plus external data ingestion/scraping pipelines orchestrated in Airflow with anti-bot protections.”

Agile Amazon API Gateway Amazon Athena Amazon CloudWatch Amazon DynamoDB Amazon EC2+210

View profile

Akshaya Chiduruppa

Screened

Mid-level Quality Assurance Engineer specializing in AI/ML and Apple ecosystem testing

Seattle, WA4y exp

AppleMissouri University of Science and Technology

“QA automation engineer with end-to-end ownership of a regression suite for a warehouse loan management platform (.NET/Angular), using Selenium (Java/Cucumber/POM) and Cypress. Improved suite stability and expanded risk-based coverage (DB/API/SQL, RBAC approval workflows), catching critical financial defects like EMI calculation errors and cutting regression effort by ~50% while gating releases via GitLab CI/CD with actionable Slack reporting.”

Quality Assurance Manual Testing Test Planning Test Case Design Functional Testing Regression Testing+77

View profile

Bhanu Prakash Reddy Dakilli

Screened

Mid-level Data Engineer specializing in Azure ETL/ELT and data warehousing

Framingham, MA4y exp

Bank of AmericaNew England College

“Data engineer who has owned end-to-end production pipelines for customer transaction data (~2–5 GB/day) using Python/PySpark/SQL and Airflow, delivering major reliability and speed gains (70% faster reporting; 60–70% fewer data issues). Also built a daily external web-scraping system with anti-bot handling and safe, idempotent Airflow-driven backfills, plus a Python data API optimized with indexing/caching and tested for correctness.”

Python SQL PySpark Apache Spark Java Power BI+97

View profile

Ruthvik Bacha

Screened

Mid-level Data Engineer specializing in financial data pipelines and reliability

North Carolina, USA7y exp

Wells FargoUniversity of South Florida

“Systems/robotics-oriented software engineer focused on real-time orchestration and reliability: built a central control layer coordinating multiple concurrent agents with safe state machines, failure isolation, and recovery. Has hands-on ROS/ROS 2 integration experience in simulation (DDS/QoS, lifecycle, nodes in Python/C++) and emphasizes observability (structured JSON logs, correlation IDs) and low-latency control-loop performance under load.”

Python Distributed systems State management Docker Containerization Debugging+85

View profile

Sanjana Duvva

Screened

Mid-level AI/ML Engineer specializing in Generative AI, LLMOps, and MLOps

5y exp

Wells FargoUniversity of North Texas

“Built and deployed an AWS-based LLM/RAG ticket triage and knowledge retrieval system (Pinecone/FAISS + Step Functions + MLflow) that cut support resolution time by 20%. Demonstrates strong production focus on hallucination reduction, PII security, and low-latency orchestration, with measurable evaluation improvements (e.g., ~25% grounding accuracy gain via re-ranking) and proven collaboration with support operations stakeholders.”

Python SQL Java Scala Shell Scripting TypeScript+153

View profile

Ishaan Nanal

Screened

Intern-level Software Engineer specializing in backend systems and AI/ML

Ithaca, NY1y exp

QuorAgraCornell University

“Built and shipped an LLM-powered RAG research copilot used by 20+ users across biology, physics, and ML, cutting literature review from days to minutes. Strong focus on production reliability—iterated on chunking/retrieval/prompting, added validation and modular pipelines for debuggability, and is now containerizing and scaling the system with Docker and GCP.”

Python SQL JavaScript Java C C+++75

View profile

Deepthi Mundarinti

Screened

Mid-level Data Engineer specializing in real-time analytics and regulated domains

NC, USA5y exp

JPMorgan ChaseSaint Louis University

“Data platform engineer focused on large-scale, real-time fraud systems, with hands-on ownership of streaming architectures using Kafka, Spark, Snowflake, and Databricks. Stands out for combining performance tuning and platform automation with LLM/RAG-based enrichment, delivering measurable gains in latency, fraud accuracy, false positives, and analyst decision speed.”

Python NumPy Pandas PySpark Scikit-learn TensorFlow+120

View profile

Ashok Reddy Kalli

Screened

Mid-level Business Analyst specializing in BI, reporting, and data insights

5y exp

Coca-ColaUniversity of Massachusetts Boston

“Healthcare analytics professional with experience at UnitedHealth Group, focused on turning messy claims, eligibility, and provider data into clean reporting datasets and Power BI dashboards. Combines SQL and Python automation with strong stakeholder alignment around KPI definitions, helping operations teams improve claim turnaround visibility and cost efficiency.”

SQL Data Cleaning Python Pandas NumPy Exploratory Data Analysis+93

View profile

Spandana Bellamkonda

Screened

Mid-level Data Analyst specializing in financial and telecom analytics

Remote, USA5y exp

AT&TLewis University

“Analytics candidate with hands-on experience at AT&T building SQL/Python pipelines for churn, usage, billing, and network-performance data at multi-million-row scale. Stands out for combining strong data quality and reconciliation practices with measurable operational impact, including a 30% query runtime improvement and ~8 hours/week of reporting automation savings.”

SQL Power BI Python SQL Query Optimization Data Cleaning Pandas+107

View profile

Harrishkumar Loganathan

Screened

Mid AI/Machine Learning Engineer specializing in FinTech and Generative AI

Remote, USA3y exp

SocureArizona State University

“AI/ML engineer with hands-on ownership of enterprise LLM deployments at Freshworks, including a large-scale RAG chatbot serving 15,000+ users across six departments. Stands out for combining deep production engineering skills—AWS microservices, Kubernetes, observability, retrieval quality, and faithfulness evaluation—with strong cross-functional stakeholder leadership and prior large-scale fraud data pipeline experience at Socure.”

Python R PySpark Node.js JavaScript TypeScript+135

View profile

Yunjie Liu

Screened

Junior Software Engineer specializing in bioinformatics and full-stack development

Remote3y exp

Baylor GeneticsCornell University

“Built and stabilized production data pipelines in clinical genomics, including integrating a qPCR step into Baylor Genetics' workflow with a focus on reliability, turnaround time, and reducing manual intervention. Also has hands-on LLM production experience, creating a Python/OpenAI-based translation evaluation pipeline that reduced manual review time by 70% and improved scoring consistency.”

Python Java JavaScript SQL C#Bash+65

View profile

Prasad Deshpande

Screened

Mid-level Full-Stack Engineer specializing in enterprise SaaS and optimization platforms

Redwood City, CA5y exp

C3 AINortheastern University

“Full-stack engineer with strong enterprise delivery experience across manufacturing and semiconductor use cases, owning deployments from discovery through post-launch support. Stands out for combining traditional product engineering with applied GenAI workflows and data pipeline reliability work, including a manufacturing app that reportedly saved a Fortune 500 customer about $6M and an AI chat panel adopted by 70% of pricing analysts.”

Python Java JavaScript TypeScript React Redux+122

View profile

Drew Dunn

Screened

Senior AI Engineer specializing in generative AI and production ML systems

Aledo, TX14y exp

Elevance HealthTexas Tech University

“ML/AI engineer with hands-on ownership of production computer vision, speech, and legal RAG systems. Notably improved a key-duplication CV pipeline enough to unblock commercial launch and remove specialist manual measurement, and also shipped a live Quran recitation detection feature for a product with 1M+ users.”

Large Language Models Generative AI PyTorch TensorFlow FAISS Transformers+113

View profile

Rithvik Mysore Suresh

Screened

Junior Full-Stack Software Engineer specializing in React and AI-powered applications

Bloomington, IN4y exp

Indiana UniversityIndiana University Bloomington

“Full-stack/AI-focused builder who shipped a production Career Advisor app using LLMs + RAG + vector DB (React/Node/MongoDB/Claude API) and grew it to 2000+ users, handling real deployment issues and CI/CD on Vercel/Render. Also developing an AI-powered iOS “3D World Explorer” (text-to-3D) and has cloud experience across Azure and AWS (S3/SageMaker/EC2).”

Python JavaScript TypeScript C SQL HTML+96

View profile

Sathyavarthan Balachandar

Screened

Mid-level Data Engineer specializing in scalable pipelines, Spark, and cloud data warehousing

Boston, USA3y exp

Fidelity InvestmentsNortheastern University

“Backend/data platform engineer who recently owned an end-to-end large-scale financial data platform delivering real-time decision support for finance and operations. Has hands-on experience modernizing legacy batch pipelines into AWS cloud-native ELT with parallel-run cutovers, strong data quality controls (dbt-style tests, reconciliation), and measurable improvements in runtime, cost, and SLA compliance. Also builds scalable, secure FastAPI microservices using Docker, ALB-based horizontal scaling, Redis caching, and managed auth with Cognito/Supabase plus Postgres RLS.”

Python SQL Go Apache Spark PySpark Databricks+125

View profile

Avijit Saha

Screened

Junior Software Engineer specializing in cloud-native microservices and AI/ML observability

Bedford, TX3y exp

JPMorgan ChaseUniversity of the Cumberlands

“Engineer with banking and industrial/IoT experience who has deployed a payment-processing microservice with zero downtime, handling Protobuf schema evolution and sensitive data migration via dual-write/checksum techniques. Demonstrates strong cross-stack troubleshooting (pinpointed intermittent distributed timeouts to a failing ToR switch port) and customer-facing Python ETL customization using plugin-based parsers and Pydantic validation, plus hands-on monitoring/alerting improvements with operators.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon EKS Amazon S3+103

View profile

Bhuvan Chandi

Screened

Mid-level Data Engineer specializing in AI/ML data platforms

NY, NY6y exp

BlackRockWebster University

“Built and productionized an LLM-powered PDF document Q&A system to eliminate manual searching through long documents, focusing on scalability and answer reliability. Implemented semantic chunking (using headings/paragraphs/tables), overlap, and preprocessing/quality checks to reduce hallucinations, and orchestrated the end-to-end pipeline with Airflow using retries, alerts, and parallel tasks.”

Python SQL Shell Scripting Apache Spark PySpark Apache Hadoop+103

View profile

Shanmukh Sai Madhu

Screened

Mid-level Data Engineer specializing in real-time pipelines and cloud analytics

Chicago, IL5y exp

JPMorgan ChaseUniversity of South Dakota

“Researcher from the University of South Dakota who built a production medical RAG system to help interpret model predictions by retrieving relevant clinical notes and medical literature, overcoming retrieval accuracy and imaging-dataset challenges through semantic chunking and metadata-driven indexing. Also has hands-on orchestration experience with Airflow and Azure Data Factory, plus a pragmatic approach to LLM evaluation and stakeholder-driven iteration.”

Agile Amazon EMR Apache Airflow Apache Kafka Apache Spark AWS+122

View profile

Mihir Trivedi

Screened

Junior Machine Learning & Quant Research Engineer specializing in low-latency data and trading systems

New York, NY3y exp

Astera HoldingsColumbia University

“Applied ML to physical EV fleet systems at ST Labs, building a real-time CNN-LSTM fault prediction pipeline from streaming vehicle telemetry and addressing live data alignment issues via resampling/interpolation and buffered inference. Also developed a V2G/G2V energy transfer algorithm to automate charging/discharging for profit optimization, and made high-impact low-latency pipeline decisions at Astera Holdings using profiling, replay testing, and live A/B validation.”

AWS Glue BigQuery C++CUDA Data Cleaning Data Engineering+109

View profile

Sai Raja Ramya Bhavana Thota

Screened

Senior Data Scientist specializing in machine learning and customer analytics

Illinois, USA7y exp

Northern TrustBradley University

“Data/ML practitioner with experience applying NLP and classical ML to large-scale customer data (2B+ records) for segmentation, prediction, and survey-text classification, delivering measurable business impact (~18% engagement efficiency). Has hands-on entity resolution across multi-source datasets and has built embedding-based semantic search using SentenceBERT + a vector database with domain fine-tuning (~20% relevance improvement), plus production workflow experience with Spark/Airflow and cloud tooling (AWS/Azure).”

A/B Testing Analytics Azure Machine Learning Bash BigQuery C+195

View profile

Software Engineers Data Engineers Machine Learning Engineers Data Analysts Software Developers Data Scientists Engineering Data & Analytics AI & Machine Learning Education

Need someone specific?

AI Search

Related

Need someone specific?