“ML/NLP practitioner who built a retrieval-augmented generation (RAG) system for large financial and operational document sets using Sentence-Transformers (all-mpnet-base-v2) and a vector DB (e.g., Pinecone), with a strong focus on retrieval evaluation and chunking strategy optimization. Experienced in entity resolution (rules + embedding similarity with type-specific thresholds) and in productionizing scalable Python data workflows using Airflow/Dagster and Spark.”

Python SQL R Pandas NumPy SciPy+177

View profile

Sri Niyati Kompella

Screened

Senior Data Engineer specializing in cloud data platforms and ML pipelines

Atlanta, GA8y exp

Berkshire HathawayUniversity of Alabama at Birmingham

“Data engineer focused on AWS-based enterprise data platforms, owning end-to-end pipelines from multi-source batch/stream ingestion (Glue/Kinesis/StreamSets/Airflow) through PySpark transformations into curated datasets for Redshift/Snowflake. Emphasizes production reliability with strong monitoring/observability and data quality gates, and reports ~30% performance improvement plus improved SLAs and latency after optimization.”

Amazon Athena Amazon DynamoDB Amazon EMR Amazon EKS Amazon Kinesis Amazon Redshift+138

View profile

Varshitha K

Screened

Mid-level Data Engineer specializing in cloud data platforms and lakehouse architectures

Lakewood, CO4y exp

First BankUniversity of Central Missouri

“Data engineer in a banking context who has owned end-to-end Azure lakehouse pipelines ingesting financial/vendor data from APIs, Azure SQL, and flat files into Databricks/Delta (bronze-silver-gold). Emphasizes production reliability via schema-drift validation, data quality controls, monitoring/alerting, retries/checkpointing, and Spark/Delta performance tuning, with outputs served to BI/reporting teams (e.g., Tableau).”

Python Scala Java C++SQL PL/SQL+173

View profile

vineetha Pulipati

Screened

Mid-level Software Engineer specializing in backend microservices and cloud data pipelines

MO, USA4y exp

Morgan StanleyWebster University

“Backend engineer with Morgan Stanley experience building and owning an end-to-end Python FastAPI microservice for high-volume market data used by trading and risk systems. Strong in performance tuning and reliability (PySpark, Redis caching, async APIs), real-time streaming with Kafka, and production operations (Docker/Kubernetes, GitOps-style CI/CD, monitoring). Has led cloud/on-prem migration work across AWS and Azure, including fixing Azure Synapse performance issues via query and pipeline redesign.”

Python SQL Bash Shell Scripting TypeScript C+++129

View profile

Agna Antony

Screened

Mid-level Data Engineer specializing in cloud-native healthcare and enterprise data platforms

Michigan, USA5y exp

MedStar HealthAPJ Abdul Kalam Technological University

“Data Engineer (TCS) who owned an end-to-end CRM analytics pipeline for Bayer’s eSalesWeb integration, ingesting from Salesforce APIs/databases/S3 and serving analytics-ready datasets via PostgreSQL/S3 for Tableau. Drove measurable outcomes: ~60% reduction in manual data-quality effort, ~30% lower latency through SQL optimization, and ~35% improved stability via monitoring, retries, and idempotent processing.”

SDLC Agile Scrum Kanban Waterfall DevOps+124

View profile

Fernando Mosqueda

Screened

Senior AI/ML Engineer specializing in healthcare AI and MLOps

Mansfield, TX16y exp

McKessonSam Houston State University

“Healthcare AI engineer with hands-on ownership of production ML and LLM systems at McKesson, spanning clinical risk prediction and RAG-based documentation tools. Stands out for combining deep clinical-data experience, HIPAA-aware deployment practices, and measurable impact through reduced readmissions, clinician workflow gains, and 20% to 30% faster ML delivery for engineering teams.”

Python R JavaScript Go PyTorch TensorFlow+193

View profile

Yogendra Nalam

Screened

Mid-level Data Scientist specializing in ML, NLP, and Generative AI

Michigan, USA3y exp

Ally FinancialUniversity of Michigan-Dearborn

“GenAI/ML engineer with production experience at Cognizant and Ally Financial, building end-to-end LLM/RAG systems and ML pipelines. Delivered a domain chatbot trained from 90k tickets and 45k docs, improving intent accuracy (65%→83%), scaling to 800+ concurrent users with 99.2% uptime and sub-150ms latency, and driving +14% customer satisfaction. Strong in Azure ML + DevOps CI/CD, Dockerized deployments, and explainable/PII-safe modeling using SHAP/LIME to satisfy stakeholder trust and GDPR needs.”

Agile Anomaly Detection API Development AWS Azure DevOps Azure Machine Learning+107

View profile

Ankush Banthia

Screened

Senior Data & Platform Engineer specializing in cloud-native streaming and distributed systems

USA10y exp

JPMorgan ChaseNew York Institute of Technology

“Financial data engineer who has built and operated high-volume batch + streaming pipelines (200–300 GB/day; 5–10k events/sec) using AWS, Spark/Delta, Airflow, Kafka, and Snowflake, with strong emphasis on data quality and reliability. Demonstrated measurable impact via 99.9% SLA adherence, major reductions in bad records/nulls, MTTR improvements, and significant latency/runtime/query performance gains; also built a distributed web-scraping system processing 5–10M records/day with anti-bot and schema-drift defenses.”

Team Building Onboarding Mentoring Agile Scrum Jira+150

View profile

Madhupal Singu

Screened

Mid-level Data Engineer specializing in multi-cloud data platforms for healthcare and finance

USA6y exp

CignaUniversity of Cincinnati

“Data engineer with Cigna experience building and operating an end-to-end AWS-based healthcare claims pipeline processing ~2TB/day, using Glue/Kafka/PySpark/SQL into Redshift. Strong focus on data quality and reliability (schema validation, monitoring/alerting, retries/checkpointing/backfills), reporting improved accuracy (~99%) and reduced latency, plus experience serving real-time Kafka/Spark data to downstream analytics with documented data contracts.”

Python Pandas PySpark SQL Scala Java+88

View profile

Abhishek Gawali

Screened

Mid-level Data Engineer specializing in cloud ETL and real-time streaming

New York, NY6y exp

PNCRochester Institute of Technology

“Data engineer focused on AWS + Spark/Databricks pipelines, including an end-to-end nightly loan-data ingestion flow (~2.2M records) from Postgres/S3 through Glue and Databricks into a DWH with layered validation and alerting. Also built real-time streaming with Kafka + Spark Structured Streaming and a master’s project streaming Reddit data for sentiment analysis under ambiguous requirements and tight budget constraints.”

SDLC Agile Waterfall Python SQL R+105

View profile

Srijitha Katkuri

Screened

Mid-level Data Analyst specializing in healthcare and business intelligence

Michigan, USA4y exp

Banner HealthTrine University

“Healthcare analytics candidate with hands-on experience turning messy EHR, billing, and operational data into validated SQL datasets and automated Python/Airflow pipelines. They appear strongest in hospital KPI reporting—especially length of stay, readmissions, retention, and bed utilization—and have owned projects from metric definition through Power BI delivery and impact measurement.”

SQL Python Pandas NumPy Power BI Tableau+70

View profile

Alekya Battu

Screened

Mid-level Data Scientist specializing in machine learning, MLOps, and cloud analytics

USA5y exp

Wells FargoWilmington University

“Senior data scientist with ~5 years’ experience building production ML/NLP systems in finance (Wells Fargo) and deep learning for sensor analytics in connected vehicles (Medtronic). Has delivered end-to-end platforms combining time-series forecasting with transformer-based NLP, including automated drift monitoring/retraining (MLflow + Airflow) and standardized Docker/CI/CD deployments; achieved a reported 22% precision improvement after domain fine-tuning.”

Python SQL R Classification XGBoost Random Forest+171

View profile

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp

DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

Python PySpark SQL Scala Pandas NumPy+157

View profile

Data Engineers Machine Learning Engineers Data Scientists Data Analysts Software Engineers AI Engineers Data & Analytics AI & Machine Learning Engineering Executive & Leadership

Need someone specific?

AI Search

Related

Need someone specific?