Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Web Scraping Professionals

Pre-screened and vetted.

Web Scraping Python SQL Docker AWS JavaScript

Yash Priyadarshi

Screened

Junior Software Engineer specializing in distributed systems and cloud infrastructure

Bengaluru, India2y exp

EricssonPenn State University

“Backend/distributed-systems engineer who built a Golang distributed key-value store on AWS using Multi-Paxos, WAL, and non-blocking gRPC replication (cutting write latency ~40%) and proactively addressed tricky failure modes like leader-election livelock. Also developed a Python/Kubernetes cost-optimization scaling engine deployed with Helm/Terraform, delivering ~$40K annual savings while sustaining 99.99% uptime, and drives contract-first API development (OpenAPI/Swagger) to speed frontend integration.”

Python C++Go SQL JavaScript TypeScript+195

View profile

Shashank Garg

Screened

Engineering leader specializing in FinTech ML/AI platforms

San Francisco, CA12y exp

TravelBankSan José State University

“Engineering Manager/player-coach leading Data Infrastructure, ML/DS, and AI Engineering pods who recently shipped multiple production agentic GenAI features. Built privacy-preserving LLM workflows (PII redaction via Microsoft Presidio) and drove an AI expense-approval agent from ambiguous ask to GA, cutting approval time from ~2.5 days to <4 hours with >85% accuracy. Also owned a major LLM cost overrun incident and implemented cost observability plus circuit breakers to prevent runaway agent loops.”

Leadership Team Building Agile Generative AI MLOps LangGraph+102

View profile

Mayur Komaravelly

Screened

Senior Data Analyst specializing in data pipelines, web scraping, and legal data enrichment

Illinois, USA5y exp

The HartfordIndiana Wesleyan University

“Data engineer focused on reliable, scalable analytics pipelines and external data collection. Has owned end-to-end pipelines processing 5–10M records/day, serving Snowflake data marts to Power BI/Tableau, and reports ~99% reliability through strong validation/monitoring. Also shipped versioned REST APIs for curated data with query optimization and caching.”

Apache Airflow Apache Kafka Apache Spark Ansible API Design AWS Glue+140

View profile

Ansh Harjai

Screened

Junior Software Engineer specializing in AI, RAG systems, and backend development

Brooklyn, NY1y exp

New York UniversityNYU

“Built an NYU software engineering capstone called “Smart Cash AI,” a multi-agent LLM-powered web app that curates offline-ready podcasts/articles/videos/news based on user preferences and commute schedules. Architected agent orchestration (discovery/downloader/summarizer), real-time progress via WebSockets, and an ETL normalization layer across RSS/YouTube and other sources with GUID-based deduplication, retries, and failure isolation to keep the system predictable.”

Python C++SQL JavaScript HTML CSS+79

View profile

Kunal Kulkarni

Screened

Intern AI/ML Researcher specializing in computer vision and data engineering

Palo Alto, CA1y exp

TieSetUCLA

“Built a production-oriented multimodal RAG "Fix Assistant" with FastAPI, Tavily search, BM25 + cross-encoder reranking, and a local Phi-3.5 model, emphasizing strict grounding and fallback/verification modes to prevent hallucinations. Also has hands-on federated learning experience using STADLE to orchestrate edge-node training and aggregation for EV telemetry data, plus experience communicating AI results to non-technical stakeholders (traffic RL/congestion outcomes).”

AWS Bash C C++CI/CD Computer Vision+128

View profile

Hadi Jaffery

Screened

Junior Data Engineer specializing in Snowflake and investment data platforms

Boston, MA3y exp

Liberty MutualUniversity of Maryland, College Park

“Private markets/private credit data engineer owning core Snowflake/AWS data infrastructure (S3 → ActiveBatch → Snowflake) with automated iceDQ quality checks and curated datasets for internal Power BI/React reporting. Drove major reliability and delivery improvements, including cutting DB CI/CD deploy time 50% and reducing downstream table errors by 90%+, and also built an internal React/FastAPI app to visualize the team’s data infrastructure in an ambiguous early-stage environment.”

AWS AWS Lambda CI/CD C C++Data Engineering+84

View profile

Bhargavi Kondaveeti

Screened

Mid-level Data Engineer specializing in big data pipelines and real-time streaming

Dallas, TX6y exp

Johnson & JohnsonUniversity of North Texas

“Data engineer who has owned end-to-end production pipelines processing a few million records/day, using Python/Airflow/SQL/PySpark with Snowflake serving to BI (Power BI). Built resilient external web data collection systems (anti-bot, schema-change detection, backfills) and shipped versioned REST APIs for internal consumers, improving pipeline success rates to 99% through monitoring, retries, and idempotent design.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon Redshift Amazon S3 Amazon SQS+101

View profile

Tanvi Dasaripally

Screened

Mid-level Cloud Data Engineer specializing in Azure/AWS pipelines and medallion architecture

USA4y exp

UnitedHealth GroupSouthern Illinois University Carbondale

“Data engineer focused on reliability and data quality, owning end-to-end pipelines processing ~100k–300k records/day. Implemented robust validation and monitoring that cut reporting issues by ~30%, and built stable external data collection with anti-bot measures, backfills, and schema-change detection while maintaining backward-compatible internal data services.”

Python SQL PySpark Apache Kafka Azure Data Factory AWS+72

View profile

Aditya Jain

Screened

Senior Design Engineer and Front-End Developer specializing in interactive data experiences

Brooklyn, NY8y exp

The Washington PostNYU

“Lead engineer/designer behind The Washington Post's internal live-tracker tooling and election coverage interfaces. They combine cloud architecture, frontend/data-viz craftsmanship, and close newsroom stakeholder collaboration to ship real-time, high-traffic journalism products that improved internal efficiency and supported major audience and subscriber outcomes.”

Artificial Intelligence React Design systems Full-stack development ETL pipelines Serverless architecture+73

View profile

Ping-Hao Liu

Screened

Senior venture analyst specializing in FinTech, InsurTech, and HealthTech

Phoenix, AZ10y exp

ManchesterStory CapitalWashington University in St. Louis

“Early-stage investor from Manchester Stories with a strong sourcing engine, generating 500-600 deals per year through referrals, events, thematic research, and AI-driven stealth startup discovery. Particularly differentiated in private payer healthcare and founder relationship-building, with examples of converting value-add outbound outreach into investments through hands-on market guidance and rigorous diligence.”

Market research Web scraping Business development Go-to-market strategy BudgetingDeal sourcing+77

View profile

Amaan Mohammed

Screened

Entry-level Machine Learning Engineer specializing in generative AI and applied ML

College Park, MD1y exp

CNPCUniversity of Maryland, College Park

“Built and deployed LLM-powered agentic systems including a multi-agent travel planning assistant using LangChain, RAG (FAISS), real-time APIs, and a supervisor agent to manage coordination and reduce hallucinations. Also developed a Text-to-SQL system with schema-aware validation guardrails, and collaborated with drilling domain experts at CNPC USA to build an ML model predicting rate of penetration (ROP).”

Python R SQL Go TypeScript PyTorch+143

View profile

Jahnavi Lasyapriya Vavilala

Screened

Junior Machine Learning Engineer specializing in LLMs, NLP, and computer vision

Bengaluru, Karnataka2y exp

PwCArizona State University

“Built a production, agentic multi-agent pharmaceutical intelligence system for US oncology (breast cancer) conference/news intelligence, automating MSL-style information gathering and summarization for pharma and healthcare stakeholders. Uses CrewAI + LangChain orchestration, custom scraping across ~15 pharma newsrooms, and a grounding-score evaluation approach (sentence transformers/cosine similarity) to mitigate hallucinations.”

Python SQL R Java JavaScript Snowflake+121

View profile

Hrishikesh Raghunath

Screened

Mid-level Data Engineer specializing in scalable ETL, streaming analytics, and cloud data platforms

Remote, USA7y exp

Dreamline AICalifornia State University, Fullerton

“At Dreamline AI, built and productionized an AWS-based incentive intelligence platform that uses Llama-2/GPT-4 to extract eligibility rules from unstructured state policy documents into structured JSON, then processes them with Glue/PySpark and serves results via Lambda/SageMaker/API Gateway. Designed state-specific ingestion connectors plus schema validation and automated checks/alerts to handle frequent policy/format changes without breaking the pipeline, and partnered with business/analytics stakeholders to deliver interpretable eligibility decisions via explanations and dashboards.”

A/B Testing Amazon CloudWatch Amazon Kinesis Amazon Redshift Amazon S3 Amazon SageMaker+114

View profile

Jason Meno

Screened

Senior Full-Stack Software Engineer specializing in digital health and AI

San Francisco, CA7y exp

Feeling GreatPurdue University

“ML practitioner with hands-on experience in healthcare time-series modeling (CGM-based blood glucose prediction) including a novel ICA-based blind source separation approach and robust data-cleaning for noisy, missing sensor data. Also built an embeddings + LLM-powered podcast recommendation workflow using YouTube transcript scraping and Vellum AI document indexing, with a strong emphasis on production-grade engineering practices (TDD, monitoring) and realistic rolling validation for forecasting.”

Ruby Python JavaScript TypeScript SQL React+77

View profile

Ankit Patra

Screened

Mid-Level Software Engineer specializing in cloud, microservices, and AI/ML

New York, NY6y exp

Binghamton UniversityBinghamton University

“Backend/API engineer with ~4 years experience building production services in .NET Core/PostgreSQL/Redis/Docker and optimizing real-world latency issues (claims ~60% response-time improvement). Also built and owned an end-to-end RAG-based AI assistant using Python/FastAPI, OpenAI APIs, and Pinecone, plus agentic workflows with reliability guardrails (retries, confidence thresholds, monitoring). Currently pursuing a master’s degree and targeting a $150k base salary.”

Agile Ansible Apache Kafka Apache Spark AWS AWS Lambda+120

View profile

Yuvraj Singh Chauhan

Screened

Entry-level AI/ML Engineer specializing in LLMs, RAG, and DevOps automation

Bangalore, India1y exp

RapidFortThapar Institute of Engineering and Technology

“Built and owned a production-scale AI-driven software release/version intelligence platform orchestrated via GitHub Actions that tracks 1000+ upstream repositories and automatically generates SLA-bound JIRA upgrade tickets for hardened container images. Replaced brittle regex/PEP440 parsing with an LLM-based semantic filtering layer plus deterministic validation to handle noisy/inconsistent GitHub tags at scale, with monitoring for coverage, latency, and correctness validated against upstream ground truth.”

API Integration Bash Computer Vision C C++Data Analytics+71

View profile

Madhav Vaddepalli

Screened

Senior Data Engineer specializing in cloud data platforms and big data pipelines

Seattle, WA8y exp

SafecoFitchburg State University

“Data engineer focused on building reliable, production-grade pipelines and external data collection systems on AWS (S3/Lambda/SQS/Glue/EMR) using PySpark/SQL, serving curated datasets to Snowflake/Redshift for finance and fraud teams. Has operated a large-scale crawler ingesting millions of records/day with anti-bot tactics, schema versioning/quarantine, and CloudWatch/Datadog monitoring, and also shipped a versioned REST API with caching and query optimization.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon Redshift Amazon RDS+192

View profile

Nivedita Shainaj Nair

Screened

Mid-level ML Data Engineer specializing in MLOps and scalable healthcare data pipelines

Boston, MA5y exp

CignaNortheastern University

“Data/ML platform engineer with healthcare (Cigna) experience owning an end-to-end pipeline spanning Airflow + Debezium CDC ingestion, PySpark/SQL transformations, rigorous data quality gates, and feature-store/API serving for ML training and inference. Worked at 10+ TB scale and cites a ~30% latency reduction plus stronger reliability via idempotent design, monitoring, and backfill-safe reprocessing; also built pragmatic early-stage data pipelines at Frankenbuild Ventures.”

Agile Alerting Anomaly Detection Apache Airflow Apache Kafka Apache Spark+135

View profile

Aswani D

Screened

Mid-level Software Engineer specializing in cloud microservices and data pipelines

5y exp

Johnson & JohnsonIndiana Wesleyan University

“Data engineer/platform builder who has owned production pipelines end-to-end processing millions of records/day, with strong emphasis on data quality (quarantine workflows) and reliability (monitoring, retries, incremental loads). Also designed large-scale external data collection/crawling with anti-bot handling and backfills, and shipped versioned REST data services optimized for performance and developer usability in an early-stage environment.”

Python SQL PL/SQL WebSockets Spring Boot Spring MVC+144

View profile

Sujith Julakanti

Screened

Junior MLOps Engineer specializing in LLMs and cloud infrastructure

College Station, TX3y exp

Texas A&M UniversityTexas A&M University

“Built a production multimodal LLM system (Gemini on GCP) to automate behavioral coding of family-involved science experiment videos, including preprocessing for inconsistent lighting/audio and LangGraph-orchestrated parallel workflows. Also developed rubric-based AI grading workflows and partnered closely with non-technical education stakeholders through explainability-focused walkthroughs and manual-vs-AI evaluation alignment.”

Python SQL C++C HTML CSS+75

View profile

Shweta Gupta

Screened

Senior Backend Software Engineer specializing in Java microservices, Kafka, and AWS

Seattle, WA6y exp

EasyBee AIUC Irvine

“AI engineer who shipped a production chat assistant for a storage company by building the underlying RAG-style knowledge base (document ingestion, chunking/embeddings, FAISS vector store) and an admin update interface to keep content current. Also has full-stack delivery experience (Python REST APIs + React/TypeScript UI) and AWS operations using Terraform/Jenkins, including handling a real production performance incident by optimizing DB queries and adding auto-scaling.”

A/B Testing Agile API Testing AWS Bash Batch Processing+111

View profile

Akhilesh Babu Tumati

Screened

Mid-level Full-Stack Software Engineer specializing in cloud-native and AI-integrated systems

3y exp

Virginia TechVirginia Tech

“Built and deployed a Virginia Tech CS department blog/archive application using a MERN/Next.js stack and a fully serverless AWS architecture (Lambda, API Gateway, S3, CloudFront, Route 53), including CI/CD via the Serverless Framework. Implemented RBAC for student/faculty/admin users and added an article export feature backed by MongoDB.”

TypeScript Python Java Spring Boot SQL C+++95

View profile

Nikitha Margadi

Screened

Mid-level Data Engineer specializing in cloud lakehouse, streaming, and MLOps

Texas, USA5y exp

AT&TCal State Fullerton

“Data engineer at AT&T focused on large-scale telecom (5G/IoT) data platforms, owning end-to-end pipelines from Kafka/Azure ingestion through Databricks/Delta Lake transformations to serving analytics and ML. Has operated at very high volumes (~50+ TB/day) and delivered measurable performance gains (25–30% faster processing) plus improved reliability via Airflow monitoring, robust data quality checks, and resilient external data collection patterns (rate limiting, retries, dynamic schemas).”

Python SQL PL/SQL PySpark Apache Spark Apache Kafka+114

View profile

Sasank Kuppili

Screened

Mid-level Software Engineer specializing in backend systems and data-driven APIs

Remote6y exp

What’s The MoveNYU

“Candidate approaches AI-assisted coding like a senior developer supervising junior contributors: they define precise technical requirements, enforce code quality and documentation, and review outputs before approval. They also actively lead multi-agent workflows using OpenClaw and a Kanban-style AI project management setup, coordinating both coding and non-technical agents.”

Node.js Python Java C++SQL REST APIs+66

View profile

Software Engineers Software Developers Data Engineers Machine Learning Engineers Research Assistants Full Stack Developers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?