Pre-screened and vetted.
Mid-level Data Engineer specializing in cloud data platforms and real-time streaming
“Worked on onboarding a Middle East logistics client processing thousands of invoices/month, building a production-ready pipeline that routes known vendor PDFs to deterministic regex parsers via Tax ID matching and falls back to LlamaParse for unknown layouts. Added financial consistency validation plus human-in-the-loop review and logging/metrics to continuously reduce LLM usage and improve template coverage.”
Senior Data Engineer specializing in multi-cloud data platforms and streaming pipelines
“Data platform engineer with hands-on ownership of high-volume financial data pipelines (millions of transactions/day) on Azure (ADF, Databricks, Delta Lake, Synapse), emphasizing schema-drift protection and automated data-quality gates. Also built resilient web scraping pipelines with anti-bot and backfill strategies, and shipped a versioned FastAPI + Redis data API with autoscaling, testing, and CI/CD via GitHub Actions.”
Senior Software Engineer specializing in AWS cloud infrastructure and microservices
Senior Software Engineer specializing in distributed systems and cloud infrastructure
Mid-level Data Engineer specializing in cloud-native ETL and data warehousing
Senior Data Scientist specializing in LLMs, NLP, and anomaly detection
Senior AI/ML Engineer specializing in GenAI, MLOps, and healthcare analytics
Mid-level Data Engineer specializing in AWS lakehouse and Spark pipelines
Senior Data Engineer specializing in Cloud Data Platforms and Generative AI
Mid-level Data Engineer specializing in GCP, Spark, and healthcare analytics
Junior Data Engineer specializing in Azure data platforms and GenAI analytics
“Data/ML practitioner with experience spanning medical imaging (retinal vessel analysis for hypertension/CVD risk prediction) and enterprise data engineering at Carl Zeiss. Built large-scale SAP data cleaning/validation pipelines (10M+ daily records, ~99% accuracy) and RAG-based semantic search with LangChain/vector DBs that cut manual querying by 82%, plus automation that reduced data onboarding from 8 hours to 12 minutes.”
Mid-level Data Engineer specializing in experimentation, analytics, and AI-driven product experiences
“Built production LLM automations using the Claude API, including a sales enablement workflow that summarizes playbooks and incorporates sales call metadata into strategic one-pagers. Experienced in orchestrating and scheduling data pipelines with SnapLogic, Airflow, and Databricks, and in scaling LLM API calls via parallel/batch processing. Also partnered with HR to deliver prompt-tuned, automated Slack messaging aligned to business tone and acceptance criteria.”
Junior Data Engineer specializing in BI, governed metrics, and workflow automation
“Built and shipped LLM/OCR/NLP-driven document-intelligence workflows in operational environments (EnvoyX and UPS), emphasizing production readiness via explicit state-machine orchestration, confidence gates, and human-in-the-loop review. Demonstrated strong business impact in customs brokerage/document ingestion: 50% fewer customs rejects, 30% higher throughput, SLA adherence improved from 71% to 96%, and platform reliability reaching 99.6% with 78% fewer bad-data incidents.”
Mid-level Data Engineer specializing in cloud data pipelines and enterprise data platforms
“Data engineer/backend engineer who owns large-scale, real-time event pipelines on AWS end-to-end, including a petabyte-scale CDC ingestion flow from multiple Postgres DBs into Redshift. Re-architected a legacy DynamoDB+S3 approach into a Delta Lake + DuckDB/PyArrow-compatible design, improving performance dramatically (e.g., ~600s to ~10s for 1k records) and increasing reliability at high file volumes.”
Principal Cloud & Infrastructure Engineer specializing in reliability and regulated data platforms
“Founder/CTO-type startup leader who has built cloud-native data and AI platforms from scratch while owning both technical vision and product direction. Brings rare end-to-end startup experience spanning zero-to-one building, growth-stage execution, and fundraising from early stage through exit, with a strong ability to translate technical complexity into clear investor narratives.”
Mid-level Data Engineer specializing in financial and trading data
“Quant Data Engineer at ASX who is also building FinishKit, a full-stack SaaS that scans AI-generated codebases for bugs and production-readiness issues. Combines React/TypeScript, Supabase/serverless, Fly.io, and Postgres with strong product instincts, rapid iteration, and prior experience building secure multi-tenant data and dashboard systems across enterprise teams.”
Mid-level AI/ML Engineer specializing in fraud detection and risk analytics in Financial Services
“At JP Morgan Chase, built and deployed a production LLM-powered RAG knowledge assistant to help fraud investigators and risk analysts quickly navigate regulatory updates and internal policies, reducing investigation delays and compliance risk. Strong focus on secure retrieval (RBAC filtering), reliability (layered testing + observability), and production constraints (latency/SLOs), with Airflow-orchestrated, auditable ML pipelines.”
“GenAI/data engineering practitioner with production experience across Equinix, Optum, and Citibank—built an Azure OpenAI (GPT-4) + LangChain document intelligence platform processing 1.5M+ docs/month and a HIPAA-compliant Airflow healthcare pipeline handling 5M+ claims/day. Also delivered a real-time fraud detection + explainability system using LightGBM and a fine-tuned T5 NLG component, improving fraud accuracy by 15%+ while partnering closely with compliance stakeholders.”
Mid-level Data Engineer specializing in streaming and cloud data platforms for financial services
“Data engineering-focused candidate (internship/project experience) who built end-to-end pipelines processing a few million transactional records/day for fraud detection and reporting, using Airflow, Python/SQL, and PySpark with strong emphasis on data quality gates, idempotency, and monitoring. Also implemented an external web/API data collection system with anti-bot tactics and schema-change quarantine, and shipped a versioned Flask API to serve curated warehouse data.”
Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps
“Built and deployed a production LLM + vector search clinical decision support system at UnitedHealth Group, retrieving medical evidence and patient context in real time for prior authorization and risk scoring. Strong in end-to-end RAG architecture (Hugging Face embeddings, Pinecone/FAISS, SageMaker, Redis) plus orchestration (Airflow/Kubeflow) and rigorous evaluation/monitoring, with demonstrated ability to align solutions with clinical operations stakeholders.”
Mid-level Data Engineer specializing in cloud data warehousing and analytics
“Data engineer at American Express who owned end-to-end pipelines for transaction and customer data used in finance reporting and risk analytics, processing ~5–8M records/day. Built Airflow-orchestrated ingestion (including external APIs/web sources) with strong data quality controls, monitoring/alerts, and resilient backfill/retry patterns, and also shipped a versioned REST API serving aggregated metrics to analytics teams.”
Senior Data Scientist / ML Engineer specializing in cloud ML pipelines and GenAI
“ML/NLP practitioner with experience building a transformer-failure prediction system that combines sensor signals with unstructured maintenance comments using LLM-based extraction and similarity validation. Strong emphasis on production readiness—data leakage controls, SQL-driven data quality tiers, and rigorous bias/fairness validation (including contract/spec evaluation across diverse company profiles).”
Mid-level Data Engineer specializing in Analytics & AI/ML
“Data engineer with experience at Sony and Walmart building high-volume, near-real-time analytics and ingestion systems. Has owned end-to-end pipelines from Kafka/Spark streaming through S3/Parquet and Redshift/Looker, emphasizing data quality (Great Expectations), observability (CloudWatch/Azure Monitor), and reliability (Airflow SLAs, retries, checkpointing), including measurable performance and latency improvements.”