Zamal Babar

Wiesbaden, Germany

Building autonomous systems

I work on post-training, cloud infrastructure, and multi-agent architecture. I design evaluation frameworks for industrial-scale generative systems at Deutsche Börse Group, building natural language pipelines for high-stakes financial document processing on GCP.

LangChain Ambassador for Germany. I sometimes post tutorials and breakdowns on YouTube.

LangChain Ambassador, Germany

Now

Research

Exploring recursive language models and reinforcement learning with verifiable rewards. I actively contribute to open source, building patches and fixing packages when something is off.

Work

Working on Agentic AI at Deutsche Börse Group, industrializing generative systems on GCP for financial document processing.

Open Source
  • Maintaining DeepGit (842 stars, thousands of users on HuggingFace)
  • Maintaining LangCode (439 stars)
  • Maintaining documentation for cvzone (817k downloads), a computer vision library
  • Maintaining augmentimg (26k downloads), a no-code image augmentation tool for CV tasks supporting YOLO and COCO formats
  • Contributing to Google DeepMind's genai-processors and HuggingFace Cookbook
  • Published 8 models and 2 datasets on HuggingFace including fine-tuned VLMs, NER models, and LoRA adapters

Experience

Associate ML Engineer, Agentic AI Aug 2024 to Present
Deutsche Börse Group, Frankfurt
  • Defined the end-to-end strategy for industrializing GenAI pipelines on GCP. Technical lead for high-stakes financial document processing.
  • Architected a proprietary evaluation framework using Gemini and ScaNN to define success metrics for model accuracy and deployment reliability.
  • Led alignment between legal, risk, and IT departments to translate regulatory requirements into technical constraints for AI Agents (DPO/GRPO).
  • Established data privacy operating processes (CMEK/GDPR) for fail-safe LLM integration into core banking systems.
Projektarbeit, Autonomous Agent Systems Apr 2024 to Sep 2024
FAPS Lab, FAU, Erlangen
  • Designed a multi-agent architecture linking LLMs with ROS, Gazebo, Catkin, and MoveIt for autonomous context-aware motion planning in simulated industrial environments.
  • Built a natural-language orchestration layer between ROS and the robot control stack, translating plain-text commands directly into movement planning and actions in Gazebo.
Germany Ambassador and Community Lead 2025 to Present
LangChain
  • Driving the Agentic AI strategy for the Munich developer ecosystem. Organizing exchange on multi-agent architectures and RAG best practices.
  • Shaping the roadmap for open-source developer tooling and orchestration frameworks.

Projects

842 stars
Deep Research Agent for GitHub

A LangGraph-powered agent for deep repository analysis. Uses hybrid dense retrieval, ColBERT v2 re-ranking, and agentic tool orchestration to surface relevant repositories that keyword search misses. Thousands of users on HuggingFace.

439 stars
Agentic CLI Tool

Gemini CLI or Claude Code? Why not both. A terminal-native coding agent with ReAct and Deep agent modes, multi-LLM support across Gemini, Anthropic, OpenAI, and Ollama, and MCP integration for extensible tool use.

163 stars
Agentic Email Manager

An email manager that prioritizes messages, reads attachments, and drafts replies so you can focus on what matters.

Open Source

Contributed to the core infrastructure of multimodal processing, enhancing the scalability of agentic workflows.

817k downloads
Computer Vision Library

A computer vision package that makes hand tracking, face detection, and pose estimation simple. Maintaining the documentation and developer guides.

26k downloads
Image Augmentation Tool

A no-code image augmentation tool for computer vision tasks. Supports YOLO and COCO annotation formats with a streamlined pipeline for dataset preparation.

Writing

How I Built a Deep Research Agent for GitHub
Mar 2025 . 5 min read

GitHub has over 100 million repositories, and basic search relies on keywords and star counts. DeepGit uses a LangGraph-powered agentic workflow with hybrid dense retrieval, cross-encoder re-ranking, and documentation intelligence to surface repositories that keyword search will never find.

Exploring LongWriter: Ultra-Long Text Generation
Aug 2024 . 4 min read

Most LLMs cap out at 2,000 words of output despite handling 100k token contexts. LongWriter from Tsinghua introduces AgentWrite, a pipeline that breaks long-form generation into planned subtasks, and a dataset of 6,000+ examples up to 32,000 words.

"Everything that makes human life easy is worth building."

just a saying I like

Skills

Agentic Systems
Multi-Agent Orchestration LangGraph DSPy MCP Servers Tool-Use Strategy Human-in-the-Loop Context Engineering RAG / ColBERT / ScaNN
LLM Post-Training and Alignment
Alignment
RLHF (PPO, GRPO) DPO Verifiable Rewards
Fine-Tuning
PEFT (LoRA, QLoRA) Full Parameter Fine-Tuning
Optimization
Post-Training Quantization Model Distillation
Cloud and Infrastructure
GCP Vertex AI Terraform Docker Kubeflow CI/CD Cloud Functions Cloud Run Load Balancer AlloyDB / CloudSQL CMEK Encryption MLOps
Evaluation and Safety
LLM-as-a-Judge Synthetic Data Pipelines Hallucination Detection AI Safety Frameworks RAG Evaluation GDPR Compliance
Languages and Tools
Python TypeScript C++ ROS REST APIs Vector Search

Education

Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)
M.Sc. Electromobility, ACES (Spec: AI and Sustainable Mobility)
Germany, 2023 to 2025
Osmania University
B.E. Automobile Engineering, University Rank: 3
India, 2018 to 2022

Resources I Learn From

ML research paper analysis and commentary
Visual explainers on diffusion, transformers, attention, and 3D vision
Hands-on tutorials on fine-tuning, VLMs, and deep learning from scratch