Zamal Babar

Wiesbaden, Germany

Building autonomous systems

I work on post-training, cloud infrastructure, and multi-agent architecture. I design evaluation frameworks for industrial-scale generative systems at Deutsche Börse Group, building natural language pipelines for high-stakes financial document processing on GCP.

LangChain Ambassador for Germany. I sometimes post tutorials and breakdowns on YouTube.

LangChain Ambassador, Germany

Email LinkedIn GitHub HuggingFace YouTube

Now

Research

Exploring recursive language models and reinforcement learning with verifiable rewards. I actively contribute to open source, building patches and fixing packages when something is off.

Work

Working on Agentic AI at Deutsche Börse Group, industrializing generative systems on GCP for financial document processing.

Open Source

Maintaining DeepGit (842 stars, thousands of users on HuggingFace)
Maintaining LangCode (439 stars)
Maintaining documentation for cvzone (817k downloads), a computer vision library
Maintaining augmentimg (26k downloads), a no-code image augmentation tool for CV tasks supporting YOLO and COCO formats
Contributing to Google DeepMind's genai-processors and HuggingFace Cookbook
Published 8 models and 2 datasets on HuggingFace including fine-tuned VLMs, NER models, and LoRA adapters

Experience

Associate ML Engineer, Agentic AI Aug 2024 to Present

Deutsche Börse Group, Frankfurt

Defined the end-to-end strategy for industrializing GenAI pipelines on GCP. Technical lead for high-stakes financial document processing.
Architected a proprietary evaluation framework using Gemini and ScaNN to define success metrics for model accuracy and deployment reliability.
Led alignment between legal, risk, and IT departments to translate regulatory requirements into technical constraints for AI Agents (DPO/GRPO).
Established data privacy operating processes (CMEK/GDPR) for fail-safe LLM integration into core banking systems.

Projektarbeit, Autonomous Agent Systems Apr 2024 to Sep 2024

FAPS Lab, FAU, Erlangen

Designed a multi-agent architecture linking LLMs with ROS, Gazebo, Catkin, and MoveIt for autonomous context-aware motion planning in simulated industrial environments.
Built a natural-language orchestration layer between ROS and the robot control stack, translating plain-text commands directly into movement planning and actions in Gazebo.

Germany Ambassador and Community Lead 2025 to Present

LangChain

Driving the Agentic AI strategy for the Munich developer ecosystem. Organizing exchange on multi-agent architectures and RAG best practices.
Shaping the roadmap for open-source developer tooling and orchestration frameworks.

Projects

DeepGit

842 stars

Deep Research Agent for GitHub

A LangGraph-powered agent for deep repository analysis. Uses hybrid dense retrieval, ColBERT v2 re-ranking, and agentic tool orchestration to surface relevant repositories that keyword search misses. Thousands of users on HuggingFace.

LangCode

439 stars

Agentic CLI Tool

Gemini CLI or Claude Code? Why not both. A terminal-native coding agent with ReAct and Deep agent modes, multi-LLM support across Gemini, Anthropic, OpenAI, and Ollama, and MCP integration for extensible tool use.

InboxHero

163 stars

Agentic Email Manager

An email manager that prioritizes messages, reads attachments, and drafts replies so you can focus on what matters.

Google DeepMind, genai-processors

Contributor

Open Source

Contributed to the core infrastructure of multimodal processing, enhancing the scalability of agentic workflows.

cvzone

817k downloads

Computer Vision Library

A computer vision package that makes hand tracking, face detection, and pose estimation simple. Maintaining the documentation and developer guides.

augmentimg

26k downloads

Image Augmentation Tool

A no-code image augmentation tool for computer vision tasks. Supports YOLO and COCO annotation formats with a streamlined pipeline for dataset preparation.

Writing

How I Built a Deep Research Agent for GitHub

Mar 2025 . 5 min read

GitHub has over 100 million repositories, and basic search relies on keywords and star counts. DeepGit uses a LangGraph-powered agentic workflow with hybrid dense retrieval, cross-encoder re-ranking, and documentation intelligence to surface repositories that keyword search will never find.

Exploring LongWriter: Ultra-Long Text Generation

Aug 2024 . 4 min read

Most LLMs cap out at 2,000 words of output despite handling 100k token contexts. LongWriter from Tsinghua introduces AgentWrite, a pipeline that breaks long-form generation into planned subtasks, and a dataset of 6,000+ examples up to 32,000 words.

"Everything that makes human life easy is worth building."

just a saying I like

Skills

Agentic Systems

Multi-Agent Orchestration LangGraph DSPy MCP Servers Tool-Use Strategy Human-in-the-Loop Context Engineering RAG / ColBERT / ScaNN

LLM Post-Training and Alignment

Alignment

RLHF (PPO, GRPO) DPO Verifiable Rewards

Fine-Tuning

PEFT (LoRA, QLoRA) Full Parameter Fine-Tuning

Optimization

Post-Training Quantization Model Distillation

Cloud and Infrastructure

GCP Vertex AI Terraform Docker Kubeflow CI/CD Cloud Functions Cloud Run Load Balancer AlloyDB / CloudSQL CMEK Encryption MLOps

Evaluation and Safety

LLM-as-a-Judge Synthetic Data Pipelines Hallucination Detection AI Safety Frameworks RAG Evaluation GDPR Compliance

Languages and Tools

Python TypeScript C++ ROS REST APIs Vector Search

Education

Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)

M.Sc. Electromobility, ACES (Spec: AI and Sustainable Mobility)

Germany, 2023 to 2025

Osmania University

B.E. Automobile Engineering, University Rank: 3

India, 2018 to 2022

Resources I Learn From

Yannic Kilcher

ML research paper analysis and commentary

Jia-Bin Huang

Visual explainers on diffusion, transformers, attention, and 3D vision

Neural Breakdown with AVB

Hands-on tutorials on fine-tuning, VLMs, and deep learning from scratch