Skills Required المهارات المطلوبة
IT/Software Development
Engineering - Telecom/Technology
Engineering - Mechanical/Electrical
AI
Computer Science
Engineering
Information Technology (IT)
Software Development
Job Description الوصف الوظيفي
About the Role
We are seeking an engineer who doesn't just "talk" to LLMs but builds autonomous, resilient systems. You will design multi-agent architecture that can reason, use tools, and recover from failures independently.
Your primary goal is to bridge the gap between "cool demos" and "production-grade reliability." You will be responsible for deploying agents across various business sectors (e.g., Finance, Operations, Customer Success), ensuring they are not only intelligent but also safe, cost-effective, and predictable in a production environment.
Key Responsibilities
● Agent Orchestration: Design and implement stateful, multi-turn agent workflows using frameworks like LangGraph, CrewAI, AutoGen, PydanticAI, Swarm, Haystack, or Bee Agent Framework.
● Proactive Reliability & Guardrails: Architect systems that prevent production meltdowns. Implement circuit breakers, "human-in-the-loop" triggers, and input/output guardrails to stop infinite loops, prompt injections, and hallucinated tool calls before they reach the end user.
● Multi-Sector Tooling: Build and maintain high-precision API integrations (tools) that allow agents to interact with diverse business systems (ERPs, CRMs, custom databases) deterministically.
● Observability & Tracing: Set up advanced tracing (e.g., LangSmith, Langfuse, or Arize Phoenix) to debug complex reasoning chains and monitor agent trajectories in real-time.
● Rigorous Evaluation (Evals): Develop automated "Golden Datasets" and evaluation frameworks (using RAGas, DeepEval, G-Eval, LangSmith Evaluators, or custom model-based grading) to measure agent success rates and prevent regressions before every deployment.
● Cost & Latency Optimization: Manage the "token budget" by implementing tiered model routing (e.g., using Gemini 3 Flash for initial reasoning and Ultra for final verification) and optimizing context window usage.