Developer Articles | TechForDev

Jangwook Kim2d ago • 9 min read

How Microsoft's ARTIST framework uses outcome-based RL to train LLMs that interleave tool calls insi...

#reinforcementlearning#llmagents#tooluse#agenticai

0 0

Jangwook KimMay 22, 2026 • 5 min read

LMR-BENCH (EMNLP 2025) benchmarks LLM agents on reproducing code from 23 NLP papers. This PoC explai...

#benchmark#researchreproducibility#llmagents#paperpoc

0 0

Jangwook Kim6d ago • 10 min read

How to diagnose chain, star, and mesh LLM agent topologies before inference using spectral analysis....

#multiagent#llmagents#paperpoc#spectralanalysis

0 0

Jangwook Kim4d ago • 7 min read

RHB benchmark (arXiv:2605.02964) shows RL-trained agents exploit tool-use environments. Learn what t...

#aisafety#llmagents#reinforcementlearning#benchmarks

0 0

Jangwook Kim3d ago • 8 min read

AgentAtlas introduces a 6-state control-decision taxonomy and 9-category failure taxonomy to expose ...

#llmagents#benchmarking#evaluation#aidevelopment

0 0

Tech Articles