{"notes":[{"content":{"title":{"value":"Learning to Orchestrate Agents in Natural Language with the Conductor"},"authors":{"value":["Stefan Nielsen","Edoardo Cetin","Peter Schwendeman","Qi Sun","Jinglue Xu","Yujin Tang"]},"authorids":{"value":["~Stefan_Nielsen1","~Edoardo_Cetin1","~Peter_Schwendeman1","~Qi_Sun10","~Jinglue_Xu1","~Yujin_Tang1"]},"keywords":{"value":["RL","reasoning","LLM","tool use","prompting"]},"abstract":{"value":"Powerful large language models (LLMs) from different providers have been expensively trained and finetuned to specialize across varying domains. In this work, we introduce a new kind of Conductor model trained with reinforcement learning to automatically discover powerful coordination strategies among LLMs. Our Conductor learns not only to design targeted communication topologies for effective agent-to-agent collaboration, but also to prompt engineer focused instructions to the LLMs to maximally leverage their individual capabilities.  We show that, by learning optimal coordination strategies over pools of powerful worker LLMs, a 7B Conductor achieves significant performance gains beyond any individual worker, attaining state-of-the-art results in challenging reasoning benchmarks, such as LiveCodeBench and GPQA. By training with randomized agent pools, our conductor effectively adapts to arbitrary sets of open- and closed-source agents, meeting any user requirements. Furthermore, allowing the Conductor to select itself as a worker gives rise to recursive topologies, elevating performance with a new form of dynamic test-time scaling through online iterative adaptation.\nMore broadly, ours is among the early work demonstrating language model coordination can be unlocked through RL, where powerful coordination strategies emerge naturally in LLMs through pure end-to-end reward maximization."},"primary_area":{"value":"foundation or frontier models, including LLMs"},"venue":{"value":"ICLR 2026 Poster"},"venueid":{"value":"ICLR.cc/2026/Conference"},"TLDR":{"value":"We introduce the Conductor, a new kind of language model trained with reinforcement learning to automatically discover powerful coordination strategies among LLMs"},"pdf":{"value":"/pdf/4a133f1e2ca67ceaedb45c3a123cc8125c694ff5.pdf"},"supplementary_material":{"value":"/attachment/3f1baa2cc471b59b9f00c669f088e44db12ab45c.zip"},"_bibtex":{"value":"@inproceedings{\nnielsen2026learning,\ntitle={Learning to Orchestrate Agents in Natural Language with the Conductor},\nauthor={Stefan Nielsen and Edoardo Cetin and Peter Schwendeman and Qi Sun and Jinglue Xu and Yujin Tang},\nbooktitle={The Fourteenth International Conference on Learning Representations},\nyear={2026},\nurl={https://openreview.net/forum?id=U23A2BUKYt}\n}"},"paperhash":{"value":"nielsen|learning_to_orchestrate_agents_in_natural_language_with_the_conductor"}},"id":"U23A2BUKYt","forum":"U23A2BUKYt","license":"CC BY 4.0","signatures":["ICLR.cc/2026/Conference/Submission1742/Authors"],"readers":["everyone"],"writers":["ICLR.cc/2026/Conference","ICLR.cc/2026/Conference/Submission1742/Authors"],"number":1742,"odate":1759896705795,"invitations":["ICLR.cc/2026/Conference/-/Submission","ICLR.cc/2026/Conference/-/Post_Submission","ICLR.cc/2026/Conference/Submission1742/-/Full_Submission","ICLR.cc/2026/Conference/Submission1742/-/Rebuttal_Revision","ICLR.cc/2026/Conference/-/Edit","ICLR.cc/2026/Conference/Submission1742/-/Camera_Ready_Revision"],"domain":"ICLR.cc/2026/Conference","tcdate":1756913992841,"cdate":1756913992841,"tmdate":1775876890581,"mdate":1775876890581,"pdate":1769435666481,"version":2,"details":{"directReplies":[{"content":{"summary":{"value":"This paper introduces the conductor, which can automatically divide the task and assign subtasks to different models, and a method to training the conductor with RL. With the conductor, small LLMs surpass large LLMs on multiple tasks."},"soundness":{"value":3},"presentation":{"value":3},"contribution":{"value":3},"strengths":{"value":"- The concept of conductor is novel. The workflow is fully automatic and doesn't need human's design.\n- The quantitive result is promising.\n- The extra analysis is detailed."},"weaknesses":{"value":"N/A"},"questions":{"value":"N/A"},"flag_for_ethics_review":{"value":["No ethics review needed."]},"rating":{"value":8},"confidence":{"value":2},"code_of_conduct":{"value":"Yes"}},"id":"RF6eM0pMq6","forum":"U23A2BUKYt","replyto":"U23A2BUKYt","signatures":["ICLR.cc/2026/Conference/Submission1742/Reviewer_og9K"],"nonreaders":[],"readers":["everyone"],"writers":["ICLR.cc/2026/Conference","ICLR.cc/2026/Conference/Submission1742/Reviewer_og9K"],"number":1,"invitations":["ICLR.cc/2026/Conference/Submission1742/-/Official_Review","ICLR.cc/2026/Conference/-/Edit"],"domain":"ICLR.cc/2026/Conference","tcdate":1760873860388,"cdate":1760873860388,"tmdate":1762915876369,"mdate":1762915876369,"parentInvitations":"ICLR.cc/2026/Conference/-/Official_Review","license":"CC BY 4.0","version":2},{"content":{"summary":{"value":"The paper presents a major conceptual step: turning a small LLM into a meta-agent that learns to orchestrate larger, specialized LLMs through reinforcement learning. Instead of directly solving problems, the Conductor learns to design agentic workflows—decomposing tasks into subtasks, assigning them to specialized worker models, and defining how agents communicate and share context.\nIt establishes new SOTA reasoning results on tasks such as LiveCodeBench, GPQA-Diamond, and AIME 2025 and introduces a flexible, extensible framework for autonomous multi-agent coordination via language."},"soundness":{"value":3},"presentation":{"value":4},"contribution":{"value":2},"strengths":{"value":"- While prior work (e.g., Mixture-of-Agents [Wang et al., 2024], or Smoothie [Guha et al., 2024]) explores routing or fixed multi-agent topologies, this work chooses to learn such coordination end-to-end using pure reward maximization, without predefined roles or handcrafted scaffolds.\n- The idea of recursively calling itself as part of the agentic workflow is interesting and open pathways for further studies.\n- The empirical results are strong, beating top frontier models.\n- I like the in-domain and out of domain analysis. Really put things in perspectives. It is nice to know that Conductor is not just overfitting on specific task or formats."},"weaknesses":{"value":"```One concern I have is whether the benefits from Conductor came from orchestration or agentic workflow planning, or purely prompt engineering?```\n\nThe Conductor orchestrates the agentic flow with respective subtasks. And RL is used to train the model to do better at this task which can be roughly divided to two tasks: 1) manage and assign the best match between model and tasks. 2) create better subtask directions/prompts. There is no ablation that fixes 1), that is, use only one model (like GPT-5) and then train Conductor with RL. If Conductor is still doing pretty well, that means benefits from orchestrating is minimal. I would highly suggest doing such an ablation which would make your case a lot stronger. \n\n```The tasks themselves are not very multi-agent dependent```\n\nTasks like Math500, AIME or LiveCodeBench are not really suitable tasks for multi-agent. Ideally, a multi-agent task or a task where multi-agent can have some leverage or just single agent are tasks that naturally can be divided to subtasks that are heterogenous. I don't find these tasks to have those properties. Like do you write first part of the math solution and then write the second part using another model? I saw some examples where Conductor ask models to refine or verify. But those are not really something that a model have specialty in. The task selection is a bit off.\n\n```Lack of other benefits```\n\nTo continue from last point. A weird task selection would cause the benefit of multi-agent to diminish. Multi-agent frameworks comes in with three big pros -- efficiency, safety and performance. There is no efficiency benefits for this because everything is sequential and one model has to wait for last model to finish (correct me if I misunderstood). This limits the contribution of the Conductor. I would love to see such a pipeline work in a meaningful agentic setup."},"questions":{"value":"Typos:\n- Line 200 \"to its own parent output\"\n\nQuestion:\n\n- How sensitive is the Conductor’s learning to the choice of reward granularity (binary correctness vs. partial credit or verification-based rewards)? Would denser reward shaping improve stability or lead to different emergent strategies? What reward strategies have you guys tried? I am generally really curious about this. \n\n- What are the different modes of orchestration the Conductor learned? Did you guys do some simple categorization?"},"flag_for_ethics_review":{"value":["No ethics review needed."]},"rating":{"value":4},"confidence":{"value":4},"code_of_conduct":{"value":"Yes"}},"id":"ROn0QfMSlZ","forum":"U23A2BUKYt","replyto":"U23A2BUKYt","signatures":["ICLR.cc/2026/Conference/Submission1742/Reviewer_c7Bs"],"nonreaders":[],"readers":["everyone"],"writers":["ICLR.cc/2026/Conference","ICLR.cc/2026/Conference/Submission1742/Reviewer_c7Bs"],"number":2,"invitations":["ICLR.cc/2026/Conference/Submission1742/-/Official_Review","ICLR.cc/2026/Conference/-/Edit"],"domain":"ICLR.cc/2026/Conference","tcdate":1761884999376,"cdate":1761884999376,"tmdate":1762915875068,"mdate":1762915875068,"parentInvitations":"ICLR.cc/2026/Conference/-/Official_Review","license":"CC BY 4.0","version":2},{"content":{"summary":{"value":"This paper introduces a conductor model trained with RL to \"orchestrate\" between different underlying LLMs to perform work. The conductor attains SOTA on reasoning benchmarks and outperforms all component models. The authors also enable online recursive adaptation of the conductor agent. This unlocks a new form of dynamic test-time compute scaling."},"soundness":{"value":3},"presentation":{"value":3},"contribution":{"value":4},"strengths":{"value":"This is an excellent paper. The training setup is sound and the conductor is trained end-to-end with RL. This strategy unlocks new SoTAs on top reasoning benchmarks. The training setup also clearly observes novel strategies from the conductor (e.g., having sub-LLMs validate each other's outputs and plan with each other). There are a number of ablations to prove robustness and follow-ups to results, and many of them have interesting results of their own (e.g., showing the impact of spending more agents on harder tasks). Figures are clean. Examples are thorough."},"weaknesses":{"value":"It is unclear if these techniques improve performance compared to other test-time-compute strategies (e.g., pass@k, best-of-N, consensus, or other heavily prompted setups to increase test-time-compute put into solving the same problem). Because the incremental lift from this strategy seems relatively small, this paper would be significantly strengthened by including comparisons to other test-time-compute strategies to show that it's worth it compared to e.g., pass@k/BoN/cons/prompting/etc.\n\nMoreover, it would be useful to understand \"cost-normalized\" performance; quantitatively, much more latency and cost is incurred from using this strategy and is it worth the performance gained? There is a plot with number of agent calls but latency in terms of time/tokens and cost in terms of actual API pricing or equivalent would be much more interpretable here to baseline the tradeoff needed to get the performance wins."},"questions":{"value":"What is the \"non-trained\" baseline (e.g., BoN/pass@)? What is cost-normalized performance?"},"flag_for_ethics_review":{"value":["No ethics review needed."]},"rating":{"value":8},"confidence":{"value":4},"code_of_conduct":{"value":"Yes"}},"id":"VoDjvWvifc","forum":"U23A2BUKYt","replyto":"U23A2BUKYt","signatures":["ICLR.cc/2026/Conference/Submission1742/Reviewer_5Z3X"],"nonreaders":[],"readers":["everyone"],"writers":["ICLR.cc/2026/Conference","ICLR.cc/2026/Conference/Submission1742/Reviewer_5Z3X"],"number":3,"invitations":["ICLR.cc/2026/Conference/Submission1742/-/Official_Review","ICLR.cc/2026/Conference/-/Edit"],"domain":"ICLR.cc/2026/Conference","tcdate":1761965939844,"cdate":1761965939844,"tmdate":1762915874426,"mdate":1762915874426,"parentInvitations":"ICLR.cc/2026/Conference/-/Official_Review","license":"CC BY 4.0","version":2},{"content":{"summary":{"value":"The paper proposes training a small (7B) Conductor model using reinforcement learning (RL) to orchestrate multiple strong LLMs for complex reasoning tasks. The model is trained using GRPO for 200 iterations. The experiments demonstrate that the Conductor achieves better performance on these reasoning tasks compared to any single model. Overall, the experimental results largely support the claim that the Conductor model improves reasoning performance in practice, although the margin of improvement is not always substantial."},"soundness":{"value":3},"presentation":{"value":1},"contribution":{"value":2},"strengths":{"value":"- The core idea of using reinforcement learning to train a dedicated task planner/assigner (the \"Conductor\") is a potentially interesting direction to combine different models’ strengths.\n- The experimental results demonstrate that this approach is effective, showing performance gains over individual state-of-the-art models."},"weaknesses":{"value":"- **The presentation has significant flaws.** Some expressions are imprecise, lack formal explanations, or are unsuitable for an academic paper. For instance, the GRPO formula in Equation (1) appears to be incorrect, as it seems to be missing the clipping mechanism characteristic of PPO-style algorithms. Also, the paper frequently uses the term \"agentic workflows\" without providing a formal definition within the context of this study. This term is often over-used, and the authors need to clarify precisely what it entails in their framework. Several other key terms are left undefined or used imprecisely. For example:\n    - What do the authors mean by the \"latent capability\" of LLMs (line 37)?\n    - What constitutes the \"unconstrained\" setting for evaluation (line 267)?\n- **The motivation for training a separate Conductor model is not fully convincing given the experimental results.** While Table 1 and Figure 4 show that the Conductor improves upon the best single agent (model), the performance foundation clearly comes from the powerful frontier models it orchestrates (e.g., GPT-5 achieves 90.8 on unseen task AIME25, while Conductor reaches 93.3). The marginal gain seems small. This raises a critical missing baseline: what is the performance of a strong model (like GPT-5 or Gemini 2.5 Pro) when simply prompted to act as the task planner and assigner? Furthermore, the results for other multi-agent baselines in Figure 4 are confusing. Why do established frameworks like MoA and MASRouter show results inferior to a single-agent Gemini 2.5 Pro? Were these baselines tested with the same powerful set of worker models as the Conductor? This needs clarification to fairly assess the Conductor's contribution."},"questions":{"value":"- The reward mechanism is described at a high level. Could you elaborate on the credit assignment? Is the final reward applied to the entire sequence of tokens generated by the Conductor? How does this scalar reward effectively train the complex, multi-step workflow generation?\n- The premise of the paper is that different models have specialized, complementary skills. Could you provide concrete examples of this from your experiments? For instance, are there cases where a generally \"weaker\" model (like Qwen3-32B) correctly solves a sub-task that a \"stronger\" model (like GPT-5) fails at, demonstrating true complementary specialization?\n- How much more computational cost (e.g., inference rounds or total tokens) does the Conductor framework introduce compared to the single-model / single-agent scenarios? A clear analysis of the performance-cost trade-off is needed."},"flag_for_ethics_review":{"value":["No ethics review needed."]},"rating":{"value":2},"confidence":{"value":3},"code_of_conduct":{"value":"Yes"}},"id":"URJg6LMukD","forum":"U23A2BUKYt","replyto":"U23A2BUKYt","signatures":["ICLR.cc/2026/Conference/Submission1742/Reviewer_TwCE"],"nonreaders":[],"readers":["everyone"],"writers":["ICLR.cc/2026/Conference","ICLR.cc/2026/Conference/Submission1742/Reviewer_TwCE"],"number":4,"invitations":["ICLR.cc/2026/Conference/Submission1742/-/Official_Review","ICLR.cc/2026/Conference/-/Edit"],"domain":"ICLR.cc/2026/Conference","tcdate":1762023147680,"cdate":1762023147680,"tmdate":1764359350952,"mdate":1764359350952,"parentInvitations":"ICLR.cc/2026/Conference/-/Official_Review","license":"CC BY 4.0","version":2},{"content":{"title":{"value":"Global Response"},"comment":{"value":"Dear Reviewers and AC, \n\nWe thank the reviewers for taking the time to review our work and for their constructive feedback and suggestions. We’re particularly appreciative of the endorsements of our work, such as its strong empirical results including SOTA performance (all reviewers), the conceptual novelty of our framework (reviewers og9K, c7Bs, 5Z3X), and scope for interesting future directions through recursion and adaptivity (reviewers 5Z3X, c7Bs). \n\nTwo common themes emerged from reviewer responses, which we’d like to address in this global response. The first was the request for additional details regarding performance-efficiency analysis, in particular, adding token usage and average cost to our efficiency analysis. Following this request, we’ve added this information to our existing performance-efficiency comparison with multi-agent baselines. We also ran an additional single-agent inference-time framework, consensus [1], and added to our existing self-reflection inference-time baselines the performance-efficiency tradeoffs of GPT-5, Gemini 2.5 Pro, and Claude Sonnet 4 in comparison with our Conductor. These results are shown in Tables A1 and A2, where we find that the Conductor offers substantive performance and efficiency improvements over both single-agent inference-time scaling and multi-agent baselines. \n\nThe second was the request for additional ablations of the Conductor framework, where we 1) use a powerful frontier model as the Conductor to perform agent selection and subtask assignment, and 2) replace all agent workers with a single powerful frontier model. We present these results in Table B and C, where we find that the Conductor far surpasses these additional baselines, which we believe offers strong evidence of the Conductor’s learned ability to first discern the true underlying capabilities of the available agents, and then to intelligently combine and compose the differing agents to deliver performance beyond any individual worker across a wide range of tasks. \n\nWe provide the main analysis, results, and figures about these points and beyond in the individual reviewer responses for easier reference. In case there are any outstanding comments or suggestions following our rebuttal, we hope the reviewers will not hesitate to get back to us, and we will focus on addressing them right away.\n\nThanks,\nAuthors\n\nTable A1\n\n| Model                          | Performance | Token Usage | Cost      | Cost-adjusted performance |\n|--------------------------------|-------------|-------------|-----------|---------------------------|\n| Claude Sonnet 4 5x consensus   | 91          | 1412.8      | 0.021192  | 42.94073235               |\n| Claude Sonnet 4 5x reflect     | 90.66       | 2516.992    | 0.02080128| 43.58385638               |\n| Gemini 2.5 Pro 5x consensus    | 91.6        | 1658.4      | 0.016584  | 55.23396044               |\n| Gemini 2.5 Pro 5x reflect      | 88.33       | 2919.776    | 0.01675976| 52.70361867               |\n| GPT 5 5x consensus             | 91.3        | 1376.3      | 0.013763  | 66.33728112               |\n| GPT 5 5x reflect               | 91.79       | 2457.132    | 0.01424907| 64.41823923               |\n| Conductor                      | 93.14       | 735.2       | 0.009     | 103.4888889               |\n\n\n\nTable A2\n\n| Model   | Performance | Token Usage | Cost |\n|----------------|---------|----------|------------------|\n| MoA            | 62.13 | 11203 |  $0.048554  |\n| Smoothie    | 56.48  |9909   | $0.039291  |\n| RDC            | 52.41  | 840    | $0.005613 |\n| MasRouter  | 56.89 | 4970   | $0.01345   |\n| Conductor   | 72.35 | 1820   | $0.02384   |\n\n\nTable B\n\n| Model                     | LCB   | AIME | BigCodeBench | GPQA-D | Avg.    |\n|---------------------------|-------|------|--------------|--------|---------|\n| GPT-5 conduct 7 models   | 50.86 | 76.67 | 34.5       | 77.78 | 59.9525  |\n| GPT-5 conduct  3 models           | 67.43 | 93.3 | 33.1         | 86.36  | 70.0475 |\n| Gemini 2.5 Pro conduct 3 models   | 70.29 | 93.3 | 35.13        | 87.62  | 71.585  |\n| Conductor                 | 83.93 | 93.3 | 37.86        | 87.5   | 75.6475 |\n\nTable C\n\n| Model                  | AIME | BigCodeBench | GPQA-D | Avg.        |\n|------------------------|------|--------------|--------|-------------|\n| Claude Sonnet 4        | 74.3 | 37.16        | 77.7   | 63.05333333 |\n| Gemini 2.5 Pro         | 78.3 | 37.51        | 84.8   | 66.87       |\n| GPT-5                  | 90.8 | 32.75        | 82.3   | 68.61666667 |\n| Conductor with only GPT-5 | 93.33| 33.5         | 82.6   | 69.81       |\n| Conductor              | 93.3 | 37.86        | 87.5   | 72.88666667 |\n\n\n[1] Wang, Xuezhi, et al. \"Self-consistency improves chain of thought reasoning in language models.\" International Conference on Learning Representations (ICLR) (2023)"}},"id":"12mlXch7Cz","forum":"U23A2BUKYt","replyto":"U23A2BUKYt","signatures":["ICLR.cc/2026/Conference/Submission1742/Authors"],"readers":["everyone"],"writers":["ICLR.cc/2026/Conference","ICLR.cc/2026/Conference/Submission1742/Authors"],"number":1,"invitations":["ICLR.cc/2026/Conference/Submission1742/-/Official_Comment"],"domain":"ICLR.cc/2026/Conference","tcdate":1763646989696,"cdate":1763646989696,"tmdate":1763701449012,"mdate":1763701449012,"parentInvitations":"ICLR.cc/2026/Conference/-/Official_Comment","license":"CC BY 4.0","version":2},{"content":{"summary":{"value":"This paper introduces \"The Conductor,\" a 7B model trained via reinforcement learning to manage and coordinate a team of larger, specialized models (like GPT-5 and Gemini) to solve difficult reasoning tasks. The reviewers were impressed by the paper's strong empirical results, which achieved state-of-the-art performance on benchmarks like AIME and LiveCodeBench, demonstrating that a small model can effectively \"orchestrate\" much larger ones. Initial concerns focused on the lack of cost and efficiency analysis, the need for stronger baselines (such as using a frontier model like GPT-5 as the conductor), and whether the gains were due to orchestration or just better prompting. Because the authors provided thorough evidence showing the Conductor is more efficient than standard methods and outperforms frontier-model planners, I recommend the paper be accepted."},"reviewer_concerns":{"value":"The authors successfully addressed the majority of the reviewers' technical concerns during the rebuttal. Specifically, they added a comprehensive performance-efficiency analysis that proved the Conductor uses fewer tokens and costs less than single-agent \"consensus\" strategies. They also included a \"GPT-5 as Conductor\" baseline, which showed that a specifically trained smaller model actually makes better coordination decisions than a raw frontier model. While some concerns remain regarding whether certain math and coding tasks are the best fit for a \"multi-agent\" approach, and some terminology like \"latent capability\" remains slightly vague, the authors' new qualitative examples of \"planners\" and \"executors\" working together effectively demonstrated the value of their framework."},"reviewer_scores":{"value":"The reviewers responded very positively to the rebuttal materials. Reviewer TwCE was the most convinced, raising their score significantly from a 2 (Reject) to a 6 (Weak Accept) after their questions on baselines and efficiency were answered. Reviewers og9K and 5Z3X maintained their high scores of 8, remaining confident in the work’s contribution to agentic workflows. Reviewer c7Bs remained the most skeptical with a 4, but given that the authors provided the specific \"GPT-5 only\" ablation they requested—which proved orchestration is better than just prompting—it is highly likely this reviewer would have moved toward a 5 or 6 had they participated in the final discussion."}},"id":"DouxKBeSOa","forum":"U23A2BUKYt","replyto":"U23A2BUKYt","signatures":["ICLR.cc/2026/Conference/Submission1742/Area_Chair_niX2"],"nonreaders":[],"readers":["everyone"],"writers":["ICLR.cc/2026/Conference","ICLR.cc/2026/Conference/Submission1742/Senior_Area_Chairs","ICLR.cc/2026/Conference/Submission1742/Area_Chair_niX2"],"number":1,"invitations":["ICLR.cc/2026/Conference/Submission1742/-/Meta_Review","ICLR.cc/2026/Conference/-/Edit"],"domain":"ICLR.cc/2026/Conference","tcdate":1767738129072,"cdate":1767738129072,"tmdate":1770618835738,"mdate":1770618835738,"parentInvitations":"ICLR.cc/2026/Conference/-/Meta_Review","license":"CC BY 4.0","version":2},{"content":{"title":{"value":"Paper Decision"},"decision":{"value":"Accept (Poster)"},"comment":{"value":""}},"id":"yoqZZjPrmg","forum":"U23A2BUKYt","replyto":"U23A2BUKYt","signatures":["ICLR.cc/2026/Conference/Program_Chairs"],"nonreaders":[],"readers":["everyone"],"writers":["ICLR.cc/2026/Conference","ICLR.cc/2026/Conference/Program_Chairs"],"number":1,"invitations":["ICLR.cc/2026/Conference/Submission1742/-/Decision","ICLR.cc/2026/Conference/-/Edit"],"domain":"ICLR.cc/2026/Conference","tcdate":1769416475361,"cdate":1769416475361,"tmdate":1770356548698,"mdate":1770356548698,"parentInvitations":"ICLR.cc/2026/Conference/-/Decision","license":"CC BY 4.0","version":2}]}}]}