Zero-Shot Policy Transfer in Multi-Agent Reinforcement Learning via Trusted Federated Explainability

Authors

  • Mohan Siva Krishna Konakanchi Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3050-9246.IJETCSIT-V6I3P118

Keywords:

Multi-Agent Reinforcement Learning, Zero-Shot Transfer, Federated Learning, Trust Metrics, Explainable AI, Integrity, Accountability, Policy Transfer

Abstract

Zero-shot policy transfer in multi-agent reinforce-ment learning (MARL) aims to reuse learned behaviors across new tasks, agent populations, or environments without additional training. While promising for scalable autonomy, real-world MARL deployments are typically siloed: data, simulators, and operational telemetry are separated across business units, regions, or vendors, and cannot be centrally pooled. This creates a core tension: policy transfer benefits from shared learning, yet safety, privacy, and organizational boundaries demand decentralization. Further, transfer decisions in high-stakes settings must be ex-plainable and auditable, but adding explainability mechanisms can reduce performance or increase operational cost. Finally, federated settings are vulnerable to integrity failures (e.g., faulty or malicious updates) that can degrade global transfer quality. This paper proposes TFX-MARL (Trusted Federated Ex-plainability for MARL), a governance-inspired framework for zero-shot policy transfer across silos using trust metric-based federated learning (FL) and explainability controls. TFX-MARL contributes: (i) a trust metric that quantifies participant integrity and accountability using provenance, update consistency, local evaluation reliability, and safety-compliance signals; (ii) a trust-aware federated aggregation protocol that reduces poisoning risk and emphasizes high-accountability participants; and (iii) a trade-off controller that explicitly quantifies and optimizes the explainability–performance balance using a simple, operationally interpretable budgeting mechanism. We evaluate TFX-MARL us-ing a controlled simulation of heterogeneous MARL domains with non-IID task distributions, partial observability, and adversarial participants. Results show that trust-aware FL improves robust zero-shot transfer compared to standard FedAvg baselines, while explainability budgets maintain stable, actionable explanations with limited performance degradation. We conclude with engi-neering guidance for deploying trusted federated policy transfer in multi-agent systems requiring integrity, accountability, and explainable decision justification.

Downloads

Download data is not yet available.

References

[1] M. E. Taylor and P. Stone, “Transfer learning for reinforcement learning domains: A survey,” J. Machine Learning Research, vol. 10, pp. 1633–1685, 2011.

[2] F. A. Oliehoek and C. Amato, A Concise Introduction to Decentralized POMDPs. Springer, 2016.

[3] R. Lowe et al., “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Proc. NeurIPS, 2017.

[4] J. Foerster et al., “Counterfactual multi-agent policy gradients,” in Proc. AAAI, 2018.

[5] T. Rashid et al., “QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning,” in Proc. ICML, 2018.

A. A. Rusu et al., “Policy distillation,” in Proc. ICLR, 2016.

[6] H. B. McMahan et al., “Communication-efficient learning of deep networks from decentralized data,” in Proc. AISTATS, 2017.

[7] J. Konecˇny´, B. McMahan, and D. Ramage, “Federated optimiza-tion: Distributed optimization beyond the datacenter,” arXiv preprint arXiv:1511.03575, 2015.

[8] P. Kairouz et al., “Advances and open problems in federated learning,”

[9] arXiv preprint arXiv:1912.04977, 2019.

[10] K. Bonawitz et al., “Practical secure aggregation for privacy-preserving machine learning,” in Proc. ACM CCS, 2017.

[11] P. Blanchard, E. Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” in Proc. NeurIPS, 2017.

[12] D. Yin, Y. Chen, K. Ramchandran, and P. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” in Proc. ICML, 2018.

[13] M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you?: Explaining the predictions of any classifier,” in Proc. ACM KDD, 2016.

[14] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Proc. NeurIPS, 2017.

[15] M. Sundararajan, A. Taly, and Q. Yan, “Axiomatic attribution for deep networks,” in Proc. ICML, 2017.

[16] M. T. Ribeiro, S. Singh, and C. Guestrin, “Anchors: High-precision model-agnostic explanations,” in Proc. AAAI, 2018.

[17] Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead,” Nature Machine Intelligence, vol. 1, no. 5, pp. 206–215, 2019.

[18] E. Androulaki et al., “Hyperledger Fabric: A distributed operating system for permissioned blockchains,” in Proc. EuroSys, 2018.

[19] Putz, F. Pernul, and G. Kablitz, “A secure and auditable logging infrastructure based on a permissioned blockchain,” Computers & Secu-rity, vol. 87, 2019.

Published

2025-08-26

Issue

Section

Articles

How to Cite

1.
Konakanchi MSK. Zero-Shot Policy Transfer in Multi-Agent Reinforcement Learning via Trusted Federated Explainability. IJETCSIT [Internet]. 2025 Aug. 26 [cited 2026 Feb. 15];6(3):121-7. Available from: https://ijetcsit.org/index.php/ijetcsit/article/view/574

Similar Articles

111-120 of 377

You may also start an advanced similarity search for this article.