Multi-Objective Optimization in Cloud-Native Intelligent Systems

Marius Schottlender; Arya Barlocker; Eric Li

Vol. 5 No. 4 (2026)

Articles

Multi-Objective Optimization in Cloud-Native Intelligent Systems

pdf

Marius Schottlender,
Arya Barlocker,
Eric Li

Published 2026-04-30

How to Cite

Schottlender, M., Barlocker, A., & Li, E. (2026). Multi-Objective Optimization in Cloud-Native Intelligent Systems. Journal of Computer Technology and Software, 5(4). Retrieved from https://ashpress.org/index.php/jcts/article/view/262

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Multi-objective optimization has become one of the central design principles for cloud-native intelligent systems because the dominant workloads of the current cycle of distributed computing-large language model inference, LLM-assisted scheduling, observability-driven AIOps, privacy-preserving federated learning, and edge/IoT reasoning-no longer admit a single "best" objective. Instead, practitioners must optimize under persistent trade-offs among latency, throughput, service-level objective compliance, energy, carbon emissions, cloud expenditure, fairness, privacy, trust, robustness, and remediation time. Recent work makes this shift explicit. In the LLM-serving literature, DeepServe models joint SLO-cost scheduling as a contextual bandit; BOute co-optimizes heterogeneous model routing and GPU deployment with multi-objective Bayesian optimization; ECCOS combines predictive quality-cost estimation with constrained optimization; and the Liao preprint framed as prompt-level cost prediction and SLO awareness presents CAPS, a bi-objective carbon-aware scheduler for online LLM inference. In adjacent domains, cloud autoscaling has been reformulated as risk-constrained reinforcement learning, microservice rate limiting as deep RL under throughput-latency tension, and federated cloud analytics as a joint optimization of accuracy, communication cost, trust, and privacy. The first experiment studies policy search for multi-pool LLM serving under cost, carbon, and SLO-met goodput. The second study autoscales under workload drift and compares a threshold policy, a latency-only reactive controller, and a risk-aware controller. The experiments are not benchmark replications; they are explanatory simulations designed to make the trade-offs in the reviewed literature concrete. They show that Pareto-oriented search can recover better budgeted goodput than simple scalarization under a carbon budget, and that risk-aware capacity control can achieve substantially lower tail latency than a latency-only controller at a modest capacity premium. These findings align with the broader literature's movement away from single-objective heuristics and toward closed-loop, multi-objective, and observability-aware decision systems.

pdf

Multi-Objective Optimization in Cloud-Native Intelligent Systems

How to Cite

Download Citation

Abstract