Authors
Sai Guruju, Abhishek Niranjan, Krishnachaitanya Goginenia and Jithendra Vepa Observe.AI, USA
Abstract
In modern systems, especially those dealing with multimedia content like audios and videos, managing varying workloads efficiently is crucial for maintaining performance and cost-effectiveness. This paper presents a Load-Aware Smart Consumer system designed to handle the transcription of customer-agent conversations, a process that is often hampered by the wide variance in the duration of audio calls. Our system dynamically adjusts the concurrency of processing based on the current load, thereby ensuring stability and efficient resource utilization. By monitoring the processing load across instance-workers, the system can make informed decisions about accepting and processing new tasks, leading to improved resilience and cost savings. This approach is not limited to transcription engines but can be applied to any multimedia processing system facing similar challenges of input variability and resource constraints.
Keywords
high latency systems, concurrency control, cost-efficiency, stability, dynamic resource allocation, multimedia processing