The success of today’s AI applications requires not only model training (Model- centric) but also data engineering (Data-centric). In data-centric AI, active learning
(AL) plays a vital role, but current AL tools can not perform AL tasks efficiently.
To this end, this paper presents an efficient MLOps system for AL, named ALaaS
(Active-Learning-as-a-Service). Specifically, ALaaS adopts a server-client archi-
tecture to support an AL pipeline and implements stage-level parallelism for high
efficiency. Meanwhile, caching and batching techniques are employed to further ac-
celerate the AL process. In addition to efficiency, ALaaS ensures accessibility with
the help of the design philosophy of configuration-as-a-service. It also abstracts
an AL process to several components and provides rich APIs for advanced users
to extend the system to new scenarios. Extensive experiments show that ALaaS
outperforms all other baselines in terms of latency and throughput. Further ablation
studies demonstrate the effectiveness of our design as well as ALaaS’s ease to use.
|arXiv:2207.09109v1 [cs.LG] 19 Jul 2022
Table 1: Comparison of Active Learning (AL) open-source tools. Our ALaaS provides a Machine-Learning-as-a-Service experience and improves AL efﬁciency a lot.