T
Transformers Serving Experiment
Implemented to observe load and inference time for each serving method (tf_serving, torch_serving, transformers_docker, optimum_onnx), for each task_type (doc_cls, doc_pair_cls, doc_multi_cls, token_cls), for some of our popular models