Some invocations of a Lambda function take much longer despite a warm start
Here is my AWS Lambda function:
import json import time import pickle def ms_now(): return int(time.time_ns() / 1000000) class Timer(): def __init__(self): self.start = ms_now() def stop(self): return ms_now() - self.start timer = Timer() from punctuators.models import PunctCapSegModelONNX model_name = "pcs_en" model_sentences = PunctCapSegModelONNX.from_pretrained(model_name) with open('model_embeddings.pkl', 'rb') as file: model_embeddings = pickle.load(file) cold_start = True init_time = timer.stop() print("Time to initialize:", init_time, flush=True) def segment_text(texts): sentences = model_sentences.infer(texts) sentences = [ [(s, len(model_embeddings.tokenizer.encode(s))) for s in el] for el in sentences] return sentences def get_embeddings(texts): return model_embeddings.encode(texts) def compute(body): command = body['command'] if command == 'ping': return 'success' texts = body['texts'] if command == 'embeddings': result = get_embeddings(texts) return [el.tolist() for el in result] if command == 'sentences': return segment_text(texts) assert(False) def lambda_handler(event, context): global cold_start global init_time stats = {'cold_start': cold_start, 'init_time': init_time} cold_start = False init_time = 0 stats['started'] = ms_now() result = compute(event['body']) stats['finished'] = ms_now() return { 'statusCode': 200, 'headers': { 'Content-Type': 'application/json' }, 'body': {'result': result, 'stats': stats} }
This Lambda function, along with the packages and the models (so that those don't need to be downloaded), is deployed as a docker image.
In addition to the timestamps of when the function started and finished (not including the cold start initialization), the response contains the information about whether it was a cold start and how long it took to initialize. I have another function, which invokes this function 15 times in parallel.
The anomaly happens with the first of these parallel invocations. Usually, it takes ~300ms (computed as the difference of the timestamps in the response). But sometimes it takes 900ms and longer (with the same input).
This does not happen due to a cold start, since I have init_time==0 in the response (when a cold start occurs, init_time>6000). It happens both with command == 'embeddings' and with command == 'sentences'.
What could be the explanation for these spikes? With a warm start, what can cause a Lambda function to take much longer than usual?
P.S. The question at SO
- Topics
- ServerlessCompute
- Tags
- AWS Lambda
- Language
- English
asked 3 years ago574 views
- Newest
- Most votes
- Most comments
It's probably Python garbage collection. On a warm start the container is reused so over a series of invocations the garbage collector is likely to kick in at some point and make that invocation take longer. I was at a presentation last night about using Rust in Lambda where graphs were shown with this exact thing - comparing Lambda execution times of Rust vs a language that has garbage collection. In this case the other language was Typescript, and there was a spike in execution time every 2 minutes due to the garbage collector.
- AlwaysLearning3 years ago
Disabling automatic garbage collection with
gc.disable()helped! But can you come up with an explanation for how come this almost always happened in the first invocation?
Relevant content
asked 5 years ago
- Accepted Answer
asked 4 years ago
