VOOZH about

URL: https://repost.aws/questions/QUI-eP97dOQ-m6USeb4NP30g/some-invocations-of-a-lambda-function-take-much-longer-despite-a-warm-start

⇱ Some invocations of a Lambda function take much longer despite a warm start | AWS re:Post


Skip to content

Some invocations of a Lambda function take much longer despite a warm start

0

Here is my AWS Lambda function:

import json
import time
import pickle

def ms_now():
 return int(time.time_ns() / 1000000)

class Timer():
 def __init__(self):
 self.start = ms_now()

 def stop(self):
 return ms_now() - self.start

timer = Timer()

from punctuators.models import PunctCapSegModelONNX
model_name = "pcs_en"
model_sentences = PunctCapSegModelONNX.from_pretrained(model_name)

with open('model_embeddings.pkl', 'rb') as file:
 model_embeddings = pickle.load(file)

cold_start = True
init_time = timer.stop()
print("Time to initialize:", init_time, flush=True)

def segment_text(texts):
 sentences = model_sentences.infer(texts)
 sentences = [
 [(s, len(model_embeddings.tokenizer.encode(s))) for s in el]
 for el in sentences]
 return sentences

def get_embeddings(texts):
 return model_embeddings.encode(texts)

def compute(body):
 command = body['command']
 
 if command == 'ping':
 return 'success'

 texts = body['texts']

 if command == 'embeddings':
 result = get_embeddings(texts)
 return [el.tolist() for el in result]
 
 if command == 'sentences':
 return segment_text(texts)
 
 assert(False)

def lambda_handler(event, context):
 global cold_start
 global init_time
 
 stats = {'cold_start': cold_start, 'init_time': init_time}
 cold_start = False
 init_time = 0

 stats['started'] = ms_now()
 result = compute(event['body'])
 stats['finished'] = ms_now()
 return {
 'statusCode': 200,
 'headers': {
 'Content-Type': 'application/json'
 },
 'body': {'result': result, 'stats': stats}
 }

This Lambda function, along with the packages and the models (so that those don't need to be downloaded), is deployed as a docker image.

In addition to the timestamps of when the function started and finished (not including the cold start initialization), the response contains the information about whether it was a cold start and how long it took to initialize. I have another function, which invokes this function 15 times in parallel.

The anomaly happens with the first of these parallel invocations. Usually, it takes ~300ms (computed as the difference of the timestamps in the response). But sometimes it takes 900ms and longer (with the same input).

This does not happen due to a cold start, since I have init_time==0 in the response (when a cold start occurs, init_time>6000). It happens both with command == 'embeddings' and with command == 'sentences'.

What could be the explanation for these spikes? With a warm start, what can cause a Lambda function to take much longer than usual?

P.S. The question at SO

Language
English

asked 3 years ago574 views

1 Answer
  • Newest
  • Most votes
  • Most comments
1
Accepted Answer

It's probably Python garbage collection. On a warm start the container is reused so over a series of invocations the garbage collector is likely to kick in at some point and make that invocation take longer. I was at a presentation last night about using Rust in Lambda where graphs were shown with this exact thing - comparing Lambda execution times of Rust vs a language that has garbage collection. In this case the other language was Typescript, and there was a spike in execution time every 2 minutes due to the garbage collector.

EXPERT

answered 3 years ago

EXPERT

reviewed 3 years ago

  • AlwaysLearning
    3 years ago

    Disabling automatic garbage collection with gc.disable() helped! But can you come up with an explanation for how come this almost always happened in the first invocation?