Waiting for an OpenSearch Snapshot • Jamie Phillips

I ran into an interesting challenge while doing some automation tasks. Taking manual snapshots of OpenSearch is an interesting challenge that I just didn’t expect. Everything isn’t very clear and I learned a lot figuring this out. Given that it was more challenging than expected I thought I would create a post to share what I discovered for future reference for myself and everyone else that will come across this post.

The Problem

There were additional steps that needed to happen after taking a manual snapshot which created an interesting problem. You make a manual snapshot request, then the snapshot needs to happen which can happen quickly, or it can take several minutes depending on the index and other conditions. The additional steps that I needed to perform only needed to occur after the snapshot had been completed. What I needed to do was wait for the snapshot to be complete before proceeding.

Existing Code Samples

Amazon and OpenSearch have good code samples to get you started using both CLI tools like cURL and Python. Here is one of the examples for taking a manual snapshot. Here are the links to the docs:

I have included a basic snippet derived from the docs above showing how to initiate the snapshot from Python:

import boto3
import requests
from requests_aws4auth import AWS4Auth

host = ''
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, "us-west-1", "es", session_token=credentials.token)

url = f"{host}/_snapshot/<repository>/<snapshot>"

response = requests.put(url, auth=awsauth)
print(response.text)

The above code does indeed create the snapshot. However, the snapshot isn’t complete before the response has been returned. If you want to perform additional processing then you have to take additional steps which we will review below.

Waiting for a manual snapshot

There are two options that I have found on how to make sure that the snapshot is complete before you take any additional steps. The first option is to poll an endpoint checking the status and the second option is to not submit the snapshot request as an asynchronous request. Either option solves the problem and it really depends on your needs and how long a snapshot takes to complete. Let’s go over these solutions.

Checking the Snapshot Status

This is the first option that I mentioned. This is where we poll the status endpoint checking the state of the status. The docs state that the following statuses are possible:

SUCCESS
IN_PROGRESS
PARTIAL
FAILED

We can make a call to _snapshot/<repository>/<snapshot>/_status to get the status. The Python for that would look like this:

import boto3
import requests
from requests_aws4auth import AWS4Auth

host = ''
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, "us-west-1", "es", session_token=credentials.token)

url = f"{host}/_snapshot/<repository>/<snapshot>"

response = requests.put(f"{url}?wait_for_completion=true", auth=awsauth)

if response.status_code != 200:
    return

status_url = f"{host}/_snapshot/<repository>/<snapshot>/_status"
snapshot_response = requests.get(status_url, auth=awsauth)
snapshot_status = snapshot_response.json()
current_state = snapshot_status["state"]

while snapshot_status.state == "IN_PROGRESS"
    status_resposne = requests.get(status_url)
    status = status_resposne.json()
    current_state = status["state"]
    

if current_state == "SUCCESS":
    print("Snapshot complete!")
else:
    print(f"Snapshot failed: {current_state}")

That is a rough implementation of how to check for the status. Checking for status is probably the correct solution if you have snapshots that take a long time.

Waiting during the HTTP Request

This is the second option where instead of submitting an asynchronous API request, we add the query parameter wait_for_completion=true which will make the HTTP call a synchronous request which will not return until the snapshot has been completed. Let’s see what that looks like in Python.

import boto3
import requests
from requests_aws4auth import AWS4Auth

host = ''
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, "us-west-1", "es", session_token=credentials.token)

url = f"{host}/_snapshot/<repository>/<snapshot>"

response = requests.put(f"{url}?wait_for_completion=true", auth=awsauth)
print(response.text)

Not much of a change, but now the request is blocking so you know when it completes that the snapshot is complete.

Wrapping Up

I hope you found this helpful and it saves you a little time. I know that I will be referencing it in the future as I’m sure I will run across this scenario again.

Thanks for reading,

Jamie

If you enjoy the content, then consider buying me a coffee.