How to use your Gremlin reliability score in Jenkins to ensure reliable releases

How to use your Gremlin reliability score in Jenkins to ensure reliable releases
Last Updated:

Introduction

Adding Gremlin to your CI/CD pipeline is a key step in automating your reliability efforts. We previously wrote a tutorial on how to run a Chaos Engineering experiment as part of a Jenkins pipeline. The result ran a chaos experiment every time you deployed your code to a test environment. But this approach has a limitation: you have to either wait for the test to finish and check the results programmatically, or allow the build process to continue regardless of the results.

This tutorial expands on the previous one by using the Gremlin reliability score, which is a more proactive indicator of reliability. A reliability score is calculated by running a series of experiments (called a Test Suite). The main benefits are:

  • We can run these experiments at any time, not just at deployment time.
  • The score is standardized across all services, so we can set a single minimum score to apply to all services.

In this tutorial, we'll create a complete Jenkins pipeline that checks a service's reliability score using the Gremlin API. We'll compare the score against a required minimum score, and if it passes, we'll promote it to production. You'll learn how to create API keys in Gremlin and use the Gremlin API. And while this tutorial uses code specific for Jenkins, you can use the same concepts with any CI/CD tool.

Overview

This tutorial will show you how to:

  • Use the Gremlin REST API
  • Create a Jenkins Pipeline using Groovy
  • Check and compare a service's reliability score using the Gremlin API and Groovy

Prerequisites

Before starting this tutorial, you’ll need the following:

Step 1 - Download the Jenins pipeline template

The first step is to define the Jenkins pipeline. We already wrote a simple Groovy file that you can download from GitHub. Copy and paste the contents of the file to your computer, or use the "Download raw file" button. Alternatively, you can copy the file contents from the code block below:

groovy
1/*
2This Pipeline example demonstrates how to use the Gremlin API to check the Gremlin Score of a service
3before promoting it to production. The Gremlin Score is a measure of the reliability of a service.
4If the Gremlin Score is less than the value set, the pipeline will fail and the service will not be promoted to production,
5 */
6
7pipeline {
8 agent any
9
10 stages {
11 stage('Check Gremlin Score') {
12 steps {
13 script {
14 def serviceId = 'Replace with your service ID'
15 def teamId = 'Replace with your team ID'
16 def apiUrl = "https://api.gremlin.com/v1/services/${serviceId}/score?teamId=${teamId}"
17 def apiToken = 'Bearer Replace with your Bearer token or API token'
18 def minScore = 80.0 // Replace with your minimum Gremlin Score
19
20 def response = sh(script: "curl -s -X GET '${apiUrl}' -H 'Authorization: ${apiToken}' -H 'accept: application/json'", returnStatus: true)
21
22 if (response != 0) {
23 error("API call to Gremlin failed with status code: ${response}")
24 } else {
25 def apiResponse = sh(script: "curl -s -X GET '${apiUrl}' -H 'Authorization: ${apiToken}' -H 'accept: application/json'", returnStdout: true).trim()
26
27 echo "API Response: ${apiResponse}" // Debug logging
28
29 // Attempt to capture numbers using a permissive regex
30 def scoreMatches = (apiResponse =~ /(\d+(\.\d+)?)/)
31
32 if (scoreMatches) {
33 def score = null
34
35 for (match in scoreMatches) {
36 def potentialScore = match[0]
37 try {
38 score = Float.parseFloat(potentialScore)
39 break
40 } catch (NumberFormatException e) {
41 // Continue searching for a valid score
42 }
43 }
44
45 if (score != null) {
46 echo "Gremlin Score: ${score}" // Debug logging
47
48 if (score < minScore)
49 error("Gremlin Score ${score} is less than defined ${minScore}. Cannot promote to production.")
50 }
51 } else {
52 echo "No valid score found in API response." // Debug logging
53 error("Unable to extract Gremlin Score from the API response.")
54 }
55 }
56 }
57 }
58 }
59
60 stage('Promote to Production') {
61 steps {
62 // Add the steps to promote to production here
63 // This could involve deployment and other production-related tasks
64 // You can replace this comment with the actual steps for your deployment process
65 echo "Promoting to production..."
66 }
67 }
68 }
69
70 post {
71 failure {
72 echo "The pipeline has failed. Not promoting to production."
73 }
74 success {
75 echo "The pipeline has succeeded. Promoting to production."
76 }
77 }
78}

Step 2 - Create a Gremlin API key and add it to the file

In order to use Gremlin's REST API, we need to add our authentication details to the script. You'll need two things:

  • A Gremlin API key (you can create a new one or reuse an existing one), and
  • Your Gremlin team ID.
  • Log into the Gremlin web app if you haven't yet.
  • Open your account settings by clicking on this link, or by clicking the user icon in the top-right corner of the page and selecting Account Settings.
  • Click API Keys. If you already have an API key you want to reuse, simply click the Copy icon next to the key name.
  • If you want to create a new API key, click New API Key.
    • Enter a name for the API key, such as "Jenkins score check". You can also enter a more detailed description, but this is optional.
    • Click Save, and copy the API key that appears in the modal.

Once you have the API key, paste it into the following line in the releasePipeline.groovy file:

groovy
1def apiToken = 'Bearer Replace with your Bearer token or API token'

Save the file.

Step 3 - Retrieve your Gremlin team ID and service ID

You'll need two additional pieces of information from Gremlin: your team ID and the service ID. The team ID is the unique ID for your Gremlin team, and the service ID is the unique ID of the service you want to check the score for.

We'll start with the team ID. To get the team ID, look in the bottom-left corner of the Gremlin web app. You'll see your name, and underneath that, your team name. Click the icon next to the team name to copy your team ID to your clipboard. From there, open your releasePipeline.groovy file and paste it in the following line:

groovy
1def teamId = 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'

For the service ID:

  • Find the service in the Services list, then click on it to see its overview page.
  • Click Settings at the top of the page.
  • Under Details, look for the Service ID box. Highlight and copy the ID shown in the box, or click the icon at the right side to copy the ID directly to your clipboard.
  • Paste the ID on the following line:
groovy
1def serviceId = 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'

Save the file.

Step 4 - Add the Groovy file to your Jenkins pipeline

In this step, we'll create a pipeline using our Groovy file. But before we do, there's one last tweak we need to make: we need to set the score threshold.

The score threshold is the minimum reliability score the service must have before it can deploy to production. This is defined in the minScore variable. In the sample file, we set minScore = 80.0, which means the service must have a score of at least 80% to deploy. Anything below this score will stop the pipeline and raise an error. You can change this threshold to any value between 0 and 100 by editing this line:

groovy
1def minScore = 80.0 // Replace with your minimum Gremlin Score

Now we're ready to add this file to our Jenkins Pipeline. To do this:

  • Open your Jenkins web application.
  • From the Dashboard, click on New Item.
  • Enter a name for the pipeline (e.g. "[service name]-gremlin-release-gate").
  • Select Pipeline as the type, then click OK.
  • Click the Pipeline tab at the top of the page to scroll down to the Pipeline section.
  • Enter the contents of the Groovy file in the Script text area.
  • Click Save.

Step 5 - Run your Jenkins pipeline

After you click Save in the previous step, click Build Now to run the pipeline. Gremlin will retrieve the service's score, check if its value is greater than or equal to minScore, and if so, will mark the build as successful. Otherwise, it will mark it as failed.

From here, you can make changes to better integrate the pipeline into your build process. Instead of hard-coding values like your service ID, use environment variables instead so you can pass different IDs for each service, and use credentials for storing your Gremlin API key.

We've also included a section in the Groovy script where you can enter commands for deploying your service to production. This runs immediately after Jenkins compares the service's reliability score against minScore:

groovy
1stage('Promote to Production') {
2 steps {
3 // Add the steps to promote to production here
4 // This could involve deployment and other production-related tasks
5 // You can replace this comment with the actual steps for your deployment process
6 echo "Promoting to production..."
7 }
8}

Lastly, you can change the "failure" condition to perform other steps, such as notifying the service's owner by sending an email or calling a service like PagerDuty. You can also track the status of your builds by integrating with a monitoring tool like Datadog and alert on failed builds that way.

Conclusion

Congratulations on setting up a reliability gate in Jenkins! This will ensure that your service only gets pushed to production if it meets your minimum reliability scores.

To ensure your scores stay up to date, make sure to autoschedule reliability tests on your service to run at least once a week. Going longer than one week without re-running a test will cause that test to expire, reducing your score. Remember that you can also use the Run All button to re-run all of the service's tests and regenerate its score.

Related

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.

Get started