Highlight Chaos Engineering experiments with AppDynamics and Gremlin webhooks

Highlight Chaos Engineering experiments with AppDynamics and Gremlin webhooks
Last Updated:
Categories: Chaos Engineering

Chaos Engineering with Gremlin is a powerful way to tune your monitoring to ensure you are gathering actionable data and to train your teams to leverage these tools so that observability expertise isn't siloed in your organization. AppDynamics is an application performance management tool used by companies worldwide to monitor their workloads. Used in combination, these two tools can help you lower your mean time to detection (MTTD) and increase the availability of your applications. This tutorial will walk you through how you can correlate an attack from Gremlin to its impact in AppDynamics.

Prerequisites

Step 1: Make a custom event role in AppDynamics

First, you need to add a role for Gremlin to post custom events to AppDynamics. Go to Settings -> Administration

Select Roles -> Create. Provide a Name: Gremlin Events. Then, select Applications -> Check “View” and click on “Edit” and check “Create Events”.

Click “Save”.

Step 2: Make a custom event user

Now you need a user that has the Gremlin Events role. Go to Users -> Display Users from “AppDynamics”, click “Create”.

Enter Username: gremlin, Email: {your_email}, Name: Gremlin Events, Password: {password}.

Roles -> Add “Gremlin Events”

Click “Save”.

Step 3: Gather your account name and app ID

You’ll need to get your user path and endpoint to send to AppDynamics. In AppDynamics, go to Settings -> License

Go to Account. Take note of your Account Name next to “Name”, you’ll use that in the next step.

Open the Applications dashboard and select the application you wish to experiment on. In the URL, you’ll find application={app_id}. You can see mine below is 10691. Grab that app_id number for Step 5.

Step 4: Encode your login key

Now that you’ve gathered that information, you’ll need to encode it for the Authorization header in the webhook. In your terminal (Mac or Linux) enter:

1echo -n 'gremlin@{Account Name)':'{password}' | base64

or Command Prompt (Windows) enter:

1[Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes(‘gremlin@{account_name}:{password}’))

Save that output for the next step.

Step 5: Create 2 Gremlin webhooks

The next step is to create two webhooks - one for when the Gremlin attack starts and one for when it finishes. Go to Settings -> Team Settings

Select Webhooks -> New Webhook. Enter the Name AppD Basic Webhook Start, your Description and the following URL with your own controller’s address:

1https://{controller_address}/controller/rest/applications/{app_id}/events?severity=INFO&summary=gremlinStart&eventtype=CUSTOM&customeventtype=gremlinStart

Then add a header key:value with the key Authorization and the value using the key generated from the previous step:

1Basic {your_encoded_key}

And select “Attack Running” and Save.

Add a second webhook with the Name AppD Basic Webhook Finish, your Description and the following URL with your own controller’s address:

1https://{controller_address}/controller/rest/applications/{app_id}/events?severity=INFO&summary=gremlinFinish&eventtype=CUSTOM&customeventtype=gremlinFinish

Then add a header key:value with the key Authorization and the value using the key generated from the previous step:

1Basic {your_encoded_key}

And select “Attack Finished” and Save.

Step 6: Create a dashboard in AppDynamics

You’ll need a way to visualize the attack. In AppDynamics, go to Dashboard & Reports -> Create Dashboard. Enter the Name Gremlin Attack Dashboard.

Click “Add a Widget” -> “Time Series Graph” and click the + sign under Data. Under “Select Data Source” select “Servers” and under “Select a Metric” select “Hardware Resources|CPU|%Busy” and Save.

Under Events, select “Show Events” and the Data Source select the application you grabbed the app_id from in Step 4. Under “Filter Criteria”, unselect all items and select “Custom”. Click Save.

Click “Add Widget” again and select “Health Rules & Events” then “Event List”.

Under Events select Show As “Timeline” and the Data Source as your application you chose in Step 4. Under Filter Criteria, unselect all then select Custom. Click Save.

Your simple dashboard is all set up.

Step 7: Run a CPU attack

In Gremlin, go to Attacks -> New Attack. Select the host(s) where you have the AppDynamics agent(s) installed. Select “Choose a Gremlin” and Resource -> CPU. Set the length to 300 seconds, CPU Capacity of 60%, and All Cores.

Click “Unleash Gremlin” and head back over to your AppDynamics dashboard. In the dashboard, you can see where the attack started and the CPU spike and when it finished and the CPU wound down.

Conclusion

The CPU attack is a great first attack, but using Gremlin and AppDynamics together, you can do many more experiments, like tracing the impact of a little backend latency to front end latency to watch for exponential latency. Additionally, using Gremlin, you can test your thresholds to tune your AppDynamics alerting to prevent noisy alerts. Fire up an attack and make sure your alerts fire at the appropriate time. Target random hosts to make sure you cover your application.

We look forward to seeing what you build!

Related

Avoid downtime. Use Gremlin to turn failure into resilience.

Gremlin empowers you to proactively root out failure before it causes downtime. See how you can harness chaos to build resilient systems by requesting a demo of Gremlin.

Get started