How to implement an automated MS Sentinel content checker in a CICD pipeline

Introduction

In my previous post I briefly demonstrated how to parse KQL using Python and C#. Since then I released a CLI tool called prevalidate to perform SIEM content validation in your CI/CD pipeline to make sure everything works before you start deploying your queries. For now this only includes full syntax validation for Microsoft Sentinel.

Config validation using Pydantic

Pydantic is a popular data validation framework and is also used by FastAPI for request/response checking.

Data validation and settings management using Python type annotations. pydantic enforces type hints at runtime, and provides user friendly errors when data is invalid. Define how data should be in pure, canonical Python; validate it with pydantic.

Prevalidate uses Pydantic to validate analytical rules stored in the yaml-based format as seen in the Microsoft Sentinel repository. To start, prevelidate contains a Detection model specifying what fields need to be present in the configuration file.

Using Python’s typehints we define the structure of a Microsoft Sentinel detection configuration file. By not specifying a default value the field is set to be required. When Prevalidate opens a file, the file will first be checked if the content matches the Detection model. If this succeeds the detection will be added to a list of SentinelDetections.

KQL Syntax checking

The KQL class inherits the previously shown Detection model. This is is used to add additional features when validating our detection content. Specifically the validate method is added which is used to check the configuration file’s query field for syntax errors (this is the part containing the KQL body).

When running prevalidate, each configuration file is checked using the Pydantic models and the validate method is used to return any KQL syntax errors. The validate method also expects a workspace as input. The workspace instance contains the available functions (ie. ASIM parsers) and tables/fields which are configured in your Sentinel environment.

Prevalidate contains a sync feature to pull these details via the Log Analytics API or you can fetch a copy of all the defaults from my repo. More background info on how the syntax checking works can be found in my previous article.

Running tests

The workspace configuration/syntax checking mentioned previously are wrapped inside two (typer) CLI commands; prevalidate sentinel sync and prevalidate sentinel test. The sync command pulls the functions/tables present in your Sentinel environment and saves them in the specified directory.

The test commands validates the detection content (+ using workspace schema) stored inside a directory and saves the output in a Junit XML file. A help is available to show what arguments are expected for both commands.

The test command executes pytest and writes the parsing/syntax errors to a file called results.xml. This file uses the Junit XML format which can be read by services as Azure DevOps and/or Github in order to create policies which prevent potential failures to be merged in branches.

The pytest framework makes it easy to write small, readable tests, and can scale to support complex functional testing for applications and libraries.

CICD integration

Now comes the important bit. I created an example repository containing an example workflow with an export of Sentinel’s default functions/tables and a directory with default detections in the sentinel-cicd repository on github. Essentially only three steps are needed:

  • Install the python dependencies (ie. pip install prevalidate);
  • Run the unittest with the correct arguments, where the first argument points to the directory containing the detection content and the second argument points to the directory containing the Sentinel functions and tables;
  • Upload our results stored in the Junit XML using Enrico’s Github action.

When the a new commit is pushed, the pipeline will run and show any config and/or KQL syntax errors as such (in the actions tab).

And thats it for now. I’ll be adding more functionality and start fixing some bugs I found in the meantime. If you have any feedback/suggestions please open up an issue on the project’s github page.