How to parse KQL by combining Python and C#

Introduction

As part of a previous project I helped out with developing a CI/CD pipeline to validate and deploy Splunk apps using Gitlab. One part of this pipeline checked whether the configurations/application bundles contain specific errors using Splunk’s appinspect. I was pretty happy with the results, but what if you’re using a different SIEM like Microsoft Sentinel?

Unfortunately there is not a lot of documentation available to build a similar pipeline to validate Sentinel content locally and/or want to parse the KQL query itself.

What follows is a bit of a hack to use Microsoft’s KustoQueryLanguage code and merge it with a bit of Python using Python.NET to perform simple and local syntax validation on a raw KQL query. Since the script will run locally there is no context available about the available tables, watchlists and/or custom functions yet.

Requirements

  • Python version 3.8 or higher until 3.11
  • .NET (core) runtime 7
  • .NET SDK 7
  • Git
  • Any editor of your choice

Set up a Python dev environment

Set up a new virtual environment for Python using your favourite environment/version manager. I personally use pyenv in combination with Poetry to manage my Python projects.

Using your new virtual environment install Python.NET as a dependency (ex. pip install pythonnet).

Python.NET is a package that gives Python programmers nearly seamless integration with the .NET Common Language Runtime (CLR) and provides a powerful application scripting tool for .NET developers. It allows Python code to interact with the CLR, and may also be used to embed Python into a .NET application.

Clone/build the required DLL

Microsoft maintains a git repository containing all the necessary methods to parse KQL and perform syntax validation. The code is written in C# so we’ll use the .NET SDK to build a DLL which we can use to load in our Python application to parse queries later on.

Clone the KustoQueryLanguage repository and open the src/Kusto.Language directory on your prefered terminal. Execute dotnet build . to build the DLL.

The build should succeed and display the path where the DLL is stored. Copy the file to the same location as your previously created Python project.

Run the code

Create a new Python file (ex. main.py) and copy/paste the provided example.

The above code snippets uses Python.NET to reflectively load the Kusto.Language DLL. The sample query is parsed and checked for errors using the getDiagnostics method. If one or multiple errors are found, the severity, message, location and issue will be printed. The issue displays the actual part of the query that is deemed incorrect.

As you can see the GetDiagnostics method found 2 issues with our query since I mispelled the ‘sort’ operator as ‘sorty’.

Using the KustoCode.Parse method only checks for basic KQL syntax errors and not if refered tables and/or fields actually exist. This can be achieved with the ParseAndAnalyze method but you will need to supply an overview of the existing schema in the form of a GlobalState.

So how does the GlobalState work? The globals specify what tables and fields (+types) exist. If these are specified, the ParseAndAnalyze method will perform more in-depth analysis to check, for example, if specific operators will work on the specified field types. Lets take a look at some examples and add the additional code to run the ParseAndAnalyze method.

Example 1: ParseAndAnalyze without GlobalState

Running the ParseAndAnalyze method instead of the Parse method on our example query without a GlobalState specified now returns an additional error, specifying that the SigninLogs table does not exist.

Now lets add a GlobalState and specify what tables and fields exist.

Example 2: ParseAndAnalyze with missing fields and incorrect field type

The example above runs the ParseAndAnalyze method on our example query with a GlobalState specified but with an incorrect specification of the TimeGenerated field type (which should be of type datetime). This correctly returns an error that the >= operator can not be used between a string and datetimeobject (ago).

To fix all of the issues, lets do a final change and specify the correct fields and types.

Complete example: ParseAndAnalyze with a correct GlobalState

This last example correctly specifies the used tables, fields and types used and returns no output.

And thats all that’s for now! There is much more functionality to explore and the KustoQueryLanguage repository contains a lot more usefull methods apart from error checking / diagnostics for KQL.