NL Queries and Aliases

A Guide to using natural language with Selector

Categories:

Natural language (NL) queries allow more free-form requests for information from the Selector S2 platform. Instead of using the strict syntax used in the Selector Software Query Language (S2QL), NL use is more like typical human questioning.

So, instead of using a complex string such as:

device:DeviceName in device_health as honeycomb where site=~CC00|CC-CITY1|SITE1|DEV1

An NL query in the GUI or with Slack would look like:

GUI NL Command: device status

Slack NL Command: /select device status

Both NL commands give the user a honeycomb grid of device status which can be drilled down into to visualize further details.

Use of Aliases

NL queries work by mapping NL queries to a series of aliases that stand for the S2QL equivalent. Aliases are best thought of as “query shortcuts.” This process lets the NL process see that the queries “Where are we driving?” and “What place are we going in the car?” are very similar, so the LLM can map them to the same alias for S2QL, such as “destination in automobile.”

Any sentence can be mapped to any alias. However, users must still be aware of the role that aliases play. Because the process tries to map NL to aliases, the better the context provided, the easier it is for the LLM to find the closest alias to the NL query.

For example, if the following aliases have been stored:

bgp status
bgp health

If the user asks, “How is BGP doing?” the system has a hard time finding the correct alias to choose. The problem is that “status” and “health” are very similar words, so the system struggles to separate them. Therefore, when aliases are created, it is important to make sure to distinguish aliases with context that adds NL similarity and not NL ambiguity. In this case, better aliases would be:

bgp overall status
bgp device health

Once more context is added, the NL query “How is BGP doing?” would correctly map to bgp overall status.

Named Entity Recognition and Structured Labels

Named Entity Recognition (NER) is the NL feature that allows specific queries (such as What is the status of device XTZ123?) to be generalized (What is the status of device yyy? where “yyy” can stand for any device.).

The challenge here is to have NER aware of all possible device label replacements. The process starts with the concept of entity scraping as the Selector Software runs.

Entity Scraping, Enumerations, and Label Autopopulation

Entities for NL processing are structured labels and their corresponding values that are extracted from incoming deployment data. They serve as identifiable data points that can be referenced in structured queries.

Enumerations or Enums are a complete list of all possible values for a given label, as gathered by the Selector software as it runs.

Label Autopopulation is a way used by the Selector Software to gather label values as the software runs, a process called automatic label population.

Together, these three aspects of NER work together to allow NL queries. Each of these concepts is described in this section.

Entities

Examples of entities would be things like:

Region: USA
Router ID: XYZ123

These entities help users interact with the system dynamically by allowing queries to reference known data points.

How Does Entity Scraping Work?

The NL query process continuously “scrapes” incoming data from various sources to identify and store entities. This process is performed by periodic pollers that extract relevant information from:

Loki (log aggregation, etc)
Prometheus (metrics monitoring, etc)
MongoDB (database collections)
Dashboard variables

These pollers can be enabled or disabled based on deployment configurations. In MongoDB, specific collections can also be targeted for entity extraction.

Entity scraping settings, including poller activation and MongoDB collection targeting, can be adjusted in the configuration section of the deployment.

Enumerations

Enumerations (Enums) provide a complete list of all known values for a given label. For example, when querying enums for s2_inst, the system returns all detected instances, such as:

customer1-s2m
customer1-staging
s2dev

Enumerations Improve the Querying Experience

To illustrate enumeration use, consider the following querybuilder example:

NL Querybuilder Example

When a user selects a filter value for a label (such as ticker), the NL process automatically retrieves all known values for that label. This provides a list of valid options for the selection, enhancing usability and reducing errors.

While users can select from the suggested values, they can also enter a custom value if needed.

Label Autopopulation

Entity storage in Selector NL processing enables automatic label population in various features. This functionality is leveraged in two situations:

Maintenance Window (MW) creation (covered in this section)
Alias resolution (covered in the next section)

Autopopulation in Maintenance Window Creation

When creating a Maintenance Window (MW), users can enable label autopopulation.

Note that:

If enabled, when a user provides a description during MW creation, the Event Consolidator Service (ECS) queries the NL process for any detected entities.
If a match is found in the s2_entities table, the MW is automatically assigned a label.

For example, if you create a Maintenance Window with the description

“Ticket opened for the machine 12345”

If “12345” is a recognized router_id in s2_entities, the NL process automatically assigns a MW label router_id = 12345

This feature streamlines the MW creation process by ensuring labels are applied correctly without manual input.

Aliases and Alias Resolution

Aliases are a core feature of NL processing and enables you to create shortened, more human-readable, and easier-to-remember versions of complex S2QL queries. This simplifies query execution, making it more intuitive and user-friendly.

Example of a Basic Alias

Alias: all kpi violations
S2QL Equivalent: s2ap_infra_health_by_kpi as honeycomb where s2ap_infra_health_by_kpi_violation > 0 show-by kpi_name, s2_inst group-by role, kpi_name, s2_inst

Once this alias is created and saved, you can invoke the query by simply typing “all kpi violations” instead of manually entering the full S2QL query.

When a user submits a query, the Collab Service first checks for alias resolution. If an alias match is found, it is automatically converted to its S2QL equivalent before execution.

Dynamic Alias Resolution

The Selector NL process supports dynamic alias resolution, allowing aliases to include template-like placeholders for entities. These placeholders enable flexible query execution based on user input.

Example of a Dynamic Alias

Alias: show kpi violations for {{S2_INST1}}
S2QL Equivalent: s2ap_infra_health_by_kpi as honeycomb where s2_inst={{S2_INST1}} show-by kpi_name

In this example, the string {{S2_INST1}} acts as a dynamic placeholder that represents an entity value.

How It Works:

The user types: show kpi violations for s2m
The NL process detects the S2_INST1 placeholder and substitutes s2m in its place.
The final S2QL query executed: s2ap_infra_health_by_kpi as honeycomb where s2_inst=s2m show-by kpi_name

Accessing and Editing Aliases

Aliases are accessed on the UI through their settings:

NL Accessing Aliases

When selected, a list of aliases is displayed:

NL Alias List

Each alias contains the phrase (alias), some metadata like timestamp and creator, and finally the s2ql.

You can perform CRUD operations from this page. All CRUD operations go through the same endpoint.

One important field to note is the source field. Sources can be these types:

Users–an alias created by the user
Widgets–an alias created from widgets by setting name of widget as an alias and the s2ql of widget as alias s2ql

Users

S2AP has an integration function called the Natural Language Phrases List where you can edit, add, and test all aliases. This list can be found on the right-hand side of the query bar at the top:

NL Query Bar Alias List

Through the Actions button you can add an alias like this:

NL Phrases

The first entry line is for the actual s2ql queryable. The second line gives the alias name to which the LLM maps.

Widgets

Selector uses widgets in dashboards to show all the necessary data for the customer and the deployment. Widgets form the core representation of what users want to query. For this reason, all widgets are added as aliases where the widget title is the alias name.

Because of this widget aliasing, it is important to establish a good naming convention for widgets to have a good representation of aliases. If a widget is deleted, added, or edited, this is shortly reflected on the Natural Language Phrases List.

Note that widget aliases cannot be edited on the Natural Language Phrases List editor. Therefore, if users want to change an alias of source widget, they must directly edit the main widget.

Intelligent Alias Matching

One of the key advantages of aliases is that they do not require an exact match. You can enter queries with:

Minor misspellings
Slight variations in wording
Different phrasing (e.g., “violations for device” vs. “violations of device”)

NL processing intelligently matches the input to the most relevant alias and processes the query accordingly. This allows for a more flexible and user-friendly querying experience.

Limitations of Entity/Filter Matching

While alias matching is lenient, entities/filters require exact spellings (for example, router_id or s2_inst or region)

Since the NL process performs a direct substitution for these values, they must be spelled correctly and match exactly as stored in the system.

Alias Label Autopopulation

While aliases significantly streamline the query process, their manual construction can sometimes be complex or unintuitive. Users must follow specific rules when inserting template placeholders, and mistakes can result in nonfunctional aliases.

For example, if you want the alias:

“device events for devices abc and def*”*

To resolve to the following S2QL query:

device_events_ml as table where device=abc or device=def

You might struggle with how to define dynamic fields needed:

Should you use {{DEVICE}}, {{DEVICE1}}, or {{DEVICE_1}}?
How should it handle multiple devices (such as {{DEVICE2}})?

To address this, the NL process supports alias label autopopulation using entity recognition. Instead of requiring you to manually insert template fields, the NL process can automatically detect and normalize them as follows.

Create an alias with a natural language query (such as “device events for devices abc and def”) but without template placeholders.
Enter the corresponding S2QL query as usual.
The NL process scans the S2QL query for potential filter values (such as abc and def for device).
The NL process searches for these values in the alias text and dynamically assigns template placeholders where applicable.
The alias is normalized so that future queries can be processed dynamically without manual intervention.

Here is an example:

NL Phrase Example

After clicking save and creating the alias, the alias page appears as follows:

NL Alias Page

This autopopulation feature is especially useful for those inexperienced with alias creation and/or dealing with aliases that are overly complex. It can also be used to test out the validity of an alias–if the NL process is not able to resolve and normalize, then this entity does not exist in the s2_entities collection.

Best Practices for NL Queries

Here is a brief list of best practices when using NL queries and aliases.

When building NL queries:

Start an existing widget:

1. Use the "NL Phrase" button for the widget.

2. Type in an NL phrase that can be used to invoke the widget. Reference any entities in the s2ql in the NL phrase, and the NL phrase is automatically generalized, such as:

    “*Show me the status of router xxx interface yyy*”

Use the query builder:

1. Get the query built as the user prefers, including filters, sorting, groups, and so on.

2. "Add Phrase"

3. For any entity filters needing to be generalized, ensure that a reference to those entities in the NL phrase.

Use Copilot to build dashboards.

When building NL Aliases:

Make NL queries at least 3 words long.
The NL queries should be meaningful.
Create entity-specific NL queries so that the system learns the broader label names (a generic query is generated from a specific query from created aliases).
Avoid semantic overlap when possible (status of devices and health of devices).
Avoid use of phrases like Show me…, What is the…? For example, instead of using show me the status of devices, just use status of devices. The extra words should not cause problems, but it is better to limit the alias to meaningful words.