* Add ECS source files sanitizer script * Simplify sanitizer and improve logging * Update schema_sanitizer to handle specific fields removal * Update the ecs generator Dockerfile to use the sanitizer * Improve sanitizer filtering * Remove --ref flag from ECS generator to force the tool to use local source * Update the Wazuh Common Schema * Fix multi-fields deletion on sanitizer script * Update the Wazuh Common Schema * Fix nested fields sanitization and remove multi-fields specific fields list * Update the Wazuh Common Schema * Update documentation from README and scripts Remove commented-out code Improve methods docstrings Add new script to README * Restore WCS mappings modification * Rename constants and fix styling * Update CHANGELOG * Revert changes on cloud-services-azure module * Re-build WCS mappings using sanitized ECS definitions (#636) * Re-build all the WCS index templates using the sanitized ECS * Do not remove @timestamp on stateless modules --------- Co-authored-by: Alex Ruiz <alejandro.ruiz.becerra@wazuh.com> * Replace leftovers from ecs folder restructuring Replace occurences of 'stateless-' with 'stateless/' --------- Co-authored-by: Wazuh Indexer Bot <github_devel_xdrsiem_indexer@wazuh.com> Co-authored-by: Alex Ruiz <alejandro.ruiz.becerra@wazuh.com>
6.3 KiB
Wazuh Common Schema generator
The generation of the Wazuh Common Schema is automated using a set of scripts and Docker projects.
- compose.yml: Docker Compose file to define the services for the schema generator.
- generate_schema.sh: generates the complete schema. The list of modules to generate is read from the module_list.txt file. Copies the generated files to the appropriate folders. The index templates are copied to Setup plugin's resources/ folder, while the CSV files are copied to each module's
docs/folder. - push_schema.sh: commits and pushes the changes in the schema to the repository. This script is meant to be used by our GH Workflow. Do not use it locally.
- run_generator.sh: Script to start the Docker Compose project. This is the main entry point for the schema generation.
- update_module_list.sh: generates the module_list.txt file, by scanning the ecs/ folder. Run this script whenever a new module is added.
- images/Dockerfile: Dockerfile to build the image used for the schema generation. Clones the ECS repository, which contains the main tooling.
- images/generator.sh: our actual schema generation script. It is executed inside the container. Contains post-processing steps to make the templates compatible with OpenSearch and to adapt them to our needs.
- images/schema_sanitizer.py: Python script that modifies the ECS source mapping files to meet WCS requirements before generating the final templates. It is executedi in the image build process.
- count_and_update_total_fields.sh: counts fields in a generated index template and proposes (or applies with --apply) an updated mapping.total_fields.limit rounded up to the next 500.
Requirements
Usage
The generator is run automatically by our GH Workflow on pull requests that modify any of the modules. However, it can also be run locally. To do so, follow these steps.
-
Update the modules list:
./update_module_list.sh -
Run the generator for all modules:
./generate_schema.sh -
Update the number of total field of each module:
./count_and_update_total_fields.sh all --apply
The scripts can be invoked from any location. When successful, all the generated files will be copied to their corresponding folders.
A new mappings folder will be created inside the module's folder, containing all the generated files.
The files are versioned using the ECS version, so different versions of the same module can be generated.
For our use case, the most important files are under mappings/<ECS_VERSION>/generated/elasticsearch/legacy/:
template.json: Elasticsearch compatible index template for the moduleopensearch-template.json: OpenSearch compatible index template for the module
The original output is template.json, which is not compatible with OpenSearch by default.
In order to make this template compatible with OpenSearch, the following changes are made:
- The
orderproperty is renamed topriority. - The
mappingsandsettingsproperties are nested under thetemplateproperty.
The tooling takes care of these changes automatically, generating the opensearch-template.json file as a result.
If a module has been removed from the repository, the generate_schema.sh script will skip it gracefully, issuing a warning message instead of failing.
Uploading templates to the Indexer
You can either upload the index template using cURL or the UI (dev tools).
curl -u admin:admin -k -X PUT "https://indexer:9200/_index_template/wazuh-states-vulnerabilities" -H "Content-Type: application/json" -d @opensearch-template.json
curl -u admin:admin -k -X PUT "https://indexer:9200/template/wazuh-states-vulnerabilities" -H "Content-Type: application/json" -d @template.json
Notes:
- PUT and POST are interchangeable.
- The name of the index template does not matter. Any name can be used.
- Adjust credentials and URL accordingly.
Creating new modules
The easiest way to create a new module is to take an existing one as a base. Copy a similar module and renaming it. Then, edit the fields files to match the new module fields.
The name of the folder will be the name of the module to be passed to the script. All 3 files are required.
fields/subset.yml: This file contains the subset of ECS fields to be used for the module.fields/template-settings-legacy.json: This file contains the legacy template settings for the module.fields/template-settings.json: This file contains the composable template settings for the module.fields/custom: folder containg custom fields for the module. This folder is optional.
Important
Add the new module to the SetupPlugin.java file, so it is included in the installation process.
Event generators
Each module contains a Python script to generate events for its module. The script prompts for the required parameters, so it can be launched without arguments:
./event_generator.py
The script will generate a JSON file with the events, and will also ask whether to upload them to the indexer. If the upload option is selected, the script will ask for the indexer URL and port, credentials, and index name. The script uses log file. Check it out for debugging or additional information.
The run_event_generators.sh script can be used to run all the event generators in sequence. It will prompt for the indexer details only once, and will use them for all the modules.
GitHub Workflow
The schema generation is automated using a GitHub Workflow, defined in the 5_builderpackage_schema.yml file.
Troubleshooting
- In case you added a new module and the script is not recognizing it, make sure to run the
update_module_list.shscript to update the modules list AND to stage the changes in Git, as the script compares the current Git working directory with the last state in main.