Skip to main content

Logstash

Logstash is a log aggregator that collects data from various input resources, executes different transformations and enhancements, and then ships the data to various supported output destinations (Elasticsearch, file, database, etc.).

I. Install Logstash with Docker

First, create a folder named configuration/logstash to store configuration files, then create the following files in the newly created folder.

logstash.yml

Logstash configuration file with the following config:

xpack.monitoring.enabled: false
info

This setting is used to enable or disable monitoring of Logstash instances.

pipelines.yml

This file defines all the pipeline configuration files in this Logstash instance. Currently SAMO has 2 pipelines for auditing and logging messages:

- pipeline.id: pipeline1
path.config: "/usr/share/logstash/config/dev-audit-pipeline.conf"

- pipeline.id: pipeline2
path.config: "/usr/share/logstash/config/dev-logs-pipeline.conf"

dev-audit-pipeline.conf

In Logstash, a pipeline is a core concept that defines the flow of data processing. It consists of three main components: inputs, filters, and outputs.

input {
kafka {
codec => json
bootstrap_servers => "kafka:9092"
topics => ["dev_audit"]
}
}

filter {
}

output {
elasticsearch {
hosts => ["elasticsearch8:9200"]
data_stream => "true"
data_stream_dataset => "dev"
data_stream_namespace => "audit"
}
}

Where:

  • Inputs are the entry points where Logstash receives data. This data can come from a variety of sources such as files, syslog, HTTP, beats, Kafka, or even databases. In this example, Logstash only receives data from Kafka topic dev_audit.
  • Filters are used to process and transform the data. They can modify the structure of the data, parse it, add or remove fields, or apply complex transformations. In this example, we do not apply any filter.
  • Outputs define where the processed data should be sent. This could be an Elasticsearch cluster, a file, a database, or any other supported destination. In this implementation, the output is the Elasticsearch instance installed in the previous step.

dev-logs-pipeline.conf

input {
kafka {
codec => json
bootstrap_servers => "kafka:9092"
topics => ["dev_logs"]
}
}

filter {
}

output {
elasticsearch {
hosts => ["elasticsearch8:9200"]
data_stream => "true"
data_stream_dataset => "dev"
data_stream_namespace => "logs"
}
}

Modify docker-compose.yml

Finally, add the following logstash service to docker-compose.yml, where ${configuration_dir} points to the configuration/logstash folder created above:

logstash:
image: docker.asseco-ce.com/samo/server/samo-logstash:8.12.2
restart: always
volumes:
- ${configuration_dir:-./configuration/logstash}:/usr/share/logstash/config
depends_on:
- kafka

II. Install Logstash on Ubuntu Server

1. Install Logstash

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg

sudo apt-get install apt-transport-https

echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-8.x.list

sudo apt-get update && sudo apt-get install logstash

2. Set Up Logstash Configuration Files

Create a directory to store Logstash configuration:

sudo mkdir -p /etc/logstash/configuration
cd /etc/logstash/configuration

Now create and edit the necessary configuration files.

a. logstash.yml

sudo nano /etc/logstash/logstash.yml

Add:

xpack.monitoring.enabled: false

b. pipelines.yml

sudo nano /etc/logstash/pipelines.yml

Add (defines pipeline1 for audit logs and pipeline2 for application logs):

- pipeline.id: pipeline1
path.config: "/etc/logstash/configuration/dev-audit-pipeline.conf"

- pipeline.id: pipeline2
path.config: "/etc/logstash/configuration/dev-logs-pipeline.conf"

c. dev-audit-pipeline.conf

sudo nano /etc/logstash/configuration/dev-audit-pipeline.conf

Add (configures Logstash to read Kafka topic dev_audit and send data to Elasticsearch):

input {
kafka {
codec => json
bootstrap_servers => "kafka:9092"
topics => ["dev_audit"]
}
}

filter {
}

output {
elasticsearch {
hosts => ["http://localhost:9200"]
data_stream => "true"
data_stream_dataset => "dev"
data_stream_namespace => "audit"
}
}

d. dev-logs-pipeline.conf

sudo nano /etc/logstash/configuration/dev-logs-pipeline.conf

Add (configures Logstash to read Kafka topic dev_logs and send data to Elasticsearch):

input {
kafka {
codec => json
bootstrap_servers => "kafka:9092"
topics => ["dev_logs"]
}
}

filter {
}

output {
elasticsearch {
hosts => ["http://localhost:9200"]
data_stream => "true"
data_stream_dataset => "dev"
data_stream_namespace => "logs"
}
}

3. Start and Enable Logstash Service

Once the configuration is set, start Logstash:

sudo systemctl enable logstash
sudo systemctl start logstash
tip

Use sudo systemctl status logstash to check if Logstash is running correctly.