Logstash
Logstash is a log aggregator that collects data from various input resources, executes different transformations and enhancements, and then ships the data to various supported output destinations (Elasticsearch, file, database, etc.).
I. Install Logstash with Docker
First, create a folder named configuration/logstash to store configuration files, then create the following files in the newly created folder.
logstash.yml
Logstash configuration file with the following config:
xpack.monitoring.enabled: false
This setting is used to enable or disable monitoring of Logstash instances.
pipelines.yml
This file defines all the pipeline configuration files in this Logstash instance. Currently SAMO has 2 pipelines for auditing and logging messages:
- pipeline.id: pipeline1
path.config: "/usr/share/logstash/config/dev-audit-pipeline.conf"
- pipeline.id: pipeline2
path.config: "/usr/share/logstash/config/dev-logs-pipeline.conf"
dev-audit-pipeline.conf
In Logstash, a pipeline is a core concept that defines the flow of data processing. It consists of three main components: inputs, filters, and outputs.
input {
kafka {
codec => json
bootstrap_servers => "kafka:9092"
topics => ["dev_audit"]
}
}
filter {
}
output {
elasticsearch {
hosts => ["elasticsearch8:9200"]
data_stream => "true"
data_stream_dataset => "dev"
data_stream_namespace => "audit"
}
}
Where:
- Inputs are the entry points where Logstash receives data. This data can come from a variety of sources such as files, syslog, HTTP, beats, Kafka, or even databases. In this example, Logstash only receives data from Kafka topic
dev_audit. - Filters are used to process and transform the data. They can modify the structure of the data, parse it, add or remove fields, or apply complex transformations. In this example, we do not apply any filter.
- Outputs define where the processed data should be sent. This could be an Elasticsearch cluster, a file, a database, or any other supported destination. In this implementation, the output is the Elasticsearch instance installed in the previous step.
dev-logs-pipeline.conf
input {
kafka {
codec => json
bootstrap_servers => "kafka:9092"
topics => ["dev_logs"]
}
}
filter {
}
output {
elasticsearch {
hosts => ["elasticsearch8:9200"]
data_stream => "true"
data_stream_dataset => "dev"
data_stream_namespace => "logs"
}
}
Modify docker-compose.yml
Finally, add the following logstash service to docker-compose.yml, where ${configuration_dir} points to the configuration/logstash folder created above:
logstash:
image: docker.asseco-ce.com/samo/server/samo-logstash:8.12.2
restart: always
volumes:
- ${configuration_dir:-./configuration/logstash}:/usr/share/logstash/config
depends_on:
- kafka
II. Install Logstash on Ubuntu Server
1. Install Logstash
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
sudo apt-get install apt-transport-https
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-8.x.list
sudo apt-get update && sudo apt-get install logstash
2. Set Up Logstash Configuration Files
Create a directory to store Logstash configuration:
sudo mkdir -p /etc/logstash/configuration
cd /etc/logstash/configuration
Now create and edit the necessary configuration files.
a. logstash.yml
sudo nano /etc/logstash/logstash.yml
Add:
xpack.monitoring.enabled: false
b. pipelines.yml
sudo nano /etc/logstash/pipelines.yml
Add (defines pipeline1 for audit logs and pipeline2 for application logs):
- pipeline.id: pipeline1
path.config: "/etc/logstash/configuration/dev-audit-pipeline.conf"
- pipeline.id: pipeline2
path.config: "/etc/logstash/configuration/dev-logs-pipeline.conf"
c. dev-audit-pipeline.conf
sudo nano /etc/logstash/configuration/dev-audit-pipeline.conf
Add (configures Logstash to read Kafka topic dev_audit and send data to Elasticsearch):
input {
kafka {
codec => json
bootstrap_servers => "kafka:9092"
topics => ["dev_audit"]
}
}
filter {
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
data_stream => "true"
data_stream_dataset => "dev"
data_stream_namespace => "audit"
}
}
d. dev-logs-pipeline.conf
sudo nano /etc/logstash/configuration/dev-logs-pipeline.conf
Add (configures Logstash to read Kafka topic dev_logs and send data to Elasticsearch):
input {
kafka {
codec => json
bootstrap_servers => "kafka:9092"
topics => ["dev_logs"]
}
}
filter {
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
data_stream => "true"
data_stream_dataset => "dev"
data_stream_namespace => "logs"
}
}
3. Start and Enable Logstash Service
Once the configuration is set, start Logstash:
sudo systemctl enable logstash
sudo systemctl start logstash
Use sudo systemctl status logstash to check if Logstash is running correctly.