Filebeat ist ein leichtgewichtiger Log-Shipper aus der Beats-Familie. Er sammelt Logs von Servern und sendet sie an Elasticsearch oder Logstash.

Vorteile von Filebeat

Beats vs. Logstash

| Feature | Filebeat | Logstash | |---------|----------|----------| | Ressourcenverbrauch | Niedrig (~50MB RAM) | Hoch (~1GB RAM) | | Konfiguration | Einfach | Komplex | | Transformation | Begrenzt | Umfangreich | | Backpressure | Ja | Ja | | Module | Vorgefertigt | Keine |

Wann Filebeat

- Log-Sammlung von vielen Servern
- Ressourcen-beschränkte Umgebungen
- Standard-Log-Formate (Nginx, Apache, Syslog)
- Direkte Elasticsearch-Anbindung

Installation

Debian/Ubuntu

# Repository hinzufügen
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | gpg --dearmor -o /usr/share/keyrings/elasticsearch.gpg
echo "deb [signed-by=/usr/share/keyrings/elasticsearch.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | tee /etc/apt/sources.list.d/elastic-8.x.list

# Installieren
apt update
apt install filebeat

CentOS/RHEL

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

cat > /etc/yum.repos.d/elastic.repo << EOF
[elastic-8.x]
name=Elastic repository
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
EOF

dnf install filebeat

Service

systemctl enable filebeat
systemctl start filebeat

Grundkonfiguration

filebeat.yml

# /etc/filebeat/filebeat.yml

filebeat.inputs:
  - type: filestream
    id: my-logs
    enabled: true
    paths:
      - /var/log/*.log

output.elasticsearch:
  hosts: ["localhost:9200"]
  username: "elastic"
  password: "password"

setup.kibana:
  host: "localhost:5601"

Konfiguration testen

# Syntax prüfen
filebeat test config

# Output testen
filebeat test output

Inputs

Filestream (empfohlen)

filebeat.inputs:
  - type: filestream
    id: nginx-access
    enabled: true
    paths:
      - /var/log/nginx/access.log
    fields:
      log_type: nginx-access
    fields_under_root: true

Mehrere Inputs

filebeat.inputs:
  - type: filestream
    id: nginx-access
    paths:
      - /var/log/nginx/access.log
    tags: ["nginx", "access"]

  - type: filestream
    id: nginx-error
    paths:
      - /var/log/nginx/error.log
    tags: ["nginx", "error"]

  - type: filestream
    id: syslog
    paths:
      - /var/log/syslog
      - /var/log/messages
    tags: ["system"]

Exclude/Include

filebeat.inputs:
  - type: filestream
    id: app-logs
    paths:
      - /var/log/app/*.log
    exclude_files: ['\.gz$', '\.bak$']
    include_lines: ['^ERR', '^WARN']
    exclude_lines: ['^DEBUG']

Multiline Logs

filebeat.inputs:
  - type: filestream
    id: java-logs
    paths:
      - /var/log/app/app.log
    parsers:
      - multiline:
          type: pattern
          pattern: '^\d{4}-\d{2}-\d{2}'
          negate: true
          match: after

JSON Logs

filebeat.inputs:
  - type: filestream
    id: json-logs
    paths:
      - /var/log/app/*.json
    parsers:
      - ndjson:
          target: ""
          add_error_key: true

Outputs

Elasticsearch

output.elasticsearch:
  hosts: ["https://es1:9200", "https://es2:9200"]
  username: "elastic"
  password: "${ES_PASSWORD}"
  ssl:
    certificate_authorities: ["/etc/filebeat/ca.crt"]
  index: "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"

Logstash

output.logstash:
  hosts: ["logstash:5044"]
  ssl:
    certificate_authorities: ["/etc/filebeat/ca.crt"]
  loadbalance: true

Kafka

output.kafka:
  hosts: ["kafka1:9092", "kafka2:9092"]
  topic: 'logs-%{[fields.log_type]}'
  partition.round_robin:
    reachable_only: false
  required_acks: 1
  compression: gzip

File (Debugging)

output.file:
  path: "/tmp/filebeat"
  filename: filebeat

Module

Verfügbare Module

# Module auflisten
filebeat modules list

# Modul aktivieren
filebeat modules enable nginx mysql system

# Modul deaktivieren
filebeat modules disable nginx

Nginx-Modul

# /etc/filebeat/modules.d/nginx.yml

- module: nginx
  access:
    enabled: true
    var.paths: ["/var/log/nginx/access.log*"]

  error:
    enabled: true
    var.paths: ["/var/log/nginx/error.log*"]

System-Modul

# /etc/filebeat/modules.d/system.yml

- module: system
  syslog:
    enabled: true
    var.paths: ["/var/log/syslog*", "/var/log/messages*"]

  auth:
    enabled: true
    var.paths: ["/var/log/auth.log*", "/var/log/secure*"]

MySQL-Modul

# /etc/filebeat/modules.d/mysql.yml

- module: mysql
  error:
    enabled: true
    var.paths: ["/var/log/mysql/error.log*"]

  slowlog:
    enabled: true
    var.paths: ["/var/log/mysql/slow.log*"]

Apache-Modul

# /etc/filebeat/modules.d/apache.yml

- module: apache
  access:
    enabled: true
    var.paths: ["/var/log/apache2/access.log*"]

  error:
    enabled: true
    var.paths: ["/var/log/apache2/error.log*"]

Setup laden

# Dashboards und Index Templates
filebeat setup

# Nur Dashboards
filebeat setup --dashboards

# Nur Index Template
filebeat setup --index-management

Processors

Hinzufügen/Entfernen

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded

  - add_cloud_metadata: ~

  - add_docker_metadata: ~

  - add_fields:
      target: ''
      fields:
        environment: production

  - drop_fields:
      fields: ["agent.ephemeral_id", "ecs.version"]

Bedingte Verarbeitung

processors:
  - drop_event:
      when:
        contains:
          message: "DEBUG"

  - add_tags:
      tags: ["error"]
      when:
        contains:
          message: "ERROR"

Dissect/Grok

processors:
  - dissect:
      tokenizer: "%{client_ip} %{method} %{path} %{status}"
      field: "message"
      target_prefix: "http"

  # Oder mit Script
  - script:
      lang: javascript
      source: >
        function process(event) {
          var msg = event.Get("message");
          if (msg.includes("error")) {
            event.Put("level", "error");
          }
        }

Autodiscover

Docker Autodiscover

filebeat.autodiscover:
  providers:
    - type: docker
      hints.enabled: true
      hints.default_config:
        type: container
        paths:
          - /var/lib/docker/containers/${data.container.id}/*.log

Kubernetes Autodiscover

filebeat.autodiscover:
  providers:
    - type: kubernetes
      node: ${NODE_NAME}
      hints.enabled: true
      hints.default_config:
        type: container
        paths:
          - /var/log/containers/*${data.kubernetes.container.id}.log

Hints-Annotationen

# Pod-Annotation
annotations:
  co.elastic.logs/enabled: "true"
  co.elastic.logs/module: nginx
  co.elastic.logs/fileset.stdout: access
  co.elastic.logs/fileset.stderr: error

Ingest Pipelines

Pipeline in Elasticsearch

PUT _ingest/pipeline/nginx-pipeline
{
  "description": "Nginx log parsing",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["%{COMBINEDAPACHELOG}"]
      }
    },
    {
      "date": {
        "field": "timestamp",
        "formats": ["dd/MMM/yyyy:HH:mm:ss Z"]
      }
    },
    {
      "remove": {
        "field": "timestamp"
      }
    }
  ]
}

Pipeline in Filebeat

output.elasticsearch:
  hosts: ["localhost:9200"]
  pipeline: "nginx-pipeline"

Pipeline per Input

filebeat.inputs:
  - type: filestream
    id: nginx
    paths:
      - /var/log/nginx/access.log
    pipeline: nginx-pipeline

  - type: filestream
    id: app
    paths:
      - /var/log/app/*.log
    pipeline: app-pipeline

Monitoring

Self-Monitoring

monitoring.enabled: true
monitoring.elasticsearch:
  hosts: ["localhost:9200"]

Metriken

# Filebeat Metriken
curl localhost:5066/stats

# Registry
ls /var/lib/filebeat/registry/

Hochverfügbarkeit

Loadbalancing

output.elasticsearch:
  hosts: ["es1:9200", "es2:9200", "es3:9200"]
  loadbalance: true

# Oder Logstash
output.logstash:
  hosts: ["logstash1:5044", "logstash2:5044"]
  loadbalance: true

Retry und Queue

# Memory Queue
queue.mem:
  events: 4096
  flush.min_events: 512
  flush.timeout: 5s

# Retry bei Fehler
output.elasticsearch:
  hosts: ["localhost:9200"]
  bulk_max_size: 50
  backoff.init: 1s
  backoff.max: 60s

Troubleshooting

Debug-Modus

# Mit Debug-Output
filebeat -e -d "*"

# Nur spezifische Komponenten
filebeat -e -d "publish"

Registry

# Registry anzeigen
cat /var/lib/filebeat/registry/filebeat/log.json | jq

# Registry zurücksetzen
systemctl stop filebeat
rm -rf /var/lib/filebeat/registry/
systemctl start filebeat

Häufige Probleme

# Keine Logs werden gesendet
# → Pfade prüfen
# → Berechtigungen prüfen
# → Output testen: filebeat test output

# Logs werden doppelt gesendet
# → Registry beschädigt → Löschen und neu starten

# Hohe Latenz
# → Queue-Einstellungen anpassen
# → Bulk-Größe erhöhen

Zusammenfassung

| Komponente | Funktion | |------------|----------| | Inputs | Log-Quellen definieren | | Processors | Daten transformieren | | Outputs | Ziele definieren | | Modules | Vorkonfigurierte Lösungen |

| Befehl | Funktion | |--------|----------| | filebeat test config | Config prüfen | | filebeat test output | Output testen | | filebeat modules list | Module anzeigen | | filebeat setup | Dashboards laden | | filebeat -e -d "*" | Debug-Modus |

| Datei | Beschreibung | |-------|--------------| | /etc/filebeat/filebeat.yml | Hauptkonfiguration | | /etc/filebeat/modules.d/ | Modul-Configs | | /var/lib/filebeat/registry/ | State-Tracking |

Fazit

Filebeat ist ideal für Log-Sammlung mit minimalen Ressourcen. Die Module decken die meisten Standard-Anwendungsfälle ab und können direkt mit Elasticsearch oder Logstash kommunizieren. Für komplexe Transformationen bleibt Logstash die bessere Wahl, aber für einfaches Log-Shipping ist Filebeat unschlagbar effizient. Die Autodiscover-Funktion macht es besonders attraktiv für Container-Umgebungen.