Centralizing Logs with Lumberjack, Logstash, and Elasticsearch

We've all done it. When we needed to find something in a log, we just did a cat or a tail and piped it to grep. Maybe we told grep to also show us X lines before and after what it found. That works fine and dandy when you've got just one or two servers. But what about 30, or 300, or 3000?

I've told myself for a long time that I needed to implement centralized logging. And up until mid-2012, I'd just put it off and put it off, and reminding myself that it needs to be in place before starting a Drupal hosting company - Anrhizan. And then out of nowhere, a friend of mine called me up and asked me to manage a site of his that was running 150,000+ pageviews per day. One part of the deal was that he wanted me to host it for him as the host he had at the time was utter crap. And with the popularity of the site, he wanted the downtime to stop. That meant at least a near high availability deployment, and that meant multiple servers.

That really showed the utility of centralized logging. We ran 2-3 web servers on typical days behind a load balancer, and if users reported a problem we need to know if it was across all servers or isolated to one. That meant pulling up each server in tmux and checking the logs on each one manually. That kind of work gets old really, really fast. 

So I asked the good folks in Freenode:#infra-talk what the current best option was for centralized logging, keeping in mind that I didn't have to do any migrations because I was starting from scratch. Within seconds I had multiple answers all saying the same thing: "logstash/elasticsearch/kibana". So I figured they must be right. Or at the very least, there'd be ample support available.

Elasticsearch

Elasticsearch is RESTful search server based on Apache Lucene. It's easily scalable and uses push replication in distributed environments. But really, the Logstash/Elasticsearch/Kibana integration is great! If it wasn't for how well these applications integrate with one another, I probably would have looked into putting my logs into SOLR (which I'm more familiar with).

As of Logstash 1.1.9, the Elasticsearch output plugin works with the 0.20.2 version of Elasticsearch, so even though Elasticsearch currently has a newer version available, we're going to use 0.20.2. (If you really need to use the most current version of Elasticsearch, you can use the elasticsearch_http output plugin to make Logstash interface with Elasticsearch's REST API. If you're using the elasticsearch output plugin, your versions must match between Logstash and Elasticsearch.)

First, you'll need to download a headless runtime environment. We run Squeeze around here, so we're using apt-get install default-jre-headless

Then, you'll need to download the .deb package.

 

wget http://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.20.2.deb
dpkg -i elasticsearch-0.20.2.deb 

 

You should see that elasticsearch is now running on your server. Stop it for now /etc/init.d/elasticsearch stop - We might want to set up some configuration. For example, you're probably going to be generating a lot of data. And even if you're not, I highly recommend using external block storage if you're on cloud hosting. That way, your data stays when your instance dies.

On some hosts, you may need to run modprobe acpiphp before attaching the block storage device for the first time. By the way, keep in mind that you will generate a LOT of I/O requests. I do NOT recommend setting this up at places like Amazon or HP Cloud where you get billed per 1,000,000 requests. That 1,000,000 may sound like a lot of requests, but if you're stashing 100GB of logs per month, don't be surprised when your I/O bill is more than $3,000. Just a heads up.

Next, you'll want to edit the configuration in /etc/elasticsearch/elasticsearch.yml. If you're on cloud hosting and running more than one node, odds are multicast won't work so you'll have to specify the IPs of your Elasticsearch nodes. You'll also need to make sure that the data directory has been properly specified. By the way, Rackspace does support multicast using Cloud Networks. If you've got a multicast-enabled network, then all that you'll need to do for your nodes to communicate with each other is ensure they are configured with the same cluster name.

After configuring Elasticsearch, you can go ahead and start it. If you're running more than one node, I recommend waiting a few minutes between starting each one (or check the health of the cluster first).

Logstash

Once again, you'll need a JRE to run this.

First, you'll want to generate self-signed SSL on logstash server: 

 

openssl req -x509 -newkey rsa:2048 -keyout /etc/ssl/logstash.key -out /etc/ssl/logstash.pub -nodes -days 365

 

You will need the public key for your Lumberjack shipper.

Next, you'll need to get the Logstash jar.

 

mkdir /opt/logstash
wget https://logstash.objects.dreamhost.com/release/logstash-1.1.9-monolithic.jar -O /opt/logstash/logstash.jar

 

You'll need to put together your logstash.conf file according to the documentation. Be sure that you have an output filter for Elasticsearch if you're going to store your logs there. If you put it in /etc/logstash/logstash.conf, you can use the following init script to run it. Otherwise, you'll need to run /usr/bin/java -jar /opt/logstash/logstash.jar agent -f <path-to-your.conf> -l <path-to-where-you-want-the.log>

 

#! /bin/sh

### BEGIN INIT INFO
# Provides:          logstash
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start daemon at boot time
# Description:       Enable service provided by daemon.
### END INIT INFO

. /lib/lsb/init-functions

name="logstash"
logstash_bin="/usr/bin/java -- -jar /opt/logstash/logstash.jar"
logstash_conf="/etc/logstash/logstash.conf"
logstash_log="/var/log/logstash.log"
pid_file="/var/run/$name.pid"

start () {
        command="${logstash_bin} agent -f $logstash_conf --log ${logstash_log}"

        log_daemon_msg "Starting $name" "$name"
        if start-stop-daemon --start --quiet --oknodo --pidfile "$pid_file" -b -m --exec $command; then
                log_end_msg 0
        else
                log_end_msg 1
        fi
}

stop () {
        log_daemon_msg "Stopping $name" "$name"
        start-stop-daemon --stop --quiet --oknodo --pidfile "$pid_file"
}

status () {
        status_of_proc -p $pid_file "" "$name"
}

case $1 in
        start)
                if status; then exit 0; fi
                start
                ;;
        stop)
                stop
                ;;
        reload)
                stop
                start
                ;;
        restart)
                stop
                start
                ;;
        status)
                status && exit 0 || exit $?
                ;;
        *)
                echo "Usage: $0 {start|stop|restart|reload|status}"
                exit 1
                ;;
esac

exit 0

 

By the way, any inputs in your conf file that are accepting data from a Lumberjack shipper need to use the Lumberjack input filter. While it may be tempting to receive logs over UDP due to reduce overhead, I highly recommend using TCP. UDP is great if you don't care if you actually receive it or not, but we want to receive all of our logs. TCP is a better choice in this case for better accounting since it will attempt to resend data that wasn't acknowledged to have been received.

Also, here is a sample config for /etc/logstash/logstash.conf:

input {
  lumberjack {
    port => 5000
    type => syslog
    ssl_certificate => "/etc/ssl/logstash.pub"
    ssl_key => "/etc/ssl/logstash.key"
  }
  lumberjack {
    type => "nginx_access"
    port => 5001
    ssl_certificate => "/etc/ssl/logstash.pub"
    ssl_key => "/etc/ssl/logstash.key"
  }
  lumberjack {
    type => "varnishncsa"
    port => "5002"
    ssl_certificate => "/etc/ssl/logstash.pub"
    ssl_key => "/etc/ssl/logstash.key"
  }
}

filter {
  # Lumberjack sends custom fields. We're going to use those for multi-user
  # Kibana access control.
  mutate {
    add_tag => [ "%{customer}" ]
  }
  mutate {
    remove => [ "customer" ]
  }

  # strip the syslog PRI part and create facility and severity fields.
  # the original syslog message is saved in field %{syslog_raw_message}.
  # the extracted PRI is available in the %{syslog_pri} field.
  #
  # You get %{syslog_facility_code} and %{syslog_severity_code} fields.
  # You also get %{syslog_facility} and %{syslog_severity} fields if the
  # use_labels option is set True (the default) on syslog_pri filter.
  grok {
    type => "syslog"
    pattern => [ "<%{POSINT:syslog_pri}>%{SPACE}%{GREEDYDATA:message_remainder}" ]
    add_tag => "got_syslog_pri"
    add_field => [ "syslog_raw_message", "%{@message}" ]
  }
  syslog_pri {
    type => "syslog"
    tags => [ "got_syslog_pri" ]
  }
  mutate {
    type => "syslog"
    tags => [ "got_syslog_pri" ]
    replace => [ "@message", "%{message_remainder}" ]
  }
  mutate {
    # XXX must not be combined with replacement which uses same field
    type => "syslog"
    tags => [ "got_syslog_pri" ]
    remove => [ "message_remainder" ]
  }

  # strip the syslog timestamp and force event timestamp to be the same.
  # the original string is saved in field %{syslog_timestamp}.
  # the original logstash input timestamp is saved in field %{received_at}.
  grok {
    type => "syslog"
    pattern => [ "%{SYSLOGTIMESTAMP:syslog_timestamp}%{SPACE}%{GREEDYDATA:message_remainder}" ]
    add_tag => "got_syslog_timestamp"
    add_field => [ "received_at", "%{@timestamp}" ]
  }
  mutate {
    type => "syslog"
    tags => [ "got_syslog_timestamp" ]
    replace => [ "@message", "%{message_remainder}" ]
  }
  mutate {
    # XXX must not be combined with replacement which uses same field
    type => "syslog"
    tags => [ "got_syslog_timestamp" ]
    remove => [ "message_remainder" ]
  }
  date {
    type => "syslog"
    tags => [ "got_syslog_timestamp" ]
    # season to taste for your own syslog format(s)
    syslog_timestamp => [ "MMM  d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601" ]
  }

  # strip the host field from the syslog line.
  # the extracted host field becomes the logstash %{@source_host} metadata
  # and is also available in the filed %{syslog_hostname}.
  # the original logstash source_host is saved in field %{logstash_source}.
  grok {
    type => "syslog"
    pattern => [ "%{SYSLOGHOST:syslog_hostname}%{SPACE}%{GREEDYDATA:message_remainder}" ]
    add_tag => [ "got_syslog_host", "%{syslog_hostname}" ]
    add_field => [ "logstash_source", "%{@source_host}" ]
  }
  mutate {
    type => "syslog"
    tags => [ "got_syslog_host" ]
    replace => [ "@source_host", "%{syslog_hostname}" ]
    replace => [ "@message", "%{message_remainder}" ]
  }
  mutate {
    # XXX must not be combined with replacement which uses same field
    type => "syslog"
    tags => [ "got_syslog_host" ]
    remove => [ "message_remainder" ]
  }

  grok {
    type => "nginx_access"
    pattern => "%{COMBINEDAPACHELOG}"
  }
  grok {
    type => "varnishncsa"
    pattern => "%{COMBINEDAPACHELOG}"
  }
}

output {
#  stdout { debug => true debug_format => "json" }

  elasticsearch {
    cluster => "my-logstash-cluster"
  }
}

By the way, you'll notice that a lot of the filters have "type" set, as well as the inputs. This is so you can limit which type a filter is acting on. Additionally, you can add multiple outputs that send only certain types to different destinations. For example, you might want to send Varnish and Nginx logs to statsd in addition to Elasticsearch, or maybe you want to send certain events to PagerDuty, or perhaps even to IRC. There are currently over 45 outputs that come with Logstash, and if what you need isn't there you can always write your own. By the way, Lumberjack isn't the only input; it's just the one we're focusing on here.

Also, the above logstash.conf assumes that you have Grok installed and that you're running on a multicast-enabled network. Also, don't get discouraged waiting for Logstash to start. It typically takes 70-90 seconds in my experience, although I've seen it take up to two minutes to start up and fully connect to Elasticsearch. I banged my head against the wall my first day with Logstash trying to figure out what was wrong - and then I went and got another cup of coffee and returned back to see it finally running.

Start Shipping with Lumberjack

I know, you're probably thinking one of two things at this point.

  1. Logstash can ship logs, too.
  2. Rsyslog can ship logs and ships with most distros.

And you're right. But when you're shipping a lot of logs, especially on a system with limited resources, you need a lightweight solution that does nothing but ship. Enter Lumberjack.

Remember, Logstash is Java. Running as a shipper, its probably going to be using a significant part of your memory. While that might be acceptable on a 32GB server, if your deployment consists of mostly 512MB or 1GB cloud servers for web nodes then you're giving up a significant chunk of resources that can be put to better use for other tasks.

The next option might be to ship using Rsyslog. But even then, I've seen that grow to over 60MB of memory usage, and I've heard people talk about over 100MB of memory usage. For shipping logs.

Both of those scenarios are unacceptable to me. Lumberjack uses a whopping 64K in my experience. (For reference, the most busy server I have at this moment generates >1M entries per day from the Nginx logs.)

First, you'll need to clone the repo, and then it's just the usual "make". However, to make my life a little easier, I like to use a neat little Rubygem called "fpm". And with that, I run  "make deb" after running "make" and it gives me a Debian package that I can reuse over and over again.

apt-get install rubygems
gem install fpm
export PATH=$PATH:/var/lib/gems/1.8/bin
git clone https://github.com/jordansissel/lumberjack.git
cd lumberjack
make
make deb
dpkg -i lumberjack_0.0.8_amd64.deb

So now that we have Lumberjack, how do we run it? Basically, you just need to tell Lumberjack what files to ship to Logstash. The basic command is "/opt/lumberjack/bin/lumberjack --host your.logstash.host --port port-for-these-logs --ssl-ca-path /etc/ssl/logstash.pub /path/to/your.log /path/to/your/other.log". You can also pass custom fields using "--field key=value". (The custom fields can be useful in a multi-user Kibana environment.)

To make my life easier, I run the following init script:

#! /bin/sh
# BEGIN INIT INFO
# Provides:          lumberjack-nginx
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start daemon at boot time
# Description:       Enable service provided by daemon.
### END INIT INFO

. /lib/lsb/init-functions

name="lumberjack-nginx"

# Read configuration variable file if it is present
[ -r /etc/default/$name ] && . /etc/default/$name

lumberjack_bin="/opt/lumberjack/bin/lumberjack.sh"
pid_file="/var/run/$name.pid"
cwd=`pwd`

start () {
        command="${lumberjack_bin}"

        if start-stop-daemon --start --quiet --oknodo --pidfile "$pid_file" -b -m -N 19 --exec $command -- $LUMBERJACK_OPTIONS; then
                log_end_msg 0
        else
                log_end_msg 1
        fi
}

stop () {
        start-stop-daemon --stop --quiet --oknodo --pidfile "$pid_file"
}

status () {
        status_of_proc -p $pid_file "" "$name"
}

case $1 in
        start)
                if status; then exit 0; fi
                echo -n "Starting $name: "
                start
                echo "$name."
                ;;
        stop)
                echo -n "Stopping $name: "
                stop
                echo "$name."
                ;;
        restart)
                echo -n "Restarting $name: "
                stop
                sleep 1
                start
                echo "$name."
                ;;
        status)
                status && exit 0 || exit $?
                ;;
        *)
                echo "Usage: $0 {start|stop|restart|status}"
                exit 1
                ;;
esac

exit 0

You might have noticed it's called "lumberjack-nginx". That's because I run a separate daemon for shipping different log formats, such as MySQL, Varnish, Syslog, etc. The options are defined in /etc/default/lumberjack-nginx (or whatever you're shipping.)

LUMBERJACK_OPTIONS="--field customer=clientname --host 172.16.0.52 --port 5001 --ssl-ca-path /etc/ssl/logstash.pub /var/log/nginx/access.log"

Bonus: Kibana

Kibana is awesome. No really. It's a really great front-end for Elasticsearch and makes analyzing your logs much, much more enjoyable. It's super simple to install, too.

First, you'll need to clone the repo: git clone --branch=kibana-ruby https://github.com/rashidkpc/Kibana.git

git clone --branch=kibana-ruby https://github.com/rashidkpc/Kibana.git /opt/kibana
apt-get install rubygems libcurl4-openssl-dev
export PATH=$PATH:/var/lib/gems/1.8/bin
cd /opt/kibana
bundle install

By the way, there's also a "kibana-ruby-auth" branch for multiuser installs. It seems to work pretty good from my testing, although there have been several bugs in the interface. At last checkout, there is an issue with the "Home" button linking to "index.html" which needs to be changed to "/". I'm not up to speed on Ruby, so I don't know what the "right" fixes are, but I get by.

You'll also need to install the following packages:

apt-get install rubygems libcurl4-openssl-dev

If you want to play with the multiuser install (not recommended for production at this time), you'll also need the libpam0g-dev package.

After that, you just need to make some minor changes to KibanaConfig.rb (such as pointing it to the right Elasticsearch server and making sure Kibana will listen on the right ports) and run "ruby kibana.rb", or use this handy init script:

#!/bin/bash

# Author - Matt Mencel - 2012
# Borrowed heavily from Patrick McKenzie's example
# http://www.kalzumeus.com/2010/01/15/deploying-sinatra-on-ubuntu-in-which-i-employ-a-secretary/
#
# kibana-daemon       Startup script for the Kibana Web UI running on sinatra
# chkconfig: 2345 20 80
#
## Copy to /etc/init.d/kibana and make it executable
## Add to system startup through chkconfig (CentOS/RHEL) or Upstart (Ubuntu)
KIBANA_PATH="/opt/kibana"

ruby1.8 $KIBANA_PATH/kibana-daemon.rb $1
RETVAL=$?

exit $RETVAL

So... what does this have to do with Drupal?

It will make your Drupal logging life easier.

No, really. I'm sure that most people have their Drupal site send all log messages to the database, and that's fine if you have a low traffic site. But what about when you need to actually search for something, such as the messages associated with a particular issue your users are sporadically experiencing? If you've ever tried to attempt that through the DB log, you probably realize how painful it can be.

Why not take advantage of another built-in feature of Drupal? Drupal can send its log messages to syslog (via the core Syslog module). If you want to go to a specific file only for Drupal, there's a contrib module for that. With Drupal logging to the filesystem, you take a load off of the database and you're able to easily ship your logs elsewhere. They can be sent to be indexed in Elasticsearch for easy search and analysis through Kibana, or to several other places such as Graphite, Email, IRC, PagerDuty, etc. You get to decide how and where your logs are processed.

And if you have the spare RAM and aggressive log rotation, you could just send your logs to tmpfs and avoid disk load altogether. After all, you are shipping them out to be stored elsewhere.

Even if you can't log to a file, Logstash can help. Logstash has an input for Drupal's DBLog. All you have to do is tell it what database to connect to and it handles the rest. It only pulls the log messages that it hasn't received previously. This can be useful if you don't want to deal with configuring Drupal to log to Syslog or another file, or if you're one of those crazy types that runs multiple web heads without configuration management to ensure that everyone is shipping file-based logs properly and consistently.

One situation where I've found shipping Drupal logs useful is debugging. A good Drupal module logs errors (and optional debugging messages) through watchdog. If you're just leaving those watchdogs in DBLog or syslog, you've got a tough time ahead of you searching for issues. If you're having your watchdogs shipped through Logstash, you can have them indexed in Elasticsearch and search through the messages, or you can have them graphed by statsd to see if there's a trend for when your users are hitting errors, or you can have your developers nagged in IRC.

Another useful situation is easily tracking logins and user signups. Now you're able to easily build charts showing when users are signing up or how often they are logging in, as well as figure out if someone is setting up multiple fake accounts.

I love this so far! I want more!

Don't worry, I'll keep blogging about some tips and tricks as time allows. Business has just kept me busy in recent months (which is what it's supposed to do). This post has actually been in a draft state for a few months now, with little additions as time became available. An assumption was made that you are your Elasticsearch, Logstash, and Kibana servers are on separate hosts. As usual, adjust to your specific needs.

By the way, I'll also be doing a session on this topic at Drupalcon Portland. You'll definitely want to be there!

Comments

hello,
thank for your post it`exactly what i`m looking for,not for logging errors but for implement a user tracking system.

Heads up: The stable, C version of lumberjack has moved into a branch on GitHub. I used:

```
cd lumberjack
git checkout -b c origin/c
make
make deb
```

is there a windows version for lumberjack ? or a similar lightweight windows process to ship logs to logstash ? currently I am logging to file using NLog.

You should update this to include embedded => true in the Logstash elasticsearch output section. They must have changed something recently that affected the defaults and Kibana will not work since it wont bring up ports 9200 and 9300.

Hi,

I love this post. I have a question. Why don't you use redis ?

As far as I know, Redis doesn't support encryption or compression, both of which I find important in some circumstances. I can deal with a few duplicate events here and there.

I try to weigh added complexity against real benefits, and I couldn't find enough of a benefit to add the workload of managing another application.

Perhaps if I had another use for Redis, I might consider it in the future.

Hi,

Thanks a lot for your post, I would like to know more about multi-user Kibana access control which you mentioned. I tried to search for it, but haven't found anything valuable yet. In our systems it is absolutely mandatory to have access control, meaning both authentication to access Kibana but also controlling what logs can be seen & searched by which user. If you have any information on this topic please tell me about it.
Thanks!

Sorry for being so late in getting back to this. It's been pretty busy around here.

The multi-user option mentioned in the blog post is from the kibana-ruby-auth branch. Going that route, you're able to set the tags that a user is allowed to see. Lumberjack doesn't ship tags, so you can ship a custom field with Lumberjack and have Logstash turn that field into a tag and then remove that field before shipping to Elasticsearch.

Currently, an option with Kibana 3 is to put it behind HTTP Basic authentication with either Nginx or Apache and rewrite to a Kibana installation path that has that user's configuration. If you're needing to limit on the Elasticsearch side (since ES doesn't include authentication), you can set up an Nginx proxy in front of it that rewrites HTTP Basic authentication to a particular user's index.

how to configure email alerting for any random string in my application server logs like OutOfMemory...etc in weblogic jboss logs

Make sure the logs coming in from those sources are being typed by Logstash as those sources. Then use the grep filter to grep for those strings and add a tag. Then use the email output to email those events that have had that tag added.

(Note that the docs I linked are for 1.1.13. The current version is 1.2.1, and some things like "type" have been deprecated. I haven't had the chance to work with the 1.2.* versions yet, so I'm not sure what the proper answer would be on the current version at this time.)

Hey, grate article!

What about ES mapping for this various log sources? Have you customised it?

Thanks for the post!
I ran into a problem when trying this out with the following error:
Exception in thread "LogStash::Runner" org.jruby.exceptions.RaiseException: (LoadError) no such file to load -- base64

I figured posting here might help others, it was related to running the web component in the same process:
https://logstash.jira.com/browse/LOGSTASH-1310
The fix was to run the web component before the agent as per the workaround in the link above:
java -jar logstash-1.2.2-flatjar.jar web -- agent -f /etc/logstash/logstash-simple.conf

Great article, you have explained everything in detail, in my previous installation i have also setup "redis" as broker between logstash and ES.

However it was a test installation, now moving to production i am planning to use lumberjack instead of logstash shipper. the only thing confusing me is you are not using "redis" anywhere. Do you have any idea what is the benefit of using "redis". To me it seems an extra thing/overhead put between beautiful setup of Logstash/ES setup.

The primary benefit of using Redis is to act as a FIFO buffer. The log parsing process in Logstash can be very CPU intensive, so Redis can help to make sure that your messages from a sudden burst in activity actually make it to Logstash.

That said, with Lumberjack you can specify multiple hosts to ship logs to and it will pick a host at random.

This is very useful thanks. But I wonder if you could provide any wisdom on:

Visualizing
Notifications

I'm looking to set up a decent logging system to collect data from several servers. And while I can see this would be perfect in terms of collection, and that Kibana will do a great job of searching through the logs: I need a visual overview of the health of say the web-servers, and other things. I also need notifications when an exception is thrown etc. On these points the options are many and varied - I'd appreciate if you had any suggestions? Is there anything that can specifically tie into this?

You need to look at something like nagios or another monitoring system that can alert you when something is triggered, this is just to setup centralized logging.

As Johnson said, Nagios would be a very good option for monitoring. However, you can also use the filters in Logstash to route certain data to other places like Statsd, Nagios, PagerDuty, etc.

For example, I usually route HTTP status codes and error message counts from Logstash to Statsd to get a graph in Graphite of trends over time. Certain very critical messages get routed directly to PagerDuty to nag me with emails and text messages.

The logstash/kibana/elasticsearch/Lumberjack combo is very cool for log mgmt and searching etc. Has anyone tried to add esper to the combo? I'm thinking of trying to it to have it all act like a SIEM platform.

Thanks for the article. Very informative.

We have lumberjack => logstash => graylog2 system setup for our logs indexing and searching. Lumberjack acting as log shipper.
We are trying to optimize resource usage specially memory usage which lumberjack is taking on app server.

Below is our lumberjack configuration:
{
"network": {
"servers": [ "logstash-server-ip" ],
"ssl ca": "/mnt/logstash-forwarder/server.crt",
"ssl key":"/mnt/logstash-forwarder/server.key",
"timeout": 15
},
"files": [
{
"paths": [
"/mnt/fabus8/shared/log/staging.log",
"/mnt/nginx/logs/error.log",
"/mnt/nginx/logs/access.log"
],
"fields": { "type": "staging" }
}
]
}

Currently each lumberjack is taking around 430+ MB on each app server.
Are we doing anything wrong here?
Is there any way we can reduce this RAM usage?

Thanks
Abhay

thanks for the great article. It's really helpful

In logstash documentation( http://logstash.net/docs/1.3.3/tutorials/getting-started-centralized ) an architecture is given that is as below
Logstash Shipper --> redis-->logstash Indexer--->ElasticSearch
Here my question is, since Logstash has output plugin for elasticSearch can we simplify the architecture to
logstash shipper-->elasticsearch?
Please explain in detail about the need of redis and logstash indexer in between.

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.