Sending Java applications logs to Elasticsearch using Fluent Bit

Written by Sudhanshu Prajapati in Fluent Biton March 29, 2023

Java application logs can be very useful for debugging and understanding what is happening in your application. However, if you have a lot of logs, it can be challenging to analyze and understand them. That’s where Fluent Bit comes in. 

Fluent Bit helps to solve this problem by providing a flexible and configurable logging pipeline that can parse, filter, and route logs to multiple destinations. This allows you to easily segregate and filter logs based on the information contained within them, making it easier to identify and troubleshoot issues.

In this blog, we will demonstrate how to use Fluent Bit to send Java application logs to Elasticsearch (ES).

The Problem of Multiline Logs

To ensure accurate log data is sent to Elasticsearch, it’s crucial to properly configure Fluent Bit to parse log messages. Failure to do so can result in incomplete or incorrect data being sent.

The challenge comes with multiline logs, which occur when information about a single event is written as multiple lines in the log file. 

Without multiline parsing, Fluent Bit will treat each line of a multiline log message as a separate log record. This can lead to:

  • Duplicated logs

  • Loss of context

  • Inability to extract structured data

To handle multiline log messages properly, you need to configure the multiline parser in Fluent Bit. The built-in or configurable multiline parser can be used based on your log format.

The regex filter can then be used to extract structured data from the parsed multiline log messages. The extracted fields can be used to enrich your log data. 

We’ve covered this problem in more detail with examples of [how to configure Fluent Bit to properly handle multiline logs](link to come)  in another blog post. If you are unfamiliar with the process, you should review that post before proceeding. 


For this demo, you will need to have Docker and Docker Compose installed. If you don’t have it already installed, you can follow the install docker-compose official documentation, which has very well-articulated steps. Lastly, you need an Elastic Cloud Account; even a trial account would work for this demo; head over to sign up for a trial.

Once you’re done with the installation, let’s look at the configuration for Fluent bit. The below configuration is fairly simple and not meant for production use.

Configure Fluent Bit

Fluent Bit can be configured using a configuration file or environment variables. The configuration file is written in a simple syntax, and it allows for easy management of complex pipelines. Environment variables can also be used to configure Fluent Bit, and they provide a simple way to pass configuration data without needing a configuration file. Once the configuration is set up, Fluent Bit can be run as a standalone process or as a sidecar in containerized environments.

For this demo, we will be going ahead with a configuration file. 


	flush 1
	log_level info

Since this file only contains information about the service (The SERVICE defines the global behavior of the Fluent Bit engine), will need to define input and outputs as well.

Input & Parser Configuration 

Fluent Bit accepts data from a variety of sources using input plugins. The `tail` input plugin allows you to read from a text log file as though you were running the `tail -f` command

Add the following to your fluent-bit.conf file.


	flush     	1
	log_level 	INFO
	parsers_file  parsers.conf

	name          	tail
	tag           	*
	path          	test.log
	multiline.parser java
	skip_empty_lines  On
	refresh_interval  5
	read_from_head   true

The path parameter in Fluent Bit’s configuration may need to be adjusted based on the source of your logs. The plugin name, which identifies which plugin Fluent Bit should load, cannot be customized by the user. The tag parameter is optional but can be used for routing and filtering your data, as discussed in more detail below.


	Name   java_multi
	Format regex
	Regex           	^(?[0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3})\s+(?\S+)\s+(?\S+)\s+---\s+\[\s*(?[^\]]+)\]\s+(?[^\s]+)\s+:\s+(?.+)$
	Time_Key        	timestamp
	Time_Format     	%Y-%m-%d %H:%M:%S.%L
	Time_Keep       	On

Filter Configuration

	name         	parser
	match        	*
	key_name     	log
	parser       	java_multi

Output Configuration for Elasticsearch.

As with inputs, Fluent Bit uses output plugins to send the gathered data to their desired destinations.

To set up your configuration, you will need to gather some information from your Elastic Cloud deployment: See the image below for how to locate it from the Elastic cloud page.

  • HOST_NAME – Cloud ElasticSearch Endpoint

  • CLOUD_AUTH – These will have been provided to you when you created your Elasticsearch cluster. If you failed to make a note of them, you could reset the password.

  • CLOUD_ID – The Cloud ID simplifies sending data to your cluster on Elastic Cloud.

screen capture

Once you have gathered the required information, add the following to your fluent-bit.conf file below the output section.

	# optional: send the data to standard output for debugging
	Name stdout
	Match *

	Name es
	Match *
	Port 9243 # default port
	tls On
	tls.verify Off
	Suppress_Type_Name On

Tip: If you want to look into more details of each output parameters of ES plugin you check out here.

The Match * parameter indicates that all of the data gathered by Fluent Bit will be forwarded to Elastic Cloud instance. We could also match based on a tag defined in the input plugin. tls On ensures that the connection between Fluent Bit and the Elastic Cloud instance is secure. By default, the Port is configured to 9243.

Note: We have also defined a secondary output that sends all the data to stdout. This is not required for the Elastic Cloud configuration but can be incredibly helpful if we need to debug our configuration.

Start Sending Your Logs!

For ease of setup, I’ve written a docker-compose file as follows, and it will help you get started with all the necessary things, such as Fluent Bit and Java app log example configured to run locally.

Example log

2023-03-13 16:20:25.995  INFO 1 --- [       	main] o.h.e.t.j.p.i.JtaPlatformInitiator   	: HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]
2023-03-13 16:20:26.017  INFO 1 --- [       	main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
Spring boot application running in IST timezone :Mon Mar 13 21:50:26 IST 2023
2023-03-13 16:20:26.294  INFO 1 --- [       	main] org.mongodb.driver.cluster           	: Cluster created with settings {hosts=[], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='2000 ms'}
2023-03-13 16:20:28.187  WARN 1 --- [       	main] JpaBaseConfiguration$JpaWebConfiguration : is enabled by default. Therefore, database queries may be performed during view rendering. Explicitly configure to disable this warning
2023-03-13 16:20:28.948  INFO 1 --- [       	main] o.s.b.a.e.web.EndpointLinksResolver  	: Exposing 1 endpoint(s) beneath base path '/actuator'
2023-03-13 16:20:29.017  INFO 1 --- [       	main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 5000 (http) with context path ''
2023-03-13 16:20:29.040  INFO 1 --- [       	main] in.onecode.lms.LmsApplication        	: Started LmsApplication in 18.04 seconds (JVM running for 29.112)
2023-03-13 16:20:36.405  INFO 1 --- [.31.2.3214:12] org.mongodb.driver.cluster           	: Exception in monitor thread while connecting to server

com.mongodb.MongoSocketOpenException: Exception opening socket
    at ~[mongodb-driver-core-4.4.2.jar!/:na]
    at ~[mongodb-driver-core-4.4.2.jar!/:na]
    at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.lookupServerDescription( ~[mongodb-driver-core-4.4.2.jar!/:na]
    at com.mongodb.internal.connection.DefaultServerMonitor$ ~[mongodb-driver-core-4.4.2.jar!/:na]
    at [na:1.8.0_362]
Caused by: connect timed out
    at Method) ~[na:1.8.0_362]
    at ~[na:1.8.0_362]
    at ~[na:1.8.0_362]
    at ~[na:1.8.0_362]
    at ~[na:1.8.0_362]
    at ~[na:1.8.0_362]
    at com.mongodb.internal.connection.SocketStreamHelper.initialize( ~[mongodb-driver-core-4.4.2.jar!/:na]
    at com.mongodb.internal.connection.SocketStream.initializeSocket( ~[mongodb-driver-core-4.4.2.jar!/:na]
    at ~[mongodb-driver-core-4.4.2.jar!/:na]
    ... 4 common frames omitted
2023-03-13 16:20:29.040  INFO 1 --- [       	main] in.onecode.lms.LmsApplication        	: Started LmsApplication in 18.04 seconds (JVM running for 29.112)


version: "3"

	driver: local
	image: fluent/fluent-bit
  	- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
  	- ./parsers.conf:/fluent-bit/etc/parsers.conf
  	- ./log/:/etc/data

Put the fluent-bit.conf, docker-compose.yaml in a directory, and within the same directory, create another one with log named and place the test.log file in there. Directory structure

├── docker-compose.yaml
├── fluent-bit.conf
├── log
│   └── test.log
└── parsers.conf

Now, run the command below to get things up and running; make sure you’re running the terminal within the same directory.

➜  fluent-bit-demo: $ docker compose up --build
[+] Running 2/2
 ⠿ Network fluent-bit-demo_default     	Created                                                                                                                          	0.1s
 ⠿ Container fluent-bit-demo-fluent-bit-1  Created                                                                                                                          	0.0s
Attaching to fluent-bit-demo-fluent-bit-1
fluent-bit-demo-fluent-bit-1  | Fluent Bit v2.0.9
fluent-bit-demo-fluent-bit-1  | * Copyright (C) 2015-2022 The Fluent Bit Authors
fluent-bit-demo-fluent-bit-1  | * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
fluent-bit-demo-fluent-bit-1  | *
fluent-bit-demo-fluent-bit-1  |
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [fluent bit] version=2.0.9, commit=4c0ca4fc5f, pid=1
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [cmetrics] version=0.5.8
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [ctraces ] version=0.2.7
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [input:tail:tail.0] initializing
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [input:tail:tail.0] multiline core started
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [output:stdout:stdout.0] worker #0 started
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [sp] stream processor started
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [output:es:es.1] worker #0 started
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [output:es:es.1] worker #1 started
fluent-bit-demo-fluent-bit-1  | [2023/03/20 07:48:29] [ info] [input:tail:tail.0] inotify_fs_add(): inode=6815793 watch_fd=1 name=/etc/data/test.log
fluent-bit-demo-fluent-bit-1  | [0] [1678724425.995000000, {"timestamp"=>"2023-03-13 16:20:25.995", "log_level"=>"INFO", "pid"=>"1", "thread_name"=>"main", "logger_name"=>"o.h.e.t.j.p.i.JtaPlatformInitiator", "message"=>"HHH000490: Using JtaPlatform implementation: [org.hibernate.engine.transaction.jta.platform.internal.NoJtaPlatform]"}]
fluent-bit-demo-fluent-bit-1  | [1] [1678724426.017000000, {"timestamp"=>"2023-03-13 16:20:26.017", "log_level"=>"INFO", "pid"=>"1", "thread_name"=>"main", "logger_name"=>"j.LocalContainerEntityManagerFactoryBean", "message"=>"Initialized JPA EntityManagerFactory for persistence unit 'default'"}]
fluent-bit-demo-fluent-bit-1  | [2] [1679298509.680409438, {"log"=>"Spring boot application running in IST timezone :Mon Mar 13 21:50:26 IST 2023
fluent-bit-demo-fluent-bit-1  | "}]
fluent-bit-demo-fluent-bit-1  | [3] [1678724426.294000000, {"timestamp"=>"2023-03-13 16:20:26.294", "log_level"=>"INFO", "pid"=>"1", "thread_name"=>"main", "logger_name"=>"org.mongodb.driver.cluster", "message"=>"Cluster created with settings {hosts=[], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='2000 ms'}"}]
fluent-bit-demo-fluent-bit-1  | [4] [1678724428.187000000, {"timestamp"=>"2023-03-13 16:20:28.187", "log_level"=>"WARN", "pid"=>"1", "thread_name"=>"main", "logger_name"=>"JpaBaseConfiguration$JpaWebConfiguration", "message"=>" is enabled by default. Therefore, database queries may be performed during view rendering. Explicitly configure to disable this warning"}]
fluent-bit-demo-fluent-bit-1  | [5] [1678724428.948000000, {"timestamp"=>"2023-03-13 16:20:28.948", "log_level"=>"INFO", "pid"=>"1", "thread_name"=>"main", "logger_name"=>"o.s.b.a.e.web.EndpointLinksResolver", "message"=>"Exposing 1 endpoint(s) beneath base path '/actuator'"}]
fluent-bit-demo-fluent-bit-1  | [6] [1678724429.017000000, {"timestamp"=>"2023-03-13 16:20:29.017", "log_level"=>"INFO", "pid"=>"1", "thread_name"=>"main", "logger_name"=>"o.s.b.w.embedded.tomcat.TomcatWebServer", "message"=>"Tomcat started on port(s): 5000 (http) with context path ''"}]
fluent-bit-demo-fluent-bit-1  | [7] [1678724429.040000000, {"timestamp"=>"2023-03-13 16:20:29.040", "log_level"=>"INFO", "pid"=>"1", "thread_name"=>"main", "logger_name"=>"in.onecode.lms.LmsApplication", "message"=>"Started LmsApplication in 18.04 seconds (JVM running for 29.112)"}]
fluent-bit-demo-fluent-bit-1  | [8] [1678724436.405000000, {"timestamp"=>"2023-03-13 16:20:36.405", "log_level"=>"INFO", "pid"=>"1", "thread_name"=>".31.2.3214:12", "logger_name"=>"org.mongodb.driver.cluster", "message"=>"Exception in monitor thread while connecting to server"}]
fluent-bit-demo-fluent-bit-1  | [9] [1679298509.680436028, {"log"=>"com.mongodb.MongoSocketOpenException: Exception opening socket
fluent-bit-demo-fluent-bit-1  |     	at ~[mongodb-driver-core-4.4.2.jar!/:na]
fluent-bit-demo-fluent-bit-1  |     	at ~[mongodb-driver-core-4.4.2.jar!/:na]
fluent-bit-demo-fluent-bit-1  |     	at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.lookupServerDescription( ~[mongodb-driver-core-4.4.2.jar!/:na]
fluent-bit-demo-fluent-bit-1  |     	at com.mongodb.internal.connection.DefaultServerMonitor$ ~[mongodb-driver-core-4.4.2.jar!/:na]
fluent-bit-demo-fluent-bit-1  |     	at [na:1.8.0_362]
fluent-bit-demo-fluent-bit-1  | Caused by: connect timed out
fluent-bit-demo-fluent-bit-1  |     	at Method) ~[na:1.8.0_362]
fluent-bit-demo-fluent-bit-1  |     	at ~[na:1.8.0_362]
fluent-bit-demo-fluent-bit-1  |     	at ~[na:1.8.0_362]
fluent-bit-demo-fluent-bit-1  |     	at ~[na:1.8.0_362]
fluent-bit-demo-fluent-bit-1  |     	at ~[na:1.8.0_362]
fluent-bit-demo-fluent-bit-1  |     	at ~[na:1.8.0_362]
fluent-bit-demo-fluent-bit-1  |     	at com.mongodb.internal.connection.SocketStreamHelper.initialize( ~[mongodb-driver-core-4.4.2.jar!/:na]
fluent-bit-demo-fluent-bit-1  |     	at com.mongodb.internal.connection.SocketStream.initializeSocket( ~[mongodb-driver-core-4.4.2.jar!/:na]
fluent-bit-demo-fluent-bit-1  |     	at ~[mongodb-driver-core-4.4.2.jar!/:na]
fluent-bit-demo-fluent-bit-1  |     	... 4 common frames omitted
fluent-bit-demo-fluent-bit-1  | "}]
fluent-bit-demo-fluent-bit-1  | [10] [1678724429.040000000, {"timestamp"=>"2023-03-13 16:20:29.040", "log_level"=>"INFO", "pid"=>"1", "thread_name"=>"main", "logger_name"=>"in.onecode.lms.LmsApplication", "message"=>"Started LmsApplication in 18.04 seconds (JVM running for 29.112)"}]

Verify the pipeline

Once, all the services are up, as defined in docker-compose, you can head over to the Elastic Cloud instance.

Check out the below screenshot where Logs are coming to Elastic Cloud Instance. You can see we’re able to see the logs with structured data, and with the help of multiline parsing, stack traces are clearly visible as a whole.

screen capture


In conclusion, configuring Fluent Bit to parse log messages correctly is crucial for ensuring accurate and complete log data is sent to Elasticsearch. By properly handling multiline log messages, Fluent Bit can avoid treating each line as a separate log entry and instead extract the desired structured data. 

With Fluent Bit’s powerful parser plugin, it’s possible to extract structured data from log messages and store it in various data stores. This allows for more efficient analysis and querying of log data, making Fluent Bit an essential component of any logging pipeline.

Although Fluent Bit is a potent tool that can be manually configured with ease, managing it can become challenging as your infrastructure scales up. Simplify your pipeline management with Calyptia Core. With its user-friendly interface and simplified pipeline management, you can spend less time configuring monitoring and more time focusing on your business goals.

