Skip to content

Allow Logstash to write its logs in JSON format #4820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

jordansissel
Copy link
Contributor

@jordansissel jordansissel commented Mar 15, 2016

This is made available by a --log-in-json flag. Default is false.
When false, the old behavior [1] is used.

When true, JSON logs are emitted.

[1] The old behavior is realy two things. First, using Object#inspect to
serialize. Second, to color the output if the IO is a tty.

For #1569

@jordansissel jordansissel changed the title Allow Logstash to write its logs in JSON format [WIP] Allow Logstash to write its logs in JSON format Mar 15, 2016
@@ -90,6 +90,10 @@ class LogStash::Runner < Clamp::Command
I18n.t("logstash.web_api.flag.http_port"),
:attribute_name => :web_api_http_port, :default => 9600

option ["--[no-]log-in-json"], :flag,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts about renaming this flag to --logger.format to make it future proof for setting.yml? This way it translates easily to yml.. See #4499 /cc @jsvd

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be consistent, this pr should add the flag according to the current state of master. right now it should use "-", but if it happens to be merged after logstash.yml, then the pr needs to be changed

@jordansissel
Copy link
Contributor Author

Rebased.

@jordansissel
Copy link
Contributor Author

Hand-test that it still works post-rebase:

% bin/logstash --log-in-json -f /tmp/nonexistant
{"timestamp":"2016-04-17T10:46:13.551000-0700","message":"No config files found: /tmp/nonexistant\nCan you make sure this path is a logstash config file?","level":"error"}

@@ -102,6 +102,10 @@ class LogStash::Runner < Clamp::Command
I18n.t("logstash.runner.flag.allow-env"),
:attribute_name => :allow_env, :default => false

option ["--[no-]log-in-json"], :flag,
I18n.t("logstash.agent.flag.log-in-json"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translation missing in yml file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix this. Good catch.

@jordansissel
Copy link
Contributor Author

Added tests

@jordansissel
Copy link
Contributor Author

Added help text:

    --[no-]log-in-json            Specify that Logstash should write its own logs in JSON form - one
                                  event per line. If false, Logstash will log using Ruby's
                                  Object#inspect (not easy to machine-parse)
                                   (default: false)

@jordansissel jordansissel changed the title [WIP] Allow Logstash to write its logs in JSON format Allow Logstash to write its logs in JSON format Apr 20, 2016
@jordansissel
Copy link
Contributor Author

Squashed to a single commit.

@ph
Copy link
Contributor

ph commented Apr 20, 2016

@jordansissel I saw we are targeting this for 5.0 But I am wondering if we should merge it in 2.3.X as an experimental flag?

@jordansissel
Copy link
Contributor Author

@ph I'd be ok merging this into 2.3 and 2.x as a flag that defaults to the old behavior. We can have a new PR later that enables JSON by default in master. Thoughts?

@jordansissel
Copy link
Contributor Author

http://build-eu-00.elastic.co/job/Logstash_PR/2671/testReport/ -- Failing tests seem unrelated to this PR.

@ph
Copy link
Contributor

ph commented Apr 20, 2016

@jordansissel
yes +1 to making this this teh default logging mode in 5.0
+1 Make it a flag for 2.X 2.3.

@ph
Copy link
Contributor

ph commented Apr 20, 2016

@jordansissel Yes the faillure are unrelated. I think its time to make a PR on flores with https://github.com/logstash-plugins/logstash-input-beats/blob/master/spec/support/flores_extensions.rb#L29 ;)

@jordansissel
Copy link
Contributor Author

jordansissel commented Apr 20, 2016

@ph ok cool. I'll target this for 2.x, 2.4, 5.0, and master

@jordansissel
Copy link
Contributor Author

Summarizing some offline discussion between @ph, @suyograo, and myself.

Given this is a new feature, we'll target this to Logstash 2.4.0 insetad of 2.3.2 because it's a feature, not a bug fix.

@jordansissel
Copy link
Contributor Author

Merge targets: 2.x, 2.4, 5.0, master

@ph
Copy link
Contributor

ph commented Apr 21, 2016

@jordansissel I found our first serialization errors when running logstash in debug mode.

{"timestamp":"2016-04-21T10:58:04.833000-0400","message":"* Environment: development","level":"debug","file":"logstash/webserver.rb","line":"55","method":"log"}
LogStash::Json::GeneratorError: Cannot serialize instance of: LogStash::Json::GeneratorError
      jruby_dump at /Users/ph/es/logstash/logstash-core/lib/logstash/json.rb:52
              << at /Users/ph/es/logstash/logstash-core/lib/logstash/logging/json.rb:14
              << at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.8.1/lib/cabin/subscriber.rb:10
         publish at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.8.1/lib/cabin/channel.rb:196
            each at org/jruby/RubyHash.java:1342
         publish at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.8.1/lib/cabin/channel.rb:193
     synchronize at org/jruby/ext/thread/Mutex.java:149
         publish at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.8.1/lib/cabin/channel.rb:192
            _log at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.8.1/lib/cabin/mixins/logger.rb:113
  log_with_level at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.8.1/lib/cabin/mixins/logger.rb:90
           fatal at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.8.1/lib/cabin/mixins/logger.rb:75
         execute at /Users/ph/es/logstash/logstash-core/lib/logstash/runner.rb:231
             run at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/clamp-0.6.5/lib/clamp/command.rb:67
             run at /Users/ph/es/logstash/vendor/bundle/jruby/1.9/gems/clamp-0.6.5/lib/clamp/command.rb:132
          (root) at /Users/ph/es/logstash/lib/bootstrap/environment.rb:72

out = File.new("/dev/null", "a")
logger.subscribe(LogStash::Logging::JSON.new(out))
logger.level = :debug
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will surely fails on windows, we can make it cross platform if we can create a dummy io object that implement the minimal interface to be an IO object but doesn't actually write anything to disk.

OR we can use this, which will make a null stream, I believe it will work on windows but I havent tested it?

File.open(File::NULL, "w")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused, why we need this code and the code in spec_helper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. this setup_json_logging was an older implementation of what is now in spec_helper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it now.

@ph
Copy link
Contributor

ph commented May 9, 2016

Changes LGTM, small comment concerning the use of /dev/null, I think we can go ahead with this.
@jordansissel Running the all plugins tests will probably discover some other issues in the plugins for the serialization that we need to fix.

@andrewvc
Copy link
Contributor

andrewvc commented May 9, 2016

Regarding the serialization to_s issue @ph and @guyboertje have been discussing I think

recursing is fine but we should never call inspect. Inspect can:

  1. Generate massive output by accident
  2. Leak passwords if they're linked in the object graph (this has happened multiple times!)

Ultimately we should strive for a whitelist approach to log inclusion IMHO. I wouldn't even mind it if we traversed plain ruby Hash and Array objects, invoking .to_s on everything else.

@guyboertje
Copy link
Contributor

Just so people know what JrJackson does do...
It knows how the serialize the following:
String
Float
BigDecimal
Time
Array
Hash
Fixnum
Bignum
Date
Symbol
Struct
TrueClass
FalseClass
(I will be adding Exception/Error and Class)

If it finds an object class not in the above list then it tries to invoke these methods, in order, on the object:
to_json_data
to_time
to_h
to_hash
to_a
to_json

Why to_json last?
Because, for the best performance, I do not want a JSON string in this case, I want JSON data structures so Jackson can continue with caching and memoising. This is why to_json_data is first - I want an object author to give me a data structure.
When authors write a to_json method, they very often will write:

def to_json
  JSON.dump(some_structure)
end

This will suspend the JrJackson session, call the JSON library, and return a string for JrJackson to append.

@jordansissel jordansissel force-pushed the feature/log-in-json branch from 1fc388c to b7399f8 Compare May 9, 2016 19:50
@ph
Copy link
Contributor

ph commented May 9, 2016

LGTM @jordansissel

This is made available by a new `--log-in-json` flag. Default is false.
When false, the old behavior [1] is used. When true, JSON logs are
emitted.

[1] The old behavior is realy two things. First, using Object#inspect to
serialize. Second, to color the output if the IO is a tty.

Fixes #1569
Use an internal subscriber to verify that JSON output is valid JSON. The
purpose is to catch any json serialization errors that would occur while
logging.

Also had to update a few logger calls to log values that could be
serialized (Class instances and similar, at this time, fail to serialize
to JSON).
@jordansissel jordansissel force-pushed the feature/log-in-json branch from b7399f8 to 619f1eb Compare May 9, 2016 19:55
@elasticsearch-bot
Copy link

Jordan Sissel merged this into the following branches!

Branch Commits
master af41218, f241dd8, 07887f8
5.0 755e4b5, 3fa057e, 682194b

elasticsearch-bot pushed a commit that referenced this pull request May 9, 2016
This is made available by a new `--log-in-json` flag. Default is false.
When false, the old behavior [1] is used. When true, JSON logs are
emitted.

[1] The old behavior is realy two things. First, using Object#inspect to
serialize. Second, to color the output if the IO is a tty.

Fixes #1569

Fixes #4820
elasticsearch-bot pushed a commit that referenced this pull request May 9, 2016
Use an internal subscriber to verify that JSON output is valid JSON. The
purpose is to catch any json serialization errors that would occur while
logging.

Also had to update a few logger calls to log values that could be
serialized (Class instances and similar, at this time, fail to serialize
to JSON).

Fixes #4820
elasticsearch-bot pushed a commit that referenced this pull request May 9, 2016
elasticsearch-bot pushed a commit that referenced this pull request May 9, 2016
Use an internal subscriber to verify that JSON output is valid JSON. The
purpose is to catch any json serialization errors that would occur while
logging.

Also had to update a few logger calls to log values that could be
serialized (Class instances and similar, at this time, fail to serialize
to JSON).

Fixes #4820
elasticsearch-bot pushed a commit that referenced this pull request May 9, 2016
@jordansissel
Copy link
Contributor Author

I need to backport this PR by hand to 2.x for 2.4. Will open a separate PR for it.

jordansissel added a commit that referenced this pull request May 10, 2016
This is made available by a --log-in-json flag. Default is false.
When false, the old behavior [1] is used.

When true, JSON logs are emitted.

[1] The old behavior is realy two things. First, using Object#inspect to
serialize. Second, to color the output if the IO is a tty.

For #1569

This is a manual backport of PR #4820 into the 2.x branch.
elasticsearch-bot pushed a commit that referenced this pull request May 10, 2016
This is made available by a --log-in-json flag. Default is false.
When false, the old behavior [1] is used.

When true, JSON logs are emitted.

[1] The old behavior is realy two things. First, using Object#inspect to
serialize. Second, to color the output if the IO is a tty.

For #1569

This is a manual backport of PR #4820 into the 2.x branch.

Fixes #5277
@jordansissel
Copy link
Contributor Author

Backport to 2.4 done in #5277.

@suyograo suyograo deleted the feature/log-in-json branch May 11, 2016 15:50
@joshuaspence
Copy link

Will Logstash 2.4 be released any time soon?

@suyograo
Copy link
Contributor

@joshuaspence yes, target is to release it in Aug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants