Skip to content

Commit 552d622

Browse files
committed
Don't be all things to all people. Focus only on Heroku log format
1 parent 35290f7 commit 552d622

File tree

6 files changed

+63
-191
lines changed

6 files changed

+63
-191
lines changed

README.md

Lines changed: 11 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
heroku-log-parser
22
=======
33

4-
A multi-provider [syslog (rfc5424)](http://tools.ietf.org/html/rfc5424#section-6) parser written in Ruby and specifically
5-
targeting Heroku's log drain format.
4+
A [syslog (rfc5424)](http://tools.ietf.org/html/rfc5424#section-6) parser written in Ruby and specifically
5+
targeting Heroku's [http log drain](https://devcenter.heroku.com/articles/labs-https-drains).
66

77
## Install
88

@@ -20,35 +20,20 @@ $ bundle install
2020

2121
## Usage
2222

23-
heroku-log-parser is built on the concept of syslog message flavors. In my brief experience with syslog streams,
24-
everybody seems to do it differently. So it is necessary to handle these inconsistencies on a per-provider
25-
basis.
26-
27-
Currently the only flavor available is the flavor for Heroku's log-stream.
28-
29-
Create a parser based on your desired flavor:
30-
3123
```ruby
32-
log_parser = heroku-log-parser.parser(:heroku).new
33-
```
34-
35-
A parser is a stateless, regex-based object that accepts a string of data holding one or more syslog messages
36-
and emits a hash containing the individual parts of a syslog message. For those unwilling to read the spec, the
37-
list of syslog tokens is as follows (and is stored in the `heroku-log-parser::SYSLOG_KEYS` array):
24+
msg_str = "156 <40>1 2012-11-30T06:45:26+00:00 heroku web.3 d.73ea7440-270a-435a-a0ea-adf50b4e5f5a - Starting process with command `bundle exec rackup config.ru -p 24405`"
3825

39-
```ruby
40-
heroku-log-parser::SYSLOG_KEYS
41-
#=> [:priority, :syslog_version, :emitted_at, :hostname, :appname, :proc_id, :msg_id, :structured_data, :message]
26+
HerokuLogParser.parse(msg_str)
27+
#=> [{:priority=>40, :syslog_version=>1, :emitted_at=>2012-11-30 06:45:26 UTC, :hostname=>"heroku", :appname=>nil, :proc_id=>"web.3", :msg_id=>"d.73ea7440-270a-435a-a0ea-adf50b4e5f5a", :structured_data=>nil, :message=>"Starting process with command `bundle exec rackup config.ru -p 24405`"}]
4228
```
4329

44-
To parse a message packet, invoke the `events` method.
30+
`HerokuLogParser` is a stateless, regex-based parser that accepts a string of data holding one or more syslog messages
31+
and returns an array of syslog message properties for each message. For those unwilling to read the spec, the
32+
list of syslog tokens is as follows (and is stored in the `HerokuLogParser::SYSLOG_KEYS` array):
4533

4634
```ruby
47-
msg_str = "156 <40>1 2012-11-30T06:45:26+00:00 heroku web.3 d.73ea7440-270a-435a-a0ea-adf50b4e5f5a - Starting process with command `bundle exec rackup config.ru -p 24405`"
48-
log_parser.events(msg_str) do |event|
49-
event.inspect
50-
#=> {:priority=>40, :syslog_version=>1, :emitted_at=>2012-11-30 06:45:26 UTC, :hostname=>"heroku", :appname=>nil, :proc_id=>"web.3", :msg_id=>"d.73ea7440-270a-435a-a0ea-adf50b4e5f5a", :structured_data=>nil, :message=>"Starting process with command `bundle exec rackup config.ru -p 24405`"}
51-
end
35+
HerokuLogParser::SYSLOG_KEYS
36+
#=> [:priority, :syslog_version, :emitted_at, :hostname, :appname, :proc_id, :msg_id, :structured_data, :message]
5237
```
5338

5439
## Contributions
@@ -59,12 +44,9 @@ end
5944

6045
* TESTS!!!!
6146
* 2nd order parsing. For instance, for parsing a structured message body into key=value pairs (including the structured_data message part)
62-
* Docs for creating diff flavors
63-
* Pure Syslog compliant default parser
64-
* Less hacky flavor dynamic loading
6547

6648
## Issues
6749

6850
Please submit all issues to the project's Github issues.
6951

70-
-[@rwdaigle](https://twitter.com/rwdaigle)
52+
-- [@rwdaigle](https://twitter.com/rwdaigle)

heroku-log-parser.gemspec

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ require 'heroku-log-parser/version'
33

44
Gem::Specification.new do |s|
55
s.name = "heroku-log-parser"
6-
s.version = heroku-log-parser::VERSION
6+
s.version = HerokuLogParser::VERSION
77
s.platform = Gem::Platform::RUBY
88
s.author = "Ryan Daigle"
99
s.email = ["[email protected]"]

lib/heroku-log-parser.rb

Lines changed: 51 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,74 +1,77 @@
1-
module HerokuLogParser
1+
class HerokuLogParser
22

33
SYSLOG_KEYS = :priority, :syslog_version, :emitted_at, :hostname, :appname, :proc_id, :msg_id, :structured_data, :message
44

5-
# Never done this before, doesn't feel very good
6-
def self.parser(flavor = :heroku)
7-
constantize("Parsley::Flavors::#{flavor.to_s.capitalize}")
8-
end
9-
10-
# Shouldn't be referenced directly. Always request a flavor of a parser
11-
# Parsley.parser(:heroku).new(syslog_str)
12-
class Parser
5+
class << self
136

14-
def events(data_str, &block)
7+
def parse(data_str)
8+
events = []
159
lines(data_str) do |line|
1610
if(matching = line.match(line_regex))
17-
yield event_data(matching)
11+
events << event_data(matching)
1812
end
1913
end
14+
events
2015
end
2116

2217
protected
2318

24-
# Since the Heroku format is the only one I've tested, it's the default until broader
25-
# flavor support is added
19+
# http://tools.ietf.org/html/rfc5424#page-8
20+
# frame <prority>version time hostname <appname-missing> procid msgid [no structured data = '-'] msg
21+
# 120 <40>1 2012-11-30T06:45:29+00:00 heroku web.3 d.73ea7440-270a-435a-a0ea-adf50b4e5f5a - State changed from starting to up
2622
def line_regex
27-
@line_regex ||= /\<(\d+)\>(1) (\d\d\d\d-\d\d-\d\dT\d\d:\d\d:\d\d\+00:00) ([a-z0-9-]+) ([a-z0-9\-\_\.]+) ([a-z0-9\-\_\.]+) \- (.*)$/
23+
@line_regex ||= /\<(\d+)\>(1) (\d\d\d\d-\d\d-\d\dT\d\d:\d\d:\d\d\+00:00) ([a-z0-9-]+) ([a-z0-9\-\_\.]+) ([a-z0-9\-\_\.]+) (\-) (.*)$/
2824
end
2925

30-
# Break a given packet into individual syslog messages. Default assumes one message per packet
26+
# Heroku's http log drains (https://devcenter.heroku.com/articles/labs-https-drains)
27+
# utilize octet counting framing (http://tools.ietf.org/html/draft-gerhards-syslog-plain-tcp-12#section-3.4.1)
28+
# for transmission of syslog messages over TCP. Properly parse and delimit
29+
# individual syslog messages, many of which may be contained in a single packet.
30+
#
31+
# I am still uncertain if this is the place for transport layer protocol handling. I suspect not.
32+
#
3133
def lines(data_str, &block)
32-
yield data_str
34+
d = data_str
35+
while d && d.length > 0
36+
if matching = d.match(/^(\d+) /) # if have a counting frame, use it
37+
num_bytes = matching[1].to_i
38+
frame_offset = matching[0].length
39+
line_end = frame_offset + num_bytes
40+
msg = d[frame_offset..line_end]
41+
yield msg
42+
d = d[line_end..d.length]
43+
elsif matching = d.match(/\n/) # Newlines = explicit message delimiter
44+
d = matching.post_match
45+
else
46+
STDERR.puts("Unable to parse: #{d}")
47+
return
48+
end
49+
end
3350
end
3451

52+
# Heroku is missing the appname token, otherwise can treat as standard syslog format
53+
def event_data(matching)
54+
event = {}
55+
event[:priority] = matching[1].to_i
56+
event[:syslog_version] = matching[2].to_i
57+
event[:emitted_at] = nil?(matching[3]) ? nil : Time.parse(matching[3]).utc
58+
event[:hostname] = interpret_nil(matching[4])
59+
event[:appname] = nil
60+
event[:proc_id] = interpret_nil(matching[5])
61+
event[:msg_id] = interpret_nil(matching[6])
62+
event[:structured_data] = interpret_nil(matching[7])
63+
event[:message] = interpret_nil(matching[8])
64+
event
65+
end
66+
67+
private
68+
3569
def interpret_nil(val)
3670
nil?(val) ? nil : val
3771
end
3872

3973
def nil?(val)
4074
val == "-"
4175
end
42-
43-
# Default is to assume simple sequential matching
44-
# Comment out until it's actually used
45-
# def event_data(matching)
46-
# event = {}
47-
# event[:priority] = matching[1].to_i
48-
# event[:syslog_version] = matching[2].to_i
49-
# event[:emitted_at] = nil?(matching[3]) ? nil : Time.parse(matching[3]).utc
50-
# event[:hostname] = interpret_nil(matching[4])
51-
# event[:appname] = interpret_nil(matching[5])
52-
# event[:proc_id] = interpret_nil(matching[6])
53-
# event[:msg_id] = interpret_nil(matching[7])
54-
# event[:structured_data] = interpret_nil(matching[8])
55-
# event[:message] = interpret_nil(matching[9])
56-
# event
57-
# end
58-
end
59-
60-
# Taken from ActiveSupport::Inflector: http://apidock.com/rails/v3.2.8/ActiveSupport/Inflector/constantize
61-
# Hate that this is here - how do others do dynamic loading?
62-
def self.constantize(camel_cased_word)
63-
names = camel_cased_word.split('::')
64-
names.shift if names.empty? || names.first.empty?
65-
66-
constant = Object
67-
names.each do |name|
68-
constant = constant.const_defined?(name) ? constant.const_get(name) : constant.const_missing(name)
69-
end
70-
constant
7176
end
72-
end
73-
74-
require "parsley/flavors"
77+
end

lib/heroku-log-parser/flavors.rb

Lines changed: 0 additions & 1 deletion
This file was deleted.

lib/heroku-log-parser/flavors/heroku.rb

Lines changed: 0 additions & 60 deletions
This file was deleted.

parsley.gemspec

Lines changed: 0 additions & 52 deletions
This file was deleted.

0 commit comments

Comments
 (0)