Skip to content

add multiline log support for s3 logs #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 13, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions aws/logs_monitoring/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ AWS lambda function to ship ELB, S3, CloudTrail, VPC, CloudFront and CloudWatch
- SSL Security
- JSON events providing details about S3 documents forwarded
- Structured meta-information can be attached to the events
- Multiline Log Support (S3 Only)

# Quick Start

Expand Down Expand Up @@ -93,3 +94,9 @@ Two environment variables can be used to forward logs through a proxy:
If the test "succeeded", you are all set! The test log will not show up in the platform.

For S3 logs, there may be some latency between the time a first S3 log file is posted and the Lambda function wakes up.

## 6. (optional) Multiline Log support for s3

If there are multiline logs in s3, set `DD_MULTILINE_LOG_REGEX_PATTERN` environment variable to the specified regex pattern to detect for a new log line.

- Example: for multiline logs beginning with pattern `11/10/2014`: `DD_MULTILINE_LOG_REGEX_PATTERN="\d{2}\/\d{2}\/\d{4}"`
15 changes: 14 additions & 1 deletion aws/logs_monitoring/lambda_function.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,12 @@
# Strip any trailing and leading whitespace from the API key
DD_API_KEY = DD_API_KEY.strip()

# DD_MULTILINE_REGEX: Datadog Multiline Log Regular Expression Pattern
if "DD_MULTILINE_LOG_REGEX_PATTERN" in os.environ:
DD_MULTILINE_LOG_REGEX_PATTERN = os.environ["DD_MULTILINE_LOG_REGEX_PATTERN"]
multiline_regex = re.compile("(?<!^)\s+(?={})(?!.\s)".format(DD_MULTILINE_LOG_REGEX_PATTERN))
multiline_regex_start_pattern = re.compile("^{}".format(DD_MULTILINE_LOG_REGEX_PATTERN))

cloudtrail_regex = re.compile(
"\d+_CloudTrail_\w{2}-\w{4,9}-\d_\d{8}T\d{4}Z.+.json.gz$", re.I
)
Expand Down Expand Up @@ -277,8 +283,15 @@ def s3_handler(event, context, metadata):
)
yield structured_line
else:
# Check if using multiline log regex pattern
# and determine whether line or pattern separated logs
if DD_MULTILINE_LOG_REGEX_PATTERN and multiline_regex_start_pattern.match(data):
split_data = multiline_regex.split(data)
else:
split_data = data.splitlines()

# Send lines to Datadog
for line in data.splitlines():
for line in split_data:
# Create structured object and send it
structured_line = {
"aws": {"s3": {"bucket": bucket, "key": key}},
Expand Down