Skip to content

feat(code-mappings): Add code mappings task to post process #40882

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 3, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions src/sentry/killswitches.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,17 @@ class KillswitchInfo:
"project_id": "A project ID to filter events by.",
},
),
"post_process.derive-code-mappings": KillswitchInfo(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this functionality work? Is there more documentation for this file? Is there a UI associated to modify kills switches on the fly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see WOR-2359

description="""
Prevent deriving code mappings for a project.

In cases where the derive-code-mapppings query load is too high, the added load
can cause other queries to be rate limitedd or slow down post_process tasks.
""",
fields={
"project_id": "A project ID to filter events by.",
},
),
"reprocessing2.drop-delete-old-primary-hash": KillswitchInfo(
description="""
Drop per-event messages emitted from delete_old_primary_hash. This message is currently lacking batching, and for the time being we should be able to drop it on a whim.
Expand Down
1 change: 1 addition & 0 deletions src/sentry/options/defaults.py
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,7 @@
register("store.symbolicate-event-lpq-never", type=Sequence, default=[])
register("store.symbolicate-event-lpq-always", type=Sequence, default=[])
register("post_process.get-autoassign-owners", type=Sequence, default=[])
register("post_process.derive-code-mappings", type=Sequence, default=[])

# Switch for more performant project counter incr
register("store.projectcounter-modern-upsert-sample-rate", default=0.0)
Expand Down
32 changes: 32 additions & 0 deletions src/sentry/tasks/post_process.py
Original file line number Diff line number Diff line change
Expand Up @@ -595,6 +595,37 @@ def process_rules(job: PostProcessJob) -> None:
return


def process_code_mappings(job: PostProcessJob) -> None:
if job["is_reprocessed"]:
return

from sentry.tasks.derive_code_mappings import derive_code_mappings

try:
event = job["event"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the stacktraces available at this point? if all stacktraces match a code mapping we will not need to schedule anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to get stacktraces here like this. How would we check that it matches a code mapping?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking this more. We should not put the logic here, otherwise, we hit the DB for every event.

project = event.project

cache_key = f"code-mappings:{project.id}"
project_queued = cache.get(cache_key)
if project_queued is None:
cache.set(cache_key, True, 3600)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use the project_id as the controlling mechanism here, we will need to control fetching the trees per org on the other task at the org id level. Otherwise, two events from two projects will trigger get_trees_for_org twice in the same hour. Now that I see this here I can look into adding it tomorrow morning or feel free to add it. Let's just coordinate it about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may make sense to add logging in case we want to debug the system:
logger.info(f"derive_code_mappings: Events from {project.id} will not have code derivation until {date_time_here}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added logging, good point!

Let's talk about your other point tomorrow. Our worst case scenario here is an org with a large number of projects (N) that all get events each hour. I'm not sure how to prevent get_trees_for_org being called N times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could happen within an hour for events for the same org:

  • Event A1 for project A
    • We try to derive code and get_trees_for_org gets called
  • Event A2 for project A (cache prevents calling task)
  • Event B1 for project B
    • We try to derive code and get_trees_for_org gets called a 2nd time

We could check the cache key here (or an extra one) to only allow one event processed per hour per org for now so we can go live.

I will work today on adding a way to track what is the current GH api limit and how to control when get_trees_for_org gets called. I need to research on memcache and the potential for OOM since the trees for org object becomes quite large.


if project_queued or not features.has(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature check will incorporate the option / killswitch added in https://github.com/getsentry/getsentry/pull/8777.

We don't need to add separate logic here, which is pretty neat!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check this earlier? or only check once an hour?
It's probably good here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this brings up a good point. We should add the killswitch check in derive_code_mappings.py as well. Even though it's a duplicate, we might want to stop all processing and right now, anything that's queued will still go through. I'll send a PR for that.

"organizations:derive-code-mappings", event.project.organization
):
return

if killswitch_matches_context(
"post_process.derive-code-mappings",
{"project_id": project.id},
):
return
derive_code_mappings.delay(project.organization_id, project.id, event.data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer only sending event.data so all the logic is derived in there (even though I know it feels that repeating code over there; I find it clearer since it saves looking for the code of the caller).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is alright since the only additional param is now the project ID. Otherwise, we would need to send the whole event to derive the project since the project_id isn't contained in event.data.


except Exception:
logger.exception("Failed to set auto-assignment")


def process_commits(job: PostProcessJob) -> None:
if job["is_reprocessed"]:
return
Expand Down Expand Up @@ -777,6 +808,7 @@ def plugin_post_process_group(plugin_slug, event, **kwargs):
process_service_hooks,
process_resource_change_bounds,
process_plugins,
process_code_mappings,
process_similarity,
update_existing_attachments,
fire_error_processed,
Expand Down