-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
feat(bug-prediction): Add a rpc call to fetch group IDs by exc types #93341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅ ✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## master #93341 +/- ##
===========================================
+ Coverage 41.28% 88.04% +46.75%
===========================================
Files 10264 10302 +38
Lines 592759 593982 +1223
Branches 23044 22976 -68
===========================================
+ Hits 244749 522945 +278196
+ Misses 347561 70591 -276970
+ Partials 449 446 -3 |
src/sentry/seer/fetch_issues/fetch_issues_given_exception_types.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this'll work but left some extra feedback 👍
"run_id": run_id, | ||
}, | ||
) | ||
return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b/c this will be a tool call, it might be useful to return something that indicates to the LLM that this error happened, e.g., a dict with {"error": f"No repo was found when searching for issues with exception type {exception_type}. You need to blah blah."}
. that way the LLM can use this specific error message to figure out what to do next. maybe that's to never use the tool, maybe something else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point, I think I'll return:
{"issues": [...]}
or {"error": "Repo does not exist"}
or {"error": "Repo project path config does not exist"}
and then have the tool return more specialized messages such as: message + ". This repo was not configured properly so don't call this tool any more for this repo"
This allows this function to be generic enough to be used in other tools in the future perhaps.
"external_id": external_id, | ||
}, | ||
) | ||
return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment re specific error here. not sure what the right message is. but good to be specific for le agent to intelligently iterate
# Using raw SQL since data is GzippedDictField which can't be filtered with Django ORM | ||
query = """ | ||
SELECT * FROM sentry_groupedmessage | ||
WHERE project_id IN %s | ||
AND last_seen >= %s | ||
AND (data::json -> 'metadata' ->> 'type') = %s | ||
ORDER BY last_seen | ||
LIMIT %s | ||
""" | ||
issues = Group.objects.raw( | ||
query, [tuple(project_ids), date_threshold, exception_type, max_num_issues] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i haven't seen .raw
used in sentry. i did see examples where raw SQL is passed using .extra
does something like this work?
query_set = (
Group.objects.extra(
where=["(data::json -> 'metadata' ->> 'type') = %s"],
params=[exception_type],
)
.filter(
project_id__in=project_ids,
last_seen__gte=date_threshold,
)
.order_by("last_seen")[:max_num_issues]
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few *.objects.raw
in the repo, just not on the Group
object. Your way of half ORM / half SQL also works, though .extra
is apparently deprecated (or soon to be). I'll do the same approach and use .annotate
with RawSQL
.
) | ||
|
||
# Extract IDs from the returned Group objects | ||
return [issue.id for issue in issues] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's stuff in sentry that isn't in seer, and would require another call to sentry. maybe makes sense to return a dict representation of the issue
? so we have the message and data and stuff. not sure how big it can get
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with the models relating to issues (or any models in the DB for that matter 😂 ), but in this table I don't see many things useful, there's the message
column that can be useful but I think the summary in seer DB has enough overlapping parts compared to the message
. I think we can look for more useful stuff in this DB if the issue summaries in seer DB doesn't yield good enough results.
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
…93341) add a rpc method to fetch from sentry database recent issues by a list of given exception types. This will be used by bug-prediction in seer to fetch issues.
add a rpc method to fetch from sentry database recent issues by a list of given exception types. This will be used by bug-prediction in seer to fetch issues.