-
Notifications
You must be signed in to change notification settings - Fork 517
fix: fix display for task information and improve UI for benchmark filtering #3629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fix display for task information and improve UI for benchmark filtering #3629
Conversation
|
@Samoed Could you take a look when you have some free time? |
|
It looks good. Do you want to add your suggestion here? #3569 (comment) |
No problem, I'll update the PR later. |
|
Hi @Samoed ,I have update the PR,you can take a look when you have some free time. |
|
Ah, this is "select all". That's fine then |
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally the new changes looks really great (thanks!), but switching from e.g. "Scandinavian" to "Multilingual", give me a few errors:
There is also some oddities when I select all languages or select all domains then it unselects a few tasks. Not sure why this happens. It might be that some tasks don't have any domains annotated)
|
Fixed selection Error I've also noticed that if I deselect a certain |
|
I seem to know the reason, take mteb/mteb/tasks/sts/zho/cmtebsts.py Lines 171 to 193 in 072e6ef
It's domains is an empty list Then task's filter function: Lines 581 to 606 in 072e6ef
if task.metadata.domains is not None and not (
set(task.metadata.domains) & set(domain_select)
):This condition returns |
@KennethEnevoldsen @Samoed Is the above logic a normal business operation or a bug in the filtering process? |
|
I think it can be changed to if task.metadata.domains is not None and len(task.metadata.domains) > 0 and not (
set(task.metadata.domains) & set(domain_select)
): |
Yes, or change it to: if task.metadata.domains and not (
set(task.metadata.domains) & set(domain_select)
): |
|
What was the source of bug with switching benchmarks #3629 (review)? I don't see much changes in f76e60a except for benchmark_tasks.sort()
tasks_to_keep.sort()I'm just curious |
First of all, I initially assigned the 'choices' and 'value' of the component's 'CheckboxGroup' at the time of initialization, but in the 'update_task_list' method only returned 'value', and did not use 'gr.update()' to update 'choices' and 'value' at the same time, so it was an error, because the new 'value' was not in the old 'choices' So I modified 'update_task_list' to make him return 'gr.update()' But this is still a bug because 'update_task_list' uses caching, the cache doesn't seem to be gr.update() working properly, so I defined another caching method so that it returns a list and then return 'gr.update()' |
|
Leaderboard Build Tests |
This comment was marked as off-topic.
This comment was marked as off-topic.
|
The leaderboard test is successful, but it fails on a posthook. I tried to resturn multiple times, but it still fails on a posthook for some reason |
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @q275343119!
I did a few tests and couldn't find any way to break it.
I am not sure why the leaderboard test fails (but it is only in the post-hook, so it is not on the code side). I set it to rerun (it might have been an GitHub issue)
Can I ask you to integrate v6 (I believe you can just merge #3605)?
Just to make sure that the changes it compatible with the latest version. If there are any issues here, feel free to leave it for another PR.
Otherwise I think this is all good to merge
|
The leaderboard issue seems to be a memory issue:
|
Yes, I saw that too. Is it because the |
|
It might be that results is just getting too big. Currently, we just git clone the entire repo, but it might be better to just make a shallow git clone @Samoed do you seen any issues with this? |
|
No, I haven't |
|
Anyway, this is probably outside the scope of this PR. I have made an issue on it and will merge this in -- again great work @q275343119 |
a882295
into
embeddings-benchmark:main

Task information.Preview here