Skip to content

Slow speed comparing to Python lxml #1087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gryznar opened this issue May 13, 2024 · 2 comments
Open

Slow speed comparing to Python lxml #1087

gryznar opened this issue May 13, 2024 · 2 comments

Comments

@gryznar
Copy link

gryznar commented May 13, 2024

I am previous user of lxml. Unfortunatelly parsing using lxml was much faster comparing to html. Maybe there are places to improve it via applying some solutions from lxml? The biggest drop is in creating Document from String, especially for big sites

@mosuem mosuem transferred this issue from dart-archive/html Oct 29, 2024
@HosseinYousefi
Copy link
Member

Recently a PR with some performance improvements has been merged. Can you check if the performance is now more comparable? If not, could you provide an example preferably with the benchmark harness for python as well that demonstrates the difference in speed so I can take a look into it?

@gryznar
Copy link
Author

gryznar commented May 2, 2025

Yeah, I'll try it in free time. Thanks for the improvements!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants