Tracking cache requests #1566

betolink · 2024-04-04T17:41:27Z

As discussed in #1527 this PR adds a few extra variables in the cache implementation to keep track of bytes requested vs cache hits. The numbers can be accessed directly or via the logs in debug mode.

This is also useful to benchmark the I/O behavior of the different cache implementations.

betolink · 2024-04-09T00:33:49Z

Hi @martindurant! I think this PR is ready for a review in case you have time this week.

martindurant

Thank you, this looks good on the whole, particularly the set of tests.

I have some comments for you to consider.

martindurant · 2024-04-09T13:53:29Z

fsspec/caching.py

            sstart = i * self.blocksize
            send = min(sstart + self.blocksize, self.size)
-            logger.debug(f"MMap get block #{i} ({sstart}-{send}")
+            self.total_requested_bytes += send - sstart
+            logger.debug(f"MMap get block #{i} ({sstart}-{send})")


There are a few of these (not your code, but shows up in the diff due to small corrections). Debug of the form ("test %s", arg) is preferred, so that the string doesn't need to be evaluated in the case that debug is not required.

fsspec/caching.py

martindurant · 2024-04-09T13:59:34Z

fsspec/caching.py

            part = self.cache[start:end]
            if end > self.blocksize:
+                self.total_requested_bytes += end - self.blocksize


Same comment: isn't the number of bytes returned by fetcher the more important number? If we request 100bytes, we might only get 10 back; probably the calling code will request more. OTOH, the various implementations of the request code might keep making new requests until enough bytes arrive.

Same as above, perhaps we are missing another variable to keep track of the total_bytes_returned?

fsspec/caching.py

fsspec/spec.py

martindurant · 2024-04-09T14:02:25Z

fsspec/tests/test_caches.py

@@ -15,26 +21,124 @@ def test_cache_getitem(Cache_imp):


 def test_block_cache_lru():
-    cache = BlockCache(4, letters_fetcher, len(string.ascii_letters), maxblocks=2)
+    """


This is a comment, not a docstring

martindurant · 2024-04-11T14:37:37Z

I can't fix your linter failure, since the PR is from master branch. You can run precommit locally, e.g.,

>>> pre-commit install
>>> pre-commit run -a

betolink · 2024-04-11T15:08:11Z

Thanks @martindurant! I'll work on some minor changes now and will run linting locally before the next push. About modifying code on external forks, it's possible! I think you could pull/edit the PR directly (as long as the PR author checks the Allow edits and access to secrets by maintainers box.

betolink · 2024-04-11T17:26:31Z

Ok now it should pass and I also addressed the comments on formatting the logs for cache implementations that do nothing @martindurant

betolink added 9 commits February 16, 2024 10:13

adding more detailed caching stats

49aaf3f

adding some testing and fixing out of range reqs

1a04591

adding more caches for completion

dff0d30

enhancing i/o debug logging

16b27f0

fix typos

1d3b0c0

cleaning expects

b1582b4

Merge branch 'fsspec:master' into master

e99b9ca

black formatting

205dc06

fix linting with ruff

4e2cb57

martindurant reviewed Apr 9, 2024

View reviewed changes

improve logs formatting, fix linting

d1b5c70

martindurant merged commit 05e7d80 into fsspec:master Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tracking cache requests #1566

Tracking cache requests #1566

Uh oh!

betolink commented Apr 4, 2024

Uh oh!

betolink commented Apr 9, 2024

Uh oh!

martindurant left a comment

Uh oh!

martindurant Apr 9, 2024

Uh oh!

Uh oh!

martindurant Apr 9, 2024

Uh oh!

betolink Apr 9, 2024

Uh oh!

Uh oh!

Uh oh!

martindurant Apr 9, 2024

Uh oh!

martindurant commented Apr 11, 2024

Uh oh!

betolink commented Apr 11, 2024

Uh oh!

betolink commented Apr 11, 2024

Uh oh!

Uh oh!

Tracking cache requests #1566

Tracking cache requests #1566

Uh oh!

Conversation

betolink commented Apr 4, 2024

Uh oh!

betolink commented Apr 9, 2024

Uh oh!

martindurant left a comment

Choose a reason for hiding this comment

Uh oh!

martindurant Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

martindurant Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

betolink Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

martindurant Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

martindurant commented Apr 11, 2024

Uh oh!

betolink commented Apr 11, 2024

Uh oh!

betolink commented Apr 11, 2024

Uh oh!

Uh oh!