Skip to content

Unmap/discard/fstrim #2708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sheepa opened this issue Aug 24, 2021 · 10 comments
Open

Unmap/discard/fstrim #2708

sheepa opened this issue Aug 24, 2021 · 10 comments
Labels
Good first issue Indicates a good issue for first-time contributors Priority: Medium Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled Roadmap: New Request Status: Parked Indicates that an issues or pull request will be revisited later

Comments

@sheepa
Copy link

sheepa commented Aug 24, 2021

Is firecracker supporting unmap/discard/fstrim, ie automatic reduction of the image size on the host when a file is deleted on the guest? If it does how do I enable it?

@acatangiu
Copy link
Contributor

Hi @sheepa,

Firecracker does not support automatic reduction of the image size on the host when a file is deleted on the guest.

Firecracker emulates raw block devices with fixed capacity. Furthermore, Firecracker does not monitor or introspect guest operations so it has no information around the contents of guest disks (such as current use %).

What you describe would come with non-negligible overhead and would require guest cooperation to find out which ranges of the backing file can be dropped; so it would add a lot of complexity. What is your usecase? Why is this important?

@raduweiss
Copy link
Contributor

@sheepa we will be tracking this as a new feature request that we have not yet taken on to our roadmap. Folks that could use this, feel free to comment. We will also actively get feedback here from other users as settle our 2022 roadmap.

@raduweiss raduweiss added the Priority: Medium Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled label Aug 31, 2021
@crappycrypto
Copy link

It seems like a good feature for longer running VMs to be able to release disk space on the host using discard. It might not be the primary use case for firecracker, but I occasionally leave firecracker running for days and it would be nice to be able to release disk space without recreating the VM.

At a first glance the implementation shouldn't be that difficult for a simple file backed disk. A single discard command can be converted into a set of calls like fallocate(fd, FALLOC_FL_PUNCH_HOLE| FALLOC_FL_KEEP_SIZE , offset, len)

Supporting it completely however quickly uncovers more features that might be implemented as well. For example for disks which are backed by a block device on the host we need to use the BLKDISCARD ioctl. And implementing discard also raises the question of whether firecracker should support the write zeroes command as well.

@sheepa
Copy link
Author

sheepa commented Jul 10, 2022

Circling back after almost a year, sorry for the late reply;

What is your usecase? Why is this important? - @raduweiss

For long running processes, if firecracker is to be used as an alternative to qemu it would need unmap, the same goes for memory.

I really love firecracker but without it I cannot use it. Is there a roadmap for this? I cannot seem to find it.

@goireu
Copy link

goireu commented Sep 29, 2023

Hi folks, I have another use case here.

I have an application that may or may not be long lived depending on how the end user configures it (can be configured to shut down on idle, in such case the customer agrees to sometimes cope with a wake-up delay for a reduced bill).
In this scenario, customers which want to go idle also want fast wake up, customers which prefer always-on don't need this.

As a platform maintainer, I really don't want to choose between Firecracker or QEMU depending on customer's configuration, it makes no sense engineering-wise.

In my specific scenario, I don't really care about RAM release because I don't over-provision RAM, but discard is important because the underlying storage is a mutualized CEPH.

@JonathanWoollett-Light JonathanWoollett-Light added the Status: Parked Indicates that an issues or pull request will be revisited later label Oct 30, 2023
@ShadowCurse ShadowCurse added the Good first issue Indicates a good issue for first-time contributors label Oct 16, 2024
@ShadowCurse
Copy link
Contributor

Hi everyone,
We are not currently work on this feature, but we are open for a PR with this change. Some notes on the potential implementation:

  • virtio block has VIRTIO_BLK_F_DISCARD which Firecracker currently does not expose to the guest. If it is exposed to the guest it will allow it to use VIRTIO_BLK_T_DISCARD request type.
  • the discard logic itself can be implemented with https://www.man7.org/linux/man-pages/man2/fallocate.2.html

@mdhaduk
Copy link

mdhaduk commented Apr 3, 2025

Hi, we're a group of students at UT Austin looking to contribute to virtualization-related open-source projects for our class. We came across this issue and would like to work on it if that's okay @ShadowCurse. Before we would proceed, we have some questions about the implementation since this is our first exposure to the Firecracker codebase.

Currently, I'm just stating my interest in contributing but I will follow up within the next 3 days with our understanding of the issue and our potential implementation design.

Thanks.

@LDagnachew
Copy link

LDagnachew commented Apr 7, 2025

Hi there, I also work with @mdhaduk at UT, and we came up with a rough implementation over trying to resolve this issue of implementing the discard logic and wanted to run it through you guys @acatangiu @ShadowCurse.

1. Expose the Discard Feature (VIRTIO_BLK_F_DISCARD)

We need to advertise discard support to the guest by enabling the VIRTIO_BLK_F_DISCARD feature in the device’s available features. For example, during the virtio block device initialization (likely in the device setup file), update the feature negotiation code as follows:

// in /src/vmm/src/devices/virtio/block/virtio/device.rs:
let mut avail_features = (1u64 << VIRTIO_F_VERSION_1)
    | (1u64 << VIRTIO_RING_F_EVENT_IDX)
    | VIRTIO_BLK_F_DISCARD; // Enable discard support

if config.cache_type == CacheType::Writeback {
    avail_features |= 1u64 << VIRTIO_BLK_F_FLUSH;
}

2. Parse Discard Requests

To our understanding, since this is an unimplemented RequestType, we need to add this variant in the RequestType enum.

// in request.rs
pub enum RequestType {
    In,
    Out,
    Flush,
    GetDeviceID,
    Discard,   // New variant for discard
    Unsupported(u32),
}

impl From<u32> for RequestType {
    fn from(value: u32) -> Self {
        match value {
            VIRTIO_BLK_T_IN      => RequestType::In,
            VIRTIO_BLK_T_OUT     => RequestType::Out,
            VIRTIO_BLK_T_FLUSH   => RequestType::Flush,
            VIRTIO_BLK_T_GET_ID  => RequestType::GetDeviceID,
            VIRTIO_BLK_T_DISCARD => RequestType::Discard,  // Map discard request
            t                    => RequestType::Unsupported(t),
        }
    }
}
...

Additionally, we must create a new match arm to handle RequestType::Discard and invoke a helper function that performs the discard operation with the actual logic).

// in request.rs:
pub(crate) fn process(
    self,
    disk: &mut DiskProperties,
    desc_idx: u16,
    mem: &GuestMemoryMmap,
    block_metrics: &BlockDeviceMetrics,
) -> ProcessingResult {
    // ... existing processing logic ...

    match request_type {
        // ... other request types ...
        RequestType::Discard => {
            // Retrieve offset and length from the request descriptor
            let offset = /* extract offset from request */;
            let len = /* extract length from request */;
            self.handle_discard(offset, len)
                .map_err(|e| /* map error accordingly */)?;
        }
        // ... handle other cases ...
    }
}

3. Implement Backend for Discarding

From here, we use the fallocate syscall with FALLOC_FL_PUNCH_HOLE as stated in previous discussions.

// in sync_io.rs:
fn handle_discard(&self, offset: u64, len: u32) -> Result<()> {
    let fd = self.file.as_raw_fd();
    let result = fallocate(
        fd,
        FallocateFlags::FALLOC_FL_PUNCH_HOLE | FallocateFlags::FALLOC_FL_KEEP_SIZE,
        offset as i64,
        len as i64,
    );
    if let Err(e) = result {
        eprintln!("Discard failed: {}", e);
        return Err(Error::IoError);
    }
    Ok(())
}
// define fallocate & flags (not sure where this would go).
use libc::{c_int, off64_t};
use std::io::Error as IoError;

bitflags::bitflags! {
    pub struct FallocateFlags: c_int {
        const FALLOC_FL_KEEP_SIZE = 0x01;
        const FALLOC_FL_PUNCH_HOLE = 0x02;
    }
}

// Wrapper for library call
pub fn fallocate(fd: c_int, mode: FallocateFlags, offset: off64_t, len: off64_t) -> Result<(), IoError> {
    let ret = unsafe { libc::fallocate64(fd, mode.bits(), offset, len) };
    if ret == 0 {
        Ok(())
    } else {
        Err(IoError::last_os_error())
    }
}

Please let us know if there's anything we aren't accounting for in our implementation. We're still figuring out some details (ex. actually fully parsing the command, as we need some logic in the parse() function as well) but I wanted to come out with a rough draft. If needed, I can also make a draft PR.

@ShadowCurse
Copy link
Contributor

Hi @mdhaduk, @LDagnachew, you are welcome to work on this issue. The general idea of changes seems correct, but there are also integration tests you need to think about. Also it seems not all file systems support this, so VIRTIO_BLK_F_DISCARD should not be exposed for those devices. Open a PR when you are ready for a review.

@LDagnachew LDagnachew mentioned this issue Apr 18, 2025
10 tasks
@LDagnachew
Copy link

Hi @ShadowCurse, I submitted a draft PR and would love some comments on what I have so far. Feel free to take a look at your own convenience. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good first issue Indicates a good issue for first-time contributors Priority: Medium Indicates than an issue or pull request should be resolved ahead of issues or pull requests labelled Roadmap: New Request Status: Parked Indicates that an issues or pull request will be revisited later
Projects
None yet
Development

No branches or pull requests

9 participants