Skip to content

[Tracking/Asahi] Rust abstractions with rebased dependencies #964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 69 commits into
base: rust-next
Choose a base branch
from

Conversation

asahilina
Copy link

@asahilina asahilina commented Feb 11, 2023

This PR tracks the remaining Rust abstractions needed for the Apple Silicon GPU driver. It contains whole abstractions pulled from the rust branch (some with minor changes) rebased on top of rust-next (these commits marked *RFL import are not directly upstreamable as they lack correct authorship information, but might be useful as a reference for the authors to upstream them), and then additional abstractions and changes I've been writing that depend on them.

At the time of opening this PR, it includes all the remaining DRM and DMA Fence abstractions, and their dependencies. I will add more things (e.g. Platform Device from rust, plus my OF abstractions) as I rebase the driver and pick them out.

The branches are intended to follow the hierarchy Rust-for-Linux:rust-nextAsahiLinux:gpu/rust-for-nextAsahiLinux:gpu/rust-for-later.

@asahilina asahilina force-pushed the gpu/rust-for-later branch 3 times, most recently from 0c69efe to 4bce6aa Compare February 16, 2023 12:17
sulix and others added 9 commits February 17, 2023 00:50
The rust_fmt_argument function is called from printk() to handle the %pA
format specifier.

Since it's called from C, we should mark it extern "C" to make sure it's
ABI compatible.

Cc: [email protected]
Fixes: 247b365 ("rust: add `kernel` crate")
Signed-off-by: David Gow <[email protected]>
Link: Rust-for-Linux#967
Reviewed-by: Björn Roy Baron <[email protected]>
Reviewed-by: Gary Guo <[email protected]>
[Applied `rustfmt`]
Signed-off-by: Miguel Ojeda <[email protected]>
This allows downstream consumers to keep track of private data for shmem
mappings. In particular, the Rust abstraction will use this to safely
drop data associated with a mapping when it is unmapped.

Signed-off-by: Asahi Lina <[email protected]>
Reviewed-by: Sven Peter <[email protected]>
Reviewed-by: Eric Curtin <[email protected]>
Signed-off-by: Hector Martin <[email protected]>
While we normally encourage devm usage by drivers, some consumers (and
in particular the upcoming Rust abstractions) might want to manually
manage memory. Export the raw functions to make this possible.

Signed-off-by: Asahi Lina <[email protected]>
Reviewed-by: Sven Peter <[email protected]>
Reviewed-by: Eric Curtin <[email protected]>
Signed-off-by: Hector Martin <[email protected]>
Other functions touching shmem->sgt take the pages lock, so do that here
too. drm_gem_shmem_get_pages() & co take the same lock, so move to the
_locked() variants to avoid recursive locking.

Signed-off-by: Asahi Lina <[email protected]>
This commit provides the build flags for Rust for AArch64. The core Rust
support already in the kernel does the rest.

The Rust samples have been tested with this commit.

[jcunliffe: Arm specific parts taken from Miguel's upstream tree]

Signed-off-by: Miguel Ojeda <[email protected]>
Co-developed-by: Jamie Cunliffe <[email protected]>
Signed-off-by: Jamie Cunliffe <[email protected]>
Reviewed-by: Vincenzo Palazzo <[email protected]>
Signed-off-by: Vincenzo Palazzo <[email protected]>
Reviewed-by: Gary Guo <[email protected]>
Enable the PAC ret and BTI options in the Rust build flags to match
the options that are used when building C.

Signed-off-by: Jamie Cunliffe <[email protected]>
Reviewed-by: Vincenzo Palazzo <[email protected]>
Disable the neon and fp target features to avoid fp & simd
registers. The use of fp-armv8 will cause a warning from rustc about
an unknown feature that is specified. The target feature is still
passed through to LLVM, this behaviour is documented as part of the
warning. This will be fixed in a future version of the rustc
toolchain.

Signed-off-by: Jamie Cunliffe <[email protected]>
Reviewed-by: Vincenzo Palazzo <[email protected]>
This allows printing the inner data of `Arc` and its friends if the
inner data implements `Display` or `Debug`. It's useful for logging and
debugging purpose.

Signed-off-by: Boqun Feng <[email protected]>
Reviwed-by: Vincenzo Palazzo <[email protected]>
Reviewed-by: Gary Guo <[email protected]>
Reviewed-by: Vincenzo Palazzo <[email protected]>
Reviewed-by: Andreas Hindborg <[email protected]>
Reviewed-by: Björn Roy Baron <[email protected]>
This both demonstrates the usage of different print format in Rust and
serves as a selftest for the `Display` and `Debug` implementation of
`Arc` and its friends.

Signed-off-by: Boqun Feng <[email protected]>
Reviewed-by: Björn Roy Baron <[email protected]>
Reviewed-by: Finn Behrens <[email protected]>
Reviewed-by: Vincenzo Palazzo <[email protected]>
Reviewed-by: Gary Guo <[email protected]>
Reviewed-by: Andreas Hindborg <[email protected]>
asahilina and others added 17 commits February 17, 2023 01:03
This module is intended to contain functions related to kernel
timekeeping and time. Initially, this just wraps ktime_get() and
ktime_get_boottime() and returns them as core::time::Duration instances.

Signed-off-by: Asahi Lina <[email protected]>
This makes it mirror the way expect_ident() works, and means we can more
easily push the result back into the token stream.

Signed-off-by: Asahi Lina <[email protected]>
This makes things like concat_idents!(bindings::foo, bar) work.
Otherwise, there is no way to concatenate two idents and then use the
result as part of a type path.

Signed-off-by: Asahi Lina <[email protected]>
Modules can (and usually do) have multiple alias tags, in order to
specify multiple possible device matches for autoloading. Allow this by
changing the alias ModuleInfo field to an Option<Vec<String>>.

Note: For normal device IDs this is autogenerated by modpost (which is
not properly integrated with Rust support yet), so it is useful to be
able to manually add device match aliases for now, and should still be
useful in the future for corner cases that modpost does not handle.

This pulls in the expect_group() helper from the rfl/rust branch
(with credit to authors).

Co-developed-by: Miguel Ojeda <[email protected]>
Co-developed-by: Finn Behrens <[email protected]>
Co-developed-by: Sumera Priyadarsini <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
Add simple 1:1 wrappers of the C ioctl number manipulation functions.
Since these are macros we cannot bindgen them directly, and since they
should be usable in const context we cannot use helper wrappers, so
we'll have to reimplement them in Rust. Thankfully, the C headers do
declare defines for the relevant bitfield positions, so we don't need
to duplicate that.

Signed-off-by: Asahi Lina <[email protected]>
This mirrors the standard library's alloc::sync::Arc::downcast().

Based on the Rust standard library implementation, ver 1.62.0,
licensed under "Apache-2.0 OR MIT", from:

    https://github.com/rust-lang/rust/tree/1.62.0/library/alloc/src

For copyright details, please see:

        https://github.com/rust-lang/rust/blob/1.62.0/COPYRIGHT

Signed-off-by: Asahi Lina <[email protected]>
This is the Rust equivalent to ERR_PTR(), for use in C callbacks.
Marked as #[allow(dead_code)] for now, since it does not have any
consumers yet.

Signed-off-by: Asahi Lina <[email protected]>
Add a function to create `Error` values out of a kernel error return,
which safely upholds the invariant that the error code is well-formed
(negative and greater than -MAX_ERRNO). If a malformed code is passed
in, it will be converted to EINVAL.

Imported from rust-for-linux/rust as authored by Miguel and Fox with
refactoring from Wedson.

Co-developed-by: Miguel Ojeda <[email protected]>
Co-developed-by: Fox Chen <[email protected]>
Co-developed-by: Wedson Almeida Filho <[email protected]>
Signed-off-by: Miguel Ojeda <[email protected]>
Signed-off-by: Fox Chen <[email protected]>
Signed-off-by: Wedson Almeida Filho <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
Add a to_result() helper to convert kernel C return values to a Rust
Result, mapping >=0 values to Ok(()) and negative values to Err(...),
with Error::from_kernel_errno() ensuring that the errno is within range.

Imported from rust-for-linux/rust, originally developed by Wedson as part of the
AMBA device driver support.

Co-developed-by: Wedson Almeida Filho <[email protected]>
Signed-off-by: Wedson Almeida Filho <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
Some kernel C API functions return a pointer which embeds an optional
`errno`. Callers are supposed to check the returned pointer with
`IS_ERR()` and if this returns `true`, retrieve the `errno` using
`PTR_ERR()`.

Create a Rust helper function to implement the Rust equivalent:
transform a `*mut T` to `Result<*mut T>`.

Lina: Imported from rust-for-linux/linux, with subsequent refactoring
and contributions squashed in and attributed below. Replaced usage of
from_kernel_errno_unchecked() with an open-coded constructor, since this
is the only user anyway.

Co-developed-by: Boqun Feng <[email protected]>
Signed-off-by: Boqun Feng <[email protected]>
Co-developed-by: Miguel Ojeda <[email protected]>
Signed-off-by: Miguel Ojeda <[email protected]>
Co-developed-by: Fox Chen <[email protected]>
Signed-off-by: Fox Chen <[email protected]>
Co-developed-by: Gary Guo <[email protected]>
Signed-off-by: Gary Guo <[email protected]>
Signed-off-by: Sven Van Asbroeck <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
Add a helper macro to easily return C result codes from a Rust function
that calls functions which return a Result<T>.

Imported from rust-for-linux/rust, originally developed by Wedson as
part of file_operations.rs.

Co-developed-by: Wedson Almeida Filho <[email protected]>
Signed-off-by: Wedson Almeida Filho <[email protected]>
Co-developed-by: Fox Chen <[email protected]>
Signed-off-by: Fox Chen <[email protected]>
Co-developed-by: Miguel Ojeda <[email protected]>
Signed-off-by: Miguel Ojeda <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
This is a subset of the Rust standard library `alloc` crate,
version 1.66.0, licensed under "Apache-2.0 OR MIT", from:

    https://github.com/rust-lang/rust/tree/1.66.0/library/alloc/src

The file is copied as-is, with no modifications whatsoever
(not even adding the SPDX identifiers).

For copyright details, please see:

    https://github.com/rust-lang/rust/blob/1.66.0/COPYRIGHT

Signed-off-by: Asahi Lina <[email protected]>
This is a subset of the Rust standard library `alloc` crate,
version 1.66.0, licensed under "Apache-2.0 OR MIT", from:

    https://github.com/rust-lang/rust/tree/1.66.0/library/alloc/src

The file is copied as-is, with no modifications whatsoever
(not even adding the SPDX identifiers).

For copyright details, please see:

    https://github.com/rust-lang/rust/blob/1.66.0/COPYRIGHT

Signed-off-by: Asahi Lina <[email protected]>
Add some missing fallible methods that we need.

They are all marked as:

    #[stable(feature = "kernel", since = "1.0.0")]

for easy identification.

Lina: Extracted from 487d757 ("rust: alloc: add some `try_*`
methods we need") in rust-for-linux/rust.

Signed-off-by: Miguel Ojeda <[email protected]>
Signed-off-by: Asahi Lina <[email protected]>
The XArray is an abstract data type which behaves like a very large
array of pointers. Add a Rust abstraction for this data type.

The initial implementation uses explicit locking on get operations and
returns a guard which blocks mutation, ensuring that the referenced
object remains alive. To avoid excessive serialization, users are
expected to use an inner type that can be efficiently cloned (such as
Arc<T>), and eagerly clone and drop the guard to unblock other users
after a lookup.

Future variants may support using RCU instead to avoid mutex locking.

This abstraction also introduces a reservation mechanism, which can be
used by alloc-capable XArrays to reserve a free slot without immediately
filling it, and then do so at a later time. If the reservation is
dropped without being filled, the slot is freed again for other users,
which eliminates the need for explicit cleanup code.

Signed-off-by: Asahi Lina <[email protected]>
TODO: This isn't abstracted properly yet

Signed-off-by: Asahi Lina <[email protected]>
Apple Silicon SoCs (M1, M2, etc.) have a GPU with an ARM64 firmware
coprocessor. The firmware and the GPU share page tables in the standard
ARM64 format (the firmware literally sets the base as its TTBR0/1
registers). TTBR0 covers the low half of the address space and is
intended to be per-GPU-VM (GPU user mappings and kernel-managed
buffers), while TTBR1 covers the upper half and is global (firmware
code, data, management structures shared with the AP, and a few
GPU-accessible data structures).

In typical Apple fashion, the permissions are interpreted differently
from traditional ARM PTEs. By default, firmware mappings use Apple SPRR
permission remapping. The firmware only uses that for its own
code/data/MMIO mappings, and those pages are not accessible by the GPU
hardware. We never need to touch/manage these mappings, so this patch
does not support them.

When a specific bit is set in the PTEs, permissions switch to a
different scheme which supports various combinations of firmware/GPU
access. This is the mode intended to be used by AP GPU drivers, and what
we implement here.

The prot bits are interpreted as follows:

- IOMMU_READ and IOMMU_WRITE have the usual meaning.

- IOMMU_PRIV creates firmware-only mappings (no GPU access)
- IOMMU_NOEXEC creates GPU-only structures (no FW access)
- Otherwise structures are accessible by both GPU and FW

- IOMMU_MMIO creates Device mappings for firmware
- IOMMU_CACHE creates Normal-NC mappings for firmware (cache-coherent
  from the point of view of the AP, but slower)
- Otherwise creates Normal mappings for firmware (this requires manual
  cache management on the firmware side, as it is not coherent with the
  SoC fabric)

GPU-only mappings (textures/etc) are expected to use IOMMU_CACHE and are
seemingly coherent with the CPU (or otherwise the firmware/GPU already
issue the required cache management operations when correctly
configured).

There is a GPU-RO/FW-RW mode, but it is not currently implemented (it
doesn't seem to be very useful for the driver). There seems to be no
real noexec control (i.e. for shaders) on the GPU side. All of these
mappings are implicitly noexec for the firmware.

Drivers are expected to fully manage per-user (TTBR0) page tables, but
ownership of shared kernel (TTBR1) page tables is shared between the
firmware and the AP OS. We handle this by simply using a smaller IAS to
drop down one level of page tables, so the driver can install a PTE in
the top-level (firmware-initialized) page table directly and just add an
offset to the VAs passed into the io_pgtable code. This avoids having to
have any special handling for this here. The firmware-relevant data
structures are small, so we do not expect to ever require more VA space
than one top-level PTE covers (IAS=36 for the next level, 64 GiB).

Only 16K page mode is supported. The coprocessor MMU supports huge pages
as usual for ARM64, but the GPU MMU does not, so we do not enable them.

Signed-off-by: Asahi Lina <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants