naga/back/hlsl/
mod.rs

1/*!
2Backend for [HLSL][hlsl] (High-Level Shading Language).
3
4# Supported shader model versions:
5- 5.0
6- 5.1
7- 6.0
8
9# Layout of values in `uniform` buffers
10
11WGSL's ["Internal Layout of Values"][ilov] rules specify how each WGSL
12type should be stored in `uniform` and `storage` buffers. The HLSL we
13generate must access values in that form, even when it is not what
14HLSL would use normally.
15
16The rules described here only apply to WGSL `uniform` variables. WGSL
17`storage` buffers are translated as HLSL `ByteAddressBuffers`, for
18which we generate `Load` and `Store` method calls with explicit byte
19offsets. WGSL pipeline inputs must be scalars or vectors; they cannot
20be matrices, which is where the interesting problems arise.
21
22## Row- and column-major ordering for matrices
23
24WGSL specifies that matrices in uniform buffers are stored in
25column-major order. This matches HLSL's default, so one might expect
26things to be straightforward. Unfortunately, WGSL and HLSL disagree on
27what indexing a matrix means: in WGSL, `m[i]` retrieves the `i`'th
28*column* of `m`, whereas in HLSL it retrieves the `i`'th *row*. We
29want to avoid translating `m[i]` into some complicated reassembly of a
30vector from individually fetched components, so this is a problem.
31
32However, with a bit of trickery, it is possible to use HLSL's `m[i]`
33as the translation of WGSL's `m[i]`:
34
35- We declare all matrices in uniform buffers in HLSL with the
36  `row_major` qualifier, and transpose the row and column counts: a
37  WGSL `mat3x4<f32>`, say, becomes an HLSL `row_major float3x4`. (Note
38  that WGSL and HLSL type names put the row and column in reverse
39  order.) Since the HLSL type is the transpose of how WebGPU directs
40  the user to store the data, HLSL will load all matrices transposed.
41
42- Since matrices are transposed, an HLSL indexing expression retrieves
43  the "columns" of the intended WGSL value, as desired.
44
45- For vector-matrix multiplication, since `mul(transpose(m), v)` is
46  equivalent to `mul(v, m)` (note the reversal of the arguments), and
47  `mul(v, transpose(m))` is equivalent to `mul(m, v)`, we can
48  translate WGSL `m * v` and `v * m` to HLSL by simply reversing the
49  arguments to `mul`.
50
51## Padding in two-row matrices
52
53An HLSL `row_major floatKx2` matrix has padding between its rows that
54the WGSL `matKx2<f32>` matrix it represents does not. HLSL stores all
55matrix rows [aligned on 16-byte boundaries][16bb], whereas WGSL says
56that the columns of a `matKx2<f32>` need only be [aligned as required
57for `vec2<f32>`][ilov], which is [eight-byte alignment][8bb].
58
59To compensate for this, any time a `matKx2<f32>` appears in a WGSL
60`uniform` variable, whether directly as the variable's type or as part
61of a struct/array, we actually emit `K` separate `float2` members, and
62assemble/disassemble the matrix from its columns (in WGSL; rows in
63HLSL) upon load and store.
64
65For example, the following WGSL struct type:
66
67```ignore
68struct Baz {
69        m: mat3x2<f32>,
70}
71```
72
73is rendered as the HLSL struct type:
74
75```ignore
76struct Baz {
77    float2 m_0; float2 m_1; float2 m_2;
78};
79```
80
81The `wrapped_struct_matrix` functions in `help.rs` generate HLSL
82helper functions to access such members, converting between the stored
83form and the HLSL matrix types appropriately. For example, for reading
84the member `m` of the `Baz` struct above, we emit:
85
86```ignore
87float3x2 GetMatmOnBaz(Baz obj) {
88    return float3x2(obj.m_0, obj.m_1, obj.m_2);
89}
90```
91
92We also emit an analogous `Set` function, as well as functions for
93accessing individual columns by dynamic index.
94
95[hlsl]: https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl
96[ilov]: https://gpuweb.github.io/gpuweb/wgsl/#internal-value-layout
97[16bb]: https://github.com/microsoft/DirectXShaderCompiler/wiki/Buffer-Packing#constant-buffer-packing
98[8bb]: https://gpuweb.github.io/gpuweb/wgsl/#alignment-and-size
99*/
100
101mod conv;
102mod help;
103mod keywords;
104mod storage;
105mod writer;
106
107use std::fmt::Error as FmtError;
108use thiserror::Error;
109
110use crate::{back, proc};
111
112#[derive(Clone, Debug, Default, PartialEq, Eq, Hash)]
113#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
114#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
115pub struct BindTarget {
116    pub space: u8,
117    pub register: u32,
118    /// If the binding is an unsized binding array, this overrides the size.
119    pub binding_array_size: Option<u32>,
120}
121
122// Using `BTreeMap` instead of `HashMap` so that we can hash itself.
123pub type BindingMap = std::collections::BTreeMap<crate::ResourceBinding, BindTarget>;
124
125/// A HLSL shader model version.
126#[allow(non_snake_case, non_camel_case_types)]
127#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq, PartialOrd)]
128#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
129#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
130pub enum ShaderModel {
131    V5_0,
132    V5_1,
133    V6_0,
134    V6_1,
135    V6_2,
136    V6_3,
137    V6_4,
138    V6_5,
139    V6_6,
140    V6_7,
141}
142
143impl ShaderModel {
144    pub const fn to_str(self) -> &'static str {
145        match self {
146            Self::V5_0 => "5_0",
147            Self::V5_1 => "5_1",
148            Self::V6_0 => "6_0",
149            Self::V6_1 => "6_1",
150            Self::V6_2 => "6_2",
151            Self::V6_3 => "6_3",
152            Self::V6_4 => "6_4",
153            Self::V6_5 => "6_5",
154            Self::V6_6 => "6_6",
155            Self::V6_7 => "6_7",
156        }
157    }
158}
159
160impl crate::ShaderStage {
161    pub const fn to_hlsl_str(self) -> &'static str {
162        match self {
163            Self::Vertex => "vs",
164            Self::Fragment => "ps",
165            Self::Compute => "cs",
166        }
167    }
168}
169
170impl crate::ImageDimension {
171    const fn to_hlsl_str(self) -> &'static str {
172        match self {
173            Self::D1 => "1D",
174            Self::D2 => "2D",
175            Self::D3 => "3D",
176            Self::Cube => "Cube",
177        }
178    }
179}
180
181/// Shorthand result used internally by the backend
182type BackendResult = Result<(), Error>;
183
184#[derive(Clone, Debug, PartialEq, thiserror::Error)]
185#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
186#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
187pub enum EntryPointError {
188    #[error("mapping of {0:?} is missing")]
189    MissingBinding(crate::ResourceBinding),
190}
191
192/// Configuration used in the [`Writer`].
193#[derive(Clone, Debug, Hash, PartialEq, Eq)]
194#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
195#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
196pub struct Options {
197    /// The hlsl shader model to be used
198    pub shader_model: ShaderModel,
199    /// Map of resources association to binding locations.
200    pub binding_map: BindingMap,
201    /// Don't panic on missing bindings, instead generate any HLSL.
202    pub fake_missing_bindings: bool,
203    /// Add special constants to `SV_VertexIndex` and `SV_InstanceIndex`,
204    /// to make them work like in Vulkan/Metal, with help of the host.
205    pub special_constants_binding: Option<BindTarget>,
206    /// Bind target of the push constant buffer
207    pub push_constants_target: Option<BindTarget>,
208    /// Should workgroup variables be zero initialized (by polyfilling)?
209    pub zero_initialize_workgroup_memory: bool,
210}
211
212impl Default for Options {
213    fn default() -> Self {
214        Options {
215            shader_model: ShaderModel::V5_1,
216            binding_map: BindingMap::default(),
217            fake_missing_bindings: true,
218            special_constants_binding: None,
219            push_constants_target: None,
220            zero_initialize_workgroup_memory: true,
221        }
222    }
223}
224
225impl Options {
226    fn resolve_resource_binding(
227        &self,
228        res_binding: &crate::ResourceBinding,
229    ) -> Result<BindTarget, EntryPointError> {
230        match self.binding_map.get(res_binding) {
231            Some(target) => Ok(target.clone()),
232            None if self.fake_missing_bindings => Ok(BindTarget {
233                space: res_binding.group as u8,
234                register: res_binding.binding,
235                binding_array_size: None,
236            }),
237            None => Err(EntryPointError::MissingBinding(res_binding.clone())),
238        }
239    }
240}
241
242/// Reflection info for entry point names.
243#[derive(Default)]
244pub struct ReflectionInfo {
245    /// Mapping of the entry point names.
246    ///
247    /// Each item in the array corresponds to an entry point index. The real entry point name may be different if one of the
248    /// reserved words are used.
249    ///
250    /// Note: Some entry points may fail translation because of missing bindings.
251    pub entry_point_names: Vec<Result<String, EntryPointError>>,
252}
253
254#[derive(Error, Debug)]
255pub enum Error {
256    #[error(transparent)]
257    IoError(#[from] FmtError),
258    #[error("A scalar with an unsupported width was requested: {0:?}")]
259    UnsupportedScalar(crate::Scalar),
260    #[error("{0}")]
261    Unimplemented(String), // TODO: Error used only during development
262    #[error("{0}")]
263    Custom(String),
264    #[error("overrides should not be present at this stage")]
265    Override,
266}
267
268#[derive(Default)]
269struct Wrapped {
270    zero_values: crate::FastHashSet<help::WrappedZeroValue>,
271    array_lengths: crate::FastHashSet<help::WrappedArrayLength>,
272    image_queries: crate::FastHashSet<help::WrappedImageQuery>,
273    constructors: crate::FastHashSet<help::WrappedConstructor>,
274    struct_matrix_access: crate::FastHashSet<help::WrappedStructMatrixAccess>,
275    mat_cx2s: crate::FastHashSet<help::WrappedMatCx2>,
276    math: crate::FastHashSet<help::WrappedMath>,
277}
278
279impl Wrapped {
280    fn clear(&mut self) {
281        self.array_lengths.clear();
282        self.image_queries.clear();
283        self.constructors.clear();
284        self.struct_matrix_access.clear();
285        self.mat_cx2s.clear();
286        self.math.clear();
287    }
288}
289
290pub struct Writer<'a, W> {
291    out: W,
292    names: crate::FastHashMap<proc::NameKey, String>,
293    namer: proc::Namer,
294    /// HLSL backend options
295    options: &'a Options,
296    /// Information about entry point arguments and result types.
297    entry_point_io: Vec<writer::EntryPointInterface>,
298    /// Set of expressions that have associated temporary variables
299    named_expressions: crate::NamedExpressions,
300    wrapped: Wrapped,
301
302    /// A reference to some part of a global variable, lowered to a series of
303    /// byte offset calculations.
304    ///
305    /// See the [`storage`] module for background on why we need this.
306    ///
307    /// Each [`SubAccess`] in the vector is a lowering of some [`Access`] or
308    /// [`AccessIndex`] expression to the level of byte strides and offsets. See
309    /// [`SubAccess`] for details.
310    ///
311    /// This field is a member of [`Writer`] solely to allow re-use of
312    /// the `Vec`'s dynamic allocation. The value is no longer needed
313    /// once HLSL for the access has been generated.
314    ///
315    /// [`Storage`]: crate::AddressSpace::Storage
316    /// [`SubAccess`]: storage::SubAccess
317    /// [`Access`]: crate::Expression::Access
318    /// [`AccessIndex`]: crate::Expression::AccessIndex
319    temp_access_chain: Vec<storage::SubAccess>,
320    need_bake_expressions: back::NeedBakeExpressions,
321}