Sender | Message | Time |
---|---|---|
25 Jan 2023 | ||
One follow-up here: why would barriers be necessary on SIMD architectures? My understanding is if you have an if/else statement and at least one thread runs on each branch, then SIMD will take all threads in a workgroup down both branches and just ignore the irrelevant branch. Wouldn't everything then be synchronized anyway, at which point wouldn't the barriers be unneeded? (I'm positive my understanding of SIMD is wrong here, but it'd be useful to know how it is wrong!) | 11:51:26 | |
The scope of a workgroupBarrier is all invocations in the workgroup. It is likely that the SIMD width of the hardware you are running on would only include a subset of those invocations (depends on the GPU and the workgroup size), and therefore a barrier is necessary.In the future (post V1), we will have subgroups which will likely align to the SIMD width of the hardware, and a subgroup barrier to synchronize between them would be inexpensive (or free). | 11:59:26 | |
In reply to @jrprice:matrix.orgOk, this makes a lot of sense. I can imagine that 256 (the max) well exceeds the size of the actual SIMD width, at which point the WebGPU API needs to manage the barriers.How expensive is synchronization currently? Is it something to be avoided unless necessary, or is it pretty cheap and can be used without thinking about it. | 12:12:02 | |
13:40:25 | ||
In reply to @ghadeer.abousaleh:matrix.orgCorrect, that's the only one that deals with endianness. And shmookey found the right minutes (to my recollection). Basically we don't see the demand for a big-endian GPU programming model. And adding one significantly increases testing complexity. | 15:25:47 | |
In reply to @stronglynormal:matrix.orgRe: storageBarrier vs. workgroupBarrer. I filed an issue to answer that hopefully well in a more permanent place. See my reply https://github.com/gpuweb/gpuweb/issues/3774#issuecomment-1403887129 Basically, they can't be used to coordinate access across workgroups. | 16:31:41 | |
In reply to @stronglynormal:matrix.orgRe: "Are workgroup barriers faster that storage barriers? Are n atomic operations on a workgroup variable faster than n atomic operations across different threads on a storage variable?" I think it depends a lot on your application (sorry). Structurally workgroup address space is only visible from that workgroup, but storage buffer memory is ultimately addressible from any invocation in the device. Depending on how the GPU is architected it's possible for access to storage buffers is slower. Implicitly there's the question of: is there enough work that can be done within a workgroup so that it's worth temporarily moving the data from storage buffer to workgroup, and then copying results back out. It's very app dependent. | 17:35:59 | |
26 Jan 2023 | ||
Hey folks, I know at some point isInf and isNan was available but was then later deprecated. My question is on how could one try to implement it. Is there a constant that one can check against to implement this functionality? | 17:32:27 | |
Unfortunately there is no reliable and portable way - which is the reason they were removed from core. | 17:38:48 | |
your best bet is checking before the nan is created | 18:53:51 | |
23:04:46 | ||
27 Jan 2023 | ||
07:34:54 | ||
In reply to @ben-clayton:matrix.orgThanks | 09:45:45 | |
Another question. Today I just started my Chrome Canary and seems it was automatically updated and O see that the deprecated support for SPIRV has been removed. Is there a chance this can be enabled as a developer flag? I'm working in a project that would take a long time to move completely to WGSL. I also tryied webgpu on mac but, it doesn t seem to be supported yet. Any suggestion? | 09:47:56 | |
Dawn and Naga both provide offline tooling to convert from SPIR-V to WGSL. | 09:49:12 | |
I believe some have successfully built both as WASM, which enables SPIR-V consumption in the browser. | 09:49:59 | |
hum, what do you mean? Actually my project is a c++ -> webassembly initiative | 09:50:49 | |
I do have in my todos to try naga as a runtime porting layer | 09:51:15 | |
but even going through spirv in webassembly returns Unsupported sType (SType::ShaderModuleSPIRVDescriptor). Expected (SType::ShaderModuleWGSLDescriptor) | 09:52:47 | |
Tint is the compiler for Dawn. It can be fetched and built from https://dawn.googlesource.com/tint, and it has an executable that can consume SPIR-V and emit WGSL. | 09:53:46 | |
Thanks for the help. Let me try that out. I'm already using dawn a a library for the C++ app. Should be easy to try :) | 09:54:39 | |
(That Tint repo is a cut-down version of Dawn - it's updated with Dawn's changes, daily) | 09:55:28 | |
I have a line in a shader that looks like let light = lighting.data[i]; where lighting is a uniform buffer. Tint expands that in HLSL with a call to a helper function that copies the entire buffer - it takes lighting by value which ultimately compiles into hundreds of consecutive MOVs - cf. the similar materials buffer is read-only-storage and the helper function it generates seems to take the buffer by reference. Is this expected behavior / how should I be writing this sort of thing? If it's expected, are these semantics in the spec? | 16:49:56 | |
As this is specific to Dawn / Tint - the Dawn chat room might be a better place to ask this. That said - to help me reproduce - what does your lighting uniform buffer look like? | 16:56:40 | |
Ah whoops wrong channel - the buffer looks something like this:
| 17:01:28 | |
* Ah whoops wrong channel - the buffer looks something like this:
| 17:01:46 | |
* Ah whoops wrong channel - the buffer looks something like this:
| 17:02:00 | |
Redacted or Malformed Event | 17:02:16 | |
* Ah whoops wrong channel - the buffer looks something like this:
| 17:02:27 | |
I have to go AFK for a little bit. If possible, please can you file a minimal repro at crbug.com/tint? I'm curious to see these large-data copies. | 17:03:00 |