1.When call ti.sync on python scope,it is equal to cudaDeviceSynchronize or cudaStreamSynchronize ?
2.I can call ti.sync in kernel scope. Does it equal to __syncthreads() in CUDA?
Thanks!
1.When call ti.sync on python scope,it is equal to cudaDeviceSynchronize or cudaStreamSynchronize ?
2.I can call ti.sync in kernel scope. Does it equal to __syncthreads() in CUDA?
Thanks!
Currently ti.sync()
corresponds to cudaStreamSynchronize()
on the CUDA backend
I find interfaces in these two places:
taichi/program/program.cpp
taichi/llvm/llvm_program.cpp
Taichi uses CUDA Driver API, so it’s actually cuStreamSynchronize()
.
Calling ti.sync()
within @ti.kernel
is undefined behavior, not sure if there are side effects.
I initialized like this ti.init(print_kernel_nvptx=True,arch=ti.cuda)
, and didn’t see relevant sync barrier statement in the PTX kernel