Issue with the CUDA driver

Hi! Since I’m a newbie to Taichi, I thought it could have been appropriate to discuss my problem here before opening an issue on Github, so to be sure I’m not doing anything fundamentally wrong.

I wrote this simple program for measuring performance:

import taichi as ti
import timeit

class Laplacian2d:
    def run(self, in_phi: ti.template(), out_lap: ti.template()):
        for i, j, k in out_lap:
            out_lap[i, j, k] = (
                -4 * in_phi[i, j, k]
                + in_phi[i - 1, j, k]
                + in_phi[i + 1, j, k]
                + in_phi[i, j - 1, k]
                + in_phi[i, j + 1, k]

def main(arch):

    shape = (128, 128, 128)
    nt = 1000

    phi = ti.field(float, shape=shape)
    lap = ti.field(float, shape=shape)

    laplacian = Laplacian2d()

    start = timeit.default_timer()
    for _ in range(nt):, lap)
    stop = timeit.default_timer()

    print(f"Elapsed time: {stop - start} s")

if __name__ == "__main__":

The kernel compiles and runs fine, but when it comes to releasing device memory (I guess), it throws [cuda_driver.h:operator()@80] CUDA Error CUDA_ERROR_CONTEXT_IS_DESTROYED: context is destroyed while calling stream_synchronize (cuStreamSynchronize). Any idea?

I’m using Python3.8 and CUDA 11.0 on a Linux machine.

I meet the same problem .Did you solve it?