[PyCUDA] Tricks to avoid device2device data copy when slicing the gpuarray?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[PyCUDA] Tricks to avoid device2device data copy when slicing the gpuarray?

黄 瓒

Hi All,

@inducer THANK YOU for providing PyCUDA.

As cudaMalloc could be time-consuming, it seems even slicing would include such operation in PyCUDA, are there any tricks to avoid frequent gpu memory operation in PyCUDA?

Regards,
Peter



_______________________________________________
PyCUDA mailing list
[hidden email]
https://lists.tiker.net/listinfo/pycuda
Reply | Threaded
Open this post in threaded view
|

Re: Tricks to avoid device2device data copy when slicing the gpuarray?

Andreas Kloeckner
黄 瓒 <[hidden email]> writes:

> Hi All,
>
> @inducer<https://github.com/inducer> THANK YOU for providing PyCUDA.
>
> As cudaMalloc could be time-consuming, it seems even slicing would include such operation in PyCUDA, are there any tricks to avoid frequent gpu memory operation in PyCUDA?

Slicing a GPUArray involves no allocations. PyCUDA includes a memory
pool which can help avoid redundant allocation.

Andreas

_______________________________________________
PyCUDA mailing list
[hidden email]
https://lists.tiker.net/listinfo/pycuda

signature.asc (847 bytes) Download Attachment