What's new in this version: CUDA Toolkit Major Component Versions: CUDA Components: - Starting with CUDA 11, the various components in the toolkit are versioned independently
CUDA Driver: - Running a CUDA application requires the system with at least one CUDA capable GPU and a driver that is compatible with the CUDA Toolkit. See Table 2. For more information various GPU products that are CUDA capable - Each release of the CUDA Toolkit requires a minimum version of the CUDA driver. The CUDA driver is backward compatible, meaning that applications compiled against a particular version of the CUDA will continue to work on subsequent (later) driver releases.
- General CUDA: - Stream ordered memory allocator enhancements
CUDA Graph Enhancements: - Enhancements to make stream capture more flexible: Functionality to provide read-write access to the graph and the dependency information of a capturing stream, while the capture is in progress. See cudaStreamGetCaptureInfo_v2() and cudaStreamUpdateCaptureDependencies(). - User object lifetime assistance: Functionality to assist user code in lifetime management for user-allocated resources referenced in graphs. Useful when graphs and their derivatives and asynchronous executions have an unknown/unbounded lifetime not under control of the code that created the resource, such as libraries under stream capture. See cudaUserObjectCreate() and cudaGraphRetainUserObject() - Graph Debug: New API to produce a DOT graph output from a given CUDA Graph
New Stream Priorities: - The CUDA Driver API cuCtxGetStreamPriorityRange() now exposes a total of 6 stream priorities, up from the 3 exposed in prior releases - Expose driver symbols in runtime API - New CUDA Driver API cuGetProcAddress() and CUDA Runtime API cudaDriverGetEntryPoint() to query the memory addresses for CUDA Driver API functions - Support for virtual aliasing across kernel boundaries - Added support for Ubuntu 20.04.2 on x86_64 and Arm sbsa platforms
CUDA Tools: CUDA Compilers: - Cu++flt demangler tool - NVRTC versioning changes - Preview support for alloca()
Nsight Eclipse Plugin: - Eclipse versions 4.10 to 4.14 are currently supported in CUDA 11.3
CUDA Libraries: cuFFT Library: - cuFFT shared libraries are now linked statically against libstdc++ on Linux platforms - Improved performance of certain sizes (multiples of large powers of 3, powers of 11) in SM86
cuSPARSE Library: - Added new routine cusparesSpSV for sparse triangular solver with better performance. The new Generic API supports: - CSR storage format - Non-transpose, transpose, and transpose-conjugate operations - Upper, lower fill mode - Unit, non-unit diagonal type - 32-bit and 64-bit indices - Uniform data type computation
NVIDIA Performance Primitives (NPP): - Added nppiDistanceTransformPBA functions
Deprecated Features: - The following features are deprecated in the current release of the CUDA software. The features still work in the current release, but their documentation may have been removed, and they will become officially unsupported in a future release. We recommend that developers employ alternative solutions to these features in their software.
CUDA Libraries: - cuSPARSE: cusparseScsrsv2_analysis, cusparseScsrsv2_solve, cusparseXcsrsv2_zeroPivot, and cusparseScsrsv2_bufferSize have been deprecated in favor of cusparseSpSV
Tools: - Nsight Eclipse Plugin: Docker support is deprecated in Eclipse 4.14 and earlier versions as of CUDA 11.3, and Docker support will be dropped for Eclipse 4.14 and earlier in a future CUDA Toolkit release.
Resolved Issues: General CUDA: - Historically, the CUDA driver has serialized most APIs operating on the same CUDA context between CPU threads. In CUDA 11.3, this has been relaxed for kernel launches such that the driver serialization may be reduced when multiple CPU threads are launching CUDA kernels into distinct streams within the same context.
cuRAND Library: - Fixed inconsistency between random numbers generated by GPU and host generators when CURAND_ORDERING_PSEUDO_LEGACY ordering is selected for certain generator types
CUDA Math API: - Previous releases of CUDA were potentially delivering incorrect results in some Linux distributions for the following host Math APIs: sinpi, cospi, sincospi, sinpif, cospif, sincospif. If passed huge inputs like 7.3748776e+15 or 8258177.5 the results were not equal to 0 or 1. These have been corrected with this release.
Known Issues: cuBLAS Library: - The planar complex matrix descriptor for batched matmul has inconsistent interpretation of batch offset - Mixed precision operations with reduction scheme CUBLASLT_REDUCTION_SCHEME_OUTPUT_TYPE (might be automatically selected based on problem size by cublasSgemmEx() or cublasGemmEx() too, unless CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION math mode bit is set) not only stores intermediate results in output type but also accumulates them internally in the same precision, which may result in lower than expected accuracy. Please use CUBLASLT_MATMUL_PREF_REDUCTION_SCHEME_MASK or CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION if this results in numerical precision issues in your application.
cuFFT Library: - cuFFT planning and plan estimation functions may not restore correct context affecting CUDA driver API applications - Plans with strides, primes larger than 127 in FFT size decomposition and total size of transform including strides bigger than 32GB produce incorrect results
cuSOLVER Library: - For values N<=16, cusolverDn[S|D|C|Z]syevjBatched hits out-of-bound access and may deliver the wrong result. The workaround is to pad the matrix A with a diagonal matrix D such that the dimension of [A 0 ; 0 D] is bigger than 16. The diagonal entry D(j,j) must be bigger than maximum eigenvalue of A, for example, norm(A, ‘fro’). After the syevj, W(0:n-1) contains the eigenvalues and A(0:n-1,0:n-1) contains the eigenvectors.
NVIDIA CUDA Toolkit 11.3.0 (for Windows 10) 相關參考資料
CUDA Toolkit 11.0 Download | NVIDIA Developer
Click on the green buttons that describe your target platform. Only supported platforms will be shown. By downloading and using the software, you agree to fully ...
https://developer.nvidia.com
CUDA Toolkit 11.3 Downloads | NVIDIA Developer
Click on the green buttons that describe your target platform. Only supported platforms will be shown. By downloading and using the software, you agree to fully ...
https://developer.nvidia.com
CUDA Toolkit Archive | NVIDIA Developer
Previous releases of the CUDA Toolkit, GPU Computing SDK, documentation and developer drivers can be found using the links below. Please select the ...
https://developer.nvidia.com
Download NVIDIA CUDA Toolkit 11.3.0 - Softpedia
2021年4月16日 — Download NVIDIA CUDA Toolkit - Extensive programming package that includes tools ... What's new in NVIDIA CUDA Toolkit 11.3.0: ... a valuable resource for both beginners and advanced...
https://www.softpedia.com
Installation Guide Windows :: CUDA Toolkit Documentation
2021年4月20日 — The installation instructions for the CUDA Toolkit on MS-Windows systems. ... The CPU and GPU are treated as separate devices that have their own memory spaces. ... Windows 10, YES, YES ...
http://docs.nvidia.com
NVIDIA CUDA Toolkit 11.3.0 (for Windows 10 ... - FileHorse
2021年4月16日 — Features · Screenshots · Change Log · Old Versions. NVIDIA CUDA Toolkit 11.3.0 (for Windows 10). Date released: 16 Apr 2021 (3 weeks ago).
https://www.filehorse.com
NVIDIA CUDA Toolkit 11.3.0 Free Download for Windows 10 ...
2021年4月16日 — Make use of the full power of your GPU using the NVIDIA CUDA Toolkit, which is an advanced tool to manage and improve the usage of your ...
https://www.filecroco.com
NVIDIA CUDA Toolkit Download (2021 Latest) for Windows 10 ...
2021年4月16日 — Download NVIDIA CUDA Toolkit for Windows PC from FileHorse. 100% Safe and Secure ✓ Free Download (32-bit/64-bit) Latest Version 2021.
https://www.filehorse.com
Release Notes :: CUDA Toolkit Documentation - NVIDIA ...
2021年4月16日 — The Release Notes for the CUDA Toolkit. ... CUDA 11.3.0 GA, >=465.19.01, >=465.89, >= 450.80.02, >= 456.38. CUDA 11.2.2 Update 2, >=460.32. ... For more information on cus...
http://docs.nvidia.com
|