What's new in this version: CUDA Compiler: Resolved Issues: - Previously, when using recent versions of VS 2019 host compiler, a call to pow(double, int) or pow(float, int) in host or device code sometimes caused build failures. This issue has been resolved.
CuSOLVER: New Features: - New singular value decomposition (GESVDR) is added. GESVDR computes partial spectrum with random sampling, an order of magnitude faster than GESVD - libcusolver.so no longer links libcublas_static.a; instead, it depends on libcublas.so. This reduces the binary size of libcusolver.so. However, it breaks backward compatibility. The user has to link libcusolver.so with the correct version of libcublas.so.
CuSPARSE: New Features: - New Tensor Core-accelerated Block Sparse Matrix - Matrix Multiplication (cusparseSpMM) and introduction of the Blocked-Ellpack storage format - New algorithms for CSR/COO Sparse Matrix - Vector Multiplication (cusparseSpMV) with better performance - New algorithm (CUSPARSE_SPMM_CSR_ALG3) for Sparse Matrix - Matrix Multiplication (cusparseSpMM) with better performance especially for small matrices - New routine for Sampled Dense Matrix - Dense Matrix Multiplication (cusparseSDDMM) which deprecated cusparseConstrainedGeMM and provides better performance - Better accuracy of cusparseAxpby, cusparseRot, cusparseSpVV for bfloat16 and half regular/complex data types - All routines support NVTX annotation for enhancing the profiler time line on complex applications
Deprecations: - cusparseConstrainedGeMM has been deprecated in favor of cusparseSDDMM - cusparseCsrmvEx has been deprecated in favor of cusparseSpMV - COO Array of Structure (CooAoS) format has been deprecated including cusparseCreateCooAoS, cusparseCooAoSGet, and its support for cusparseSpMV
Known Issues: - cusparseDestroySpVec, cusparseDestroyDnVec, cusparseDestroySpMat, cusparseDestroyDnMat, cusparseDestroy with NULL argument could cause segmentation fault on Windows
Resolved Issues: - cusparseAxpby, cusparseGather, cusparseScatter, cusparseRot, cusparseSpVV, cusparseSpMV now support zero-size matrices - cusparseCsr2cscEx2 now correctly handles empty matrices (nnz = 0) - cusparseXcsr2csr_compress now uses 2-norm for the comparison of complex values instead of only the real part
Extended functionalities for cusparseSpMV: - Support for the CSC format - Support for regular/complex bfloat16 data types for both uniform and mixed-precision computation - Support for mixed regular-complex data type computation - Support for deterministic and non-deterministic computation
NPP: New features: - New APIs added to compute Distance Transform using Parallel Banding Algorithm (PBA) - nppiDistanceTransformPBA_xxxxx_C1R_Ctx() – where xxxxx specifies the input and output combination 8u16u, 8s16u, 16u16u, 16s16u, 8u32f, 8s32f, 16u32f, 16s32f) and nppiSignedDistanceTransformPBA_32f_C1R_Ctx()
Resolved issues: - Fixed the issue in which Label Markers adds zero pixel as object region
NVJPEG: New Features: - nvJPEG decoder added a new API to support region of interest (ROI) based decoding for batched hardware decoder: nvjpegDecodeBatchedEx() and nvjpegDecodeBatchedSupportedEx()
Resolved Issues: - Previously, reduced performance of power-of-2 single precision FFTs was observed on GPUs with sm_86 architecture. This issue has been resolved - Large prime factors in size decomposition and real to complex or complex to real FFT type no longer cause cuFFT plan functions to fail
CUPTI: Deprecations early notice: - The following functions are scheduled to be deprecated in 11.3 and will be removed in a future release: - NVPW_MetricsContext_RunScript and NVPW_MetricsContext_ExecScript_Begin from the header nvperf_host.h. - cuptiDeviceGetTimestamp from the header cupti_events.h
NVIDIA CUDA Toolkit 11.2.1 (for Windows 10) 相關參考資料
CUDA Toolkit 11.2 Update 1 Downloads | NVIDIA Developer
By downloading and using the software, you agree to fully comply with the terms and conditions of the CUDA EULA. Operating System. Linux Windows ...
https://developer.nvidia.com
CUDA Toolkit Archive | NVIDIA Developer
Previous releases of the CUDA Toolkit, GPU Computing SDK, documentation and ... CUDA Toolkit 11.2.1 (Feb 2021), Versioned Online Documentation.
https://developer.nvidia.com
Download NVIDIA CUDA Toolkit 11.2.1 - Softpedia
Download NVIDIA CUDA Toolkit - Extensive programming package that includes tools for ... 16,236 downloads Updated: February 10, 2021 Freeware ... a valuable resource for both beginners and advanced pr...
https://www.softpedia.com
Installation Guide Windows :: CUDA Toolkit Documentation
2 天前 — The installation instructions for the CUDA Toolkit on MS-Windows systems. ... The CPU and GPU are treated as separate devices that have their own memory spaces. This configuration ... Windows ...
http://docs.nvidia.com
NVIDIA CUDA Installation Guide for Microsoft Windows
Installation and Verification on Windows ... Windows 10. YES. YES. Windows ... stand-alone driver, install the driver from the NVIDIA CUDA Toolkit. Note: The ...
https://docs.nvidia.com
NVIDIA CUDA Toolkit 11.2 - NVIDIA Developer Documentation
CUDA Toolkit and Minimum Compatible Driver Versions. CUDA Toolkit. Linux x86_64 Driver Version. Windows x86_64. Driver Version. CUDA 11.2.1 Update 1.
https://docs.nvidia.com
Nvidia CUDA Toolkit 11.2.1 Download - TechSpot
The CUDA Installers include the CUDA Toolkit, SDK code samples, and developer ... February 10, 2021. Developer: Nvidia. License: Freeware. OS: Windows.
https://www.techspot.com
NVIDIA CUDA Toolkit 11.2.1 Free Download for Windows 10 ...
Make use of the full power of your GPU using the NVIDIA CUDA Toolkit, which is an advanced tool to manage and improve the usage of your graphics card with ...
https://www.filecroco.com
Release Notes :: CUDA Toolkit Documentation - NVIDIA ...
跳到 What's New in CUDA 11.2.1 — This section summarizes the changes in CUDA 11.2.1 (11.2 ... could cause segmentation fault on Windows.
http://docs.nvidia.com
|