Article on Dr.-Ing. Nicolas Weber

Article on Dr.-Ing. Nicolas Weber https://www.mergian.de/categories/article/ Recent content in Article on Dr.-Ing. Nicolas Weber Hugo -- gohugo.io Sat, 12 Nov 2022 00:00:00 +0000 VEDA: Best practices to use hybrid programming on the NEC SX-Aurora TSUBASA https://www.mergian.de/2022/sxaurora-veda/ Sat, 12 Nov 2022 00:00:00 +0000 https://www.mergian.de/2022/sxaurora-veda/ The Vector Engine Driver API (VEDA) was developed to enable easy porting of existing CUDA applications to NEC’s SX-Aurora TSUBASA. While the API enables a smooth transition between the different architectures, there are unique features that require special attention, to achieve optimal performance. In this article we present multiple methods to improve your code. First, we explain how to use C++ function overloading and templates. Second, we show how to make best use of the unique features of VEDAdeviceptrs. SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks https://www.mergian.de/2022/sxaurora-sol/ Sun, 01 May 2022 00:00:00 +0000 https://www.mergian.de/2022/sxaurora-sol/ The increased interest in Artificial Intelligence (AI) raised the need for highly optimized and sophisticated AI frameworks. Starting with the Lua-based Torch many frameworks have emerged over time, such as Theano, Caffe, Chainer, CNTK, MxNet, PyTorch, DL4J, or TensorFlow. All of these provide a high level scripting API that allows users to easily design neural networks and run these on various kinds of hardware. What the user usually does not see is the high effort put into these frameworks to provide peak execution performance. AVEO-VEDA: Hybrid Programming for the NEC Vector Engine https://www.mergian.de/2021/sxaurora-aveo-veda/ Wed, 14 Jul 2021 00:00:00 +0000 https://www.mergian.de/2021/sxaurora-aveo-veda/ Hybrid programming is a state of the art method for incorporating compute accelerators such as GPUs or vector processors into applications that run on a host system. The main reason for hybrid programming is that compute accelerators are well suited for compute and memory heavy tasks but perform poorly in control flow dominated code sections. Therefore latter are usually executed on CPUs while the compute heavy parts are offloaded to accelerators.