The Luxembourg SuperComputing Competence Center is hosting a webinar 18 February 2026.
Training description
Due to the enormous growth in the size and number of datasets, professionals need tools and techniques to work with big data. Data analysis and preprocessing of small data can be done on a single computer, most of the time in a sequential regime. Typical libraries used are NumPy, Pandas, Scikit-learn, Matplotlib. On the other hand, large datasets (1 TB and more) cannot fit into the RAM of a single computer and be processed sequentially. In such cases, parallel and distributed techniques are used, and typical libraries include Dask, Apache Spark, RAPIDS, CuPy, cuDF, to name a few. In this workshop, participants will get familiar with the basic concepts of High-Performance Data Analytics (HPDA) and the libraries used for the analysis of big datasets.
Target audience
This workshop is designed for anyone who would like to learn more about HPDA, and the techniques used in the Python ecosystem. It would be good if participants had a basic knowledge of Python programming and some experience in analyzing datasets.
GPU Compute Resources
During the training, participants will have access to the MeluXina supercomputer. For more information about MeluXina, please refer to the MeluXina Overview and the MeluXina – Getting Started Guide. Communication will take place via MicrosoftTeams and email. All training content will be provided in advance on GitHub.
Agenda
This half-day course will be hosted online in Central European Time (CET) on February 18, 2026 (01:00 PM – 05:00 PM).
Important: Limited spots available (25 participants max)!
Registration will close a week before the training date, on the 11 February 2026.
Contact person for more information: Aleksandra RANCIC – aleksandra.rancic[at]uni.lu