Speaker
Description
Processing massive scientific datasets is challenging. Avoiding memory bottlenecks without having to rewrite existing software often involves breaking up and analyzing data in smaller chunks, a process both inefficient and unsuitable to exploit the scientific value of the data. To address this problem, at FZJ we co-develop the Python library Heat [https://github.com/helmholtz-analytics/heat]. Heat can be used as a backend to your NumPy/SciPy-based code, to intuitively distribute massive memory-intensive operations to multi-CPU, multi-GPU clusters. In this discussion, I can briefly show an example of Heat usage, but most of all I'm curious about your compute / memory bottlenecks, and about potential collaborations. Feel free to bring your own code!