Showing posts from February, 2020

Parallel processing libraries in Python - Dask vs Ray

At work, I'm working with enabling numerical computations/transformations of N-dimensional tensors that cannot fit in a single machine's memory. They are "big data" in the order of TBs-PBs. The data types are numerical arrays; e.g. np.array. Data processing / analysis code are written in Python. I looked at two libraries to enable parallel computing within the Python ecosystem. The two are Dask and Ray. I made some comparison between them.   Dask Dask is a flexible library for parallel computing in Python Ray Ray is a fast and simple framework for building and running distributed applications. (Originally published 02/25/2020, updated 03/03/2021 as per new features in the libraries) Dimension Dask Ray Color Code Justification Efficient Data Sharing Between Nodes  Source: Data sharing happens via TCP messages between workers. Every time dat

[Snow Sports] (02/02/20 - 02/07/20) Snowboarding in Zermatt, Switzerland

We spent 5 days skiing in Zermatt! (with a rest day in between!) Skiing details We weren't blessed with the weather the first couple days; it was stormy and very windy, so the Matterhorn side was closed. We could ski in the Sunnegga side though! We were definitely anxious to see the Matterhorn but it was nowhere to be found in those two couple cloudy days. From the fourth day (third day skiing), the weather became great, and we were able to see the Matterhorn! One day was spent just enjoying skiing near the Matterhorn glacier. Another day was spent going to and back from the Italy side. And another day was spent taking an off-piste lesson with Bruno who was great! He showed us some good off-piste run near Rothorn that we could not have found on our own. He also equipped us some avalanche transceivers although avalanche risks were pretty low that day.  My favorite part was just the mere fact that I was skiing with Matterhorn so close by. I also very much enjoyed learning abou