Disco: Erlang/Python MapReduce #2
Please check Part 1 here if you have not already. Part 2: The Payload (Python Jobs) Preparing the Data (files) Disco Distributed Filesystem (DDFS) is a great low-level component of Disco. DDFS is designed with huge data in mind, so it made more sense to use it in my experiment as opposed to any other type of storage, for example, HDFS. Moreover, we can even store job results in DDFS, which we are going to do below....