HDFS is a block based file system that spans multiple nodes in a cluster and allows user data to be stored in files. It presents a traditional hierarchical file organization so that users or applications can manipulate (create, rename, move or remove) files and directories. It also presents a streaming interface that can be used to run any application of choice using the MapReduce framework. HDFS does not support setting hard or soft links and you cannot seek to particular blocks or overwrite files. HDFS requires programmatic access and so you cannot mount it as a file system. All HDFS communication is layered on top of the TCP/IP protocol.