Pig can run in two execution modes. The modes depends on where the pig script is going to run and where the data is residing. The data can be stored in a single machine, i.e. local file system or it can be stored in a distributed environment like typical Hadoop Cluster environment. We can run pig programs in three different modes.
First one is non interactive shell also known as script mode. In this we have to create a file, load the code in the file and execute the script. Second one is grunt shell, it is an interactive shell for running pig commands. Third one is embedded mode, in this we use JDBC to run SQL programs from Java.
Pig Local mode
In this mode, the pig runs on single JVM and accesses the local file system. This mode is best suitable for dealing with the smaller data sets. In this, parallel mapper execution is not possible because the earlier versions of the Hadoop versions are not thread safe.
By providing –x local, we can get in to pig local mode of execution. In this mode, pig always looks for the local file system path when data is loaded. $pig –x local implies that it’s in local mode.
Pig Map reduce mode
In this mode, we could have proper Hadoop cluster setup and Hadoop installations on it. By default, the pig runs on MR mode. Pig translates the submitted queries into Map reduce jobs and runs them on top of Hadoop cluster. We can say this mode as a Map reduce mode on a fully distributed cluster.