# Pig Development Guide

## 1. Simple example

 1. Upload data to the hdfs directory

```
[hadoop@uhadoop-******-master1 pig]$ hadoop fs -put /etc/passwd /user/hadoop/passwd
```

 2. Start pig

```
[hadoop@uhadoop-******-master1 pig]$ pig
```

 3. Load data

```
grunt> A = load 'passwd' using PigStorage(':');
grunt> dump A;
```

Display results:

```
(root,x,0,0,root,/root,/bin/bash) 
……
```

## 2. Use UDF

- Prepare data

Content of the student file

    any 9 5
    bob 8 4

Upload the student file

```
hdfs dfs -put student /user/root/student
```

- Sample code

``` java
package myudfs;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;

public class UPPER extends EvalFunc<String>
{
    public String exec(Tuple input) throws IOException {
        if (input == null || input.size() == 0 || input.get(0) == null)
            return null;
        try{
            String str = (String)input.get(0);
            return str.toUpperCase();
        }catch(Exception e){
            throw new IOException("Caught exception processing input row ", e);
        }
    }
}
```

- Compile

```
cd myudfs
javac -cp $ PIG_HOME/pig-0.12.0-cdh5.4.4.jar UPPER.java
cd ..
jar -cf myudfs.jar myudfs
```

**Test script upper.pig**

``` pig
REGISTER myudfs.jar;
A = LOAD 'student' AS (name: chararray, age: int, gpa: float);
B = FOREACH A GENERATE myudfs.UPPER(name);
DUMP B;
```

- Execute

```
pig upper.pig
```

- Output result

    (ANY 9 5)
    (BOB 8 4)
