So far we have covered the following topics in the big data. You can click on the hyperlink and go to a specific topic.
- Technology Stack
- hadoop distributed file system (HDFS)
- Map Reduce
- Installing Hadoop ( Single Node)
- Apache Hadoop installing Multi Node
- Big Data: Troubleshooting, Administering and optimizing Hadoop
So what is Pig? Pig is high level data scripting language.
The major components of pig are:
-Runtime engine
-language Pig Latin
Execution Tools| Mode
- Grunt shell or commandline
-local or map reduce mode
-interactive or batch
Below is the Pig Latin Script
lets look at the script little closer. first three lines A, B and C are loading data in the cluster. D and E are the filters
PIG Vs SQL
SQL (Declarative)
-Write queries from inside out
-Complex queries can get different to comprehend
PIG Latin (Procedural)
-Data flows step by step
-clean and easy to write
-data can be stored at any point in the pipe
Pig Latin
- collection of statements
-statements built using operators/expressions
-

No comments:
Post a Comment