08-13 09:16 阅读 66

hadoop what is difference between Pig and Hive? Stack Overflow

hadoop - what is difference between Pig and Hive? - Stack Overflow

Apache Pig and Hive are two projects that layer on top of Hadoop, and provide a higher-level language for using Hadoop's MapReduce library. Apache Pig provides a scripting language for describing operations like reading, filtering, transforming, joining, and writing data -- exactly the operations that MapReduce was originally designed for. Rather than expressing these operations in thousands of lines of Java code that uses MapReduce directly, Pig lets users express them in a language not unlike a bash or perl script. Pig is excellent for prototyping and rapidly developing MapReduce-based jobs, as opposed to coding MapReduce jobs in Java itself.
If Pig is "scripting for Hadoop", then Hive is "SQL queries for Hadoop". Apache Hive offers an even more specific and higher-level language, for querying data by running Hadoop jobs, rather than directly scripting step-by-step the operation of several MapReduce jobs on Hadoop. The language is, by design, extremely SQL-like. Hive is still intended as a tool for long-running batch-oriented queries over massive data; it's not "real-time" in any sense. Hive is an excellent tool for analysts and business development types who are accustomed to SQL-like queries and Business Intelligence systems; it will let them easily leverage your shiny new Hadoop cluster to perform ad-hoc queries or generate report data across data stored in storage systems mentioned above.

推荐资源

Java后端管理系统Spring+SpringMVC+SpringDataJPA源码 2021最新知乎精准引流9.0+知乎好物变现技术：轻松月入过万企业级AD域管理部署实战微软升级版MCSE MCSA必修课程 Windows Server 2016AD管理实战 MySQL数据库最佳入门到项目实践视频教程尚观教育MySQL数据库基础实践视频教程基于AWS云平台大规模集群千亿数据调优方案祁连山photoshop cs6视频教程 99个视频教程 PHP架构之Linux基础、进阶优化、开发、负载均衡教程大数据-基于Spark的机器学习-智能客户系统项目实战最新升级全新版本Tableau高级应用实战商业智能与可视化数据分析课程 Tableau实用课程 2022年最新好物分享课程：短视频带货从零基础到精通，只需手机+实操