Hadoop backup and recovery solutions : learn the best strategies for data recovery from Hadoop backup clusters and troubleshoot problems / Gaurav Barot, Chintan Mehta, Amij Patel.
- Edition:
- 1st edition
- Publication:
- Birmingham, England ; Mumbai, [India] : Packt Publishing, 2015.
- Series:
- Community experience distilled.
Community Experience Distilled - Format/Description:
- Book
1 online resource (206 p.) - Subjects:
- Apache Hadoop.
Electronic data processing -- Distributed processing. - Form/Genre:
- Electronic books.
- Language:
- English
- System Details:
- text file
- Summary:
- If you are a Hadoop administrator and you want to get a good grounding in how to back up large amounts of data and manage Hadoop clusters, then this book is for you.
- Contents:
- ""Cover""; ""Copyright""; ""Credits""; ""About the Authors""; ""About the Reviewers""; ""www.PacktPub.com""; ""Table of Contents""; ""Preface""; ""Chapter 1: Knowing Hadoop and Clustering Basics""; ""Understanding the need for Hadoop""; ""Apache Hive""; ""Apache Pig""; ""Apache HBase""; ""Apache HCatalog""; ""Understanding HDFS design""; ""Getting familiar with HDFS daemons""; ""Scenario 1 � writing data to the HDFS cluster""; ""Scenario 2 � reading data from the HDFS cluster""; ""Understanding the basics of Hadoop cluster""; ""Summary""
""Chapter 2: Understanding Hadoop Backup and Recovery Needs""""Understanding the backup and recovery philosophies""; ""Replication of data using DistCp""; ""Updating and overwriting using DistCp""; ""The backup philosophy""; ""Changes since the last backup""; ""The rate of new data arrival""; ""The size of the cluster""; ""Priority of the datasets""; ""Selecting the datasets or parts of datasets""; ""The timelines of data backups""; ""Reducing the window of possible data loss""; ""Backup consistency""; ""Avoiding invalid backups""; ""The recovery philosophy""
""Knowing the necessity of backing up Hadoop""""Determining backup areas � what should I back up?""; ""Datasets""; ""Block size � a large file divided into blocks""; ""Replication factor""; ""A list of all the blocks of a file""; ""A list of DataNodes for each block � sorted by distance""; ""The ACK package""; ""The checksums""; ""The number of under-replicated blocks""; ""The secondary NameNode""; ""Active and passive nodes in second generation Hadoop""; ""Hardware failure""; ""Software failure""; ""Applications""; ""Configurations""; ""Is taking backup enough?""
""Understanding the disaster recovery principle""""Knowing a disaster""; ""The need for recovery""; ""Understanding recovery areas""; ""Summary""; ""Chapter 3: Determining Backup Strategies""; ""Knowing the areas to be protected""; ""Understanding the common failure types""; ""Hardware failure""; ""Host failure""; ""Using commodity hardware""; ""Hardware failures may lead to loss of data""; ""User application failure""; ""Software causing task failure""; ""Failure of slow-running tasks""; ""Hadoop's handling of failing tasks""; ""Task failure due to data""
""Bad data handling � through code""""Hadoop's skip mode""; ""Learning a way to define the backup strategy""; ""Why do I need a strategy?""; ""What should be considered in a strategy?""; ""Filesystem check (fsck)""; ""Filesystem balancer""; ""Upgrading your Hadoop cluster""; ""Designing network layout and rack awareness""; ""Most important areas to consider while defining a backup strategy""; ""Understanding the need for backing up Hive metadata""; ""What is Hive?""; ""Hive replication""; ""Summary""; ""Chapter 4: Backing Up Hadoop""; ""Data backup in Hadoop""; ""Distributed copy""
""Architectural approach to backup"" - Notes:
- Includes index.
Description based on online resource; title from PDF title page (ebrary, viewed August 5, 2015). - Contributor:
- Mehta, Chintan, author.
Patel, Amij, author. - ISBN:
- 1-78328-905-8
- OCLC:
- 915154145
-
Loading...
Location | Notes | Your Loan Policy |
---|
Description | Status | Barcode | Your Loan Policy |
---|