Big data-Pyspark Developer-Hadoop

Bengaluru East, Karnataka, India

3 days ago

Applicants: 0

Apply Now

Unix Linux OS Java Hadoop Apache Spark

Salary Not Disclosed

3 weeks left to apply

Job Description

Skillset required: 1. Excellent knowledge of UNIX/LINUX OS. 2. Knowing of core java is a plus but not mandatory. 3. Good understanding of OS concepts, process management and resource scheduling. 4. Basics of networking, CPU, memory and storage. 5. Good hold of shell scripting. Deploying a hadoop cluster, maintaining a hadoop cluster, adding and removing nodes using cluster monitoring tools like Cloudera Manager, configuring the NameNode high availability and keeping a track of all the running hadoop jobs. 2. Implementing, managing and administering the overall hadoop infrastructure. 3. Knowledge all the components in the Hadoop ecosystem like Apache Spark, Apache Hive, HBase, Kafka, Sqoop, Yarn, Zookeeper etc. 4. Takes care of the day-to-day running of Hadoop clusters. 5. Candidate will have to work closely with the database team, network team, BI team and application teams to make sure that all the big data applications are highly available and performing as expected. 6. Candidate is responsible for capacity planning and estimating the requirements for lowering or increasing the capacity of the hadoop cluster. 7. Ensure that the hadoop cluster is up and running all the time. 8. Monitoring the cluster connectivity and performance. 9. Manage and review Hadoop log files. 10. Backup and recovery tasks 11. Resource and security management 12. Troubleshooting application errors and ensuring that they do not occur again.