ITEC874 - Big Data Technologies - Data Lake Architecture - IT Assessment Answer

Download Solution Order New Solution
Assessment Task:
ITEC874 - Big Data Technologies IT Assessment Answer

Part 1. Data Lake Components

In the lecture, you have been introduced to the high-level concepts of the whys and whats of a Data Lake. The goal of this assignment is to take a deep dive into the architecture of Data Lake and provide a Design Patterns for the problem of dealing with organizing a collection of datasets that holds a vast amount of data gathered from various private/open data islands. Your design should include the specification of the following components in some details:

Data Ingestion Component

a. You need to research and identify the different types of data (from structured to unstructured) and data ingest (e.g., batch, micro-batch, real-time), and briefly explain them.

b. Identify the existing Big Data Technologies and Tools for ingesting big data, e.g., Hortonworks DataFlow1 .

Data Organization Component

a. You need to research and compare various techniques for organizing data, e.g., Directory Structure, Version Control and Database Management Systems.

b. Identify the existing Database Management Systems for each category, e.g. MySQL in Relational DBs and MongoDB in NoSQL document-oriented DBs.

Data Security and Governance Component

a. You need to research and identify the requirements for governing the right data access and the rights for defining and modifying data.

b. Identify the existing trust, security, and privacy issues in Big Data.

Indexing and Search Component

a. You need to research on the topic “Federated Search” topic and identify technologies that facilitate the simultaneous search of multiple searchable resources.

b. Identify the existing Big Data Technologies and Tools for indexing and searching the big data: e.g., Elasticsearch2 and some research outcomes3,4.

Analytics Component

a. You need to research and compare the techniques for analysing the data (from structured to unstructured) and extracting insight from them.

b. Identify the existing Big Data Technologies and Tools for analysing the big data: SAS Tools5 (such as SAS Text-Analytics6 ), Microsoft7 ML platform, Amazon8 ML Platform, and Apache Mahoot9 .

Visualization Component:

a. You need to research and identify the techniques for visualizing the data.

b. Identify the existing Big Data Technologies and Tools for visualizing the big data: e.g. SAS10 Visual Analytics. Other examples include D3.JS11 and VIS.JS12.

 

Part 2. Data Lake Architecture

Design Patterns are formalized best practices that one can use to solve common problems when designing a system. Refer to the Data Lake components in Part 1, and propose a Data Lake architecture for the problem of graph search in big graph databases.

 

This ITEC874: IT Assessment has been solved by our IT experts at My Uni Paper. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our experts are well trained to follow all marking rubrics & referencing style.

Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.