What is Big Data?
As the name infers, Big Data is an enormous measure of data that is mind-boggling and hard to store, keep up or access in the standard document framework utilizing customary data handling applications. Also, what are the wellsprings of this enormous arrangement of data:-
Characteristics of Big Data
A regular file system with a typical data processing application faces the following challenges:
Volume – The volume of data coming from different sources is high and potentially increasing day by day.
Velocity – A single processor, limited RAM, and limited storage-based system are not enough to process this high volume of data.
Variety – Data coming from different sources varies
And therefore, Big Data Technology comes into the picture:
Data Types
Structured Data: Data which is presented in a tabular format and stores in RDMS (Relational Database Management System)
Semi-structured Data: Data that does not have a formal data model and is stored in XML, JSON, etc.
Unstructured Data: Data that does not have a pre-defined data model like video, audio, image, text, weblogs, system logs, etc.
What is HADOOP?
HADOOP Ecosystem & Core Components
Hadoop Sub Components
Big Data Technologies
There are different advancements in the market from various sellers including Amazon, IBM, Microsoft, and so on., to deal with large data.
Limitless Amendments | for $49.00free |
Bibliography | for $21.00free |
Outline | for $51.00free |
Title Page | for $46.00free |
Formatting | for $39.00free |
Plagiarism Report | for $18.00free |
The Big Data Problem
To understand the big data problem, let’s consider some of the examples below:
Example-1: Consider there are 3 tables ??tab1 has 100 records, tab2 has 1000 and tab3 has 100000 records.
Now consider the following queries:
SELECT COUNT(*) FROM TAB1;
SELECT COUNT(*) FROM TAB2;
SELECT COUNT(*) FROM TAB3;
Here among the given queries, 1st one runs faster followed by 2nd and 3rd one. Here even if the algorithm (COUNT) is same, as the data volume increases (100 records in TAB1 to 100000 records in TAB3), the processing speed goes down. This is the first problem with traditional RDBMS.
Problem-1: With an Increase in DATA VOLUME, processing speed decreases.
Example-2: Now consider the following 3 queries with the same table (i.e., same number of records for all of the 3 queries):
SELECT COUNT(*) FROM TAB1;
SELECT AVG(SAL) FROM TAB1;
SELECT STDDEV(SAL) FROM TAB1;
Here even if the record count is the same, still the first query runs faster followed by 2nd and 3rd one. Hence, as the complexity of the algorithm increases, the processing time decreases. This is the second problem with traditional RDBMS.
Problem-2: With the increase in DATA COMPLEXITY, processing speed decreases.
Example-3: Consider the following example of a Facebook Friends relationship:
USER |
FRIEND |
Ramesh |
Suresh |
Suresh |
Ravi |
Suresh |
Tanya |
Tanya |
Ramesh |
Here even if the data is stored in a structured format (in form of rows and columns), if we ask what is the relationship between Tanya & Ravi, it is very difficult to answer with a normal SQL query. This kind of data where record-to-record relationships exists is called Graph Data. This kind of data can’t be processed easily with traditional RDBMS. This is another problem.
Problem-3: Traditional RDBMS can’t handle All Kinds of Structured Data.
Example-4: Consider the following table structure:
NAME |
DOB |
SAL |
DEPT |
GENDER |
PHOTO |
Ramesh |
01-01-1988 |
10000 |
10 |
M |
|
Tanya |
01-01-1989 |
10000 |
20 |
F |
|
In this case, even though we can store the photo of an employee on our table, we wouldn’t be able to validate the photo with traditional SQL. For example, for Ramesh, if I store Tanya's photo, RDBMS will still store it and I have no way with traditional SQL to validate that I am actually storing a Female photo for a Male record. This is another problem of RDBMS where it can’t handle all kinds of data.
Problem-4: Traditional RDBMS can’t handle all VARIETIES of data
Apart from the above problems with the processing of the data, traditional RDBMS is limited in terms of Storage Capacity. For example, if we consider Facebook posts in a day, there are Millions and Billions of transactions (posting a comment, posting a status update, updating profile pic, likes, shares, etc.) happening each second. Hence this is not possible with Traditional RDBMS to store all of that data and process it.
Hence Big Data can be defined as the combination of Huge Volume Data and Complex Data.
You get a preview before making final payment.
You can pay using multiple secure channels, such as PayPal or Credit Cards.
We sent unique content with no plagiarism.
You can talk to us anytime around the clock. We are up for the support.
We let you chose from the pool of 2000 PhDs tutors.
You do not need to be on laptop all the time, our mobile interface is great to use.
Here's a list of some of our Students testimonials. From small to a large variety of solutions,
Assignment Achievers has made happy clients all over the world and we are proud to share
some of our experiences with you.