I have just enrolled in a Data Science course on Udemy  and I learned good stuff.

I know you’ve heard many times « Look this one, this is the job of the future ! ». The simplest thing I can do is explain why it’s interesting to learn about Data Science. This is extremely useful skills for the future.

The principle is that the more data there is, the more work there is for Data Scientists. Let’s look at the amount of data created in the world in the past, present, and estimate for the future.

130 Exabytes have been created by humans since the beginning of humanity until 2005. Ok, you didn’t understand. Don’t worry, it was the same for me. Let’s go back to the source.

# Measuring data

The source, it’s 1 byte (1B) and 1 byte is the necessary place for a hard drive to hold a letter. For example, the letter « S » = 1 byte (1B).

You go to the next level and you multiply 1 byte (1B) by 1000 which gives you 1 Kilobyte (1Kb). A book’s page contains between 2000 and 5000 letters so we can say that a half of page of text is about 1 Kilobyte (1Kb).

You go to the next level and you multiply 1 Kilobyte (1Kb) by 1000 which gives you 1 Megabyte (1Mb). A 500 pages book is about 1 Megabyte (1Mb).

You go to the next level and you multiply 1 Megabyte (1Mb) by 1000 which gives you 1 Gigabyte (Gb). A human genome (coded) can be contained in 1 Gigabyte (1Gb).

You go to the next level and you multiply 1 Gigabyte (1Gb) by 1000 which gives you 1 Terabyte (1Tb). If you take an HD camera and take a picture every day, every hour for 80 years. All videos can be contained in 1 Terabyte (1Tb).

You go to the next level and you multiply 1 Terabyte (1Tb) by 1000 which gives you 1 Petabyte (1Pb). If you take all trees of Amazonian forest to make paper and you write text on both sides each paper, all this paper represents between 1 and 2 Petabyte (1-2 Pb).

You go to the next level and you multiply 1 Petabyte (1Pb) per 1000 which gives you 1 Exabyte (1Eb). All existing data on planet Earth is contained in 1 Exabyte (1Eb).

# More more more data

I think now you understand better how we measure the amount of data in a hard drive. At first, I told you that 130 Exabytes (130 Eb) created by humans from the beginning of humanity until 2005.

In 2010, this increased to 1200 Exabytes (1200 Eb). In 2015, this increased to 7900 Exabyte (7900 Eb). The forecast for 2020 is that this will increase up to 40 900 Exabyte (40 900 Eb).You see how data creation is growing in the world, it goes very very fast.

With a graphic, it’s easier to visualize all that.

The blue line on the graph corresponds to the quatitiy that machines (computers) can sore. You see, there is much more data than what computers can store.

The red line corresponds to what Data Scientists can process as data. You see, there is much more data than Data Scientists can process.

Another important point is that the gap between the machines and the Data Scientists will increase over time.

There are very few Data Scientits in the world and because they’re rare, they’re expensive or their salaries are high.

As companies increasingly seek ou Data Scientists, universities and engineering schools are beginning to offer this type of trainining.

The fact that the number of data increase, the companies demand to have Data Scientist to proccess data also increase. This demand is so enormous that it’s expected that in dozen years, everyone will know the Data Science’s basics as the programming now.

I advise you to do research on Data Science, you’ll see, it can be used in any industry, it’s really interesting.