History of Big Data
Roger Mougalas introduced the term Big Data back in 2005. However, the implementation of big data and search to understand the available data has been in existence from a long duration of time. Some of the very early traces of implementation of big data date 7000 years back.
All this began when accounting was introduced in Mesopotamia to record herding and crop growth. The fundamentals continued to grow and improve over time. In 1963, John Graunt also known as the father of statistics, recorded and analyzed the mortality rate information in London. John Graunt’s purpose of analyzing information was to create awareness about the impacts of a deadly plague. John Graunt gave the world’s first statistical analysis of data ever created in his book “Natural and Political Observations Made upon the Bills of Mortality.” The book provides an insight into the reasons for deaths that occurred in the seventeenth century in England.
After Graunt’s contributions, there were developments and improvements in accounting principles but no big change happened until the 20th century. The modern information era began when Herman Hollerith invented a computing system in 1889 in an effort to organize census data. After Herman Hollerith’s works, the next notable event about the development of data took place in 1937, when Franklin D. Roosevelt was president of the United States. When the Social Security Act was passed by the United States Congress, track of millions of Americans was to be kept by the government. In this extensive data project, IBM was hired by the government to make a punch card-reading system that can be applied to this project.
‘Colossus’ was the very first data processing machine which was developed by the British in 1943 to decipher Nazi codes in World War II. The intercepted messages were searched by this machine to decode the patterns that occurred in a sequence. The machine saved a lot of time as the work that was manually done took weeks whereas the machine did it in just a few hours. The machine worked at a rate of five thousand characters per second which helped to speed up the process of deciphering.
This development led to the creation of the National Security Agency (NSA) in the United States in 1952. The task of NSA employees was to decrypt the messages obtained during the Cold War. Machines were getting so advanced at this stage of development that they could independently and automatically obtain and analyze the information. In 1965, the United States government built the first data centre to store millions of tax returns and fingerprint sets. Each record was transferred on a magnetic tape which was to be stored in a central location systematically. This project was discontinued but it is still considered as the beginning of electronic big storage. In 1989, a British computer scientist Tim Berners-Lee invented the World Wide Web. He wanted to create an information sharing system using hypertext. With the beginning of the 1990s, data creation accelerated as more devices gained access to the internet.
After Roger Mougalas introduced Big Data in 2005, Yahoo developed the Hadoop which is open source now, to index the whole World Wide Web. Millions of businesses today use Hadoop to go through huge amounts of data. During this time, rapidly increasing social networks started creating a huge amount of data daily. Governments and businesses started establishing big data projects. For instance, the largest database was created by the Indian government to store fingerprints and iris scans of all its citizens. Eric Schmidt an American businessman and software engineer gave a speech at the Techonomy conference in Lake Tahoe, California in 2010 in which he presented that 5 exabytes of data have been produced from the beginning of time until 2003. He could not imagine that same amount of data would be generated only in a day or two in the coming years.
Examples of Big Data
- Stock exchange markets produce many terabytes of data daily. Using this data, the Predictive analysis feature of big data is highly used in stock trading. Using advanced techniques of machine learning and artificial intelligence, data from twitter, facebook and other social media platforms is crunched to predict the behaviour of a company or a sentiment. Traders use predictive analysis to place their orders in the right place.
- Statistics prove that social media website Facebook stores 500+ terabytes of data daily in the form of photos, videos, comments and message conversations. Businesses filter out useful information from this available data and use it for framing efficient social media marketing strategies. They can plan their social media campaigns for their potential customers more effectively by using big data.
- A jet engine aeroplane produces 10+ terabytes of data every 30 minutes. Considering thousands of flights daily, data amount reaches up to many petabytes. By collecting the unstructured data created by aircraft maintenance sensors and combining it with data onboard and from the ground, technology such as MongoDB can provide on-flight intelligence to operate the aircraft more safely and efficiently.