big data luckey_sun flickr

(Luckey Sun / Flickr)

Ontario’s information and privacy commissioner Ann Cavoukian has been appointed as the executive director of Ryerson’s Institute for Privacy and Big Data. But what exactly is Big Data? The Ryersonian’s Alexa Huffman takes an deeper look into the complicated topic.

What is Big Data?
Big data is a term that describes a large amount of data unstructured and structured that is difficult to process through traditional software systems. It encompasses the challenges of capturing, curating, storing, searching, sharing and transferring the data as well as analysis and visualization. The size of data is measured in bytes from gigabytes to exabytes. Big data would exceed more data that could be classified as a perabyte or even an exabyte.

Big data consists of the different records from millions of people, not all from the same source. There can be web data, sales data, social media information and mobile data, to name a few. The data can often be incomplete or inaccessible. That’s where the method of Big Data comes in to help organize.

Why do companies want to analyze big data?
Two words: business analytics. According to Edd Dumbill of O’Reilly Radar, big data can give valuable insight into certain business areas, such as product development or customer service. But big data can be a problem for businesses because the tools to analyze massive data sets aren’t commonplace yet.

Data sets have grown in size because today’s technology can store large amounts of data. This comes from mobile devices, wireless sensor networks, software logs, cameras and microphones and more.

Where did the term Big Data come from?
According to a New York Times article, the term is believed to have started in the high-tech community during the 1990s.

But the actual history goes back much further than that to the 1940s. Humans first tried to quantify the “information explosion” or the growth rate of the volume of data, says a Forbes magazine article. At the time, this was only seen in university libraries.

The quantification continued throughout the 20th century. In 1961, physicist Derek Price charted the growth of scientific knowledge when he looked at the growth in the number of scientific journals.

By 1967, there was already a call for information storage requirements to be kept to a minimum and to increase the rate of information transmission through a computer. The 1970s initiated the debate on privacy concerns. Japan started studying the volume of information that circulated in the country. Researchers found information supply was increasing faster than information consumption and society was moving towards an era of detail-oriented information.

In 1996, digital storage becomes more cost-effective for storing data than paper, according to R.J.T. Morris and B.J. Truskowski, from“The Evolution of Storage Systems,” published in the IBM Systems Journal in 2003.

In 1997, the first article in the Association for Computing Machinery digital library uses the term “big data,” says the Forbes magazine article.

Are they developing software to deal with it?
They have and the software continues to grow. Huge financially-sound corporations like Walmart and Google are already using big data. Yahoo, Amazon and Facebook are using the system as well.

The highly personalized social media experience and advertising Facebook is known for? That’s thanks to big data. Facebook looked at the large number of signals from users’ actions and their friends to create an entirely new service.

Mike Volpi writes in an article that cloud-based architecture and open source software are bringing big data processing to more companies than ever before. Many companies will even use cloud resources to help with their own software.

To learn what Ryerson is doing in terms of big data, click here.

Alexa Huffman is a former reporter with the Ryersonian. She graduated from Ryerson University in 2014 with a Bachelor of Journalism.