Big Data For Everyone

I wouldn’t be a true Millennial if I didn’t follow the crowds and got crazy over Big Data fever. Although there is still no common definition on what Big Data is, it looks like everyone can take part in the journey to becoming a Big Data scientist / analyst / strategist / architect / groopie. In this new era we can see psychologists programming – did Freud expect that? No, but Goffman predicted it in a way.

In this blog I will try to explain how Big Data is used in everyday life by everyone without even realizing it. We learn to make conclusions based on our experiences with things that we cannot directly name. Complicated? Not really. Ever checked someone’s profile on Facebook and made a picture in your head about them and then you met them in real life and realized they are completely different – or more exciting – that your social media research on someone predicted accurate attributes? Through many Facebook profiles screened and many social media research, we slowly learned to find patterns. After years of Facebook creeping now we know that having a zillion of friends doesn’t necessarily mean one is a social butterfly in the real life.


The other day I came across a very fun personality assessment, developed by University of Cambridge’s Psychometrics Centre. It’s an awesome app because it puts data behind what is clear to everyone – one’s personality can be predicted based on what they like. Let’s skip the question whether the virtual self is equal to our real self and go to the fun stuff. According to findings of the test, I come across intelligent because I like nutella, competitive because I like AC DC and I must be catholic because I liked America’s Next top Model ten years ago.

Amazing deduction I would say.

What we  learn from this experiment is the following:

  • Not every data along the way should be used in making predictions. Until we find the way towards relevant data, Big Data deduction is biased and in some cases even dangerous.
  • Social media profile privacy settings are important. Update them. Delete stupid things.
  • What we share is not necessarily who we are, but only we know that. Others don’t, they take it as current and relevant.
  • It is a good example of what is missing in big data – a sane knowledge.
  • Machines, programming languages and clever software for now cannot replace human reasoning.
  • Common sense inferences are important.

Even though academics and public pointed out strong disagreement with apps like this because of their influence on someone’s self-esteem etc., I think it’s brilliant, even though it’s wrong. My point here is a bit different: We have all been doing it all the time. We have become masters of Facebook creeping and making assumptions on what people are like based on their Facebook profile. We have done it so many times, our predictions became accurate! How is this different from Big data analysis?

Goffman says that we perceive people based on what is already known of them or whatever we already heard about them. This statement is about 60 years old, but in my opinion is still valid. We just no longer organise our prejudice of people based on what we heard at butcher’s or in the gossip corner. Now we organise our assumptions on people around what we learn from their Facebook profile – And it is not only the photos they upload or content they shared. It is about pages they’ve liked, comments they’ve posted, videos they’ve shared, whatever content they’ve been tagged in.

Of course the regular argument follows: The Facebook you is not a real you. But will this stop us from creeping their profiles? I don’t think so. Will that stop your employer to Facebook creep you before they hire you? Not sure.

It is not valid, it is not right, but it exists. Therefore – protect your profiles, nothing is really private in social media. And most of all – be authentic, be you. More on dangers of false self representations on social media in one of the future blogs, I promise.

One does not need to join the Big Data fever, because everyone is already a part of it. People just make it complicated with ambiguous definitions, expensive software and complicated programming languages. It seems like Big Data is dealing with everything and everyone. I am not sure about everything, but with regards to Everyone part – it has a long way to go, therefore it should slow down in its implementations. People are not machines, let’s keep it this way.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

Up ↑

%d bloggers like this: