Skip to main content

Blog entry # 3 - Network Analysis and course insights


Network Analysis is very powerful!
If you thought that conventional Data Analysis is breaking all boundaries in terms of finding new significant insights on data we were not able to track and see before, then network analysis will truly blow your mind. 

In an era of constant growth in social media use and networking between people, everyone seems to want to connect and share things with each other. In fact, everyone seems to want to be part of a group or in our case, a network. Network analysis gives you the ability to see even deeper connections between individuals as we can analyze different aspects of their connections with each other, how important they are to each other and to the network, how often do they interact with each other and how well they are connected to others.

Don't hurt my Ego-network!
A network is made of nodes - the individuals in a network (e.g people, objects etc.), and edges - the connections between these nodes. There are many different types of networks, and each one is defined by its context and semantics. For example, we can have a friendship network between people.

In this network we can see that each person is connected to a different number of other people. The number of connections of a node is called a degree of connections. A friendship network will usually be undirected, which means that there is no specific direction between the nodes (usually friendships are mutual). In other examples we can see directed networks in which there is a specific direction to an interaction between nodes (e.g emails and compensation (salaries, bonuses) in a work network). In directed networks we see there are In and Out degrees, which means how many connections are incoming and how many are outgoing. For example, on twitter, people can follow other people, so for any person there will be the number of people following them (In degrees) and the number of people they are following (Out degrees).


There are networks where the relationships between the nodes are differently affecting the network. These are called weighted networks, in which one edge can be stronger than another. For example, from our final exam example, in an international trade network, the countries trading will be the nodes and their trade volumes the edges. Countries who import more than others will have a larger weight in imports, while they might have a smaller weight in export. Their overall weight will define their effect on the network in regards to trade.

Example for distinguishing one community within a larger network

Another interesting measure is the distance between nodes. Each step from a node to another is called a Hop, and the shortest distance between two nodes is referred to as the shortest path or the Geodesic distance. If we analyze the distance between nodes, we can find the best way to reach destinations, through the most "connected" node with Betweenness centrality or Closeness centrality measures. This shows us whether a node is connected to other nodes or if it is the nearest to certain nodes. These insights are all related to what we know as the Six degrees of separation theory, who was perfected in the 1950's by an American Psychologist named Stanley Milgram. The theory is saying that every 2 people (nodes) picked at random on earth, will be able to reach each other through a maximum of 6 other people (edges). There are many experiments and fun facts about the six degrees of separation such as the Six Degrees of Kevin Bacon, games, and even to the extent of using it in recent politic articles.

There are some very interesting various new software such as Gephi, Socnetv, Centrifuge and EgoNet in order to analyze networks. With these software, in addition to the measures mentioned earlier, we can also analyze things such as the importance of a node or edge's influence through Eigenvector (e.g citations network, importance based on # of citations, well connection of a node to other well connected nodes), different demographics such as age groups with a Clustering algorithm, and the Density of connections (a fraction of the connections in a network compared to the maximum connections).

All of these new tools and measures that were once hard to find or even define, are within a few clicks away and available to everyone. Network analysis is and has really proved to be very efficient in Data Analysis.



WELL A-L-O-T-!

Many things that were once hard to understand or even imagine, make a lot more sense with Data Analysis. Whether it is a way to maintain a competitive advantage, a new way of working with streaming data to bring more value to a business, or even how to discover more about day to day processes and functions we do at home at Datameer.



Peter Sondergaard, Senior Vice President at Gartner said once:

"Information is the oil of the 21st century, and analytics is the combustion engine."

We do believe in this sentence since the more content and data we have, and the more new ways we have to analyze that data, the less we need to work only by our intuition. This can improve our decision making process as managers, our commitment to our organization and society, and the way we impact the success of our colleagues and our own. Obviously, people need to have intuition in their decisions, but this should be a combination that leads to better results. Without information we cannot do anything, we cannot understand or reason or plan or forecast. Once we know how to use our information in the best way our operations will improve and we will be able to further evolve.
If we use a car metaphor, just like an engine can not run without oil, and in order to drive and operate our car in the best way we need oil in our engine, so does information need to be well analyzed with modern tools in order to make the best use for our organization.

“The more you know, the more you know you don't know.” Aristotle

Does size really matter? Yes it does!
Big Data is defined by many metrics, one of them is the sheer size which has never before been available to us. With more data we can make better and new conclusions, that will help understand new things about how we run our business.

Better decision making
With Big Data, we can find out important information that can make our lives better, such as choosing the right restaurant, or be able to pair food better :) [dedicated to Prof. Ram]. This might seem like a small point that does not affect us enormously but does it not? Really? Think again.
The self-discrepancy theory taken from consumer behavior, states that people have an Actual-self vs an Ideal-self. People think of themselves in terms of what they are vs what they want to be (Who am I today?, What do I think of myself? vs Who I would like to become?) - the gap between these two is what builds motivation or self esteem. Each individual has multiple selves, and different selves have different needs and perceptions, they get activated at different times, and therefore need to be targeted differently. The reason we are mentioning about this topic is because things that affect our life, life style and leisure time, are actually very important. They reflect on our inner self and make us want to act in certain ways, so people perceive us the way we want. This is a powerful tool and if we understand these behaviors better, we could actually use them to target our customers with various new solutions and offerings. Big Data is a key factor to have that knowledge.

Different food pairing do not work with Indian food

 What if Google was one of us?
In order to analyze this huge amount of data, we have many different tools that can help with searching and collecting the data, with cleaning it and with redistributing it to others in more comprehensive and visual ways. One of the biggest data bases is obviously Google and with it comes Google Analytics, a powerful tool to analyze whatever we want on the world wide web. With the growing dependency on internet and mobile, an organization must have a functioning website that offers its services or products. The website should be able to answer users needs, provide a good and unique experience that will make this user come back and interact with the company more in the future. It is therefore very important to be able to analyze the various activities happening on the website, being able to see small differences and important patterns that affect our goals, whether it be more profit, more traffic, or even simply more awareness. Google Analytics, and web analytics has become a very big field for competition, and is provided by big companies and by smaller ones, all directed at making the best out of something very simple such as a website. Through A/B testing, we are now able to see how small changes in our content impact user engagement which then impacts our organizations success and how we reach our goals. One of the best tools on the web today comes from Optimizely, providing interesting features for better understanding our users, to the extent of heat maps and more.

Heat maps used at UXCam


A very important aspect to remember in order to do a good analysis of our data is to always ask the right questions. We need to remember that before there is data, there should be a company's strategy, which should be the base of all things. We need to think of what are our company's goals, what are the KPI's that will measure our success, and how we can measure them. One way is to use the 5 W’s (What - Who - When - Where - Why) - this will make it easier to analyze our customers/users.

What are the people doing on our website? (conversions, reading…), who are they? (age, gender), when are they doing these actions? (time of day, period, certain events), where are they coming from? (which countries, what referrals), and most importantly, why are they here? why do they leave?

So finally, with Big Data, and Data Analysis tools, we can save the world, we can be innovative, we can work better, we can be more successful, we can be happy. Let's use Big Data better!

Porter's 4 Forces, OUT.

Comments