Читать книгу Social Network Analysis - Группа авторов - Страница 38

2.7.1 Facebook Case Study

The first important steps in analyzing any kind of data set in python is importing libraries. The data to be analyzed can be scrapped directly from the respective site or it can be accessed from the API provided by the website [17]. Choosing the data mainly depends on the need, i.e., why do we need to analyze the data? What is the purpose? What kind of problem are we solving? [18]

Step 1: Import libraries

Each library has their built-in function, which makes Python easy to code.

Figure 2.11 Code blocks for importing libraries.

Step 2: Read data

Pandas is used to retrieve the data and can be used to explore a huge data set conveniently.

Figure 2.12 Code block for reading data.

Step 3: Data cleaning

Data cleaning means removing/cleaning the noise (NaN, Missing data) [19]. Data quality will have more impact in the model so using the data with less noise is recommended for better results. Missing values can be altered by generating the mean, median value and so on [20–22]. It completely depends upon the type of data.

Step 4: Read input

read_edgelist is a built-in function in NetworkX library. More details about it can be found in the documentation website. [23]

Figure 2.13 Code block for reading edge list.

Step 5: Visualizing the network

Figure 2.14 Visualization of Facebook users.

Step 6: Centrality measures

Figure 2.15 Code block for centrality measures.

Figure 2.16 Visualization of centrality measures on Facebook users.

Подняться наверх