Читать книгу Social Network Analysis - Группа авторов - Страница 38
2.7.1 Facebook Case Study
ОглавлениеThe first important steps in analyzing any kind of data set in python is importing libraries. The data to be analyzed can be scrapped directly from the respective site or it can be accessed from the API provided by the website [17]. Choosing the data mainly depends on the need, i.e., why do we need to analyze the data? What is the purpose? What kind of problem are we solving? [18]
Step 1: Import libraries
Each library has their built-in function, which makes Python easy to code.
Figure 2.11 Code blocks for importing libraries.
Step 2: Read data
Pandas is used to retrieve the data and can be used to explore a huge data set conveniently.
Figure 2.12 Code block for reading data.
Step 3: Data cleaning
Data cleaning means removing/cleaning the noise (NaN, Missing data) [19]. Data quality will have more impact in the model so using the data with less noise is recommended for better results. Missing values can be altered by generating the mean, median value and so on [20–22]. It completely depends upon the type of data.
Step 4: Read input
read_edgelist is a built-in function in NetworkX library. More details about it can be found in the documentation website. [23]
Figure 2.13 Code block for reading edge list.
Step 5: Visualizing the network
Figure 2.14 Visualization of Facebook users.
Step 6: Centrality measures
Figure 2.15 Code block for centrality measures.
Figure 2.16 Visualization of centrality measures on Facebook users.