USING NETWORK ANALYSIS FOR RAPID, TRANSPARENT, AND RIGOROUS THEMATIC ANALYSIS: A CASE STUDY OF ONLINE DISTANCE LEARNING

In thematic analysis, themes construction can be performed manually by the researcher or automatically by a computer. Both methods have strengths and weaknesses. This article introduces a strategy that involves the role of both researcher and computer to construct themes from qualitative data in a rapid, transparent, and rigorous manner. The strategy uses network analysis and is demonstrated by employing a case study on students‟ perceptions of online distance learning they experienced during the COVID-19 pandemic. The themes-construction strategy consists of four systematic phases, namely (1) determining unit of analysis and coding; (2) constructing the code co-occurrence matrix; (3) conducting network analysis; and (4) generating, reviewing, and reporting the themes. The strategy is successfully demonstrated in generating themes from the data with modularity value Q = 0.34. The application of network analysis in this strategy allows researchers to automatically generate themes from qualitative data using mathematical algorithms, represent these themes visually using network graph, and interpret the themes to answer the research questions.


INTRODUCTION
Thematic analysis is a popular method of qualitative data analysis. This method has been widely used in various research fields, including education (Brennan et al., 2019;Farrell & Seery, 2019;Kristanto, 2018;Littenberg-Tobias & Reich, 2020). The researchers" decision to choose thematic analysis is mostly due to its accessibility and flexibility (Braun & Clarke, 2012) and lack of requirements for advanced qualitative research skills (Holloway & Todres, 2003). Besides, thematic analysis does not depend on any theory and epistemology, hence it is flexible enough to use in any theoretical and epistemological approach (Braun & Clarke, 2006). Thus, the method suitably answer a wide range of research questions (King & Brooks, 2018).
One of the main purposes of thematic analysis is to generate themes from a set of data. The word "generate" here needs to be underlined because generating themes requires the researcher"s active. Themes are not readily available in the data; they need to be constructed by the researchers themselves (Clarke & Braun, 2013). Generating themes is usually started by repeatedly reading over the data, followed by coding. A theme can be constructed by collecting, comparing, and arranging relevant codes (Nowell et al., 2017). Several strategies can be employed to generate themes (Ryan & Bernard, 2003). A researcher can use a mind map, coding manual, template, table, thematic network (Attride-Stirling, 2001), or affinity diagram (Haskins Lisle et al., 2020). Those strategies provide flexibility for the researcher to continuously interact with the data, to not lose the context of the information they are currently analyzing. The context is important for the researcher to generate themes that truly represent the data.
Even though the aforementioned strategies offer advantages, they draw their fair share of criticism as well. The theme-generating strategies done manually by researches deserve criticism for at least two reasons. First, the strategies are potentially influenced by the researcher"s own subjectivity (Jackson & Trochim, 2002). Generating themes entirely done by the researcher can result in a biased method which means the generated themes may not reflect the data accurately. Second, the strategy usually needs a long time and impractical (Watkins, 2017). This is unfortunate because much qualitative research has limited time before being disseminated in order to provide immediate impact not only to the stakeholders but also wider society. To overcome these drawbacks, computers can be of assistance in thematic analysis.
Thematic analysis from a set of data can be done quickly and automatically with the help of a computer. Concept mapping-based software, like Leximancer, can help the researcher in generating the themes from a set of data using statistical analysis (Jackson & Trochim, 2002). Such a way, the researcher"s subjectivity and bias can be avoided. Furthermore, the software can increase the transparency and replicability of the data analysis process (Thompson, 2002).
However, computer-assisted data analysis also has several disadvantages. The first disadvantage is its inability to thoroughly make sense of the data together with the context (Fielding & Lee, 2002). If the thematic analysis is entirely done using computer software, then it will be hard for the researcher to connect the themes with the context the themes originated from. Second, computer-assisted thematic analysis can potentially ignore important themes within the data (Lee et al., 2018). Considering these two weaknesses, the researcher must be in charge of the thematic analysis to ensure deep sense-making of the data.
Generating representative themes from a set of data is a process that requires attention to detail and a thorough understanding of context. It will be painstakingly inefficient to be done entirely by a human, yet worryingly ineffective if automatically done by computers. The process of generating themes needs a balanced mix of the interpretative role of a researcher and the mechanistic role of a computer (Lee et al., 2018), to minimize the drawbacks and maximize the advantage of both strategies.
In this article, the strategy of theme generation in the thematic analysis that involves the role of both the researcher and computer is described. The strategy uses network analysis to categorize code into several themes in an automatic, transparent, and rigorous way. In the explanation that follows, we will deliver a brief review on network analysis and its use in thematic analysis, as well as the strategy we offer.

Network Analysis
The foundation of network analysis is one of the mathematics fields that discuss and study about network, namely graph theory (Harary, 2018;Scott & Carrington, 2014). A network is a set of points simultaneously connected by lines that connect the points in pairs (Newman, 2010). There are four concepts necessary to understand network which will be important once we get down to thematics analysis; namely nodes, edge, degree, and weight.
Nodes are the points in a network. The number of nodes determines the size of the network; the more nodes it contains, the larger the network will be. Edges are the lines connecting a pair of nodes. Nodes and edges are the building bricks of a network. The characteristic of each node in a network can be identified through its degree, indicating the number of edges connected to the node. The higher degree a node has, means the more edges connected to it.
Other important metrics related to network is weight, which is a value assigned to an edge. The total weight of all edges connected to a node is called weighted degree of the node. For large-scale structure networks, it will be useful to detect the communities or clusters constructing them. In the context of network analysis, a community can be defined as a group of nodes which has a close relationship to one another within this group, but weakly connected to nodes outside of this group. One of the effective approaches to detect community in a network is through modularity optimization (Brandes et al., 2008;Good et al., 2010;Newman, 2006). Modularity denotes the density of edges in a community, as compared with its density in an equal network with its edges placed randomly. The value ranges from -1 to 1. The higher the modularity value, the higher the connectedness among nodes within its communities, but lower connectedness with nodes outside of its communities. The formula to calculate modularity for weighted netword is presented in Formula (1) (Newman, 2004 …………………………………… (1) where A ij representing the weight of the edge between i and j, ∑ the sum of the weight of all the edges connected to node i, c i is the community containing nodes i, the function δ(u, v) equals 1 if u = v and equals 0 if otherwise, and ∑ . Network analysis has been used in many research methodologies, such as bibliographic coupling (Kessler, 1963) and co-citation analysis (McCain, 1990;Small, 1973). For example, Chen et al., (2019) use a bibliographic coupling to identify intellectual communities in the field of international research collaboration. To that end, they used documents which represent the research field as the nodes. If two documents cite the same group of documents, then the two documents are connected by an edge. Furthermore, the network in their study is considered as a weighted network. The weight of the edge connecting two nodes states the similarity index of the topic discussed in both nodes. Following modularity-based community detection, they identified five communities within the network. The method of bibliographic coupling has been applied in several studies in the field of education, such as distance learning (Lund et al., 2018), mathematics education (Drijvers et al., 2020), and information and communication technology education (Stopar & Bartol, 2019).

The Use of Network Analysis as a Theme-Generating Strategy
Network analysis can be implemented as one of the strategies to generate themes from qualitative data. To do this, codes constructed from the data is depicted as nodes. The rela-tionship between codes is denoted as their co-occurrence within a unit of analysis (LeCompte & Schensul, 1999). Thus, if two codes overlap on the same unit of analysis, then the nodes symbolizing the codes are connected with an edge. Two codes that cooccur in multiple units of analysis are definitely more closely related than two codes that only cooccur in one unit of analysis, or none at all. Therefore, the edges are weighted, to denote the frequency of two codes co-occurring within different units of analysis.
Up until this point, the networks involved in theme-generating process use codes as nodes and the relationship between the codes (their co-occurrence within the unit analysis of data) as edges, as well as weighted. We then use modularity optimization algorithm to detect existing communities within the networks. These communities represent the themes of qualitative data.
At least is a group of researchers who have used a similar strategy to the one we proposed, namely Pokorny et al. (Pokorny et al., 2018). They also used codes as nodes and utilized modularity optimization to identify themes within the network. However, there are differences in the way they define the edge. They use the chronological location of the codes to define the edge, which results in a directed network. In our opinion, this method needs more time and thoroughness when redefining the codes and the relationship between them into a network, especially for large-scale qualitative datasets.

RESEARCH METHOD
This study is a case study aimed to demonstrate how thematic analysis can utilize network analysis to generate themes from qualitative data. The case study has been frequently used in research to demonstrate a strategy, technique, or method (Burns & Schubotz, 2009;Zhao & Logan, 2002). To describe the proposed theme-generating strategy, the data on students" perception of online distance learning they experienced during the COVID-19 pandemic were used .
The subjects of the study are 283 mathematics education students in a private university in Yogyakarta. Out of those, 80.21% are female and 19.79% are male. They are distributed among first-year students (26.5%), second-year students (31.1%), third-year students (21.9%), and fourth-year students (20.5%). The age ranges from 18 to 38.2 years old with the average M = 20.61 and standard deviation SD = 1.67.
The data on the students" perception were collected using an electronic questionnaire developed and administered using Google form. Three items are open-ended questions asking about students" meaningful experience, and their perception and additional comment regarding online distance learning.
The questionnaire has undergone validation process and a try-out before being distributed to the targeted students. The qualitative data which were acquired from the students" written response were analyzed using thematics analysis. The thematic analysis consists of six stages, namely (1) getting familiar with the data; (2) formulating initial codes; (3) generating (initial) themes; (4) reviewing the resulting themes; (5) defining and naming the themes; and (6) writing the report (Braun & Clarke, 2006). In this study, network analysis was employed in stage 3 to construct themes.

Findings
This section describes and interprets the result of analysis during the process ofthemegenerating. The results are then presented chronologically according to the theme-generating phases.

Phase 1: Determining Unit of Analysis and Coding
In the first phase, we read the data repeatedly to understand the context of the data. Afterwards, we choose the unit of analysis. As a unit of analysis, we choose a sentence since we realized that most of the sentences in the dataset contain one or more interconnected ideas. This is important because the theme-generating strategy conducted in the next stages depends on the relationship between codes that are depicted in their co-occurrence within the unit of analysis.
Next, we code the data using software package Atlas.ti. Other than Atlas.ti, readers can use other qualitative data analysis software, such as NVivo (Jackson & Bazeley, 2019), RQDA (Chandra & Shang, 2019), and Dedoose (Salmona et al., 2019). The coding was conducted in a bottom-up manner or inductive approach. From this process, we acquired 103 initial codes.

Phase 2: Constructing Code Co-Occurrence Matrix
Code co-occurrence matrix was constructed based on the initial codes. The codes are represented as the column and row headings of the matrix. Out of the 103 codes, 91 codes have appeared together with other codes in the same unit of analysis, leaving 12 isolated codes. The value of each cell below the main diagonal of the matrix represents the frequency of related code co-occurrence. The number of cells that contain values is 434.

Figure 2. A Snippet of the Code Co-Occurrence Matrix
As an example, Figure 2 shows the co-occurrence frequency of 10 codes. A cell that connects content explanation to asynchronous video contains the number 48. This means the two codes cooccur with each other 48 times in the data. With similar interpretation, asynchronous video and exam context never overlapped each other in the data. Then, we used the frequency of code co-occurrence as the strength of the code connection. The larger the frequency, the stronger the connection between the two codes.

Phase 3: Conducting Network Analysis
The third phase is conducting network analysis to the code co-occurrence table generated in the previous phase. This phase is started by transforming the code co-occurrence table into a network. In this study, the construction of the network is done through network analysis software Gephi (Bastian et al., 2009). The network represents the connectedness of the resulting codes. As a simple illustration, Figure 3 presents the network of the codes generated from data segment. In the network, nodes represent codes while the edges connecting two codes signify their co-occurrence.

Figure 3. Example of Codes Co-occurrence and Their Corresponding Network
After the network is contructed, the next step is the computation of several important metrics. The metrics are the number of nodes and edges, degree, weighted degree, modularity, and the number of communities. All the metrics were calculated using Gephi and the result is described in Table 1.  Table 2 shows that the resulting network contains 91 nodes, which is the representation of codes, and seven communities with modularity value Q = 0.34. This means that the 91 nodes can be categorized into seven communities. The communities detection in this network was done by using Louvain algorithm (Blondel et al., 2008) in Gephi. To obtain these seven communities, we used a resolution of 0.7. In the community detection algorithm, this resolution value determines how many communities are generated. Greater resolution results in fewer communities (but more nodes within each community).

Phase 4: Generating, Reviewing, and Reporting Themes
The main purpose of applying network analysis in this study is to categorize each code into several themes. The previous phase has identified seven themes. These seven themes along with the codes belonging to them are shown in Table 2. Task context, deadlines, internet connection problem, and 12 other codes. 5 Instructor-student interaction, student interaction, discussion, and 2 other codes. 6 Autonomous, communication, self-regulation, and 4 other codes.
The construction of the seven themes along with their member codes has been completed. Furthermore, each of these themes needs to be checked and reviewed to ensure its validity. The validity of each of these themes shows whether the meaning of these themes actually has thorough evidence in the data (Braun & Clarke, 2006). When checking and reviewing themes, we performed several steps. First, we moved some of the codes in a certain theme to more appropriate themes. Second, the order of the themes was also adjusted according to the number of code belonging to each theme.
After each theme has been examined, the process of defining and naming the themes was then carried out. Defining and naming each of these themes was done by reading all the codes in these themes repeatedly and matching them back with the data that was paired with the codes. The process of defining and naming the themes resulted in seven names, i.e. (1) instructor"s orchestration; (2) assignment and internet constraints; (3) the learning process; (4) the flexibility of online distance learning; (5) interaction among course participants; (6) experiences in the synchronous learning environment; as well as (7) self-and co-regulation.
The resulting seven themes will be easier to understand in visual representation, such as a graph. By using Gephi, the seven themes along with their member codes are represented as a network graph, as shown in Figure 4. In the graph, each theme is expressed in different colors. The size of the nodes in the graph indicates their weighted degree. Finally, the thickness of the edges represents their weight.

Figure 4. Network Graph of Each Theme
The theme-generating procedure has been described in detail using the data of students" perception regarding online distance learning. The final step of the thematic analysis is to report on the themes that have been constructed so that they can be communicated clearly and logically to the stakeholders (Kiger & Varpio, 2020). Next, we will report on the themes found through the strategy that has been described previously.
The first theme is the instructor's orchestration. Students appreciate the instructor in designing and managing learning activities. Learning activities and content, such as asynchronous videos, modules, and textbooks, are perceived to help students explore and understand the topics and facilitate them to obtain knowledge and skills which are important for the future. Even so, students still find rooms for improvement in the efforts of the instructor to design lesson and content to make it easier to understand.
The second theme is assignments and internet constraints. Students feel that the time period to work and submit assignments is very short. Furthermore, students revealed that internet constraints should be considered in determining the assignment deadlines. The student feelings that dominate this theme were panic, depression, and worry. In addition, students also appreciate some instructor who can tolerate delays in submitting assignments. Students hope that other instructors also tolerate assignment submission delays due to internet constraints.
The third theme is the learning process. Related to this theme, several students gave positive responses to the instructor"s organization. A number of students also appreciated the instructor"s techniques in building affective expressions, for example by using humor in group chats. In addition to positive responses, students also gave negative responses to the instructor"s organization, especially regarding the timeliness of the class period and the number of assignments they received. Besides, students also wish for feedback from their instructor so that they know the extent to which they have achieved the learning objectives that have been set.
The fourth theme is the flexibility of online distance learning. In online learning, students feel that they can study at a more flexible time, place and in more flexible conditions. Furthermore, they argue that preparation for synchronous learning is also more flexible than face-to-face learning, especially for morning lectures. The dominant student feelings in this theme are relaxed and happy. Some students admit that they enjoy learning online because they can get together with their families and do other activities while being in class at the same time, for exaple listening to music. Nonetheless, a number of students stated that they experienced irregular sleeping hours during the online distance learning period. In addition, according to students, the flexibility of online distance learning provides its own challenges to be able to focus on learning activities.
The fifth theme is the interaction among course participants. Students give a positive assessment of the facilitation of lecturers so that they can interact with other learning participants (instructor or other students). They feel that their thoughts are appreciated when they have the opportunity to share their thoughts and ideas with other participants and get feedback on these thoughts and ideas. Students also give positive responses to interactions that occur outside of synchronous learning hours. They also appreciated the instructor who had replied to their questions outside the classroom synchronously in a timely manner. In addition, students consider that interactions among students outside the synchronous classroom helped them learn. Besides, students perceive that they are more active in discussions on online distance learning. Regarding the theme of interaction between learning participants, students also gave some suggestions related to group chat organizations and the accessibility of the instructors.
The sixth theme is the experience in synchronous learning. Students give a positive assessment of the efforts of their instructors in providing the synchronous learning environment. In this learning environment, students can meet directly with instructors and friends. Students appreciate how the instructors develop a sense of concern by greeting them and asking how they are doing during synchronous class sessions. A number of students also gave a positive assessment of the video conferencing conducted in oral exams and teaching practicum.
The last theme is self-regulation and co-regulation. Students admit that distance learning provides a challenge for them to be responsible for their own learning. They should be able to manage their time effectively in order to get the most out of online distance learning. They also recognize that communication and collaboration skills are needed to be successful in an online learning environment. Furthermore, they also realized that they should be able to maintain their motivation in exploring learning content.

Discussion
This article has demonstrated a strategy that applies network analysis to support rapid, transparent, and rigorous theme-generating strategy. The strategy begins with determining unit of analysis and coding, followed by constructing a code co-occurrence matrix. This matrix is then used to conducting network analysis. The network metrics are then determined to formulate the network"s characteristics and detect the communities within it. These communities are then interpreted as the resulting themes. The procedure have been demonstrated to be successful in generating themes from the data of the students" perception regarding their learning experience during online distance learning. Furthermore, the procedure is also effective to generate themes from qualitative data in other study (Kristanto & Santoso, 2020).
The theme-generating strategy demonstrated in the present study can save researchers time in analyzing qualitative data. The process of generating themes usually done manually with several complicated steps is trimmed by mathematical algorithms through network analysis. Automatic processes like this are important for qualitative researchers who have limited time to complete their research (Burgess-Allen & Owen-Smith, 2010). Even for researchers who have a relatively long time, this strategy is also useful because it gives the researcher more time to focus on interpreting the data in order to produce valid and reliable findings (Rose & Johnson, 2020).
The second advantage of the theme generation strategy in this study addresses the need for transparency in qualitative research (Aguinis & Solarino, 2019;Hannes & Macaitis, 2012). The use of quantitative measures and visualization through network analysis in this strategy provides an opportunity for other researchers to replicate such an analytical process and come up with similar conclusions. This transparency is important for consumers of research who are not only curious about the results of the research but also want to know about the process (Tuval-Mashiach, 2017).
Third, the strategy we have proposed have the potential to increase the rigor of thematic analysis. This is because the strategy contains systematic phases and is guaranteed by a mathematical algorithm. The mathematical algorithm used in this strategy has been widely used in research methodology (Frandsen, 2017). Thus, this strategy can minimize the subjectivity and bias caused by researchers when generating themes from the data (Galdas, 2017).
Even though the strategy we described here has its own merit, we also wish for the readers to be cautious of its limitations. The strategy assumes that the codes paired within the same unit of analysis are related to one another. This relationship is then employed to generate a network in order to detect the themes within the data. Therefore, the phase of choosing the unit of analysis and the initial stage of coding is extremely important. A researcher must choose a unit of analysis that contain one idea so that the codes assigned to it are connected in a logical way (Jamieson, 2016). Furthermore, the readers need to underline that thematic analysis is an iterative and reflective process (Nowell et al., 2017). Consequently, in employing the present computer-assisted strategy to generate themes, a researcher needs to check and review the resulting themes. This is important to ensure that the themes are a valid representation of the overall pattern of the data and in line with the existing well-established theories (Bringer et al., 2006;John & Johnson, 2000;A. King, 2010).