Gene Regulatory Networks in Halobacterium

Project By: Akshata Aravind 2020 ISB CMWG

The modeling software used in this guiding activity is Cytoscape. The Cytoscape application can be downloaded from here.

Project Overview:

This guided activity will create a comprehensive background as well as an example of research application on modeling complex regulatory networks. Through this interactive project, we will create a simplified version of a Halobacterium salinarum gene regulatory network using the Cytoscape application to visually model the data based off a research paper. By identifying and using published data and information, we will walk you through how to create a base network including a variety of elements (regulatory proteins, environmental factors, activators, and inhibitors).

What is Halobacterium salinarum?

Halobacterium salinarum is an extremely halophilic archaeon that has unique properties involving diverse methods to produce energy, contributing to its role in the research of gene regulatory networks of energy transduction systems networks. As a result of the diverse nature of gene regulatory networks within the Halobacterium, scientists are able to use it as a model organism to research various DNA microarrays and proteomic data to identify regulating factors of energy pathways.

The overall goal of this tutorial is to display the correlation between recognizing and identifying qualitative connections between elements and to use these observations to create a visual model of a network.

Background Information

What are the key characteristics of Halobacterium salinarum that will be addressed and used in this project?

Halobacterium salinarum are capable of growing in both aerobic and anaerobic conditions. This results in complex gene regulatory networks and pathways as a result of the various elements working together in the Halobacterium salinarum to maintain homeostasis and stable energy production. Some examples of biological pathways include isoprenoid synthesis, carotenoid synthesis, and bacteriorhodopsin assembly; these are all various processes that occur in Halobacterium to produce certain proteins, enzymes, and pigments for energy production. Here are the descriptions of the two main categories in which Halobacterium produce energy:

Aerobic energy production: energy from organic compounds

Anaerobic energy production/phototrophy: energy from light through the process of purple membrane biogenesis

Purple membrane biogenesis is made up of multiple copies of bacterioopsin (Bop), which is a membrane protein, and a retinal called bacteriorhodopsin that functions as a light-driven proton pump. Bat, an integral transcription regulator in Halobacterium salinarum, regulates (indirectly and directly) a number of critical genes that encode for and influence pathways that affect the biogenesis of the purple membrane, thus regulating anaerobic phototrophy/energy production.

Halobacterium salinarum conducts the coordinated regulation of various intertwined biochemical pathways including: isoprenoid synthesis, carotenoid synthesis, and bacteriorhodopsin assembly as a means to control energy production.

Let us now begin analyzing and identifying key data elements and connections between certain interactions in Halobacterium salinarum by reading through this research paper on the Coordinate regulation of energy transduction modules in Halobacterium. While reading through this dense research paper, be sure to write notes on any connections between transcription factors, transcriptional regulators, enzymes, and genes. There is a lot of information to go through, so be sure to use the given Figures (Figure 4 and Figure 5) to help make connections and identify interactions.

Background on the Research Paper

The research paper that we will be referring to and interacting with during this interactive model was created under the Institute for Systems Biology Labs and was published by the National Academy for Sciences. The authors involved in the creation of this research paper are: Nitin S. Baliga, Min Pan, Young Ah Goo, Eugene C. Yi, David R. Goodlett, Krassen Dimitrov, Paul Shannon, Ruedi Aebersold, Wailap Victor Ng, and Leroy Hood.

This research paper on the Coordinate regulation of energy transduction modules in Halobacterium delves into the many interconnected biochemical pathways (aerobic and anaerobic) conducted in different strains of Halobacterium (overactive as well as knockout strains) to observe the effects of the bat gene. However, for the purposes of developing a simple Cytoscape model of the gene regulatory networks in Halobacterium, this interactive project will not delve into the intricacies and methods used in this research paper; this project aims to simply use the results of this research paper (and the discovered proteins that interact with each other in Halobacterium networks) to illustrate an overall, generalized network system model.

After reading through this research paper on transcriptional networks in Halobacterium salinarum, create a list of any connections you could identify.

It may be challenging to come up with a network from scratch, however, here is a process that may help simplify this task:

1. Identify which nodes are going to be shown in your network.

2. Identify what are the interactions between the nodes.

For instance, this research paper says that the gene bat is responsible for encoding the transcriptional regulator, Bat. Another example would be that gene bop encodes for the the membrane protein, Bop.

It would be best to create a simple "three-column" table representation of the interactions between identified elements. We will then use this "three column" format to create a usable form of data for Cytoscape.

To the left is an example of how to organize and write down any interactions between genes, transcription factors, enzymes, etc.

Remember to create a key to be able to color-code the functions/types of the elements in the table of interactions. We will use this information to incorporate into our Cytoscape model.

Creating a ".sif" (Simple Interaction File)

Using the "three column" table scheme, we can now create a file with the data on these interactions to use in Cytoscape. The first step is to write the "three column" in a simple text file and making sure to separate each word with an equal amount of space. After creating the file, you should save it using the ".sif" (Simple Interaction File) extension. Then, this file can be uploaded in Cytoscape to display a network.

Be careful when you are typing up your table, spelling is important, since Cytoscape will recognize the typos.

Also be cautious with the use of spaces between names; if specific names use spaces, it can mess up the program and the "three column" scheme.

Creating a table to label the elements of your network

Using a similar format, create a "two column" scheme table to label any certain/key elements of your network and save it as an excel file.

Now that you have researched the background information and have created a list of interactions between elements in Halobacterium salinarum, let's translate this written data into a visual network model using Cytoscape. Make sure to have Cytoscape downloaded!

After opening the Cytoscape application, your screen should look like this:

After opening Cytoscape, make sure to install these programs:

  • Color Cast
    Apps -> App Manager -> Search: Color Cast -> Select listing -> Click on Install button

  • yFiles Layout Algorithms
    Apps -> App Manager -> Search: yFiles -> Select listing -> Click on Install button

Plotting the gene regulatory network and interaction network in Cytoscape

Import the protein-protein interaction file (the ".sif" file you had saved earlier with the "three column" scheme).

File -> Import -> Network from File

Import the elements' information table (the "two column" scheme file you had saved earlier).

File -> Import -> Table from File -> To selected networks only -> Click on the OK button

Let's now use Cytoscape to edit and design the display of our network!

Click on Style tab to change the style of the network. In this picture, I have selected the Sample3 style, however, there are many different options to choose from.

Change the shape of nodes according to the functional class of proteins. Diamonds will represent the genes. Rectangles will represent proteins. Triangles will represent enzymes. And so on. Scroll down to the Shape row, and select the button on the Mapping Column, as circled in the picture. In the Column section, select the table attribute on which the shape will be based off.

In the "Mapping Type" section, make sure to select Discrete Mapping.

After selecting Discrete Mapping, scroll through to find the attribute for which you want to assign a shape to. For instance, diamonds can represent the genes, and rectangles can represent proteins.

I now want to emphasize the connections between the regulatory genes and the rest of the network. To do this, we should add a thicker border to the nodes connected to genes. Going to Border Width and selecting the Mapping icon, I again choose the column I want to pick my attribute from and select the Discrete Mapping type. As shown in this picture, I have put the border width to size 2 for the genes.

To have the labels of the nodes fit within each shape and to have the nodes be more distinct, we should change the size of all the nodes in our network. Scroll down to the Size section, and click on Default Column (not the Mapping column like the previous times). I have set the size to 40.0.

Now, let's move on to the Edges. To make the edges and connections between all of the elements more visible, we should increase the thickness of our connecting edges. Switch to the Edge Tab down below, and click on the Edge Width. I have changed my Edge Width to 4.0

You can also change the shape of the edges. As you may have noticed, we don't have arrows in our edges to direct the flow of the interactions. To make arrows, switch to the Edge Tab down below (if you have not already done so), and select a Target Arrow Shape and Color in the Default Column.

For the last step, let's choose a better way to have our data visually organized. If you go to the top and select the Layout tab, there will be a drop-down menu of layout choices to choose from. Here are some different ways to display the network data:

Here is our final visual representation of the complex regulatory interactions in Halobacterium salinarum!

To get the image of your network go to Files -> Export -> Network to Image

Congratulations on completing this activity!

Project Analysis

What conclusions can we draw from this model?

Based off this Cytoscape model, it is simpler to visually characterize the interconnected nature and complex networks at play in Halobacterium. This model allows for a visual representation of transcriptional proteins that play a significant, influential role in the energy production processes in Halobacterium; by displaying the frequency of correlations between certain elements with the connecting edges, it leads us to identify the most impactful aspects of a network (what controls and regulates the entire system).

How does this model help out understanding of Halobacterium and gene regulatory networks?

This model supports our understanding of the complex gene regulatory networks of Halobacterium and systems in general by portraying the significant concepts of systems networks. A key concept that it integral throughout systems biology is the idea of a "ripple effect" throughout networks; every aspect of a system correlates to another pathway and so on. Affecting a single element in a network can result in drastic effects on the system as a whole, creating a ripple affect, which is especially integral in increasingly complex network systems.

In even more complex networks, it can be hard to identify a single issue, which is why creating a visual model to display the edges and correlations between aspects of a system, is significant in identifying problems and therefore spearheading solutions; the creation of a model is a first step in formulating an overall picture of a network scenario, and is constantly evolving to provide different perspective of the same system.

What insight does the model illustrate that the scientific research paper does not?

The Cytoscape model of system networks in energy production in Halobacterium displays the interconnections between the various transcriptional regulators, pigments, enzymes, and genes, in a way that simple data tables and words cannot; by visually displaying the model, certain influential proteins stand out amongst others (such as the ones highlighted in the picture above).

In this scenario of the Halobacterium, for instance, the research paper as well as our model, demonstrates the significant role the Bat protein plays in energy production processes in Halobacterium. Solely based off the research paper, we can observe that the Bat protein affects almost all of the biochemical pathways in Halobacterium. However, through the Cytoscape model, we can identify the specific elements that the Bat protein regulates and corresponds to; this is the power of computational modeling. By selecting on particular edges in the model, we can see connections between pathways that aren't fully represented in data tables or words. Based off the final model specifically, there is an abundance of edges all meeting at the Bat protein node (highlighted in yellow), which visually depicts the impact of the Bat protein in this system of energy production in Halobacterium.


1. Halobacterium salinarum - Group A: Prokaryotic Diversity. [accessed 2020 Sept 12].

2. Baliga NS, Pan M, Goo YA, Yi EC, Goodlett DR, Dimitrov K, Shannon P, Aebersold R, Ng WV, Hood L. Coordinate regulation of energy transduction modules in Halobacterium sp. analyzed by a global systems approach. Proceedings of the National Academy of Sciences of the United States of America. 2002 [accessed 2020 Sept 09];99(23):14913–14918.


Project By: Akshata Aravind 2020 ISB CMWG

The content of these pages was created by students for students with the help of educators and scientists. The views expressed herein are those of the authors and do not necessarily reflect the views of NSF or ISB.