Goal: make a nice network visualization of connections between developmental disorder-associated proteins and explore the network

1. We start by loading the network into Cytoscape. Open Cytoscape and then go to File -> Import -> Network from File. Select the file STRING_interactions_Kaplanis_285.tsv and click Open. You should see a table-like view with all the columns from the input file. Every column header has a little icon at the beginning showing how Cytoscape recognized this column. The green and red circle indicate the two columns that Cytoscape will use to draw the network, the other columns with a sheet icon are recognized as describing attributes for the interactions. Make sure the column node1 and node2 are having the green and red circle as icon, if not, click on it to change it. When you are done, click Ok.

2. Familiarize yourself with the view of the network. Zoom in, try to select individual nodes and edges, try to move them around, etc. Explore how your node and edge selection changes what is displayed in the Node Table and Edge Table below compared to when no nodes and edges are selected. You can also select nodes or edges by clicking onto rows in the respective Node or Edge Table.

3. The network has too many edges to really see anything. One way to make the network more explorable is by removing less confident edges or edges of weaker evidence. Use the Filter pane (at the very left of the Cytoscape window) and figure out how to select all edges that DO NOT have evidence from "database_annotated" (Interactions with evidence from database_annotated in STRING are those that were imported from high throughput interactome studies or other protein interaction databases that do manual curation of interactions from publications. These interactions likely correspond to actual biophysical interactions.) Once you think you correctly selected all interactions that you want to remove, click on Edit (at the top of the Cytoscape window) and "Remove selected nodes and edges" to delete the selected edges from the network. If there are still too many edges in the network and it is hard to explore the network, then you can continue removing edges using information in the "combined_score" column and the Filter pane again, e.g. you can remove all edges with a combined score < 0.8.

4. Now that we have removed quite some edges, let's find a nice layout for our network. For this play with the options provided under the "Layout" Menu (at the top of the Cytoscape window). My preferred layout is prefuse force-directed but you should play around with many of them to see what they are doing and maybe you prefer another one?

5. I would like to explore how the 28 new developmental disorder (DD)-associated proteins are potentially connected to previously identified DD proteins. For this, we would first need information which of the 285 proteins in the network correspond to a new DD gene. Cytoscape allows to import additional attributes for nodes and edges that we can then use for visualization. Let's try this. You can import attributes to nodes or edges using the File Menu -> Import -> Table from File. Select the file Kaplanis_28_genes_annotated.txt and click Open. You will see again a table that shows the content of the file you had just selected and how Cytoscape interprets the content of each column of that table. To assign this new information correctly to every node in your network, Cytoscape needs to know which column contains the right node identification information. Cytoscape is making a first guess and puts the key icon next to the column header for the column that it thinks contains the node identification information. Our nodes are identified by the names of the genes. Make sure the key symbol is part of the column header that contains the gene names and not the Uniprot ACs. You can then click Ok. To see whether Cytoscape has added the columns of this file correctly to the existing Node Table in Cytoscape, take a look at the Node Table and scroll to the very right. By clicking on the column header of "Kaplanis_gene_new" it will sort the column. Click until it sorts such that you see the "y" first in this column. There should be 28 rows with a "y" in this column.

6. We will now learn how we can visualize different node and edge attributes onto the network using the Style pane at the left side of the Cytoscape window. Please, note that the Style pane has multiple tabs, the two important ones for us today are called Node and Edge. The Node tab allows us to play with how nodes are shown in the network and the Edge tab is doing the same for edges. Let's try to figure out how we can color all nodes in the network in one color that correspond to the 28 new DD genes and how to color all other nodes differently. Find the right style option (make sure you are in the Node tab) that controls the node color, select the right column (Kaplanis_gene_new), choose as Mapping type "Discrete Mapping" and choose a color for the "y". Figure out how to change the default color of the nodes as well. Next, try to change the shape of all nodes or if you like choose different shapes for the new and previously known DD genes. If the labels of the nodes (the gene names), are too small, increase their font size.

7. Now, let's also visualize some edge attributes. For this, switch first to the Edge tab under the Style pane and try to figure out how to vary the thickness of the edges according to the "combined_score" column, i.e. more confident edges (higher score) could be thicker. Try to find settings that draw a "pretty" network.

8. To further prioritize interesting proteins and interactions I would like to know and visualize on the network how strongly each of them is expressed in brain tissue. To do this, repeat the steps 5 and 6 by loading the gene expression data as additional node attributes to the network using the file GTEx_tissue_expression_Kaplanis_genes_285.txt (make sure the right column is selected with the node identification information!) and try to visualize strength of expression based on information in the "brain other" column using for example the size of the node (more highly expressed -> larger node). You are encouraged to keep playing with the node and edge attribute and style options.

9. Once you've had enough fun playing with the style options to make a super stylish network, let's look at one last feature of Cytoscape. I would like to know whether the proteins in my network or a group of tightly connected proteins have specific functional enrichments. To investigate this, use the "Enrichment Table"  (tab at the bottom of the Cytoscape window). You can compute enrichments here for all the nodes in the network or just your selection. You can click on the Filter Icon to select a few ontologies that you want to use for the enrichment analysis (i.e. select Gene Ontology Biological Process and Reactome pathways) and click the checkbox "Remove redundant terms". Select highly connected clusters of proteins in your network and try to understand whether they are known to exert a certain function. 

10. Let's save our network visualization. Under the File Menu you can save this Cytoscape session (try it) and you have different export options (explore the Network to Image and Network to File options).

11. If you are still fancy exploring Cytoscape, you can check out the Tools Menu, play more with the styles, or think of additional attributes you would like to add to and visualize on the network. Maybe you have an idea where to get this information from?