Introduction
Bipartite graph is a graph whose nodes can be devided into two disjoint sets and edges connects these two (U,V) independent sets. There are weighted and unweighted graph depending on weight of edges.
If one wants to create bipartite graph the first question is what kind of dataset available.
- a matrix that represents a bipartite graph: rows are first or primary vertex set, columns are second or secondary vertex set, cells are weights or 0/1, that is adjacency matrix
- two column edgelist: first vertex set in first column second vertex set in second column and rows represent connections; in weighted case there are recurring pairs of nodes
- three column edgelist (weighted graphs): first vertex set in first column second vertex set in second column and connection weights in third column
The second question is weigted or unweighted graph one need. There is many possibilities to wrangling data.
Direction parameter can be set with {igraph} if it is needed.
For use the exaples above you need to load {igraph} package.
library(igraph)
library(reshape2)
Graph from adjacency matrix
Suppose one have this matrix that represents a weighted matrix.
A <- matrix(c(2,4,0,0,1,2,1,0,0,0,4,5,0,0,1,2,0,0,6,2), byrow = T, nrow = 5, ncol = 4)
rownames(A) <- letters[c(1:nrow(A))]
colnames(A) <- LETTERS[c(1:ncol(A))]
print(A)
## A B C D
## a 2 4 0 0
## b 1 2 1 0
## c 0 0 4 5
## d 0 0 1 2
## e 0 0 6 2
A bipratite graph can be create with {igraph} from A adjacency matrix.
g <- graph.incidence(A, weighted = T)
g
## IGRAPH UNWB 9 11 --
## + attr: type (v/l), name (v/c), weight (e/n)
## + edges (vertex names):
## [1] a--A a--B b--A b--B b--C c--C c--D d--C d--D e--C e--D
Parameters of created graph can be seen. This graph is UN-B which mean that this graph is U = undirected, N = named, W = weighted, B = bipartite, 9 = # of vertex, 11 = # of edges, and there are connections.
This function automatically set names and types of nodes. You can check them.
V(g)$type
## [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE
V(g)$name
## [1] "a" "b" "c" "d" "e" "A" "B" "C" "D"
FALSE type means primary and TRUE means secondary vertex set.
Graph from two column edge list
The egdelist of graph above defined is the follow. Rows represent connections thats why some row repeated according to weights.
## from to
## 1 a A
## 2 a A
## 3 a B
## 4 a B
## 5 a B
## 6 a B
There are many possibility to create bipartite graph from these I show two. You can choose the simplier…
First step is create an empty graph with directed=F parameter to get the same graph as above. Then create nodes with TRUE and FALSE type parameter and then add edges.
g <- graph.empty(directed = F)
node.out <- unique(edges$from) #stringsAsFactor = F in data frame
node.in <- unique(edges$to) #stringsAsFactor = F in data frame
g <- add.vertices(g,nv=length(node.out),attr=list(name=node.out),type=rep(FALSE,length(node.out)))
g <- add.vertices(g,nv=length(node.in),attr=list(name=node.in),type=rep(TRUE,length(node.in)))
edge.list.vec <- as.vector(t(as.matrix(data.frame(edges))))
g <- add.edges(g,edge.list.vec)
g
## IGRAPH UN-B 9 30 --
## + attr: type (v/l), name (v/c)
## + edges (vertex names):
## [1] a--A a--A a--B a--B a--B a--B b--A b--B b--B b--C c--C c--C c--C c--C
## [15] c--D c--D c--D c--D c--D d--C d--D d--D e--C e--C e--C e--C e--C e--C
## [29] e--D e--D
The other possibility to create graph from data frame and then add nodes type
g <- graph.data.frame(edges, directed = F)
V(g)$type <- V(g)$name %in% edges[,2] #the second column of edges is TRUE type
g
## IGRAPH UN-B 9 30 --
## + attr: name (v/c), type (v/l)
## + edges (vertex names):
## [1] a--A a--A a--B a--B a--B a--B b--A b--B b--B b--C c--C c--C c--C c--C
## [15] c--D c--D c--D c--D c--D d--C d--D d--D e--C e--C e--C e--C e--C e--C
## [29] e--D e--D
You can see that these graphs not real weighted according to igraph’s parameters but them have some repeated edges. But if you get adjacency matrix from graph you get the same weighted adjacency matrix as above. And other graph operations are work same as real weighted graphs.
get.incidence(g)
## A B C D
## a 2 4 0 0
## b 1 2 1 0
## c 0 0 4 5
## d 0 0 1 2
## e 0 0 6 2
Graph from three column edge list
In this case the edge list is similar then two column case but weights are in the third column.
## Var1 Var2 value
## 1 a A 2
## 2 b A 1
## 6 a B 4
## 7 b B 2
## 12 b C 1
## 13 c C 4
The process is the same as two column type but one step more because need to add weights.
g <- graph.data.frame(edges, directed = F)
V(g)$type <- V(g)$name %in% edges[,2] #the second column of edges is TRUE type
E(g)$weight <- as.numeric(edges[,3])
g
## IGRAPH UNWB 9 11 --
## + attr: name (v/c), type (v/l), value (e/n), weight (e/n)
## + edges (vertex names):
## [1] a--A b--A a--B b--B b--C c--C d--C e--C c--D d--D e--D
OR if the edge data frame weight column has name exactly “weight” then {igraph} automatically read it as weights of edges.
names(edges)[3] <- "weight"
g <- graph.data.frame(edges, directed = F)
V(g)$type <- V(g)$name %in% edges[,2] #the second column of edges is TRUE type
g
## IGRAPH UNWB 9 11 --
## + attr: name (v/c), type (v/l), weight (e/n)
## + edges (vertex names):
## [1] a--A b--A a--B b--B b--C c--C d--C e--C c--D d--D e--D
This results real weighted graph as you see. The adjacency matrix is same as earlier.
get.incidence(g, attr = "weight")
## A B C D
## a 2 4 0 0
## b 1 2 1 0
## c 0 0 4 5
## d 0 0 1 2
## e 0 0 6 2
Plot graph
You are ready to plot the resulted graph.
V(g)$color <- V(g)$type
V(g)$color=gsub("FALSE","red",V(g)$color)
V(g)$color=gsub("TRUE","blue",V(g)$color)
plot(g, edge.color="gray30",edge.width=E(g)$weight, layout=layout_as_bipartite)
Be happyR!
See also:
Thank you!! :)
ReplyDelete