Though introduced recently, complex networks research has grown steadily because of its potential to represent, characterize and model a wide range of intricate natural systems and phenomena. Because of the intrinsic complexity and systemic organization of life, complex networks provide a specially promising framework for systems biology investigation. The current article is an up-to-date review of the major developments related to the application of complex networks in biology, with special attention focused on the more recent literature. The main concepts and models of complex networks are presented and illustrated in an accessible fashion. Three main types of networks are covered: transcriptional regulatory networks, protein-protein interaction networks and metabolic networks. The key role of complex networks for systems biology is extensively illustrated by several of the papers reviewed.; FAPESP; CNPq
The relationship between the structure and function of biological networks constitutes a fundamental issue in systems biology. Particularly, the structure of protein-protein interaction networks is related to important biological functions. In this work, we investigated how such a resilience is determined by the large scale features of the respective networks. Four species are taken into account, namely yeast Saccharomyces cerevisiae, worm Caenorhabditis elegans, fly Drosophila melanogaster and Homo sapiens. We adopted two entropy-related measurements (degree entropy and dynamic entropy) in order to quantify the overall degree of robustness of these networks. We verified that while they exhibit similar structural variations under random node removal, they differ significantly when subjected to intentional attacks (hub removal). As a matter of fact, more complex species tended to exhibit more robust networks. More specifically, we quantified how six important measurements of the networks topology (namely clustering coefficient, average degree of neighbors, average shortest path length, diameter, assortativity coefficient, and slope of the power law degree distribution) correlated with the two entropy measurements. Our results revealed that the fraction of hubs and the average neighbor degree contribute significantly for the resilience of networks. In addition...
A teoria das redes complexas é uma área relativamente nova da Ciência, inspirada por dados empíricos tais como os obtidos de interações biológicas e sociais. Esta área apresenta uma natureza altamente interdisciplinar, de modo que tem unido cientistas de diferentes áreas, tais como matemática, física, biologia, ciência computação, sociologia, epidemiologia e muitas outras. Um dos problemas fundamentais nessa área é entender como a organização de redes complexas influencia em processos dinâmicos, como sincronização, propagação de epidemias e falhas e ataques. Nessa dissertação, é apresentada uma análise da relação entre estrutura e robustez de redes complexas através da remoção de vértices. Para a aplicação deste estudo, foram adquiridas bases de dados de interações de proteínas de quatro espécies, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster e Homo sapiens, como também mapas das malhas de rodovias de sete países, Brasil, Portugal, Polônia, Romênia, Austrália, Índia e África do Sul. Foi estudada a robustez dessas redes através de simulação de falhas e ataques, segundo uma dinâmica de remoção de vértices. Nesse caso, a variação na estrutura das redes devido a essa remoção foi quantificada pelas medidas do tamanho da maior componente conectado...
Though introduced recently, complex networks research has grown steadily because of its potential to represent, characterize and model a wide range of intricate natural systems and phenomena. Because of the intrinsic complexity and systemic organization of life, complex networks provide a specially promising framework for systems biology investigation. The current article is an up-to-date review of the major developments related to the application of complex networks in biology, with special attention focused on the more recent literature. The main concepts and models of complex networks are presented and illustrated in an accessible fashion. Three main types of networks are covered: transcriptional regulatory networks, protein-protein interaction networks and metabolic networks. The key role of complex networks for systems biology is extensively illustrated by several of the papers reviewed.
Residue networks representing 595 nonhomologous proteins are studied. These networks exhibit universal topological characteristics as they belong to the topological class of modular networks formed by several highly interconnected clusters separated by topological cavities. There are some networks that tend to deviate from this universality. These networks represent small-size proteins having <200 residues. This article explains such differences in terms of the domain structure of these proteins. On the other hand, the topological cavities characterizing proteins residue networks match very well with protein binding sites. This study investigates the effect of the cutoff value used in building the residue network. For small cutoff values, <5 Å, the cavities found are very large corresponding almost to the whole protein surface. On the contrary, for large cutoff value, >10.0 Å, only very large cavities are detected and the networks look very homogeneous. These findings are useful for practical purposes as well as for identifying protein-like complex networks. Finally, this article shows that the main topological class of residue networks is not reproduced by random networks growing according to Erdös-Rényi model or the preferential attachment method of Barabási-Albert. However...
Pattern discovery in protein structures is a fundamental task in computational biology, with important applications in protein structure prediction, profiling and alignment. We propose a novel approach for pattern discovery in protein structures using Particle Swarm-based flying windows over potentially promising regions of the search space. Using a heuristic search, based on Particle Swarm Optimization (PSO) is, however, easily trapped in local optima due to the sparse nature of the problem search space. Thus, we introduce a novel fitness-based stagnation detection technique that effectively and efficiently restarts the search process to escape potential local optima.
The proposed fitness-based method significantly outperforms the commonly-used distance-based method when tested on eight classical and advanced (shifted/rotated) benchmark functions, as well as on two other applications for proteomic pattern matching and discovery. The main idea is to make use of the already-calculated fitness values of swarm particles, instead of their pairwise distance values, to predict an imminent stagnation situation. That is, the proposed fitness-based method does not require any computational overhead of repeatedly calculating pairwise distances between all particles at each iteration. Moreover...
The brain's structural and functional systems, protein-protein interaction, and gene networks are examples of biological systems that share some features of complex networks, such as highly connected nodes, modularity, and small-world topology. Recent studies indicate that some pathologies present topological network alterations relative to norms seen in the general population. Therefore, methods to discriminate the processes that generate the different classes of networks (e. g., normal and disease) might be crucial for the diagnosis, prognosis, and treatment of the disease. It is known that several topological properties of a network (graph) can be described by the distribution of the spectrum of its adjacency matrix. Moreover, large networks generated by the same random process have the same spectrum distribution, allowing us to use it as a "fingerprint". Based on this relationship, we introduce and propose the entropy of a graph spectrum to measure the "uncertainty" of a random graph and the Kullback-Leibler and Jensen-Shannon divergences between graph spectra to compare networks. We also introduce general methods for model selection and network model parameter estimation, as well as a statistical procedure to test the nullity of divergence between two classes of complex networks. Finally...
Doenças complexas são caracterizadas por serem poligênicas e multifatoriais, o que representa um desafio em relação à busca de genes relacionados a elas. Com o advento das tecnologias de sequenciamento em larga escala do genoma e das medições de expressão gênica (transcritoma), bem como o conhecimento de interações proteína-proteína, doenças complexas têm sido sistematicamente investigadas. Particularmente, baseando-se no paradigma Network Medicine, as redes de interação proteína-proteína (PPI -- Protein-Protein Interaction) têm sido utilizadas para priorizar genes relacionados às doenças complexas segundo suas características topológicas. Entretanto, as redes PPI são afetadas pelo viés da literatura, em que as proteínas mais estudadas tendem a ter mais conexões, degradando a qualidade dos resultados. Adicionalmente, métodos que utilizam somente redes PPI fornecem apenas resultados estáticos e não-específicos, uma vez que as topologias destas redes não são específicas de uma determinada doença. Neste trabalho, desenvolvemos uma metodologia para priorizar genes e vias biológicas relacionados à uma dada doença complexa, através de uma abordagem integrativa de dados de redes PPI, transcritômica e genômica...
Background: Genome-wide libraries of yeast deletion strains have been used to screen for genes that drive phenotypes such as stress response. A surprising observation emerging from these studies is that the genes with the largest changes in mRNA expression during a state transition are not those that drive that transition. Here, we show that integrating gene expression data with context-independent protein interaction networks can help prioritize master regulators that drive biological phenotypes. Results: Genes essential for survival had previously been shown to exhibit high centrality in protein interaction networks. However, the set of genes that drive growth in any specific condition is highly context-dependent. We inferred regulatory networks from gene expression data and transcription factor binding motifs in Saccharomyces cerevisiae, and found that high-degree nodes in regulatory networks are enriched for transcription factors that drive the corresponding phenotypes. We then found that using a metric combining protein interaction and transcriptional networks improved the enrichment for drivers in many of the contexts we examined. We applied this principle to a dataset of gene expression in normal human fibroblasts expressing a panel of viral oncogenes. We integrated regulatory interactions inferred from this data with a database of yeast two-hybrid protein interactions and ranked 571 human transcription factors by their combined network score. The ranked list was significantly enriched in known cancer genes that could not be found by standard differential expression or enrichment analyses. Conclusions: There has been increasing recognition that network-based approaches can provide insight into critical cellular elements that help define phenotypic state. Our analysis suggests that no one network...
The principles underlying protein folding remains one of Nature's puzzles
with important practical consequences for Life. An approach that has gathered
momentum since the late 1990's, looks at protein hetero-polymers and their
folding process through the lens of complex network analysis. Consequently,
there is now a body of empirical studies describing topological characteristics
of protein macro-molecules through their contact networks and linking these
topological characteristics to protein folding. The present paper is primarily
a review of this rich area. But it delves deeper into certain aspects by
emphasizing short-range and long-range links, and suggests unconventional
places where "power-laws" may be lurking within protein contact networks.
Further, it considers the dynamical view of protein contact networks. This
closer scrutiny of protein contact networks raises new questions for further
research, and identifies new regularities which may be useful to parameterize a
network approach to protein folding. Preliminary experiments with such a model
confirm that the regularities we identified cannot be easily reproduced through
random effects. Indeed, the grand challenge of protein folding is to elucidate
the process(es) which not only generates the specific and diverse linkage
patterns of protein contact networks...
During the last decade, network approaches became a powerful tool to describe
protein structure and dynamics. Here, we describe first the protein structure
networks of molecular chaperones, then characterize chaperone containing
sub-networks of interactomes called as chaperone-networks or chaperomes. We
review the role of molecular chaperones in short-term adaptation of cellular
networks in response to stress, and in long-term adaptation discussing their
putative functions in the regulation of evolvability. We provide a general
overview of possible network mechanisms of adaptation, learning and memory
formation. We propose that changes of network rigidity play a key role in
learning and memory formation processes. Flexible network topology provides
"learning competent" state. Here, networks may have much less modular
boundaries than locally rigid, highly modular networks, where the learnt
information has already been consolidated in a memory formation process. Since
modular boundaries are efficient filters of information, in the "learning
competent" state information filtering may be much smaller, than after memory
formation. This mechanism restricts high information transfer to the "learning
competent" state. After memory formation...
Computer experiments are performed to investigate why protein contact
networks (networks induced by spatial contacts between amino acid residues of a
protein) do not have shorter average shortest path lengths in spite of their
importance to protein folding. We find that shorter average inter-nodal
distances is no guarantee of finding a global optimum more easily. Results from
the experiments also led to observations which parallel an existing view that
neither short-range nor long-range interactions dominate the protein folding
process. Nonetheless, runs where there was a slight delay in the use of
long-range interactions yielded the best search performance. We incorporate
this finding into the optimization function by giving more weight to
short-range links. This produced results showing that randomizing long-range
links does not yield better search performance than protein contact networks au
natural even though randomizing long-range links significantly reduces average
path lengths and retains much of the clustering and positive degree-degree
correlation inherent in protein contact networks. Hence there can be
explanations, other than the excluded volume argument, beneath the topological
limits of protein contact networks.; Comment: v2 accepted by European Conference on Artifical Life 2011
Self-avoiding random walks were performed on protein residue networks.
Compared with protein residue networks with randomized links, the probability
of a walk being successful is lower and the length of successful walks shorter
in (non-randomized) protein residue networks. Fewer successful walks and
shorter successful walks point to higher communication specificity between
protein residues, a conceivably favourable attribute for proteins to have. The
use of random walks instead of shortest paths also produced lower node
centrality, lower edge betweeness and lower edge load for (non-randomized)
protein residue networks than in their respective randomized counterparts. The
implications of these properties for protein residue networks are discussed in
terms of communication congestion and network vulnerability. The randomized
protein residue networks have lower network clustering than the
(non-randomized) protein residue networks. Hence, our findings also shed light
on a hitherto neglected aspect: the importance of high network clustering in
protein residue networks. High clustering increases navigability of a network
for local search and the combination of a local search process on a highly
clustered small-world network topology such as protein residue networks reduces
communication congestion and network vulnerability.
Protein interaction networks aim to summarize the complex interplay of
proteins in an organism. Early studies suggested that the position of a protein
in the network determines its evolutionary rate but there has been considerable
disagreement as to what extent other factors, such as protein abundance, modify
this reported dependence.
We compare the genomes of Saccharomyces cerevisiae and Caenorhabditis elegans
with those of closely related species to elucidate the recent evolutionary
history of their respective protein interaction networks. Interaction and
expression data are studied in the light of a detailed phylogenetic analysis.
The underlying network structure is incorporated explicitly into the
The increased phylogenetic resolution, paired with high-quality interaction
data, allows us to resolve the way in which protein interaction network
structure and abundance of proteins affect the evolutionary rate. We find that
expression levels are better predictors of the evolutionary rate than a
protein's connectivity. Detailed analysis of the two organisms also shows that
the evolutionary rates of interacting proteins are not sufficiently similar to
be mutually predictive.
It appears that meaningful inferences about the evolution of protein
interaction networks require comparative analysis of reasonably closely related
species. The signature of protein evolution is shaped by a protein's abundance
in the organism and its function and the biological process it is involved in.
Its position in the interaction networks and its connectivity may modulate this
but they appear to have only minor influence on a protein's evolutionary rate.; Comment: Accepted for publication in BMC Evolutionary Biology
It has recently been demonstrated that many biological networks exhibit a
scale-free topology where the probability of observing a node with a certain
number of edges (k) follows a power law: i.e. p(k) ~ k^-g. This observation has
been reproduced by evolutionary models. Here we consider the network of
protein-protein interactions and demonstrate that two published independent
measurements of these interactions produce graphs that are only weakly
correlated with one another despite their strikingly similar topology. We then
propose a physical model based on the fundamental principle that (de)solvation
is a major physical factor in protein-protein interactions. This model
reproduces not only the scale-free nature of such graphs but also a number of
higher-order correlations in these networks. A key support of the model is
provided by the discovery of a significant correlation between number of
interactions made by a protein and the fraction of hydrophobic residues on its
surface. The model presented in this paper represents the first physical model
for experimentally determined protein-protein interactions that comprehensively
reproduces the topological features of interaction networks. These results have
profound implications for understanding not only protein-protein interactions
but also other types of scale-free networks.; Comment: 50 pages...
We model the evolution of eukaryotic protein-protein interaction (PPI)
networks. In our model, PPI networks evolve by two known biological mechanisms:
(1) Gene duplication, which is followed by rapid diversification of duplicate
interactions. (2) Neofunctionalization, in which a mutation leads to a new
interaction with some other protein. Since many interactions are due to simple
surface compatibility, we hypothesize there is an increased likelihood of
interacting with other proteins in the target protein's neighborhood. We find
good agreement of the model on 10 different network properties compared to
high-confidence experimental PPI networks in yeast, fruit flies, and humans.
Key findings are: (1) PPI networks evolve modular structures, with no need to
invoke particular selection pressures. (2) Proteins in cells have on average
about 6 degrees of separation, similar to some social networks, such as
human-communication and actor networks. (3) Unlike social networks, which have
a shrinking diameter (degree of maximum separation) over time, PPI networks are
predicted to grow in diameter. (4) The model indicates that evolutionarily old
proteins should have higher connectivities and be more centrally embedded in
their networks. This suggests a way in which present-day proteomics data could
provide insights into biological evolution.; Comment: 22 pages...
The brain's structural and functional systems, protein-protein interaction,
and gene networks are examples of biological systems that share some features
of complex networks, such as highly connected nodes, modularity, and
small-world topology. Recent studies indicate that some pathologies present
topological network alterations relative to norms seen in the general
population. Therefore, methods to discriminate the processes that generate the
different classes of networks (e.g., normal and disease) might be crucial for
the diagnosis, prognosis, and treatment of the disease. It is known that
several topological properties of a network (graph) can be described by the
distribution of the spectrum of its adjacency matrix. Moreover, large networks
generated by the same random process have the same spectrum distribution,
allowing us to use it as a "fingerprint". Based on this relationship, we
introduce and propose the entropy of a graph spectrum to measure the
"uncertainty" of a random graph and the Kullback-Leibler and Jensen-Shannon
divergences between graph spectra to compare networks. We also introduce
general methods for model selection and network model parameter estimation, as
well as a statistical procedure to test the nullity of divergence between two
classes of complex networks. Finally...
Cellular functions are based on the complex interplay of proteins, therefore
the structure and dynamics of these protein-protein interaction (PPI) networks
are the key to the functional understanding of cells. In the last years,
large-scale PPI networks of several model organisms were investigated.
Methodological improvements now allow the analysis of PPI networks of multiple
organisms simultaneously as well as the direct modeling of ancestral networks.
This provides the opportunity to challenge existing assumptions on network
evolution. We utilized present-day PPI networks from integrated datasets of
seven model organisms and developed a theoretical and bioinformatic framework
for studying the evolutionary dynamics of PPI networks. A novel filtering
approach using percolation analysis was developed to remove low confidence
interactions based on topological constraints. We then reconstructed the
ancient PPI networks of different ancestors, for which the ancestral proteomes,
as well as the ancestral interactions, were inferred. Ancestral proteins were
reconstructed using orthologous groups on different evolutionary levels. A
stochastic approach, using the duplication-divergence model, was developed for
estimating the probabilities of ancient interactions from today's PPI networks.
The growth rates for nodes...
The three dimensional structure of a protein is an outcome of the
interactions of its constituent amino acids in 3D space. Considering the amino
acids as nodes and the interactions among them as edges we have constructed and
analyzed protein contact networks at different length scales, long and
short-range. While long and short-range interactions are determined by the
positions of amino acids in primary chain, the contact networks are constructed
based on the 3D spatial distances of amino acids. We have further divided these
networks into sub-networks of hydrophobic, hydrophilic and charged residues.
Our analysis reveals that a significantly higher percentage of assortative
sub-clusters of long-range hydrophobic networks helps a protein in
communicating the necessary information for protein folding in one hand; on the
other hand the higher values of clustering coefficients of hydrophobic
sub-clusters play a major role in slowing down the process so that necessary
local and global stability can be achieved through intra connectivities of the
amino acid residues. Further, higher degrees of hydrophobic long-range
interactions suggest their greater role in protein folding and stability. The
small-range all amino acids networks have signature of hierarchy. The present
analysis with other evidences suggest that in a protein's 3D conformational
During the last decade, network approaches became a powerful tool to describe
protein structure and dynamics. Here we review the links between disordered
proteins and the associated networks, and describe the consequences of local,
mesoscopic and global network disorder on changes in protein structure and
dynamics. We introduce a new classification of protein networks into
cumulus-type, i.e., those similar to puffy (white) clouds, and stratus-type,
i.e., those similar to flat, dense (dark) low-lying clouds, and relate these
network types to protein disorder dynamics and to differences in energy
transmission processes. In the first class, there is limited overlap between
the modules, which implies higher rigidity of the individual units; there the
conformational changes can be described by an energy transfer mechanism. In the
second class, the topology presents a compact structure with significant
overlap between the modules; there the conformational changes can be described
by multi-trajectories; that is, multiple highly populated pathways. We further
propose that disordered protein regions evolved to help other protein segments
reach rarely visited but functionally-related states. We also show the role of
disorder in spatial games of amino acids; highlight the effects of
intrinsically disordered proteins (IDPs) on cellular networks and list some
possible studies linking protein disorder and protein structure networks.; Comment: 27 pages...