oaqc/ 0000755 0001777 0001777 00000000000 13202604401 012517 5 ustar herbrandt herbrandt oaqc/inst/ 0000755 0001777 0001777 00000000000 13202564773 013515 5 ustar herbrandt herbrandt oaqc/inst/CITATION 0000644 0001777 0001777 00000001535 13200043226 014635 0 ustar herbrandt herbrandt bibentry(
bibtype = "Article",
title = "Efficient orbit-aware triad and quad census in directed and undirected graphs",
author = c(person(given = "Mark",
family = "Ortmann",
email = "mark.ortmann@uni-konstanz.de"),
person(given = "Ulrik",
family = "Brandes",
email = "ulrik.brandes@uni-konstanz.de")),
journal = "Applied Network Science",
year = "2017",
volume = "2",
number = "1",
pages = "13",
issn = "2364-8228",
doi = "10.1007/s41109-017-0027-2",
header = "To cite oaqc in publications use:",
textVersion = paste(
"Mark Ortmann, Ulrik Brandes (2017).",
"Efficient orbit-aware triad and quad census in directed and undirected graphs.",
"Applied Network Science, 2(1), 13.",
"10.1007/s41109-017-0027-2")
)
oaqc/inst/doc/ 0000755 0001777 0001777 00000000000 13202564773 014262 5 ustar herbrandt herbrandt oaqc/inst/doc/oaqc.Rmd 0000644 0001777 0001777 00000003233 13202564732 015645 0 ustar herbrandt herbrandt ---
title: "Introduction to oaqc"
author: "Mark Ortmann"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{oaqc}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---
# Introduction to oaqc
This package provides an efficient algorithm to calculate for a given graph the **o**rbit-**a**ware **q**uad **c**ensus. More precisely the frequency distribution of all induced and non-induced non-isomorphic four node subgraphs, i.e. quads , on a node and edge level; see the figure below for the relation between orbit and quad.

## Input
The input can either be
* Edge-List (matrix or data.frame)
* Graph Object ('igraph')
Despite the input format the graph should not contain loops or multi-edges and the vertex indices have to lie in range $[0,n-1)$ with $n$ denoting the number of vertices in the graph. Note that if the smallest index is $1$ the algorithm will create an isolated vertex with index $0$.
## Calculating the orbit-aware quad census
The following code exemplifies the use of this package.
```{r}
library(oaqc)
### k4, pure R
k4 <- data.frame(
source = c(0, 0, 0, 1, 1, 2),
target = c(1, 2, 3, 2, 3, 3)
)
k4orbits <- oaqc(k4, non_ind_freq = F, file = "")
# print(k4orbits)
```
In order to calculate the non-induced frequencies as well just set the corresponding flag to `TRUE`.
Since the orbit-aware frequencies can be rather large integers, which can cause some problems with R, the results can be directly written to a file.
## Result
The results of the, e.g., induced frequencies of the nodes in orbit 10 can be accessed in the following way.
```{r}
print(k4orbits$n_orbits_ind[,10])
``` oaqc/inst/doc/oaqc.R 0000644 0001777 0001777 00000000564 13202564773 015335 0 ustar herbrandt herbrandt ## ------------------------------------------------------------------------
library(oaqc)
### k4, pure R
k4 <- data.frame(
source = c(0, 0, 0, 1, 1, 2),
target = c(1, 2, 3, 2, 3, 3)
)
k4orbits <- oaqc(k4, non_ind_freq = F, file = "")
# print(k4orbits)
## ------------------------------------------------------------------------
print(k4orbits$n_orbits_ind[,10])
oaqc/inst/doc/oaqc.html 0000644 0001777 0001777 00000360233 13202564773 016102 0 ustar herbrandt herbrandt
Introduction to oaqc
Introduction to oaqc
This package provides an efficient algorithm to calculate for a given graph the o rbit-a ware q uad c ensus. More precisely the frequency distribution of all induced and non-induced non-isomorphic four node subgraphs, i.e. quads , on a node and edge level; see the figure below for the relation between orbit and quad.
Input
The input can either be
Edge-List (matrix or data.frame)
Graph Object ('igraph')
Despite the input format the graph should not contain loops or multi-edges and the vertex indices have to lie in range \([0,n-1)\) with \(n\) denoting the number of vertices in the graph. Note that if the smallest index is \(1\) the algorithm will create an isolated vertex with index \(0\).
Calculating the orbit-aware quad census
The following code exemplifies the use of this package.
library(oaqc)
### k4, pure R
k4 <- data.frame(
source = c(0, 0, 0, 1, 1, 2),
target = c(1, 2, 3, 2, 3, 3)
)
k4orbits <- oaqc(k4, non_ind_freq = F, file = "")
# print(k4orbits)
In order to calculate the non-induced frequencies as well just set the corresponding flag to TRUE
.
Since the orbit-aware frequencies can be rather large integers, which can cause some problems with R, the results can be directly written to a file.
Result
The results of the, e.g., induced frequencies of the nodes in orbit 10 can be accessed in the following way.
print(k4orbits$n_orbits_ind[,10])
## [1] 0 0 0 0
oaqc/tests/ 0000755 0001777 0001777 00000000000 13200601223 013656 5 ustar herbrandt herbrandt oaqc/tests/oaqc_test.R 0000644 0001777 0001777 00000001326 13202560020 015766 0 ustar herbrandt herbrandt stopifnot(require('oaqc'), require('igraph'), getRversion() >= '3.4')
sessionInfo()
packageDescription("oaqc")
### k4, pure R
k4 <- data.frame(
source = c(0, 0, 0, 1, 1, 2),
target = c(1, 2, 3, 2, 3, 3)
)
k4orbits <- oaqc(k4, non_ind_freq = T)
print(k4orbits)
n_orbits <- matrix(0, nrow = 4, ncol = 20)
n_orbits[, 20] <- 1
stopifnot(k4orbits$n_orbits == n_orbits)
e_orbits <- matrix(0, nrow = 6, ncol = 14)
e_orbits[, 14] <- 1
stopifnot(k4orbits$e_orbits == e_orbits)
### k4, igraph
ik4 <- make_graph(t(as.matrix(k4)) + 1) # 'igraph' is 1-based
ik4graph <- oaqc(ik4, non_ind_freq = T)
print(vertex_attr_names(ik4graph))
print(edge_attr_names(ik4graph))
### random ba
n <- 1403
g <- ba.game(n)
print(g)
c <- oaqc(g)
oaqc/src/ 0000755 0001777 0001777 00000000000 13202564773 013327 5 ustar herbrandt herbrandt oaqc/src/Graph.cpp 0000644 0001777 0001777 00000005065 13202564773 015102 0 ustar herbrandt herbrandt /*
* Graph.cpp
*
* Created on: 14 Sep 2017
* Author: ortmann
*/
#include "Graph.h"
#include
namespace oaqc {
Graph::Graph(const unsigned int n, const unsigned int m, const int* const edges) :
_edges(new Edge[2 * m]), //
_n(n), //
_m(m), //
_inOffset(new unsigned int[n + 1]), //
_outOffset(new unsigned int[n]), //
_mapping(new unsigned int[n]) {
// used to find the highest indexed outgoing edge
_inOffset[_n] = 2 * _m;
createGraph(edges);
}
Graph::~Graph() {
delete[] _mapping;
delete[] _outOffset;
delete[] _inOffset;
delete[] _edges;
}
void Graph::createGraph(const int* const edges) {
bucketSort(edges);
// store the graph
for (unsigned int i = 0; i < _m; ++i) {
const unsigned int n1 = _mapping[edges[i]];
const unsigned int n2 = _mapping[edges[i + _m]];
_edges[_outOffset[n1]].first = n2;
_edges[_outOffset[n1]].second = i;
++_outOffset[n1];
_edges[_outOffset[n2]].first = n1;
_edges[_outOffset[n2]].second = i;
++_outOffset[n2];
}
// sort the edges according the opposite indices
for (unsigned int i = 0; i < _n; ++i) {
// this works since all ids are unique => second values are never compared
// set the outOffset
std::sort(_edges + _inOffset[i], _edges + _inOffset[i + 1]);
for (unsigned int pos = _inOffset[i]; pos < _inOffset[i + 1]; ++pos) {
if (_edges[pos].first > i) {
_outOffset[i] = pos;
break;
}
}
}
}
void Graph::bucketSort(const int* const edges) {
// calculate node degrees
unsigned int* const deg = new unsigned int[_n]();
for (unsigned int i = 0; i < _m; ++i) {
++deg[edges[i]];
++deg[edges[i + _m]];
}
// init the buckets array
unsigned int maxDegree = 0;
for (unsigned int i = 0; i < _n; ++i) {
maxDegree = std::max(maxDegree, deg[i]);
}
unsigned int* const bucket = new unsigned int[maxDegree + 1]();
// compute size of each bucket
for (unsigned int i = 0; i < _n; ++i) {
++bucket[deg[i]];
}
unsigned int first = 0;
unsigned int size;
// calculate the lower end point of each bucket
for (unsigned int i = 0; i <= maxDegree; ++i) {
size = bucket[i];
bucket[i] = first;
first += size;
}
unsigned int* const reverseMapping = new unsigned int[_n];
for (unsigned int i = 0; i < _n; ++i) {
const unsigned int degree = deg[i];
unsigned int pos = bucket[degree];
_mapping[i] = pos;
reverseMapping[pos] = i;
++bucket[degree];
}
unsigned int m = 0;
for (unsigned int i = 0; i < _n; ++i) {
_inOffset[i] = m;
_outOffset[i] = m;
m += deg[reverseMapping[i]];
}
delete[] bucket;
delete[] deg;
delete[] reverseMapping;
}
} /* namespace oaqc */
oaqc/src/QuadCensus.cpp 0000644 0001777 0001777 00000035604 13202564773 016116 0 ustar herbrandt herbrandt /*
* QuadCensus.cpp
*
* Created on: 14 Sep 2017
* Author: ortmann
*/
#include "QuadCensus.h"
#include
#include "Graph.h"
namespace oaqc {
QuadCensus::QuadCensus(const unsigned int n, const unsigned int m,
const int * const edges) :
_nodeOrbitCount(20), //
_edgeOrbitCount(14), //
_eTriCount(0), //
_nTriCount(0), //
_nNonIndC4Count(0), //
_eNonIndC4Count(0), //
_eOrbits(0), //
_nOrbits(0), //
_neighDeg(0), //
_k3Count(0), //
_2pathCount(0),//
_graph(n,m,edges){
init();
initCounts();
calcK3K4C4();
calcK3RelNonIndCounts();
calcNonInducedFrequencies();
}
QuadCensus::~QuadCensus() {
clear();
}
const unsigned long* QuadCensus::eOrbits() {
return _eOrbits;
}
const unsigned long* QuadCensus::nOrbits() {
return _nOrbits;
}
void QuadCensus::init() {
const unsigned int n = _graph.n();
const unsigned int m = _graph.m();
_eTriCount = new unsigned long[m]();
_nTriCount = new unsigned long[n]();
_eNonIndC4Count = new unsigned long[m]();
_nNonIndC4Count = new unsigned long[n]();
_eOrbits = new unsigned long[_edgeOrbitCount * m]();
_nOrbits = new unsigned long[_nodeOrbitCount * n]();
_neighDeg = new unsigned long[n]();
}
void QuadCensus::clear() {
delete[] _eTriCount;
delete[] _nTriCount;
delete[] _nNonIndC4Count;
delete[] _eNonIndC4Count;
delete[] _eOrbits;
delete[] _nOrbits;
delete[] _neighDeg;
}
void QuadCensus::initCounts() {
for (unsigned int i = 0; i < _graph.n(); ++i) {
const unsigned long deg = _graph.lastOutEdge(i) - _graph.firstInEdge(i);
_2pathCount += choose2(deg);
N_ORBIT(i, 11)= choose3 (deg);
const unsigned int endInd = _graph.lastInEdge(i);
for (unsigned int neigh = _graph.firstInEdge(i); neigh < endInd;
++neigh) {
const unsigned int neighInd = _graph.opInd(neigh);
_neighDeg[i] += _graph.lastOutEdge(neighInd)
- _graph.firstInEdge(neighInd);
_neighDeg[neighInd] += deg;
}
}
}
void QuadCensus::calcK3K4C4() {
int* const innerMark = new int[_graph.n()];
std::fill_n(innerMark, _graph.n(), -1);
int* const outerMark = new int[_graph.n()];
std::fill_n(outerMark, _graph.n(), -1);
unsigned int* const visCount = new unsigned int[_graph.n()]();
unsigned int* const workCount = new unsigned int[_graph.n()]();
for (unsigned int t4 = 1; t4 < _graph.n(); ++t4) {
const unsigned int t4StartIncInd = _graph.firstInEdge(t4);
const unsigned int t4EndIncInd = _graph.lastInEdge(t4);
// mark the edges
for (unsigned int i = t4StartIncInd; i < t4EndIncInd; i++) {
outerMark[_graph.opInd(i)] = _graph.edgeInd(i);
}
// now find K3, K4, C4
for (unsigned int t1EdgeInd = t4StartIncInd; t1EdgeInd < t4EndIncInd;
++t1EdgeInd) {
unsigned int t1 = _graph.opInd(t1EdgeInd);
unsigned int l = outerMark[t1];
outerMark[t1] = -1;
unsigned int t1StartIncInd = _graph.firstInEdge(t1);
unsigned int t1StartOutInd = _graph.firstOutEdge(t1);
// for C4 counting
for (; t1StartIncInd < t1StartOutInd; ++t1StartIncInd) {
const unsigned int pos = _graph.opInd(t1StartIncInd);
++visCount[pos];
++workCount[pos];
}
unsigned int t1EndOutInd = t1StartIncInd;
// for K4 counting
for (unsigned int i = _graph.opInd(t1EndOutInd); i != t4;
i = _graph.opInd(++t1EndOutInd)) {
++visCount[i];
++workCount[i];
innerMark[i] = _graph.edgeInd(t1EndOutInd);
}
for (unsigned int t2EdgeInd = t1StartOutInd;
t2EdgeInd < t1EndOutInd; ++t2EdgeInd) {
const unsigned int t2 = _graph.opInd(t2EdgeInd);
const unsigned int s1 = _graph.edgeInd(t2EdgeInd);
innerMark[t2] = -1;
if (outerMark[t2] == -1) {
continue;
}
const unsigned int s2 = outerMark[t2];
// this is an triangle
++_eTriCount[s1];
++_eTriCount[s2];
++_eTriCount[l];
++_nTriCount[t4];
++_nTriCount[t1];
++_nTriCount[t2];
int t2StartOudInd = _graph.firstOutEdge(t2);
const int t2EndOutInd = _graph.lastOutEdge(t2);
while (t2StartOudInd < t2EndOutInd) {
const int t3 = _graph.opInd(t2StartOudInd);
const int s3 = _graph.edgeInd(t2StartOudInd);
// this is a k4
if (outerMark[t3] >= 0 && innerMark[t3] >= 0) {
++E_ORBIT(s1, 13);
++E_ORBIT(s2, 13);
++E_ORBIT(s3, 13);
++E_ORBIT(l, 13);
++E_ORBIT(innerMark[t3], 13);
++E_ORBIT(outerMark[t3], 13);
++N_ORBIT(t1, 19);
++N_ORBIT(t2, 19);
++N_ORBIT(t3, 19);
++N_ORBIT(t4, 19);
}
++t2StartOudInd;
}
}
}
for (unsigned int t1EdgeInd = t4StartIncInd; t1EdgeInd < t4EndIncInd;
++t1EdgeInd) {
const unsigned int t1 = _graph.opInd(t1EdgeInd);
const unsigned int l = _graph.edgeInd(t1EdgeInd);
unsigned int t1StartIncInd = _graph.firstInEdge(t1);
unsigned int t2;
while ((t2 = _graph.opInd(t1StartIncInd)) != t4) {
const int c4count = visCount[t2] - 1;
if (--workCount[t2] == 0) {
const long val = choose2(c4count + 1);
_nNonIndC4Count[t4] += val;
_nNonIndC4Count[t2] += val;
visCount[t2] = 0;
}
_nNonIndC4Count[t1] += c4count;
_eNonIndC4Count[l] += c4count;
_eNonIndC4Count[_graph.edgeInd(t1StartIncInd)] += c4count;
++t1StartIncInd;
}
}
}
delete[] innerMark;
delete[] outerMark;
delete[] visCount;
delete[] workCount;
}
void QuadCensus::calcK3RelNonIndCounts() {
int* const mark = new int[_graph.n()];
std::fill_n(mark, _graph.n(), -1);
for (unsigned int t3 = 2; t3 < _graph.n(); ++t3) {
unsigned int t3FirstInInd = _graph.firstInEdge(t3);
unsigned int t3EndInInd = _graph.lastInEdge(t3);
for (unsigned int i = t3FirstInInd; i < t3EndInInd; ++i) {
mark[_graph.opInd(i)] = _graph.edgeInd(i);
}
for (unsigned int i = t3FirstInInd; i < t3EndInInd; ++i) {
const unsigned int t1 = _graph.opInd(i);
const int l = mark[t1];
mark[t1] = -1;
unsigned int t1StartOutInd = _graph.firstOutEdge(t1);
unsigned int t2;
while ((t2 = _graph.opInd(t1StartOutInd)) != t3) {
if (mark[t2] >= 0) {
++_k3Count;
const unsigned
int s1 = _graph.edgeInd(t1StartOutInd);
const unsigned
int s2 = mark[t2];
const unsigned
long s1T = _eTriCount[s1];
const unsigned
long s2T = _eTriCount[s2];
const unsigned
long lT = _eTriCount[l];
const unsigned
int t1D = _graph.lastOutEdge(t1) - _graph.firstInEdge(t1);
const unsigned
int t2D = _graph.lastOutEdge(t2) - _graph.firstInEdge(t2);
const unsigned
int t3D = _graph.lastOutEdge(t3) - _graph.firstInEdge(t3);
// update diamond counts
E_ORBIT(l, 11)+= s1T + s2T;
E_ORBIT(s1, 11)+= s2T + lT;
E_ORBIT(s2, 11)+= s1T + lT;
N_ORBIT(t1, 17)+= s2T;
N_ORBIT(t2, 17)+= lT;
N_ORBIT(t3, 17)+= s1T;
// update paw counts
E_ORBIT(s1, 9)+= t3D;
E_ORBIT(s2, 9)+= t1D;
E_ORBIT(l, 9)+= t2D;
N_ORBIT(t1, 14)+= t2D + t3D;
N_ORBIT(t2, 14)+= t1D + t3D;
N_ORBIT(t3, 14)+= t1D + t2D;
}
++t1StartOutInd;
}
}
}
delete[] mark;
}
void QuadCensus::calcNonInducedFrequencies() {
// for each edge
for (unsigned int i = 0; i < _graph.n(); i++) {
const unsigned int endIndex = _graph.lastInEdge(i);
const unsigned int srcDeg = _graph.lastOutEdge(i) - _graph.firstInEdge(i);
for (unsigned int neighPos = _graph.firstInEdge(i); neighPos < endIndex;
neighPos++) {
const unsigned int tgtIndex = _graph.opInd(neighPos);
const unsigned int eIndex = _graph.edgeInd(neighPos);
const unsigned int tgtDeg = _graph.lastOutEdge(tgtIndex)
- _graph.firstInEdge(tgtIndex);
// set non-induced edge orbit counts
E_ORBIT(eIndex, 12)= choose2(_eTriCount[eIndex] );
E_ORBIT(eIndex, 11)= E_ORBIT(eIndex, 11) - 2 * _eTriCount[eIndex];
E_ORBIT(eIndex, 10)=_eNonIndC4Count[eIndex];
E_ORBIT(eIndex, 9)= E_ORBIT(eIndex, 9) - 2 * _eTriCount[eIndex];
E_ORBIT(eIndex, 8)= _eTriCount[eIndex] * (srcDeg + tgtDeg - 4);
E_ORBIT(eIndex, 7)=
_nTriCount[i] + _nTriCount[tgtIndex] - 2 * _eTriCount[eIndex];
E_ORBIT(eIndex, 6)= choose2(srcDeg - 1) + choose2(tgtDeg - 1);
E_ORBIT(eIndex, 5)=(srcDeg - 1) * (tgtDeg - 1) - _eTriCount[eIndex];
E_ORBIT(eIndex, 4)= _neighDeg[i] + _neighDeg[tgtIndex] - 2 * (srcDeg + tgtDeg) + 2 - 2 * _eTriCount[eIndex];
E_ORBIT(eIndex, 3)= _eTriCount[eIndex] * (_graph.n() - 3);
E_ORBIT(eIndex, 2)=((srcDeg - 1) + (tgtDeg - 1)) * (_graph.n() - 3);
E_ORBIT(eIndex, 1)= _graph.m() - srcDeg - tgtDeg + 1;
E_ORBIT(eIndex, 0)= choose2(_graph.n() - 2);
// set non-induced node orbit counts
N_ORBIT(i, 18)+= choose2(_eTriCount[eIndex]);
N_ORBIT(i, 15)+= _nTriCount[tgtIndex] - _eTriCount[eIndex];
N_ORBIT(i, 12)+= choose2(tgtDeg - 1);
N_ORBIT(i, 10)+= _neighDeg[tgtIndex] - tgtDeg;
N_ORBIT(i, 9)+= (srcDeg - 1) * (tgtDeg - 1) - _eTriCount[eIndex];
N_ORBIT(i, 6)+= tgtDeg - 1;
N_ORBIT(i, 3)+= _graph.m() - (srcDeg + tgtDeg - 1);
N_ORBIT(tgtIndex, 18)+= choose2(_eTriCount[eIndex]);
N_ORBIT(tgtIndex, 15)+= _nTriCount[i] - _eTriCount[eIndex];
N_ORBIT(tgtIndex, 12)+= choose2(srcDeg - 1);
N_ORBIT(tgtIndex, 10)+= _neighDeg[i] - srcDeg;
N_ORBIT(tgtIndex, 9)+=
(srcDeg - 1) * (tgtDeg - 1) - _eTriCount[eIndex];
N_ORBIT(tgtIndex, 6)+= srcDeg - 1;
N_ORBIT(tgtIndex, 3)+= _graph.m() - (srcDeg + tgtDeg - 1);
}
}
// solve remaining non-induced node frequencies
const unsigned
long allPairs = choose3(_graph.n() - 1);
for (unsigned int i = 0; i < _graph.n(); ++i) {
const unsigned int deg = _graph.lastOutEdge(i) - _graph.firstInEdge(i);
N_ORBIT(i, 17)-= _nTriCount[i];
N_ORBIT(i, 16)= _nNonIndC4Count[i];
N_ORBIT(i, 14)-= _nTriCount[i] * 4;
N_ORBIT(i, 13)= _nTriCount[i] * (deg - 2);
N_ORBIT(i, 11)= choose3(deg);
N_ORBIT(i, 10)-= deg * (deg - 1) + 2 * _nTriCount[i];
N_ORBIT(i, 8)= _k3Count - _nTriCount[i];
N_ORBIT(i, 7)= _nTriCount[i] * (_graph.n() - 3);
N_ORBIT(i, 5)= N_ORBIT(i, 6);
N_ORBIT(i, 6)= _2pathCount - N_ORBIT(i, 6) - choose2(deg);
N_ORBIT(i, 5)= N_ORBIT(i, 5) * (_graph.n() - 3);
N_ORBIT(i, 4)= choose2(deg) * (_graph.n() - 3);
N_ORBIT(i, 2)= (_graph.m() - deg) * (_graph.n() - 3);
N_ORBIT(i, 1)= choose2(_graph.n() - 2) * deg;
N_ORBIT(i, 0)= allPairs;
}
}
void QuadCensus::calcInducedFrequencies() {
for (unsigned int i = 0; i < _graph.n(); ++i) {
const unsigned int endIndex = _graph.lastInEdge(i);
for (unsigned int neighPos = _graph.firstInEdge(i); neighPos < endIndex;
++neighPos) {
const unsigned int eIndex = _graph.edgeInd(neighPos);
E_ORBIT(eIndex, 12)-= E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 11)-= 4 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 10)-=
E_ORBIT(eIndex, 11) + 2 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 9)-=
E_ORBIT(eIndex, 11) + 2 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 8)-=
E_ORBIT(eIndex, 11) + 4 * E_ORBIT(eIndex, 12)
+ 4 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 7)-=
E_ORBIT(eIndex, 11) + 2 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 6)-=
E_ORBIT(eIndex, 7) + E_ORBIT(eIndex, 8)
+ E_ORBIT(eIndex, 11) + 2 * E_ORBIT(eIndex, 12)
+ 2 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 5)-=
E_ORBIT(eIndex, 8) + E_ORBIT(eIndex, 10)
+ E_ORBIT(eIndex, 11) + 2 * E_ORBIT(eIndex, 12)
+ 2 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 4)-=
2 * E_ORBIT(eIndex, 7) + 2 * E_ORBIT(eIndex, 9)
+ 2 * E_ORBIT(eIndex, 10)
+ 3 * E_ORBIT(eIndex, 11)
+ 4 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 3)-=
E_ORBIT(eIndex, 8) + E_ORBIT(eIndex, 9)
+ E_ORBIT(eIndex, 11) + 2 * E_ORBIT(eIndex, 12)
+ 2 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 2)-=
2 * E_ORBIT(eIndex, 3) + E_ORBIT(eIndex, 4)
+ 2 * E_ORBIT(eIndex, 5)
+ 2 * E_ORBIT(eIndex, 6)
+ 2 * E_ORBIT(eIndex, 7)
+ 3 * E_ORBIT(eIndex, 8)
+ 2 * E_ORBIT(eIndex, 9)
+ 2 * E_ORBIT(eIndex, 10)
+ 3 * E_ORBIT(eIndex, 11)
+ 4 * E_ORBIT(eIndex, 12)
+ 4 * E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 1)-=
E_ORBIT(eIndex, 4) + E_ORBIT(eIndex, 7)
+ E_ORBIT(eIndex, 9) + E_ORBIT(eIndex, 10)
+ E_ORBIT(eIndex, 11) + E_ORBIT(eIndex, 13);
E_ORBIT(eIndex, 0)-=
E_ORBIT(eIndex, 1) + E_ORBIT(eIndex, 2)
+ E_ORBIT(eIndex, 3) + E_ORBIT(eIndex, 4)
+ E_ORBIT(eIndex, 5) + E_ORBIT(eIndex, 6)
+ E_ORBIT(eIndex, 7) + E_ORBIT(eIndex, 8)
+ E_ORBIT(eIndex, 9) + E_ORBIT(eIndex, 10)
+ E_ORBIT(eIndex, 11) + E_ORBIT(eIndex, 12)
+ E_ORBIT(eIndex, 13);
}
}
for (unsigned int i = 0; i < _graph.n(); ++i) {
N_ORBIT(i, 18)-= 3 * N_ORBIT(i, 19);
N_ORBIT(i, 17)-= 3 * N_ORBIT(i, 19);
N_ORBIT(i, 16)-=
3 * N_ORBIT(i, 19) + N_ORBIT(i, 18) + N_ORBIT(i, 17);
N_ORBIT(i, 15)-= 3 * N_ORBIT(i, 19) + 2 * N_ORBIT(i, 17);
N_ORBIT(i, 14)-=
6 * N_ORBIT(i, 19) + 2 * N_ORBIT(i, 18)
+ 2 * N_ORBIT(i, 17);
N_ORBIT(i, 13)-= 3 * N_ORBIT(i, 19) + 2 * N_ORBIT(i, 18);
N_ORBIT(i, 12)-=
3 * N_ORBIT(i, 19) + N_ORBIT(i, 18) + 2 * N_ORBIT(i, 17)
+ N_ORBIT(i, 15) + N_ORBIT(i, 14);
N_ORBIT(i, 11)-=
N_ORBIT(i, 19) + N_ORBIT(i, 18) + N_ORBIT(i, 13);
N_ORBIT(i, 10)-=
6 * N_ORBIT(i, 19) + 2 * N_ORBIT(i, 18)
+ 4 * N_ORBIT(i, 17) + 2 * N_ORBIT(i, 16)
+ 2 * N_ORBIT(i, 15) + N_ORBIT(i, 14);
N_ORBIT(i, 9)-=
6 * N_ORBIT(i, 19) + 4 * N_ORBIT(i, 18)
+ 2 * N_ORBIT(i, 17) + 2 * N_ORBIT(i, 16)
+ N_ORBIT(i, 14) + 2 * N_ORBIT(i, 13);
N_ORBIT(i, 8)-=
N_ORBIT(i, 19) + N_ORBIT(i, 17) + N_ORBIT(i, 15);
N_ORBIT(i, 7)-=
3 * N_ORBIT(i, 19) + 2 * N_ORBIT(i, 18) + N_ORBIT(i, 17)
+ N_ORBIT(i, 14) + N_ORBIT(i, 13);
N_ORBIT(i, 6)-=
3 * N_ORBIT(i, 19) + N_ORBIT(i, 18) + 3 * N_ORBIT(i, 17)
+ N_ORBIT(i, 16) + 3 * N_ORBIT(i, 15)
+ N_ORBIT(i, 14) + N_ORBIT(i, 12) + N_ORBIT(i, 10)
+ 3 * N_ORBIT(i, 8);
N_ORBIT(i, 5)-=
6 * N_ORBIT(i, 19) + 4 * N_ORBIT(i, 18)
+ 4 * N_ORBIT(i, 17) + 2 * N_ORBIT(i, 16)
+ 2 * N_ORBIT(i, 15) + 3 * N_ORBIT(i, 14)
+ 2 * N_ORBIT(i, 13) + 2 * N_ORBIT(i, 12)
+ N_ORBIT(i, 10) + N_ORBIT(i, 9)
+ 2 * N_ORBIT(i, 7);
N_ORBIT(i, 4)-=
3 * N_ORBIT(i, 19) + 3 * N_ORBIT(i, 18) + N_ORBIT(i, 17)
+ N_ORBIT(i, 16) + N_ORBIT(i, 14)
+ 3 * N_ORBIT(i, 13) + 3 * N_ORBIT(i, 11)
+ N_ORBIT(i, 9) + N_ORBIT(i, 7);
N_ORBIT(i, 3)-=
3 * N_ORBIT(i, 19) + 2 * N_ORBIT(i, 18)
+ 2 * N_ORBIT(i, 17) + 2 * N_ORBIT(i, 16)
+ N_ORBIT(i, 15) + N_ORBIT(i, 14) + N_ORBIT(i, 13)
+ N_ORBIT(i, 10) + N_ORBIT(i, 9);
N_ORBIT(i, 2)-=
3 * N_ORBIT(i, 19) + 2 * N_ORBIT(i, 18)
+ 3 * N_ORBIT(i, 17) + 2 * N_ORBIT(i, 16)
+ 3 * N_ORBIT(i, 15) + 2 * N_ORBIT(i, 14)
+ N_ORBIT(i, 13) + 2 * N_ORBIT(i, 12)
+ 2 * N_ORBIT(i, 10) + N_ORBIT(i, 9)
+ 3 * N_ORBIT(i, 8) + N_ORBIT(i, 7)
+ 2 * N_ORBIT(i, 6) + N_ORBIT(i, 5)
+ N_ORBIT(i, 3);
N_ORBIT(i, 1)-=
3 * N_ORBIT(i, 19) + 3 * N_ORBIT(i, 18)
+ 2 * N_ORBIT(i, 17) + 2 * N_ORBIT(i, 16)
+ N_ORBIT(i, 15) + 2 * N_ORBIT(i, 14)
+ 3 * N_ORBIT(i, 13) + N_ORBIT(i, 12)
+ 3 * N_ORBIT(i, 11) + N_ORBIT(i, 10)
+ 2 * N_ORBIT(i, 9) + 2 * N_ORBIT(i, 7)
+ N_ORBIT(i, 5) + 2 * N_ORBIT(i, 4)
+ N_ORBIT(i, 3);
N_ORBIT(i, 0)-=
N_ORBIT(i, 19) + N_ORBIT(i, 18) + N_ORBIT(i, 17)
+ N_ORBIT(i, 16) + N_ORBIT(i, 15) + N_ORBIT(i, 14)
+ N_ORBIT(i, 13) + N_ORBIT(i, 12) + N_ORBIT(i, 11)
+ N_ORBIT(i, 10) + N_ORBIT(i, 9) + N_ORBIT(i, 8)
+ N_ORBIT(i, 7) + N_ORBIT(i, 6) + N_ORBIT(i, 5)
+ N_ORBIT(i, 4) + N_ORBIT(i, 3) + N_ORBIT(i, 2)
+ N_ORBIT(i, 1);
}
}
}
/* namespace oaqc */
oaqc/src/Graph.h 0000644 0001777 0001777 00000002615 13202564773 014545 0 ustar herbrandt herbrandt /*
* Graph.h
*
* Created on: 14 Sep 2017
* Author: ortmann
*/
#ifndef GRAPH_H_
#define GRAPH_H_
#include
typedef std::pair< unsigned int, unsigned int> Edge;
namespace oaqc {
class Graph {
public:
Graph(const unsigned int n, const unsigned int m,
const int * const edges);
virtual ~Graph();
Edge* const _edges; // TODO private
inline const unsigned int* const getMapping() const{
return _mapping;
}
inline unsigned int n() const {
return _n;
}
inline unsigned int m() const {
return _m;
}
inline unsigned int firstInEdge(const unsigned int v) const {
return _inOffset[v];
}
inline unsigned int lastInEdge(const unsigned int v) const {
return _outOffset[v];
}
inline unsigned int firstOutEdge(const unsigned int v) const {
return _outOffset[v];
}
inline unsigned int lastOutEdge(const unsigned int v) const {
return _inOffset[v + 1];
}
inline unsigned int opInd(const unsigned int pos) const {
return _edges[pos].first;
}
inline unsigned int edgeInd(const unsigned int pos) const {
return _edges[pos].second;
}
private:
unsigned const int _n;
unsigned const int _m;
unsigned int* const _inOffset;
unsigned int* const _outOffset;
unsigned int* const _mapping;
void createGraph(const int * const edges);
void bucketSort(const int * const edges);
};
} /* namespace oaqc */
#endif /* GRAPH_H_ */
oaqc/src/init.cpp 0000644 0001777 0001777 00000007101 13202564773 014775 0 ustar herbrandt herbrandt #include "QuadCensus.h"
#include
#include
#include // for NULL
#include
#include
using namespace std;
using namespace oaqc;
static SEXP c_to_r(const unsigned int* const rMap, const unsigned long * orbit,
unsigned int size, unsigned long orbitCount) {
const unsigned long len = size * orbitCount;
SEXP rvec = PROTECT(allocVector(REALSXP, len));
// set dimensions
SEXP dim = PROTECT(allocVector(INTSXP, 2));
INTEGER(dim)[0] = size;
INTEGER(dim)[1] = orbitCount;
setAttrib(rvec, R_DimSymbol, dim);
// copy (cast and transpose) data
double * data = REAL(rvec);
unsigned int j = 0;
for (unsigned int o = 0; o < orbitCount; ++o){
for (unsigned int i = 0; i < size; ++i) {
unsigned int pos = i;
if (rMap != NULL) {
pos = rMap[i];
}
data[j++] = (double) orbit[pos * orbitCount + o];
}
}
UNPROTECT(2);
return rvec;
}
static void write_to_file(string fName, const unsigned int* const rMap,
const unsigned long * orbit, unsigned int size,
unsigned long orbitCount) {
ofstream writer;
// open the writer
writer.open(fName.c_str());
if (!writer.is_open()) {
throw ios_base::failure("cannot open " + fName);
}
for (unsigned int o = 0; o < orbitCount - 1; ++o) {
writer << "orbit_" << o << ";";
}
writer << "orbit_" << (orbitCount - 1) << std::endl;
for (unsigned int i = 0; i < size; ++i) {
unsigned int pos = i;
if (rMap != NULL) {
pos = rMap[i];
}
for (unsigned int o = 0; o < orbitCount - 1; ++o) {
writer << orbit[pos * orbitCount + o] << ";";
}
writer << orbit[pos * orbitCount + orbitCount - 1] << std::endl;
}
// flush and close
writer.flush();
writer.close();
}
static void write_results(SEXP& a_value, SEXP& a_names, unsigned int& sIndex,
const unsigned int n, const unsigned int m, QuadCensus& qc,
string& filePrefix, string rType) {
if (!filePrefix.empty()) {
write_to_file(filePrefix + "_n_orbits_" + rType + ".csv",
qc.getMapping(), qc.nOrbits(), n, qc.getNOrbitCount());
write_to_file(filePrefix + "_e_orbits_" + rType + ".csv", NULL,
qc.eOrbits(), m, qc.getEOrbitCount());
}
SET_STRING_ELT(a_names, sIndex, mkChar(("n_orbits_" + rType).c_str()));
SET_VECTOR_ELT(a_value, sIndex,
c_to_r(qc.getMapping(), qc.nOrbits(), n, qc.getNOrbitCount()));
++sIndex;
SET_STRING_ELT(a_names, sIndex, mkChar(("e_orbits_" + rType).c_str()));
SET_VECTOR_ELT(a_value, sIndex,
c_to_r(NULL, qc.eOrbits(), m, qc.getEOrbitCount()));
++sIndex;
}
extern "C" SEXP entry(SEXP a_n, SEXP a_edges, SEXP a_freqFlag, SEXP a_file) {
const unsigned int n = INTEGER(a_n)[0];
const unsigned int m = length(a_edges) / 2;
const int * edges = INTEGER(a_edges);
string filePrefix(CHAR(STRING_ELT(a_file, 0)));
const bool wNonIndFreq = LOGICAL(a_freqFlag)[0];
unsigned int res_size;
if (wNonIndFreq) {
res_size = 4;
} else {
res_size = 2;
}
QuadCensus qc(n, m, edges);
SEXP value = PROTECT(allocVector(VECSXP, res_size)); // VECSXP == list
SEXP names = PROTECT(allocVector(STRSXP, res_size));
unsigned int pos = 0;
if (wNonIndFreq) {
write_results(value, names, pos, n, m, qc, filePrefix, "non_ind");
}
qc.calcInducedFrequencies();
write_results(value, names, pos, n, m, qc, filePrefix, "ind");
setAttrib(value, R_NamesSymbol, names);
UNPROTECT(2);
return value;
}
#define CALLDEF(name, n) {#name, (DL_FUNC) &name, n}
static const R_CallMethodDef CallEntries[] = {
CALLDEF(entry, 4), { NULL, NULL, 0 } };
extern "C" void R_init_oaqc(DllInfo *dll) {
R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
R_useDynamicSymbols(dll, FALSE);
R_forceSymbols(dll, TRUE);
}
oaqc/src/QuadCensus.h 0000644 0001777 0001777 00000003107 13202564773 015554 0 ustar herbrandt herbrandt /*
* QuadCensus.h
*
* Created on: 14 Sep 2017
* Author: ortmann
*/
#ifndef QUADCENSUS_H_
#define QUADCENSUS_H_
#include "Graph.h"
namespace oaqc {
#define N_ORBIT(nIndex,orbit) _nOrbits[nIndex * _nodeOrbitCount + orbit]
#define E_ORBIT(eIndex,orbit) _eOrbits[eIndex * _edgeOrbitCount + orbit]
class QuadCensus {
public:
QuadCensus(const unsigned int n, const unsigned int m,
const int * const edges);
virtual ~QuadCensus();
const unsigned long getNOrbitCount() const{
return _nodeOrbitCount;
}
const unsigned long getEOrbitCount() const {
return _edgeOrbitCount;
}
const unsigned int* getMapping() const{
return _graph.getMapping();
}
const unsigned long* eOrbits();
const unsigned long* nOrbits();
void calcInducedFrequencies();
private:
const unsigned long _nodeOrbitCount;
const unsigned long _edgeOrbitCount;
unsigned long* _eTriCount;
unsigned long* _nTriCount;
unsigned long* _nNonIndC4Count;
unsigned long* _eNonIndC4Count;
unsigned long* _eOrbits;
unsigned long* _nOrbits;
unsigned long* _neighDeg;
unsigned long long _k3Count;
unsigned long long _2pathCount;
const Graph _graph;
void init();
void clear();
void initCounts();
void calcK3K4C4();
void calcK3RelNonIndCounts();
void calcNonInducedFrequencies();
inline unsigned long choose2(const unsigned long val) {
if (val < 1) {
return 0;
}
return (val * (val - 1)) / 2;
}
inline unsigned long choose3(const unsigned long val) {
if (val < 3) {
return 0;
}
return (val * (val - 1) * (val - 2)) / 6;
}
};
} /* namespace oaqc */
#endif /* QUADCENSUS_H_ */
oaqc/NAMESPACE 0000644 0001777 0001777 00000000134 13177114776 013762 0 ustar herbrandt herbrandt # Generated by roxygen2: do not edit by hand
export(oaqc)
useDynLib(oaqc, .registration=T)
oaqc/R/ 0000755 0001777 0001777 00000000000 13202554536 012735 5 ustar herbrandt herbrandt oaqc/R/oaqc.R 0000644 0001777 0001777 00000006152 13202563252 014002 0 ustar herbrandt herbrandt #' @title Orbit-aware Quad Census computation
#' @docType package
#' @name oaqc
#' @useDynLib oaqc, .registration=T
NULL
#' Coerce graph input.
#'
#' @param graph A matrix, data.frame or graph object.
#' @return Edge list matrix.
as.edge_list <- function(graph) {
if (is.matrix(graph)) {
if (length(dim(graph)) != 2 && ncol(graph) != 2) {
stop(paste("not coercable to edge list",
"only matrices with 2 columns supported"))
}
graph
} else if (is.data.frame(graph)) {
if (ncol(graph) != 2) {
stop(paste("not coercable to edge list",
"only data frames with 2 columns supported"))
}
data.matrix(graph)
} else if (inherits(graph, 'igraph')
&& requireNamespace('igraph', quietly = T)) {
igraph::as_edgelist(graph) - 1 # igraph is 1-based
} else {
stop(paste("unrecognized graph type",
"use matrix/data frame with 2 columns or igraph objects"))
}
}
#' Annotates the igraph object with orbit labels.
#'
#' @param graph Unmodified input graph.
#' @param orbits List with n_orbits, e_orbits matrices.
#' @param non_ind_freq A flag indicating whether non-induced frequencies have to be written or not.
#' @return \code{orbits} if the input is not an igraph, the annotated igraph
#' instead.
annotate_result <- function(graph, orbits, non_ind_freq) {
if (inherits(graph, 'igraph') && requireNamespace('igraph', quietly = T)) {
for (i in 1:ncol(orbits$n_orbits_ind)) {
graph <- igraph::set_vertex_attr(graph,
paste("orbit_ind", i, sep = "_"),
value = orbits$n_orbits_ind[, i])
}
for (i in 1:ncol(orbits$e_orbits_ind)) {
graph <- igraph::set_edge_attr(graph,
paste("orbit_ind", i, sep = "_"),
value = orbits$e_orbits_ind[, i])
}
if (non_ind_freq){
for (i in 1:ncol(orbits$`n_orbits_non_ind`)) {
graph <- igraph::set_vertex_attr(graph,
paste("orbit_non_ind", i, sep = "_"),
value = orbits$`n_orbits_non_ind`[, i])
}
for (i in 1:ncol(orbits$`e_orbits_non_ind`)) {
graph <- igraph::set_edge_attr(graph,
paste("orbit_non_ind", i, sep = "_"),
value = orbits$`e_orbits_non_ind`[, i])
}
}
return(graph)
}
return(orbits)
}
#' Calculates the orbit-aware quad census on an edge and node level, see
#' \code{vignette('oaqc')}.
#'
#' @param graph A matrix, data.frame or graph object.
#' @param non_ind_freq A flag indicating whether non-induced frequencies have to be returned or not.
#' @param file Name (and location) of the file to be written.
#' @return orbit-aware quad census on a node and edge level. Consult
#' \code{vignette('oaqc')} to see the correspondence between orbit and quad.
#' @examples
#' k4 <- data.frame(
#' source = c(0, 0, 0, 1, 1, 2),
#' target = c(1, 2, 3, 2, 3, 3)
#' )
#'
#' k4orbits <- oaqc(k4, non_ind_freq = TRUE)
#' print(k4orbits)
#' @export
oaqc <- function(graph, non_ind_freq=F, file="") {
edges <- as.edge_list(graph)
orbits <- .Call(entry, as.integer(max(edges) + 1), as.integer(edges),
as.logical(non_ind_freq), as.character(file))
return(annotate_result(graph, orbits, as.logical(non_ind_freq)))
}
oaqc/vignettes/ 0000755 0001777 0001777 00000000000 13202564773 014550 5 ustar herbrandt herbrandt oaqc/vignettes/oaqc.Rmd 0000644 0001777 0001777 00000003233 13202564732 016133 0 ustar herbrandt herbrandt ---
title: "Introduction to oaqc"
author: "Mark Ortmann"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{oaqc}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---
# Introduction to oaqc
This package provides an efficient algorithm to calculate for a given graph the **o**rbit-**a**ware **q**uad **c**ensus. More precisely the frequency distribution of all induced and non-induced non-isomorphic four node subgraphs, i.e. quads , on a node and edge level; see the figure below for the relation between orbit and quad.

## Input
The input can either be
* Edge-List (matrix or data.frame)
* Graph Object ('igraph')
Despite the input format the graph should not contain loops or multi-edges and the vertex indices have to lie in range $[0,n-1)$ with $n$ denoting the number of vertices in the graph. Note that if the smallest index is $1$ the algorithm will create an isolated vertex with index $0$.
## Calculating the orbit-aware quad census
The following code exemplifies the use of this package.
```{r}
library(oaqc)
### k4, pure R
k4 <- data.frame(
source = c(0, 0, 0, 1, 1, 2),
target = c(1, 2, 3, 2, 3, 3)
)
k4orbits <- oaqc(k4, non_ind_freq = F, file = "")
# print(k4orbits)
```
In order to calculate the non-induced frequencies as well just set the corresponding flag to `TRUE`.
Since the orbit-aware frequencies can be rather large integers, which can cause some problems with R, the results can be directly written to a file.
## Result
The results of the, e.g., induced frequencies of the nodes in orbit 10 can be accessed in the following way.
```{r}
print(k4orbits$n_orbits_ind[,10])
``` oaqc/vignettes/quad_census.svg 0000644 0001777 0001777 00000235732 13202564173 017611 0 ustar herbrandt herbrandt
image/svg+xml 1
1
1
1
co-
K
4
2
3
2
3
1
co-diamond
4
4
4
4
2
2
co-
C
4
6
5
7
6
3
3
co-paw
8
8
9
8
4
4
4
co-claw
10
10
11
11
6
5
5
P
4
12
13
13
13
7
7
7
claw
14
15
16
15
9
8
9
10
paw
17
17
17
17
11
11
11
11
C
4
18
19
19
18
12
12
12
12
13
diamond
20
20
20
20
14
14
14
14
14
14
K
4
Figure: All non-isomorphic subgraph with four nodes (quads). Node and edge labels refer to the orbits (different automorphism classes in each quad) and were enumerated such that each orbit is identified with a single quad.