Modules

Simulation Module

Diffusion of innovation simulation.

This simulation models the processes discussed by Abrahamson and Rosenkopf in [AR1997] and [RA1999].

Author:Christopher Kirkos
Date:04/19/2011
Version:0.1

Implementation

disim.disim.fullRegressionAnalysis(outFilePath, expTrialLogOutfile, trickleDirection)[source]

Perform regression analysis on all the identified combinations of values from the [AR1997] paper (With/without boundary conditions, all/high/low network densities, full/limited to 185 peripheral ties).

This function should effectively run the necessary regression analyses to regenerate all analysis tables presented in the paper.

disim.disim.loadCaseLog(expCaseLogOutfilePath)[source]

Regenerate experiment case log structure from output log file.

disim.disim.parseCommandLine()[source]
Available commands:
simulate
-d, –direction=up/down/both -n, –nodes=<integer> -t, –trials=<integer> -D, –dots=all,wpp -P, –pngs=all,wpp
plotstats
-i, –input-file=caseLogFile.csv
plotnetwork
-i, –input-file=dotfile.dot
Global options:
Output directory: -o –output-dir

simulate runs the simulation plotstats takes a case log file (CSV) and produces a graph file (PNG) plotnetwork takes a DOT file and produces a network visualization (PNG)

disim.disim.run1997ThresholdModel(trickleDirection='down', numberOfNodes=31, trials=100, cpRatio=0.33333333333333331, outFilePath='/home/prima/Development/tmp/disim/out', dots='none', pngs='none')[source]

Runs the initial threshold model from [AR1997]

Parameters:
  • trickleDirection (str) – The direction of trickle simulation. This decides whether the seed adopter is in the core or periphery for the simulation.
  • numberOfNodes (int) – The number of nodes in the generated network.
  • trials (int) – The number of trials to run for each unique set of parameters (Periphery ties, Ai).
  • cpRatio (float) – Ratio of the number of nodes in the core to nodes in the periphery.
  • outFilePath (str) – The path where output files and sub-directories are created.
  • dots (str) – Condition to output DOT files of the influence networks. Possible values are “all”, “wpp”, and “none”. “all” outputs files for each trial, “wpp” for only trials whose resulting graphs have boundary conditions, and “none” for no outut.
  • pngs (str) – Condition to output PNG files of the influence networks (same values/conditions as dots argument).

Note

“For each case, we ran 100 trials and calculated the average number of adopters in the focal and non-focal strata” ([AR1997] p. 298)

Graph Generation Module

Module to generate graph structures.

Author:Christopher Kirkos

Implementation

class disim.graphgen.DICorePeriphNxGenerator(numCoreNodes, numPeriphNodes, pties, seed=None, *args, **kwargs)[source]

Generates a core-periphery network like the one discussed in [AR1997] using NetworkX [HSS2008].

“First set of simulations ... core-perphiphery networks with fully-linked cores ... network density was varied by varying the number of network ties beyond the core” ([AR1997] p. 297).

The core of the network is completely connected and the edges with/between the periphery are generated randomly.

next()[source]
class disim.graphgen.DINetworkGenerator(n)[source]

Abstract base class for DISim network generators. These objects are iterators that return graph objects on each iteration.

next()[source]
disim.graphgen.dissimilarProduct(A, B)[source]

Generator for all combinations of 2 lists where the items are not the same.

disim.graphgen.drawAdoptionNetworkGV(G, writeFile=None, writePng=None)[source]

Generates the GraphViz adoption network. Optionally writes the output to DOT and/or PNG files.

This function expected the node attributes ‘adopted’ and ‘influence’ to be pre-populated. If the ‘adopted’ attribute is True, the node is given a different color, showing adoption visually. If the ‘influence’ attribute contains a list of nodes that played a role in a given node’s adoption, then the edges from those nodes to the given target node is highlighted.

Parameters:
  • G (networkx.MultiGraph) – The networkx MultiGraph object. This object should be pre-populated with node attributes. A MultiGraph is needed because this function augments the graph with additional edges to represent information/influence flow visually.
  • writeDot (str) – The filename/path to which to save the DOT file.
  • writePng (str) – The filename/path to which to save the PNG file.
Returns:

pygraphviz.AGraph object augmented with style attributes.

disim.graphgen.drawAdoptionNetworkMPL(G, fnum=1, show=False, writeFile=None)[source]

Draws the network to matplotlib, coloring the nodes based on adoption. Looks for the node attribute ‘adopted’. If the attribute is True, colors the node a different color, showing adoption visually. This function assumes that the node attributes have been pre-populated.

Parameters:
  • G (networkx.Graph) – Any NetworkX Graph object.
  • fnum (int) – The matplotlib figure number. Defaults to 1.
  • show (bool) –
  • writeFile (str) – A filename/path to save the figure image. If not specified, no output file is written.
disim.graphgen.generateARCorePeriph(numCoreNodes, numPeriphNodes, pties, show=False)[source]

Generates a core-periphery network like the one discussed in [AR1997] using NetworkX [HSS2008].

“First set of simulations ... core-perphiphery networks with fully-linked cores ... network density was varied by varying the number of network ties beyond the core” ([AR1997] p. 297).

Parameters:
  • numCoreNodes (int) – The number of nodes in the Core (>0).
  • numPeriphNodes (int) – The number of nodes in the Periphery (>0).
  • pties (int) – The number of additional ties to generate in the periphery.
disim.graphgen.setDefaultNodeAttrs(G)[source]

Helper function to set default attributes on a new network. This funciton modifies the graph itself.

Parameters:G (networkx.Graph) – A networkx Graph object.

Graph Search Module

Created on May 4, 2011

author:Christopher Kirkos

Theory

Boundary Pressure Points

“In sum, we define a boundary pressure point as a concentration of social ties linking potential adopters of an innovation in one segment of a network to a potential adopter in another segment of that network.” ([AR1997] p. 300).

”... in the second set of simulations, we operationalized boundary pressure points by counting each non-focal potential adopter that communicates with at least half of the focal potential adopters. We also tried proportions other than one half, and the results did not differ substantially.” ([AR1997] p. 300).

Boundary Weaknesses

”...we operationalized boundary weaknesses by counting each non-focal potential adopter that satisfied two conditions: the potential adopter had to communicate with a focal potential adopter, and it had to have assessed profits high enough such that one adoption would create enough impetus for this potential adopter to adopt” ([AR1997] p. 300).

“A boundary weakness occurs ... when potential adopter F both has ties bridging two sides of a boundary and has a low adoption threshold. A single adoption can cause such a weakly linked potential adopter to adopt...” ([AR1997] p. 300).

“In sum, we define a boundary weakness as a social tie linking a potential adopter of an innovation in one segment of a network to a potential adopter, in another segment of that network, who is highly predisposed to adopting this innovation.” ([AR1997] p. 300).

Implementation

class disim.graphsearch.FalseFilter(*args, **kwargs)[source]

Always returns False when called (all graphs are invalid).

class disim.graphsearch.GraphFilter[source]

An abstract base class defining the structure of GraphFilter objects.

GraphFilter objects act as a callable filter for graphs. Their intent is to be used as a way to restrict the output in large simulation to only graphs with properties of interest.

The class instance itself is callable (acts like a function)

class disim.graphsearch.TrueFilter(*args, **kwargs)[source]

Always returns True when called (all graphs are valid).

class disim.graphsearch.WPPFilter(weaknessThresh=1, pressurePointThresh=1, targetSegment='periphery', *args, **kwargs)[source]

A GraphFilter that includes only those graph that have boundary weaknesses or pressure points over a given threshold (count of # occurrences in graph > specified threshold).

disim.graphsearch.clearWPPCache()[source]

Delete all entries in Weaknesses and Pressure Points Cache.

disim.graphsearch.findWeaknessesAndPressurePoints(G, proportion=0.5, targetSegment='periphery', addGraphAttrs=True, ignoreCache=False)[source]

Searches a graph for nodes that match the conditions for boundary weaknesses and boundary pressure points as given by [AR1997].

Parameters:
  • G (networkx.Graph) – The networkx Graph object representing the network
  • A_i (int) – The ambiguity level for this simulation to calculate potential bandwagon pressure. For boundary weakness calculation.
  • proportion (float) – The proportion of nodes from an alternate segment B that are required to neighbor a given target node from segment A (where A <union> B = {}). For pressure point calculation.
  • targetSegment (str) – The attribute of the Graph G that identifies the segment from which to determine weaknesses and pressure points. Specify periphery for trickle down diffusion and core for trickle up diffusion.
  • addGraphAttrs (bool) – Augments the Graph G with node attributes representing weaknesses and pressure points.
  • ignoreCache (bool) – Cause function to recalculate for the given Graph G and update the cache accordingly.
Returns:

A tuple of 2 lists, the first list contains the node IDs that were identified as being boundary weaknesses, the second contains node ID’s of pressure points.

Todo

See if there’s a way to automatically determine graph ‘dirtyness’ for cache invalidation.

Plotting Module

Created on Apr 29, 2011

Plotting for DISim.

Author:Christopher Kirkos
disim.plotting.createCoreDiffusionPlot(experimentCaseLog, outFilePath, plotTitle=None)[source]

Function to generate the Core Diffusion vs Peripheral Density plot. (Basically a clone of createPeripheralDiffusionPlot)

disim.plotting.createPeripheralDiffusionPlot(experimentCaseLog, outFilePath, plotTitle=None)[source]

Function to generate the Peripheral Diffusion vs Peripheral Density plot.

disim.plotting.marker_cycle()[source]

Return an infinite, cycling iterator over the available marker symbols.

This is wrapped in a function to make sure that you get a new iterator that starts at the beginning every time you request one. This function is meant for use with Matplotlib.

Statistics Module

Created on May 2, 2011

Statistic generation for DISim

Author:Christopher Kirkos
disim.stats.calcNxDensity(pties, coreNodes, periphNodes)[source]

Calculate the density of a network with the given number of peripheral ties, core nodes, and peripheral nodes

disim.stats.calcPerpipheralDensity(pties, coreNodes, periphNodes)[source]

Calculate peripheral density, which is the number of ties in the periphery divided by the number of possible ties in the periphery.

All input values are scalars, not arrays/matrices.

Parameters:
  • pties (integer) – The number of ties in the periphery.
  • coreNodes (integer) – The number of nodes in the core.
  • periphNodes (integer) – The number of nodes in the periphery.
disim.stats.minmax(A, newMin=0.0, newMax=1.0)[source]

Min-Max normalization.

Parameters:
  • A (numpy.Array) – A 1D numpy Array object.
  • newMin (float) – The new minimum value.
  • newMax (float) – The new maximum value.
Returns:

A copy/view of the input Array with values normalized.

Return type:

numpy.Array

disim.stats.optimizedCalcNxDensity(pt, cn, pn)[source]

An optimized version of calcNxDensity

disim.stats.optimizedCalcPeriphDensity(pt, cn, pn)[source]

An optimized version of calcPeripheralDensity using the numexpr module.

All input parameters are numpy Arrays.

disim.stats.optimizedMinmax(A, newMin=0.0, newMax=1.0)[source]

An optimized version of minmax using numexpr.

disim.stats.possibleTies(numberOfNodes, numCoreNodes)[source]

Calculates the max number of ties for a network with given node statistics.

Total undirected simple edges = (n*(n-1))/2 Total peripheral ties = ties between nodes in the periphery and themselves, AND ties between nodes in the periphery and in the core core.

Total peripheral ties is calculated as the ... Total possible ties in the network minus the total ties between core nodes.

Parameters:
  • numberOfNodes (integer) – The number of nodes in the network.
  • numCoreNodes (integer) – The number of nodes in the core.
Returns:

Tuple of the number of total possible ties in the network, the total possible core ties, and the total possible peripheral ties.

Todo

Write latex set notation to represent this calculation in the docs.

disim.stats.runOLSRegression1997(expTrialLogFilePath, trickleDirection='down', peripheralTieRange=(0, 185), densityRange=None, withBoundaryAnalysis=False, outFilePath=None)[source]

Perform Ordinary Least Squares (OLS) regression on the trial output data to determine which parameters have the most effect on the extent of peripheral diffusion.

Parameters:
  • expTrialLogFilePath (str) – The full path to the experiment’s trial log file.
  • trickleDirection (str) – Either “up” or “down”. If “up”, core diffusion is selected as the dependent variable in the regression analysis. If “down”, peripheral diffusion is selected as the dependent variable.
  • withBoundaryAnalysis (boolean) – Whether or not to include the boundary weaknesses and pressure points in the multiple regression.
  • peripheralTieRange (tuple) – Limit the regression analysis to the records where the number of peripheral ties is within (including) the specified range (min, max). If None is specified, then the range defaults to the entire range of the sample (no restriction).
  • densityRange (tuple) – Limit regression analysis to the records where the peripheral density is within (including) the specified range (Eg. (0,0.5)). If None is specified, then no density restriction is placed on the records.

Note

The original authors only simulated with the number of peripheral ties within range [0,185]. In their second set of simulations, they performed regressions on the same simulation but split the data set by density of peripheral ties. They split it in half with the first set having density <=0.5 and the second set with records having density >0.5. This is the reasoning behind the peripheralTieRange and densityRange function parameters.

disim.stats.standardizeCoeff(A, sample=True)[source]

Element-wise, subtract the mean of the values from each value and divide by the standard deviation.

Parameters:
  • A (numpy.Array) – A numpy array.
  • sample (bool) – Whether the Array A represents a sample or population (True if sample).
Returns:

A copy/view of the input Array with values standardized.

Return type:

numpy.Array

Data Module

A clone of Robert Axtell’s Data.java.

class disim.data.Data[source]

An object that maintains running mean, variance/standard deviation, min, max, and number of samples.

addDatum(val)[source]

Add a datum, updating the running statistics.

average[source]

The average of the data (accessed as a property).

stdDev[source]

The standard deviation of the data (accessed as a property).

variance[source]

The variance of the data (accessed as a property).