DATATHON CHALLENGE

Context and description

The study of citation dynamics in scientific production enables the exploration of cognitive networks through data visualization techniques involving large datasets (Brasil Jr. and Carvalho, 2020). This makes it possible to map, through the authors cited, named, and referenced, different ways of knowing, thinking, and mobilizing sociocultural meanings.

This tutorial proposes a property graph-based approach to cultural analytics that operationalizes the notion of intellectual reciprocity: the mutual acknowledgment, citation, and influence among thinkers, especially across gendered, racialized, and geopolitical divides. We focus on Wikipedia as a primary data source, due to its global reach, semi-structured format, and crowdsourced dynamics of inclusion and omission.

Material

  • Dataset: intellectuals_dictatorships.csv – curated list of Latin American and Eastern European intellectuals (with gender, region, field, and interactions)
  • intellectuals_network_enriched_styled.gexf – styled graph for Gephi exploration
  • Notebook: intellectuals_graph_tutorial.ipynb – Jupyter notebook covering all steps from data import to graph export

Tools and Software

  • Python 3.8+ – Recommended to use in a virtual environment or Jupyter
  • environment
  • Required Python libraries: pandas, networkx, matplotlib, random, community from networkx.algorithms
  • Colab / Jupyter environment
  • Gephi (latest version) – for interactive graph visualization, community exploration, and filtering by timeline or influence
  • spaCy or nltk for natural language enrichment
  • Wikipedia API for extending the datas

Challenge

Mapping Epistemic Reciprocity and Violence: A Graph Analytics Challenge for Intellectual Histories in Latin America

Participants will construct and explore property graphs representing intellectual relationships among scholars, artists, and activists active during Latin America’s dictatorship periods. They will compute reciprocity, visibility, and absence metrics to identify communities of practice, assess mutual recognition, and surface patterns of epistemic violence related to gender, geography, and disciplinary power.

Challenge Goals

Participants are expected to:

  • Develop a nuanced graph that models epistemic interactions
  • Identify forms of mutual recognitionasymmetry, and exclusion
  • Quantify and visualize epistemic violence
  • Reflect critically on the biases of data structures, sources, and visibility

Task 1: Construct a Property Graph

Build a directed, annotated property graph where:

  • Nodes represent intellectuals, institutions, and concepts
  • Edges represent influence, co-citation, collaboration, or translation
  • Node properties include gender, location, discipline, and time period
  • Tools: networkxpandasneo4j (optional)

Task 2: Compute Reciprocity and Visibility Metrics

Implement and interpret:

  • Reciprocity Index: proportion of bidirectional relationships
  • Visibility Ratio: degree centrality comparison across gender or region
  • Absence Score: quantify structurally missing links (e.g., Latin American women not cited by regional peers)
  • Betweenness Centrality & Community Detection: identify key knowledge brokers and epistemic clusters
  • Suggested formulas and templates will be provided.

Task 3: Detect Communities and Analyze Epistemic Silences

  • Use Louvain or Label Propagation to detect communities
  • Investigate how gender, region, and discipline affect community formation
  • Intify intellectuals isolatedunder-linked, or invisible in dominant clusters
  • Bonus: Compare findings with those from Eastern European intellectual networks (cross-regional module)

Task 4: Semantic Enrichment (Optional Advanced Track)

  • Use embedding models (e.g., BERTspaCy) to compute semantic similarity of biographies
  • Detect latent influence not captured by hyperlinks or citations

Goal: reveal alternative epistemic flows and affinities missed in the network structure