Latin America Regional Special Section
Society

Telar and TelarKG: Data-Driven Insights into Chile’s Constitutional Process

As part of the Millennium Institute for Foundational Research on Data, a multidisciplinary team of academics resolved to enrich discussion of the constitutional process in Chile using data-driven techniques.

Posted
people gathered in front of La Moneda Palace, Chile

A year into intensive social protests and unrest, Chile held a referendum on October 25, 2020, to decide on whether or not to define a new constitution. The people voted 78% in favor of a new constitution and 79% in favor of having the new constitution drafted solely by convention members elected for the role. On May 16, 2021, an election was held, and 155 convention members were elected, balancing gender and reserving 17 seats for indigenous groups.5 The resulting constitutional convention met on July 4, 2021, to begin drafting a new constitution that, if approved via referendum, would replace the active 1980 constitution adopted during the dictatorship of Augusto Pinochet.

The constitutional process was the subject of much media attention and discussion in Chile. As part of the Millennium Institute for Foundational Research on Data (IMFD)—a multidisciplinary team of academics from the areas of Computer Science, Social Science, Statistics and Communications—we resolved to enrich this discussion using data-driven techniques. Thus we began the project Plataforma Telar (Telar Platform), where “telar” is the Spanish word for a loom. Our goal was to weave together diverse data from disparate sources relating to the constitutional process in order to gain insights and communicate the results to the Chilean public.

Data collection.  Our data came from fieldwork, online platforms, and the convention itself. For fieldwork, we organized panels spanning Chile’s social and geographic divisions, with periodical polls, open questions, and focus groups. The resulting data were handled via our own application that obtains real-time results in a secure environment. In parallel, we developed tools to periodically extract data from the convention website and related online activity. These data were very diverse: the convention provided roll call votes and transcripts of speeches of the members of the convention; online newspapers published news related to the process; and from social networks we obtained tweets, Instagram posts and Facebook publications related to the convention. Data were collected on a daily basis and stored in a data lake on the Google Cloud Platform. This process gathered approximately 25GB of data, of which 15% was news and almost 80% was social network data; this incorporates textual and structured data only, with links to the appropriate multimedia content.

Data analytics.  A key analysis within Plataforma Telar was constructing a political ideology map obtained via votes of conventionals using D-Nominate.4 Analysis of social networks required more sophisticated tools. We sought to bring clarity to the controversial topic of bots in Twitter, showing that they span the political spectrum (although right-wing bots had a greater share) and that they often published messages with emotions distinct from those of regular users. The accompanying figure shows a sample of these results. This analysis was done in partnership with Brazilian bot detector Pegabot (https://pegabot.com.br/), which we used to sample bot and non-bot accounts. Ideology was estimated using the text-based ideal points methodology.6

Concerning regular social media users, we used a community detection algorithm1 to associate social media users with convention members. Using the ideology previously assigned to convention members, we can then propose the same ideology for users in that community. Classifying the ideology of messages that were retweeted by each of these communities gave us a measure of isolation of each of these three groups: if a certain political faction only interacts with messages that ideologically align with the same faction, then we consider that faction more isolated than one that interacts with messages spanning the political spectrum. Our results showed the right appeared far more isolated than the left and center factions.

Figure 1.  How the ideological spectrum of messages distributes over accounts labeled as “likely bots” (red) and “likely regular users” (blue). Bot messages can be seen to lean to the right (derecha) of the political spectrum, while content by regular users was more widespread (original in Spanish as it appeared on CNN Chile).

Communication Strategy and Impact

Thanks to a partnership with CNN Chile, our analyses were aired every Monday as part of a weekly program devoted to Plataforma Telar, with more details posted on our website and social media accounts. Our results were regularly met with high engagement, shared by media companies and personalities, and even by convention members.a Plataforma Telar thus had a noticeable impact on how people understood the convention (for an analysis of how data-driven political communication impacts public opinion, see Daud2).

TelarKG. Given the diversity, scale and dynamics of the data, our cloud infrastructure was increasingly becoming unwieldy, with relevant information about particular entities (for example, convention members) scattered around different tables. In order to better structure these data, we structured these data as a knowledge graph, called TelarKG, which could then be queried using MillenniumDB: an open-source graph database also developed within the IMFD. (See an initial demo at https://telarkg.imfd.cl/query/, and Hyvönen et al.3 for similar efforts.) We also plan to publish the complete TelarKG dump, containing approximately 20GB of data.

TelarKG enables further techniques that we are now exploring, particularly graph-based learning methods, which can be combined with text embeddings of social media publications in order, for example, to model communities with similar views, predict how members would vote, identify key topics and/or disinformation discussed on social media, and so forth.

Main Challenges

Plataforma Telar presented several challenges. Researchers in multidisciplinary teams had different goals. Engineers prioritized building a well-founded data infrastructure, but this slowed down the analysis processes. Computer science analysts preferred using powerful, black-box tools, while social scientists favored explicable results. We managed to keep teams engaged by putting out a specific goal concerning the data infrastructure, and prioritizing studies that would require a mixture of white- and black-box machinery. Our partnership with CNN involved weekly studies, whose deadlines forced us to stay on top of current affairs, ensuring our analysis had immediate impact with the public. However, it also hampered the development of more carefully planned long-term or longitudinal studies. Thankfully, our focus on building a good data infrastructure allowed us to produce good longitudinal results that were not specifically planned for. One important example is how we managed to be among the first groups able to map the increasing disengagement and distrust with the convention by the Chilean public.b

Conclusion

Plataforma Telar stands out in its multidisciplinary approach to study how society influenced and reacted to important political events. Different disciplines put their skills and knowledge at the project’s disposal to create analyses that enriched the discussion and transformed data into information that could be used and understood by people involved in the process and the general public. Telar set a new standard on the breadth and quality of the political analysis that could be expected in Chile, which has already been replicated by other institutions for other constitutional processes. TelarKG remains open, and it is being used by several research projects in Chile.

Acknowledgment

More than 50 people worked in Plataforma Telar, including students, analysts, directors, and researchers, and without all their help this initiative would have been realized. This initiative was partially funded by Millennium Science Initiative Program—Code ICN17_002.

    • 1. Blondel, V. et al. Fast unfolding of communities in large networks. J. Statistical Mechanics: Theory and Experiment  (2008).
    • 2. Daud, R.S. The role of political communication in shaping public opinion: A comparative analysis of traditional and digital media. J. Public Representative and Society Provision. (2021); 10.55885/jprsp.v1i2.241
    • 3. Hyvönen, E. et al. Integrating Faceted Search with Data Analytic Tools in the User Interface of Parliament Sampo–Parliament of Finland on the Semantic Web. In The Semantic Web: ESWC 2023 Satellite Events. C. Pesquita et al. (Eds.) Lecture Notes in Computer Science. Springer Nature Switzerland, Cham, 2023; 10.1007/978-3-031-43458-7_3
    • 4. Poole, K.T. and Rosenthal, H. Patterns of congressional voting. American J. Political Science 35, (Feb. 1991).
    • 5. Prieto, M. and Verdugo, S. Understanding Chile’s Constitution-making procedure. Intern. J. Constitutional Law. 19, 1 (2021); 10.1093/icon/moab025.
    • 6. Vafa, K. et al. Text-based ideal points. ACL Anthology  (2020); https://aclanthology.org/2020.acl-main.475.pdf

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More