Notes on Constructing the Audit
These notes describe details about the audit in the paper “Measuring Geopolitical Risk” by Dario Caldara and Matteo Iacoviello
Construction of the Index
Each month the universe of newspapers that we use to construct our GPR index contains about [70,000] articles. This isset U.Of these, only about [0.32%] meets our computer-generated criterion to be included in the GPR index. This isset G.Each month, The ratio G/U (normalized to equal 100 in the 2000-2009 decade) is our computer-generated GPR index.
Design of the audit sample – Benchmark GPR Index SAMPLING FROM GPR ARTICLES (set G)
We sample a subset of the articles that are identified as discussing high or rising geopolitical risks.We code these articles asGPR=1, GPR=0, GPR=-1, as follows:1 = thearticle containsreferencesto high or rising geopolitical risks.0 =the article containsno references to geopolitical risks, or is uninformative whether geopolitical risks are rising or falling.-1 =the article contains references tolow or declining geopoliticalrisks.
Design of the audit sample – Expanded SampleSampling from GPRE ARTICLES (set E)
We sample from the universe of newspapers a subset of articles which is sufficiently large to include articles that are likely to be GPR=1 articles. This is set E, and contains about [ 15% ] of the articles in sample U. The ratio E/U is the GPRE index.The subset is constructed sampling articles that contain any of these 4 words (roots): military OR war OR geopolitical ORterroris*.We code these articles as GPR=1 orGPR=0,as follows:1 = the article contains references to high or rising geopolitical risks.0 = the article contains no references to geopolitical risks.
Reading the Articles
We select a sample of50 outof  months.For each of the selected months, working with a team of RAs, we extract  random articles from set G, and  random articles from set E.To ensure randomness, we pick the first  articles in the given sets (G and E) that also contain the genericstopwords“a” and “the” and “and” (http://www.ranks.nl/stopwords)We exclude the Financial Times from the searches since no full text can be saved for replicability purposes.
How to Code ArticlesGeneral Principle
If the article discusses or highlights recent past, or current or future expected geopolitical risks, terror risks, war risks, terror acts, or currentwars,label it as 1.Example:https://search.proquest.com/docview/1748535069?accountid=39704Anightmare scenario in theMideast, The Washington Post, Dec 14, 2015: “..The wars in Syria and Iraq and jihadist attacks in the West have obscured yet another Middle East threat:the possibilitythat slowly escalating violence between Palestinians and Israelis will destroy one of thefew remainingzones of relative tranquility between Morocco andIran”. This article is coded as 1.
How to Code ArticlesGeneral Principle (Continued)
If the article discussesdeclining tensions,label it as-1.Exampleof article coded as -1“InEast Europe, stunningchange”:Dramatic change swept Eastern Europe in 1989, first gathering force in Poland and Hungary,then spreadingto East Germany and Czechoslovakia as the Soviet Union continued to loosen the iron gripwith whichit has dominated its neighbors since the end of World War II.https://search.proquest.com/docview/282685288?accountid=39704,ChicagoTribune, Dec 27, 1989
How to Code ArticlesBooks and Reviews
If the articles does not highlight any of these risks, including book or movie reviews except when highlighting risks associated to them, label it as 0.E.g. “Booksof The Times; Black Military History in the U.S.: No Longer the Untold Story”, New York Times, Dec 23, 1989,https://search.proquest.com/docview/427465416?accountid=39704coded as 0 since it mostly covers Vietnam’s warE.g. “Books of TheTimes; A Nuclear Pragmatist Offers Hope”,New York Times, Dec15, 1988,https://search.proquest.com/docview/427011256?accountid=39704,codedas1 as the review discusses the author’s assessment of current nuclear risks.
How to Code ArticlesObituaries/Death Notices*
TheGPR algorithm excludes articles found in a publication’s obituariessection. Occasionally, newspapers publish obituariesfor certain high-profile figures inother categoriesreserved for news.If the articles does not highlight any of these risks, including obituary, except when highlighting risks associated to them, label it as 0.E.g.“Major-General DerrickWormald, Obituary”, The Times; London, Apr 5, 1994,https://search.proquest.com/docview/318127277?accountid=39704coded as 0as it recounts a death with little/no geopolitical significance.E.g. “The World;Imam’sDeath Puts A Region On Edge”, The Times; London, Apr 5, 1994,https://search.proquest.com/docview/422126099?accountid=39704coded as1asthe reported individual’s death instigates regional strife.
How to Code ArticlesHistorical Accounts and Anniversaries
If the articles does not highlight any of these risks, including anniversaries/historical accounts except when highlighting risks associated to them, label it as 0.E.g. “Moments That Make History”, The Times; London,Dec 31, 2001,https://search.proquest.com/docview/318584735?accountid=39704codedas 0 asthe article historicizes 9/11 without discussing current developments.E.g. “Analysis: Death FromAfar, There‘sA Long History Of Us Military Mistakes. They Destroyed A Cambodian Town Like That In 1973”, The Guardian,Apr21, 1999,https://search.proquest.com/docview/245380083?accountid=39704codedas1 as it assets the connection between current and past conflicts.
How to Code ArticlesTensions and Markets
Note: If using the -1/0/1 scale, only code articles -1 if they discuss easing tensions, not if they discuss market strengthdespitetensions:E.g.“Report: Stocks Leap as Fear Over Ukraine Eases”, Wall Street Journal, Aug 19, 2014,https://search.proquest.com/docview/1553962294?accountid=39704codedas-1asit explicitly discussesthedeclineofgeopoliticalrisk.E.g.“Market Roundup;S&P 500 reaches 2,000, falls back”,Los Angeles Times, Aug 26, 2014,https://search.proquest.com/docview/1555923120?accountid=39704coded as 0 as it does not discuss the decline or presence of current geopolitical risk factors. (it discuss pastgeopolitical risks)E.g.“Global turmoil fails to unsettle markets”,The Daily Telegraph, Aug 25, 2014,https://search.proquest.com/docview/1555592597?accountid=39704codedas 1 asit discusses the presence of current geopolitical risk.
How to Code ArticlesWar or Military or Terror Trials
Terror trials or war trials are counted as 1 if their account highlights current or recent geopolitical, terrorist or war risksE.g. “C.I.A. Head Sees More Spy Cases Ahead”, TheNew York Times,Apr20, 1994,https://search.proquest.com/docview/429703118?accountid=39704coded as 0asit does not discuss whether the trials have geopolitical implications.E.g. “For Cambodia,It‘sTime To Look Ahead--And Back; Elections, Tribunal Stir Up Tensions”, Chicago Tribune, Jul 7, 2003,https://search.proquest.com/docview/419924015?accountid=39704coded as1 it discusses the geopolitical impact of a series of trials.
How to Code ArticlesMeetings or Talks
Articles discussing constructive meetings to end wars, to end terrorism should be coded as 0 unless they make explicit references to ongoing tensions or to the risk that this goal will not be achieved.E.g: ”U.S. To Monitor PLO Pledge to End Terrorism” Boston Globe, Dec 19, 1988,http://search.proquest.com/docview/294491173?accountid=39704coded as 0 because it does not explicitly reference ongoing tensions.E.g: ” U.S. Presses Mideast MissileTalks” Washington Post,Dec28,1988,https://search.proquest.com/docview/307076045?accountid=39704codedas 1 because article discusses tensions surrounding talks.
How to Code ArticlesAppointments, Elections and Nominations
Appointment or reappointment to a military position or civilian oversight of military position (e.g. Secretary of State) should be counted as 0 unless the article discusses how the appointment brings or ignites new or renewed geopolitical tensions.E.g. “Bush's Selections for the United Nations, the C.I.A. and Top EconomicPosts”,New York Times, Dec 7, 1988,https://search.proquest.com/docview/427021759?accountid=39704coded as 0as it does not discuss thegeopolitical impactof the appointment.E.g. “For Cambodia, It‘s Time To Look Ahead--And Back; Elections, Tribunal Stir Up Tensions”, Chicago Tribune, Jul 7, 2003,https://search.proquest.com/docview/419924015?accountid=39704coded as 1 it discusses the geopolitical impact of a series ofelections.
Construction of audited indices
Based on the audits, we construct the following indices:TheGPRAindex is theGPR index (G/U)times the fraction of audited articles in set G in each month that are coded as1 (GPR_AC/50).TheGPREAindex isexpanded GPR index (E/U),times the fraction of audited articles in setEin each month that are coded as1 (GPRE_AC/50).
Instructions for Extracting and coding articles
Navigate tohttp://search.proquest.com/newsstand/commandline?accountid=39704Type in the command line search the search query (see next slide for details)***Select 50 items per pageSavefull text of articles in pdf (including index), rename it as YYYY_MM and saveSave excel file of theresults listing, rename it as YYYY_MM (make sure you’ve cleared selection from previous month, otherwise you will save twice as many articles)Report coding results in column G, and any possible comments in H (mark coding results with initials)Save edited excel file as YYYY_MM_FL, where F and L are first and last name
About 12.7% of all articles belong to GPRE, the sample of articles containing references to either war, terror, military or geopolitical.Of those, based on human reading of a sample of [ 1,200 ] articles in GPRE, 64.5% are articles mentioning high or rising risks, so one can reasonable conclude that about 8% of newspaper articles contain references to high or rising war, terror, or geopolitical risks.About 0.3% of all articles belongs to GPR (a subset of GPRE). Of those, a fraction equal to 87.3% discusses high or rising risks. Ofthe remaining articles, about 40% discuss “declining” tensions, rather than something unrelated to geopolitical risks.
The correlation matrices are as followsThere is a high correlation between GPR and GPRA, at 0.99, thus suggesting that false positives are not a problem. False positives are [12.7]% of all articles in the GPR set.There is a low correlation between GPRE and GPREA, at [ 0.53 ], thus suggesting that there is more noise when the search is very broad. The fraction of articles in GPRE that belongs to GPREA is 64.5%, hence falsepositivesare a bigger problem when the search is broad.There is a low correlation between GPR and GPRE (68%), thus suggesting that naïve search returns results that are different from a detailed search.There is a higher correlation between GPR and GPREA (82%) than between GPR and GPRE, thus suggesting that a sophisticated search is more likely to capture the true underlying measure of articles on geopolitical risks, even though it does not capture ALL articles mentioning geopolitical risk.
Validation: Excluding Some Words
We strive to include in the computer-generated index words that are highly likely to be used when geopolitical tensions are high or rising.As a result of our audit, we can select all articles with false positives, and search the text of these articles for patterns in the words.Comparingtrue GPRC=1with GPRH=0 articles,theGPRH=0 articles contain more often the following words: Books, History, Museums, Art,Kennedy, Nixon, Movies (or Films), [ add other words ]
Instructions for Extracting and coding articles
*** The newspaper code is, depending on the year (Financial Times removed)1984 – 1994:pub.Exact("The Washington Post (pre-1997Fulltext)" OR "The Globe and Mail" OR "Boston Globe (pre-1997Fulltext)" OR "New York Times" OR "Wall Street Journal" OR "Chicago Tribune (pre-1997Fulltext)" OR "Los Angeles Times (pre-1997Fulltext)" OR "The Guardian" OR "The Daily Telegraph" OR "The Times")1995-1996pub.Exact("The Washington Post" OR "The Washington Post (pre-1997Fulltext)" OR "The Globe and Mail" OR "Boston Globe (pre-1997Fulltext)" OR "New York Times" OR "Wall Street Journal" OR "Chicago Tribune (pre-1997Fulltext)" OR "Los Angeles Times (pre-1997Fulltext)" OR "The Guardian" OR "The Daily Telegraph" OR "The Times")1997 -pub.Exact("The Washington Post" OR "The Globe and Mail" OR "Boston Globe" OR "New York Times" OR "Wall Street Journal" OR "Chicago Tribune" OR "Los Angeles Times" OR "The Guardian" OR "The Daily Telegraph" OR "The Times")The search query is, in August 2014, for examplepub.Exact("The Washington Post" OR "The Globe and Mail" OR "Boston Globe" OR "New York Times" OR "Wall Street Journal" OR "Chicago Tribune" OR "Los Angeles Times" OR "The Guardian" OR "The Daily Telegraph" OR "The Times") AND DTYPE(article OR commentary OR editorial OR feature OR front page article OR front page/cover story OR news OR report OR review) AND(PD(Aug 2014)) AND (military OR war OR geopolitical OR terrorism OR terrorist) AND (a AND about AND above)