1,551
Views
0
CrossRef citations to date
0
Altmetric
Articles

Using the Web to Predict Regional Trade Flows: Data Extraction, Modeling, and Validation

ORCID Icon, ORCID Icon &
Pages 717-739 | Received 29 Sep 2021, Accepted 26 Jun 2022, Published online: 19 Oct 2022

References

  • Ainsworth, S. G., A. Alsum, H. SalahEldeen, M. C. Weigle, and M. L. Nelson. 2011. How much of the web is archived? In Proceedings of the 11th annual International ACM/IEEE Joint Conference on Digital Libraries, 133–36. New York: ACM.
  • Anderson, J. E., and E. Van Wincoop. 2003. Gravity with gravitas: A solution to the border puzzle. American Economic Review 93 (1):170–92. doi: 10.1257/000282803321455214.
  • Antras, P., and D. Chor. 2018. On the measurement of upstreamness and downstreamness in global value chains. Technical report, National Bureau of Economic Research, Cambridge, MA.
  • Arto, I., J. M. Rueda-Cantuche, and G. P. Peters. 2014. Comparing the GTAP-MRIO and WIOD databases for carbon footprint analysis. Economic Systems Research 26 (3):327–53. doi: 10.1080/09535314.2014.939949.
  • Athey, S., and G. W. Imbens. 2019. Machine learning methods that economists should know about. Annual Review of Economics 11 (1):685–725. doi: 10.1146/annurev-economics-080217-053433.
  • Barca, F. 2009. An agenda for a reformed cohesion policy-independent report. Brussels, Belgium: European Commission.
  • Biau, G. 2012. Analysis of a random forests model. Journal of Machine Learning Research 13:1063–95.
  • Blazquez, D., and J. Domenech. 2018. Big data sources and methods for social and economic analyses. Technological Forecasting and Social Change 130:99–113. doi: 10.1016/j.techfore.2017.07.027.
  • Boero, R., B. K. Edwards, and M. K. Rivera. 2018. Regional input–output tables and trade flows: An integrated and interregional non-survey approach. Regional Studies 52 (2):225–38. doi: 10.1080/00343404.2017.1286009.
  • Breiman, L. 2001. Random forests. Machine Learning 45 (1):5–32. doi: 10.1023/A:1010933404324.
  • Caruana, R., N. Karampatziakis, and A. Yessenalina. 2008. An empirical evaluation of supervised learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning, ICML ’08, 96–103. New York: Association for Computing Machinery. doi: 10.1145/1390156.1390169.
  • Chen, W., B. Los, P. McCann, R. Ortega-Argilés, M. Thissen, and F. van Oort. 2018. The continental divide? Economic exposure to Brexit in regions and countries on both sides of The Channel. Papers in Regional Science 97 (1):25–54. doi: 10.1111/pirs.12334.
  • Chun, Y., H. Kim, and C. Kim. 2012. Modeling interregional commodity flows with incorporating network autocorrelation in spatial interaction models: An application of the US interstate commodity flows. Computers, Environment and Urban Systems 36 (6):583–91. doi: 10.1016/j.compenvurbsys.2012.04.002.
  • Chung, J. 2011. The geography of global Internet hyperlink networks and cultural content analysis. PhD dissertation, University at Buffalo.
  • Crampton, J. W., M. Graham, A. Poorthuis, T. Shelton, M. Stephens, M. W. Wilson, and M. Zook. 2013. Beyond the geotag: Situating “big data” and leveraging the potential of the geoweb. Cartography and Geographic Information Science 40 (2):130–39. doi: 10.1080/15230406.2013.777137.
  • Credit, K. 2022. Spatial models or random forest? Evaluating the use of spatially explicit machine learning methods to predict employment density around new transit stations in Los Angeles. Geographical Analysis 54 (1):58–83. doi: 10.1111/gean.12273.
  • David, H., D. Dorn, and G. H. Hanson. 2013. The China syndrome: Local labor market effects of import competition in the United States. American Economic Review 103 (6):2121–68. doi: 10.1257/aer.103.6.2121.
  • de Mello-Sampayo, F. 2017a. Competing-destinations gravity model applied to trade in intermediate goods. Applied Economics Letters 24 (19):1378–84. doi: 10.1080/13504851.2017.1282109.
  • de Mello-Sampayo, F. 2017b. Testing competing destinations gravity models—Evidence from BRIC International. The Journal of International Trade & Economic Development 26 (3):277–94. doi: 10.1080/09638199.2016.1239752.
  • Devriendt, L., B. Derudder, and F. Witlox. 2008. Cyberplace and cyberspace: Two approaches to analyzing digital intercity linkages. Journal of Urban Technology 15 (2):5–32. doi: 10.1080/10630730802401926.
  • Dietzenbacher, E., B. Los, R. Stehrer, M. Timmer, and G. De Vries. 2013. The construction of world input–output tables in the WIOD project. Economic Systems Research 25 (1):71–98. doi: 10.1080/09535314.2012.761180.
  • Egger, P. 2002. An econometric view on the estimation of gravity models and the calculation of trade potentials. The World Economy 25 (2):297–312. doi: 10.1111/1467-9701.00432.
  • Feenstra, R. C., R. E. Lipsey, H. Deng, A. C. Ma, and H. Mo. 2005. World trade flows: 1962–2000. Technical report, National Bureau of Economic Research, Cambridge, MA.
  • Fingleton, B., H. Garretsen, and R. Martin. 2012. Recessionary shocks and regional employment: Evidence on the resilience of UK regions. Journal of Regional Science 52 (1):109–33. doi: 10.1111/j.1467-9787.2011.00755.x.
  • Flegg, A. T., and C. D. Webber. 2000. Regional size, regional specialization and the FLQ formula. Regional Studies 34 (6):563–69. doi: 10.1080/00343400050085675.
  • Gervais, A., and J. B. Jensen. 2019. The tradability of services: Geographic concentration and trade costs. Journal of International Economics 118:331–50. doi: 10.1016/j.jinteco.2019.03.003.
  • Gómez-Herrera, E. 2013. Comparing alternative methods to estimate gravity models of bilateral trade. Empirical Economics 44 (3):1087–1111. doi: 10.1007/s00181-012-0576-2.
  • Greene, W. 2013. Export potential for US advanced technology goods to India using a gravity model approach. Working Paper 2013-03B, U.S. International Trade Commission, Washington, DC.
  • Guan, D., D. Wang, S. Hallegatte, S. J. Davis, J. Huo, S. Li, Y. Bai, T. Lei, Q. Xue, D. Coffman, et al. 2020. Global supply-chain effects of COVID-19 control measures. Nature Human Behaviour 4 (6):577–87. doi: 10.1038/s41562-020-0896-8.
  • Guns, R., and R. Rousseau. 2014. Recommending research collaborations using link prediction and random forest classifiers. Scientometrics 101 (2):1461–73. doi: 10.1007/s11192-013-1228-9.
  • Halavais, A. 2000. National borders on the world wide web. New Media & Society 2 (1):7–28. doi: 10.1177/14614440022225689.
  • Hale, S. A., T. Yasseri, J. Cowls, E. T. Meyer, R. Schroeder, and H. Margetts. 2014. Mapping the UK webspace: Fifteen years of British universities on the web. In Proceedings of the 2014 ACM Conference on Web Science, 62–70. New York: ACM.
  • Head, K., T. Mayer, and J. Ries. 2009. How remote is the offshoring threat? European Economic Review 53 (4):429–44. doi: 10.1016/j.euroecorev.2008.08.001.
  • Hellmanzik, C., and M. Schmitz. 2015. Gravity and international services trade: The impact of virtual proximity. European Economic Review 77:82–101. doi: 10.1016/j.euroecorev.2015.03.014.
  • Hellmanzik, C., and M. Schmitz. 2017. Taking gravity online: The role of virtual proximity in international finance. Journal of International Money and Finance 77:164–79. doi: 10.1016/j.jimonfin.2017.07.001.
  • Hernández, B., J. Jiménez, and M. J. Martín. 2009. Improved estimation of regional input–output tables using cross-regional methods. International Journal of Information Management 29 (5):362–71. doi: 10.1016/j.ijinfomgt.2008.12.006.
  • Holmberg, K. 2010. Co-inlinking to a municipal Web space: A webometric and content analysis. Scientometrics 83 (3):851–62. doi: 10.1007/s11192-009-0148-1.
  • Holmberg, K., and M. Thelwall. 2009. Local government web sites in Finland: A geographic and webometric analysis. Scientometrics 79 (1):157–69. doi: 10.1007/s11192-009-0410-6.
  • Holzmann, H., W. Nejdl, and A. Anand. 2016. The dawn of today’s popular domains: A study of the archived German Web over 18 years. In Digital Libraries (JCDL), 2016 IEEE/ACM Joint Conference, 73–82. IEEE.
  • Hope, O. 2017. The changing face of the online world. Accessed March 5, 2021. https://www.nominet.uk/changing-face-online-world/.
  • Ijtsma, P., and B. Los. 2020. UK Regions in global value chains. Technical report, Economic Statistics Centre of Excellence (ESCoE), London.
  • Isard, W. 1951. Interregional and regional input–output analysis: A model of a space-economy. The Review of Economics and Statistics 33 (4):318–28. doi: 10.2307/1926459.
  • Isard, W. 1956. Location and space-economy. New York: The Technology Press of Massachusetts Institute of Technology and John Wiley & Sons, Inc.
  • Ivanova, O., D. Kancs, and M. Thissen. 2019. Regional trade flows and input output data for Europe. Technical report, EERI Research Paper Series, Brussels, Belgium.
  • Jackson, A. N. 2017a. JISC UK Web domain dataset (1996–2010) Geoindex. https://doi.org/10.5259/ukwa.ds.2/geo/1
  • Jackson, A. N. 2017b. JISC UK Web domain dataset (1996–2010) Host Link Graph. https://doi.org/10.5259/ukwa.ds.2/host.linkage/1
  • Janc, K. 2015a. Geography of hyperlinks–Spatial dimensions of local government websites. European Planning Studies 23 (5):1019–37. doi: 10.1080/09654313.2014.889090.
  • Janc, K. 2015b. Visibility and connections among cities in digital space. Journal of Urban Technology 22 (4):3–21. doi: 10.1080/10630732.2015.1073899.
  • Jensen, J. B., L. G. Kletzer, J. Bernstein, and R. C. Feenstra. 2005. Tradable services: Understanding the scope and impact of services offshoring [with comments and discussion]. Brookings Trade Forum 2005 (1):75–133. doi: http://www.jstor.org/stable/25058763.
  • Jiang, X., E. Dietzenbacher, and B. Los. 2012. Improved estimation of regional input–output tables using cross-regional methods. Regional Studies 46 (5):621–37. doi: 10.1080/00343404.2010.522566.
  • JISC and the Internet Archive. 2013. JISC UK Web domain dataset (1996–2013). The British Library. https://doi.org/10.5259/ukwa.ds.2/1
  • Jones, B. W., B. Spigel, and E. J. Malecki. 2010. Blog links as pipelines to buzz elsewhere: The case of New York theater blogs. Environment and Planning B: Planning and Design 37 (1):99–111. doi: 10.1068/b35026.
  • Keßler, C. 2017. Extracting central places from the link structure in Wikipedia. Transactions in GIS 21 (3):488–502. doi: 10.1111/tgis.12284.
  • Kimura, F., and H.-H. Lee. 2006. The gravity equation in international trade in services. Review of World Economics 142 (1):92–121. doi: 10.1007/s10290-006-0058-8.
  • Kitsos, A., A. Carrascal-Incera, and R. Ortega-Argilés. 2019. The role of embeddedness on regional economic resilience: Evidence from the UK. Sustainability 11 (14):3800. doi: 10.3390/su11143800.
  • Kleinberg, J., J. Ludwig, S. Mullainathan, and Z. Obermeyer. 2015. Prediction policy problems. The American Economic Review 105 (5):491–95. doi: 10.1257/aer.p20151023.
  • Krüger, M., J. Kinne, D. Lenz, and B. Resch. 2020. The digital layer: How innovative firms relate on the web. Discussion Paper 20-003, ZEW-Centre for European Economic Research. Leibniz-Zentrum für Europäische Wirtschaftsforschung, Mannheim.
  • Kuhn, M. 2008. Building predictive models in R using the caret package. Journal of Statistical Software 28 (5):1–26. doi: 10.18637/jss.v028.i05.
  • Lamonica, G. R., and F. M. Chelli. 2018. The performance of non-survey techniques for constructing sub-territorial input-output tables. Papers in Regional Science 97 (4):1169–1202. doi: 10.1111/pirs.12297.
  • Last, M., O. Maimon, and E. Minkov. 2002. Improving stability of decision trees. International Journal of Pattern Recognition and Artificial Intelligence 16 (2):145–59. doi: 10.1142/S0218001402001599.
  • Lazer, D., A. Pentland, L. Adamic, S. Aral, A.-L. Barabasi, D. Brewer, N. Christakis, N. Contractor, J. Fowler, M. Gutmann, et al. 2009. Social science: Computational social science. Science 323 (5915):721–23. doi: 10.1126/science.1167742.
  • Leamer, E., and R. Stern. 1971. Quantitative international economics. Journal of International Economics 1:359–61.
  • Liaw, A., and M. Wiener. 2002. Classification and regression by randomForest. R News 2 (3):18–22.
  • Lin, J., A. Halavais, and B. Zhang. 2007. The blog network in America: Blogs as indicators of relationships among US cities. Connections 27 (2):15–23.
  • Linnemann, H. 1966. An econometric study of international trade flows. Amsterdam, The Netherlands: Holland Publishing.
  • Los, B., P. McCann, J. Springford, and M. Thissen. 2017. The mismatch between local voting and the local economic consequences of Brexit. Regional Studies 51 (5):786–99. doi: 10.1080/00343404.2017.1287350.
  • Los, B., M. P. Timmer, and G. J. de Vries. 2015. How global are global value chains? A new approach to measure international fragmentation. Journal of Regional Science 55 (1):66–92. doi: 10.1111/jors.12121.
  • Los, B., M. P. Timmer, and G. J. de Vries. 2016. Tracing value-added and double counting in gross exports: Comment. American Economic Review 106 (7):1958–66. doi: 10.1257/aer.20140883.
  • Matter, R. 2009. Economic recovery: Innovation and sustainable growth. Paris: OECD.
  • McCann, P., and R. Ortega-Argilés. 2015. Smart specialization, regional growth and applications to European Union cohesion policy. Regional Studies 49 (8):1291–1302. doi: 10.1080/00343404.2013.799769.
  • Meijers, E., and A. Peris. 2019. Using toponym co-occurrences to measure relationships between places: Review, application and evaluation. International Journal of Urban Sciences 23 (2):246–68. doi: 10.1080/12265934.2018.1497526.
  • Mikkonen, K., and M. Luoma. 1999. The parameters of the gravity model are changing—How and why? Journal of Transport Geography 7 (4):277–83. doi: 10.1016/S0966-6923(99)00024-1.
  • Miller, R. E., and P. D. Blair. 2009. Input–output analysis: Foundations and extensions. New York: Cambridge University Press.
  • Mozolin, M., J.-C. Thill, and E. Lynn Usery. 2000. Trip distribution forecasting with multilayer perceptron neural networks: A critical evaluation. Transportation Research Part B: Methodological 34 (1):53–73. doi: 10.1016/S0191-2615(99)00014-4.
  • Mullainathan, S., and J. Spiess. 2017. Machine learning: An applied econometric approach. Journal of Economic Perspectives 31 (2):87–106. doi: 10.1257/jep.31.2.87.
  • Musso, M., and F. Merletti. 2016. This is the future: A reconstruction of the UK business web space (1996–2001). New Media & Society 18 (7):1120–42. doi: 10.1177/1461444816643791.
  • Ortega, J. L., and I. F. Aguillo. 2008a. Linking patterns in European Union countries: Geographical maps of the European academic web space. Journal of Information Science 34 (5):705–14. doi: 10.1177/0165551507086990.
  • Ortega, J. L., and I. F. Aguillo. 2008b. Visualization of the Nordic academic web: Link analysis using social network tools. Information Processing & Management 44 (4):1624–33. doi: 10.1016/j.ipm.2007.09.010.
  • Ortega, J. L., and I. F. Aguillo. 2009. Mapping world-class universities on the web. Information Processing & Management 45 (2):272–79. doi: 10.1016/j.ipm.2008.10.001.
  • Oshan, T. M. 2020a. Potential and pitfalls of big transport data for spatial interaction models of urban mobility. The Professional Geographer 72 (4):468–80. doi: 10.1080/00330124.2020.1787180.
  • Oshan, T. M. 2020b. The spatial structure debate in spatial interaction modeling: 50 years on. Progress in Human Geography 45 (5):925–50.
  • Owen, A., R. Wood, J. Barrett, and A. Evans. 2016. Explaining value chain differences in MRIO databases through structural path decomposition. Economic Systems Research 28 (2):243–72. doi: 10.1080/09535314.2015.1135309.
  • Paul Lesage, J., and W. Polasek. 2008. Incorporating transportation network structure in spatial econometric models of commodity flows. Spatial Economic Analysis 3 (2):225–45. doi: 10.1080/17421770801996672.
  • Pereira-López, X., A. Carrascal-Incera, and M. Fernández-Fernández. 2020. A bidimensional reformulation of location quotients for generating input–output tables. Spatial Economic Analysis 15 (4):476–93. doi: 10.1080/17421772.2020.1729996.
  • Pontius, R. G., O. Thontteh, and H. Chen. 2008. Components of information for multiple resolution comparison between maps that share a real variable. Environmental and Ecological Statistics 15 (2):111–42. doi: 10.1007/s10651-007-0043-y.
  • Pourebrahim, N., S. Sultana, A. Niakanlahiji, and J.-C. Thill. 2019. Trip distribution modeling with Twitter data. Computers, Environment and Urban Systems 77:101354. doi: 10.1016/j.compenvurbsys.2019.101354.
  • Rabari, C., and M. Storper. 2015. The digital skin of cities: Urban theory and research in the age of the sensored and metered city, ubiquitous computing and big data. Cambridge Journal of Regions, Economy and Society 8 (1):27–42. doi: 10.1093/cjres/rsu021.
  • Ren, Y., T. Xia, Y. Li, and X. Chen. 2019. Predicting socio-economic levels of urban regions via offline and online indicators. PLoS ONE 14 (7):e0219058. doi: 10.1371/journal.pone.0219058.
  • Riddington, G., H. Gibson, and J. Anderson. 2006. Comparison of gravity model, survey and location quotient-based local area tables and multipliers. Regional Studies 40 (9):1069–81. doi: 10.1080/00343400601047374.
  • Salvini, M. M., and S. I. Fabrikant. 2016. Spatialization of user-generated content to uncover the multirelational world city network. Environment and Planning B: Planning and Design 43 (1):228–48. doi: 10.1177/0265813515603868.
  • Sargento, A. L., P. N. Ramos, and G. J. Hewings. 2012. Inter-regional trade flow estimation through non-survey models: An empirical assessment. Economic Systems Research 24 (2):173–93. doi: 10.1080/09535314.2011.574609.
  • Sawyer, C. H., and R. E. Miller. 1983. Experiments in regionalization of a national input–output table. Environment and Planning A: Economy and Space 15 (11):1501–20. doi: 10.1068/a151501.
  • Serrano, M. Á, and M. Boguñá. 2003. Topology of the world trade web. Physical Review E: Statistical, Nonlinear, and Soft Matter Physics 68 (1, Pt. 2):015101. doi: 10.1103/PhysRevE.68.015101.
  • Simini, F., M. C. González, A. Maritan, and A.-L. Barabási. 2012. A universal model for mobility and migration patterns. Nature 484 (7392):96–100. doi: 10.1038/nature10856.
  • Singleton, A., and D. Arribas-Bel. 2021. Geographic data science. Geographical Analysis 53 (1):61–75. doi: 10.1111/gean.12194.
  • Sinha, P., A. E. Gaughan, F. R. Stevens, J. J. Nieves, A. Sorichetta, and A. J. Tatem. 2019. Assessing the spatial sensitivity of a random forest model: Application in gridded population modeling. Computers, Environment and Urban Systems 75:132–45. doi: 10.1016/j.compenvurbsys.2019.01.006.
  • Sulaiman, S., S. Mariyam Shamsuddin, A. Abraham, and S. Sulaiman. 2011. Intelligent web caching using machine learning methods. Neural Network World 21 (5):429–52. doi: 10.14311/NNW.2011.21.025.
  • Thelwall, M. 2000. Who is using the .co.uk domain? Professional and media adoption of the web. International Journal of Information Management 20 (6):441–53. https://www.sciencedirect.com/science/article/pii/S0268401200000384. doi: 10.1016/S0268-4012(00)00038-4.
  • Thelwall, M. 2002a. Evidence for the existence of geographic trends in university web site interlinking. Journal of Documentation 58 (5):563–74. doi: 10.1108/00220410210441586.
  • Thelwall, M. 2002b. The top 100 linked-to pages on UK university web sites: High inlink counts are not usually associated with quality scholarly content. Journal of Information Science 28 (6):483–91. doi: 10.1177/016555150202800604.
  • Thelwall, M., and L. Vaughan. 2004. A fair history of the Web? Examining country balance in the Internet Archive. Library & Information Science Research 26 (2):162–76. doi: 10.1016/j.lisr.2003.12.009.
  • Thelwall, M., L. Vaughan, and L. Björneborn. 2006. Webometrics. Annual Review of Information Science and Technology 39 (1):81–135. doi: 10.1002/aris.1440390110.
  • Thissen, M., T. de Graaff, and F. van Oort. 2016. Competitive network positions in trade and structural economic growth: A geographically weighted regression analysis for European regions. Papers in Regional Science 95 (1):159–80. doi: 10.1111/pirs.12224.
  • Thissen, M., D. Diodato, and F. Van Oort. 2013a. European regional trade flows: An update for 2000–2010. The Hague: PBL Netherlands Environmental Assessment Agency.
  • Thissen, M., D. Diodato, and F. Van Oort. 2013b. Integrated regional Europe: European regional trade flows in 2000. The Hague: PBL Netherlands Environmental Assessment Agency.
  • Thissen, M., M. Lankhuizen, F. van Oort, B. Los, and D. Diodato. 2018. EUREGIO: The construction of a global IO database with regional detail for Europe for 2000–2010. Tinbergen Institute Discussion Papers 18-084/VI, Tinbergen Institute.
  • Timmer, M. P., E. Dietzenbacher, B. Los, R. Stehrer, and G. J. De Vries. 2015. An illustrated user guide to the world input–output database: The case of global automotive production. Review of International Economics 23 (3):575–605. doi: 10.1111/roie.12178.
  • Tinbergen, J. 1962. Shaping the world economy. New York: The Twentieth Century Fund.
  • Többen, J., and T. H. Kronenberg. 2015. Construction of multi-regional input–output tables using the CHARM method. Economic Systems Research 27 (4):487–507. doi: 10.1080/09535314.2015.1091765.
  • Tranos, E., T. Kitsos, and R. Ortega-Argilés. 2021. Digital economy in the UK: Regional productivity effects of early adoption. Regional Studies 55 (12):1924–38. doi: 10.1080/00343404.2020.1826420.
  • Tranos, E., and C. Stich. 2020. Individual internet usage and the availability of online content of local interest: A multilevel approach. Computers, Environment and Urban Systems 79:101371. doi: 10.1016/j.compenvurbsys.2019.101371.
  • Vaughan, L. 2004. Exploring website features for business information. Scientometrics 61 (3):467–77. doi: 10.1023/B:SCIE.0000045122.93018.2a.
  • Vaughan, L., Y. Gao, and M. Kipp. 2006. Why are hyperlinks to business Websites created? A content analysis. Scientometrics 67 (2):291–300. doi: 10.1007/s11192-006-0100-6.
  • Vaughan, L., and G. Wu. 2004. Links to commercial websites as a source of business information. Scientometrics 60 (3):487–96. doi: 10.1023/B:SCIE.0000034389.14825.bc.
  • Wilting, H. C., A. M. Schipper, O. Ivanova, D. Ivanova, and M. A. Huijbregts. 2021. Subnational greenhouse gas and land-based biodiversity footprints in the European Union. Journal of Industrial Ecology 25 (1):79–94. doi: 10.1111/jiec.13042.
  • Yan, X., X. Liu, and X. Zhao. 2020. Using machine learning for direct demand modeling of ridesourcing services in Chicago. Journal of Transport Geography 83:102661. doi: 10.1016/j.jtrangeo.2020.102661.
  • Zook, M. A. 2000. The web of production: The economic geography of commercial Internet content production in the United States. Environment and Planning A: Economy and Space 32 (3):411–26. doi: 10.1068/a32124.