Main Corpus on OECD reports on water between 2009 and 2021



This is a corpus of 55 reports in pdf published by the OECD Studies on Water between 2009 and 2021. This is the basis of the project funded by e-Science center and ODISSEI consotrium awarded to ESSB in EUR in 2022. The codes produced are availalbe on Github at the following links -- with methodology, samlping and results listed there. Structured Topc Modelling (STM) -- GitHub link: Word frequencies of executive summaries vis a vis main texts -- GitHub link: Named Entity Recognition and Semantic Label Mapping -- The data is licensed by OECD and can be obtained from OECD by request. The data description is to be found here --
Date made available2022

Cite this