Skip to Main Content

WoS XML

Access guide to the Web of Science Raw Data (XML)

About

The Web of Science (WoS) XML data provides the raw data behind the Web of Science Database for the years 1900-2022.

Terms and Conditions for Access and Use

  • The Web of Science XML data is intended for non-commercial, academic research.
  • The data is restricted to the use by faculty, staff, students, and researchers at the Georgia Institute of Technology.  As the Data’s publisher must provide prior approval for use or storage on devices physically located outside of the United States, you must first seek and receive written approval from the Library’s Data Scientist Librarian for any use of the Data outside the United States.  Commercial use of the data or derivatives is strictly prohibited.
  • The data and derivatives may not be shared outside of Georgia Institute of Technology including other universities, institutions, government agencies, or corporate entities.
  • You may no longer use the data set if your affiliation with Georgia Tech ends, including graduation, retirement, resignation, or termination.
  • Where usersquote and excerpt Licensed Information in their work as permitted by the Agreement, they must appropriately cite and credit Clarivate as the source. Attribution to Clarivate and use of the Licensed Information must not categorize or identify Clarivate as an ‘expert’ in any context and to ensure Licensed Information is not misrepresented or taken out of context. Without our prior written consent, the Licensed Information shall not be filed with any securities authorities. 

This is a large and complex data set, and the GT LIbrary will continue to evolve its support for the product.  We are currently at Phase 0.

Phase 0:  Spring 2021 

  • A Data Scientist Librarian will mediate and provide access to the data.  End-users are responsible for abiding by the terms and conditions for access and use and developing their own infrastructure to analyze the file.  (See code examples below).
  • To request access to the data please contact jay.forrest@library.gatech.edu.

Phase 1:  Summer/Fall 2025 (estimate)

  • Data access will provided by direct download via a web portal.  End-users will be responsible for abiding by the terms and conditions for access and use and developing their own infrastructure to analyze the file.  (See code examples below).  

Phase 2:  Fall 2025/Spring 2026 (estimate)

  • Data access will be provided by direct download and via a database solution.  End-users will be responsible for abiding by the terms and conditions for access and use.  End-users will be able to use their own infrastructure or create structured queries via the database solution.
  • The Library will provide end-user training for the database solution.

 

Code Examples