+ GSoC Project Task: Supporting Migration from XML Package
+Welcome to the R-xml2 repository! This repository contains tests and examples for working with XML content using the xml2 package in R. Below you’ll find information on how to navigate this repository and run the provided tests. This project aims to contribute to the migration efforts from the XML package to the xml2 package in R.
+Table of Contents
+The XML package in R has been in maintenance mode for several years, and there’s a push to migrate packages depending on XML to alternatives such as xml2, which is actively developed. This project aims to support this migration effort by contributing patches to packages dependent on XML, implementing the switch to xml2, and documenting example mappings from XML to xml2 code.
+Tests Overview
+This repository contains three tests:
+Easy Test
: This test demonstrates basic XML parsing using the xml2 package. It involves extracting specific information from a simple XML document.
+Medium Test
: The medium test builds upon the concepts covered in the easy test and introduces more advanced XML parsing techniques. It involves replicating a given analysis using functions from the XML package.
+Hard Test
: The hard test is the most challenging and requires advanced XML parsing skills. It involves writing a custom function to parse XML code without depending on the XML package.
+Each test is designed to progressively increase in difficulty, providing users with a comprehensive learning experience.
+Before running the tests or contributing to this project, ensure that you have the following packages installed:
+ - xml2
+ - XML
+ - stringr
+You can install these packages using the following R commands:
+To run the tests for this project, follow these steps:
+ - Clone this repository to your local machine.
+ - Open RStudio or any other R environment.
+ - Load the required packages:
+ - Navigate to the directory where the repository is cloned.
+ - Run the desired test script (easy-test.R, medium-test.R, or hard-test.R) to execute the corresponding test.
+ - Run the script in your R environment.
+Viewing Test Results
+Click on the specific test to view the results:
+Contributions to this repository are welcome! If you have ideas for additional tests, improvements to existing tests, or any other enhancements, feel free to open an issue or submit a pull request.
+Before contributing, please review the contribution guidelines.
+This project is licensed under the GPL-3.0 Public License.
# GSoC Project Task: Supporting Migration from XML Package
Welcome to the R-xml2 repository! This repository contains tests and examples for working with XML content using the xml2 package in R. Below you'll find information on how to navigate this repository and run the provided tests. This project aims to contribute to the migration efforts from the XML package to the xml2 package in R.
## Table of Contents
+- [Introduction](#introduction)
+- [Tests Overview](#tests-overview)
+- [Installation](#installation)
+- [Usage](#usage)
+- [Viewing Test Results](#viewing-test-results)
+- [Contributing](#contributing)
+- [License](#license)
## Introduction
The XML package in R has been in maintenance mode for several years, and there's a push to migrate packages depending on XML to alternatives such as xml2, which is actively developed. This project aims to support this migration effort by contributing patches to packages dependent on XML, implementing the switch to xml2, and documenting example mappings from XML to xml2 code.
## Tests Overview
This repository contains three tests:
`Easy Test`: This test demonstrates basic XML parsing using the xml2 package. It involves extracting specific information from a simple XML document.
`Medium Test`: The medium test builds upon the concepts covered in the easy test and introduces more advanced XML parsing techniques. It involves replicating a given analysis using functions from the XML package.
`Hard Test`: The hard test is the most challenging and requires advanced XML parsing skills. It involves writing a custom function to parse XML code without depending on the XML package.
Each test is designed to progressively increase in difficulty, providing users with a comprehensive learning experience.
# Installation
Before running the tests or contributing to this project, ensure that you have the following packages installed:
1. xml2
2. XML
3. stringr
You can install these packages using the following R commands:
# Usage
To run the tests for this project, follow these steps:
1. Clone this repository to your local machine.
2. Open RStudio or any other R environment.
3. Load the required packages:
5. Navigate to the directory where the repository is cloned.
6. Run the desired test script (easy-test.R, medium-test.R, or hard-test.R) to execute the corresponding test.
7. Run the script in your R environment.
## Viewing Test Results
Click on the specific test to view the results:
+- [Easy Test Analysis](https://tushar98644.github.io/R-xml2/output/easy)
+- [Medium Test Analysis](https://tushar98644.github.io/R-xml2/output/medium)
+- [Hard Test Analysis](https://tushar98644.github.io/R-xml2/output/hard)
# Contributing
Contributions to this repository are welcome! If you have ideas for additional tests, improvements to existing tests, or any other enhancements, feel free to open an issue or submit a pull request.
Before contributing, please review the contribution guidelines.
# License
This project is licensed under the [GPL-3.0 Public License](LICENSE).
+ Easy Test Analysis
+The easy test focuses on basic XML parsing using the xml2 package. It
+involves extracting specific information from a simple XML document. The
+code snippet below demonstrates how to load the xml2 package and parse a
+simple XML document to extract the director name for the second movie.
+Setting Up the Environment
+Section 1: Loading Libraries and XML String
+xml_string <- c( '<?xml version="1.0" encoding="UTF-8"?>',
+ '<movies>',
+ '<movie mins="126" lang="eng">',
+ '<title>Good Will Hunting</title>',
+ '<director>',
+ '<first_name>Gus</first_name>',
+ '<last_name>Van Sant</last_name>',
+ '</director>',
+ '<year>1998</year>',
+ '<genre>drama</genre>',
+ '</movie>',
+ '<movie mins="106" lang="spa">',
+ '<title>Y tu mama tambien</title>',
+ '<director>',
+ '<first_name>Alfonso</first_name>',
+ '<last_name>Cuaron</last_name>',
+ '</director>',
+ '<year>2001</year>',
+ '<genre>drama</genre>',
+ '</movie>',
+ '</movies>')
+ -
The xml2 library is loaded to handle XML data in R.
+ -
The stringr library is loaded for string manipulation, though it’s
+not used in this snippet.
+ -
An XML string representing a list of movies is defined, including
+details like title, director, year, and genre.
+Section 2: Parsing the XML Document
+doc <- read_xml(paste(xml_string, collapse = ''))
+## {xml_document}
+## <movies>
+## [1] <movie mins="126" lang="eng">\n <title>Good Will Hunting</title>\n <dir ...
+## [2] <movie mins="106" lang="spa">\n <title>Y tu mama tambien</title>\n <dir ...
+ -
The read_xml function from the xml2 package is used to parse the
+XML string into an XML document object.
+ -
The paste function with collapse = ’’ is used to concatenate the
+XML string into a single string before parsing.
+ -
The parsed XML document is stored in the variable doc.
+Section 3: Navigating the XML Document
+tu_mama <- xml_child(doc, search = 2)
+## {xml_node}
+## <movie mins="106" lang="spa">
+## [1] <title>Y tu mama tambien</title>
+## [2] <director>\n <first_name>Alfonso</first_name>\n <last_name>Cuaron</last ...
+## [3] <year>2001</year>
+## [4] <genre>drama</genre>
+## {xml_nodeset (4)}
+## [1] <title>Y tu mama tambien</title>
+## [2] <director>\n <first_name>Alfonso</first_name>\n <last_name>Cuaron</last ...
+## [3] <year>2001</year>
+## [4] <genre>drama</genre>
+ -
The xml_children function lists all child nodes of the XML
+ -
The xml_child function is used to select a specific child node by
+its index, in this case, the second movie.
+director <- xml_child(tu_mama,"director")
+## {xml_node}
+## <director>
+## [1] <first_name>Alfonso</first_name>
+## [2] <last_name>Cuaron</last_name>
+## {xml_nodeset (2)}
+## [1] <first_name>Alfonso</first_name>
+## [2] <last_name>Cuaron</last_name>
+ -
The xml_child function is used again to select the “director”
+child node of the selected movie.
+ -
The xml_contents function lists all nodes within the “director”
+ -
The xml_text function extracts the text content of the “director”
+node, providing the director’s name.
