Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XPath, descendants, and a splash of Sentry #21

Merged
merged 11 commits into from
Sep 3, 2024
38 changes: 37 additions & 1 deletion .github/workflows/gtest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,47 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Checkout Sentry Native
uses: actions/checkout@v4
with:
repository: getsentry/sentry-native
path: sentry-native
- name: Apt dance
run: sudo apt-get update && sudo apt-get upgrade -yy
- name: Install libcurl
run: sudo apt-get install libcurl4-openssl-dev
- name: Make build directory
run: mkdir gtest-build
- name: CMake
run: cd gtest-build && cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE-CXX_FLAGS=-Werror ..
- name: Download Coverity Scan
run: curl https://scan.coverity.com/download/linux64 -d 'token=${{ secrets.COVERITY_TOKEN }}&project=dwd%2Frapidxml' -o coverity.tar.gz
- name: Unpack Coverity
run: mkdir coverity && cd coverity && tar xf ../coverity.tar.gz && ln -s cov-analysis-* current
- name: Make
run: cd gtest-build && make
run: cd gtest-build && ../coverity/current/bin/cov-build --dir cov-int make
- name: Run Tests
run: cd gtest-build && ./rapidxml-test
- name: Tar up Coverity output
run: cd gtest-build && tar czf ../cov-build-output.tar.gz cov-int
- name: Upload it
run: |
curl --form token=${{ secrets.COVERITY_TOKEN }} \
--form email=dave@cridland.net \
--form file=@cov-build-output.tar.gz \
--form version="vX" \
--form description="RapidXML (Dave's Version)" \
https://scan.coverity.com/builds?project=dwd%2Frapidxml
- name: SonarQube install
uses: SonarSource/sonarcloud-github-c-cpp@v3
- name: Clean build
run: cd gtest-build && make clean
- name: Build Wrapper
run: cd gtest-build && build-wrapper-linux-x86-64 --out-dir sonar-out make
- name: Sonar Scanner
run: cd gtest-build && sonar-scanner --define sonar.cfamily.compile-commands=sonar-out/compile_commands.json --define sonar.projectKey=dwd-github_rapidxml --define sonar.organization=dwd-github
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
18 changes: 16 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ project(rapidxml)
set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
option(RAPIDXML_PERF_TESTS "Enable (very slow) performance tests" OFF)
option(RAPIDXML_SENTRY "Use Sentry (for tests only)" ON)

include(FetchContent)
FetchContent_Declare(
Expand All @@ -16,6 +17,11 @@ set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)

FetchContent_MakeAvailable(googletest)

if (RAPIDXML_SENTRY)
set(SENTRY_BACKEND inproc)
add_subdirectory(sentry-native EXCLUDE_FROM_ALL)
endif(RAPIDXML_SENTRY)

enable_testing()
add_executable(rapidxml-test
test/parse-simple.cpp
Expand All @@ -29,10 +35,18 @@ add_executable(rapidxml-test
test/perf.cpp
rapidxml_wrappers.hpp
test/iterators.cpp
rapidxml_predicates.hpp
test/xpath.cpp
rapidxml_generator.hpp
test/main.cc
)
target_link_libraries(rapidxml-test
GTest::gtest_main
target_link_libraries(rapidxml-test PRIVATE
GTest::gtest
)
if(RAPIDXML_SENTRY)
target_link_libraries(rapidxml-test PRIVATE sentry)
target_compile_definitions(rapidxml-test PRIVATE DWD_GTEST_SENTRY=1)
endif()
target_include_directories(rapidxml-test
PUBLIC
${CMAKE_CURRENT_SOURCE_DIR}
Expand Down
28 changes: 23 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,21 @@ It has breaking changes, the largest of which are:
* There is no need for string termination, now, so the parse function never terminates, and that option has vanished.
* Return values that were previously bare pointers are now a safe wrapped pointer which ordinarily will check/throw for nullptr.
* append/prepend/insert_node now also have an append/prepend/insert_element shorthand, which will allow an XML namespace to be included if wanted.
* Parsing data can be done from a container as well as a NUL-terminated buffer. A NUL-terminated buffer is slightly faster still, and will be used if possible.
* Parsing data can be done from a container as well as a NUL-terminated buffer. A NUL-terminated buffer remains slightly faster, and will be used if possible (for example, if you pass ina std::basic_string, it'll call c_str() on it and do that).

Not breaking, but kind of nice:
* The parse buffer is now treated as const, and will never be mutated. This incurs a slight performance penalty for handling long text values that have an encoded entity late in the string.
* The iterators library is now included by default, and updated to handle most idiomatic modern C++ operations.
* The iterators library is now included by default, and updated to m_handle most idiomatic modern C++ operations.

Internal changes:
* There is no longer a internal::measure or internal::compare; these just use the std::char_traits<Ch> functions as used by the string_views.
* Reserialization (that is, using the rapidxml::print family on a tree that is mostly or entirely from parsing) is now much faster, and will optimize itself to use simple buffer copies where the data is unchanged from parsing.
* Alignment of the allocator uses C++11's alignof/std::align, and so sould be more portable.
* Alignment of the allocator uses C++11's alignof/std::align, and so should be more portable.

New features:
* Instead of the `doc->allocate_node` / `node->append_node` dance, you can now `node->append_element(name, value)`, where `name` can be either a `string` (or `string_view`, etc) or a tuple like {xmlns, local_name}, which will set an xmlns attribute if needed.
* There's a xpathish thing going on in `rapidxml_predicates`, which lets you search for (or iterate through) elements using a trivial subset of XPath.
* You can get access to containerish things in rapidxml_iterators by methods on nodes/documents, as `node.children()`, `node.attributes()` and a new `node.descendants()`.

### Fun

Expand All @@ -36,7 +41,18 @@ for (auto & child : node.children()) {
}
```

More in tests/iterators.cpp
More in [test/iterators.cpp](./test/iterators.cpp)

Of course, in this case it might be simpler to:

```c++
auto xpath = rapidxml::xpath::parse("/potato");
for (auto & child : xp->all(node)) {
scream_for(joy);
}
```

More of that in [test/xpath.cpp](./test/xpath.cpp)

For those of us who lose track of the buffer sometimes, clone_node() now takes an optional second argument of "true" if you want to also clone the strings. Otherwise, nodes will use string_views which reference the original parsed buffer.

Expand All @@ -47,7 +63,7 @@ std::basic_string_view<Ch>. Typical usage passed in 0, NULL, or nullptr for unwa
and earlier - use C++23 ideally, but you can pass in {} instead. This should probably be a
std::optional<std::basic_string_view<Ch>> instead.

## Changes
## Changes to the original

I needed a library for fast XMPP processing (reading, processing, and reserializing), and this mostly fit the bill. However, not entirely, so this version adds:

Expand All @@ -61,6 +77,8 @@ The other thing this fork added was a file of simple tests, which I've recently

The original makes reference to an expansive test suite, but this was not included in the open source release. I'll expand these tests as and when I need to.

The tests use a driver which can optionally use Sentry for performance/error tracking; to enable, use the CMake option RAPIDXML_SENTRY, and clone the [sentry-native](https://github.com/getsentry/sentry-native) repository into the root, and when running `rapidxml-test`, set SENTRY_DSN in the environment. None of the submodules are needed, but it'll need libcurl, so `sudo apt install libcurl4-openssl-dev`.

## Pull Requests

Erm. I didn't expect any, so never set up any of the infrastructure for them - this was really a fork-of-convenience for me. Not that they're unwelcome, of course, just entirely unexpected.
Expand Down
11 changes: 8 additions & 3 deletions rapidxml.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ namespace rapidxml
template<typename Ch> class xml_attribute;
template<typename Ch> class xml_document;
template<typename Ch> class children;
template<typename Ch> class descendants;
template<typename Ch> class attributes;

//! Enumeration listing all node types produced by the parser.
Expand Down Expand Up @@ -1052,6 +1053,10 @@ namespace rapidxml
return rapidxml::children<Ch>{*this};
}

rapidxml::descendants<Ch> descendants() const {
return rapidxml::descendants<Ch>{optional_ptr<xml_node<Ch>>{const_cast<xml_node<Ch> *>(this)}};
}

rapidxml::attributes<Ch> attributes() const {
return rapidxml::attributes<Ch>{*this};
}
Expand Down Expand Up @@ -1095,7 +1100,7 @@ namespace rapidxml
}
for (xml_node<Ch> *child = m_last_node; child; child = child->m_prev_sibling) {
if ((name.empty() || child->name() == name)
&& (!xmlns || child->xmlns() == xmlns)) {
&& (xmlns.empty() || child->xmlns() == xmlns)) {
return child;
}
}
Expand All @@ -1112,7 +1117,7 @@ namespace rapidxml
optional_ptr<xml_node<Ch>> previous_sibling(view_type const & name = {}, view_type const & asked_xmlns = {}) const
{
assert(this->m_parent); // Cannot query for siblings if node has no parent
if (name)
if (!name.empty())
{
view_type xmlns = asked_xmlns;
if (xmlns.empty() && !name.empty()) {
Expand All @@ -1122,7 +1127,7 @@ namespace rapidxml
}
for (xml_node<Ch> *sibling = m_prev_sibling; sibling; sibling = sibling->m_prev_sibling)
if ((name.empty() || sibling->name() == name)
&& (!xmlns || sibling->xmlns() == xmlns))
&& (xmlns.empty() || sibling->xmlns() == xmlns))
return sibling;
return nullptr;
}
Expand Down
80 changes: 80 additions & 0 deletions rapidxml_generator.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
//
// Created by dave on 29/07/2024.
//

#ifndef RAPIDXML_RAPIDXML_GENERATOR_HPP
#define RAPIDXML_RAPIDXML_GENERATOR_HPP

#include <coroutine>
#include <iterator>

namespace rapidxml {
template<typename T>
class generator {
public:
using value_pointer = std::remove_reference<T>::type *;
struct handle_type;
struct promise_type {
value_pointer value;

std::suspend_always yield_value(T & v) {
value = &v;
return {};
}

std::suspend_never initial_suspend() {
return {};
}

std::suspend_always final_suspend() noexcept {
return {}; // Change this to std::suspend_always
}

void return_void() {}

void unhandled_exception() {
std::terminate();
}

generator get_return_object() {
return generator{handle_type{handle_type::from_promise(*this)}};
}
};

struct handle_type : std::coroutine_handle<promise_type> {
explicit handle_type(std::coroutine_handle<promise_type> && h) : std::coroutine_handle<promise_type>(std::move(h)) {}

T &operator*() {
return *(this->promise().value);
}

void operator++() {
this->resume();
}

bool operator!=(std::default_sentinel_t) const {
return !this->done();
}
};

explicit generator(handle_type h) : m_handle(h) {}

~generator() {
if (m_handle)
m_handle.destroy();
}

handle_type begin() {
return m_handle;
}

std::default_sentinel_t end() {
return std::default_sentinel;
}

private:
handle_type m_handle{};
};
}

#endif //RAPIDXML_RAPIDXML_GENERATOR_HPP
Loading