Skip to content

SYSTEM Cited by 1 source

Nokogiri

Nokogiri (nokogiri.org) is a Ruby wrapper around native XML and HTML parsers: libxml2 + libgumbo (HTML5 via Google's Gumbo parser) on CRuby, Xerces on JRuby. It's the dominant high-performance XML / HTML parser in the Ruby ecosystem — and crucially for XML-DSig use cases, it supports XML canonicalisation (C14N), which REXML does not.

require 'nokogiri'
doc = Nokogiri::XML(xml)
noko_sig_element = doc.at_xpath('//ds:Signature', 'ds' => DSIG)
canon_string = noko_sig_element.at_xpath('./ds:SignedInfo', 'ds' => DSIG)
                               .canonicalize(canon_algorithm)

Nokogiri methods are called on document (or a parsed-node handle), distinguishing them visually from REXML's REXML::XPath.first(...) idiom — a review-time tell for the two-parser code pattern that enables parser-differential attacks.

Error-handling quirk relevant to security

Nokogiri does not raise exceptions on malformed XML by default. Parsing errors accumulate silently on doc.errors:

doc = Nokogiri::XML(xml) do |config|
  config.options = Nokogiri::XML::ParseOptions::STRICT |
                   Nokogiri::XML::ParseOptions::NONET
end
raise "XML errors when parsing: " + doc.errors.to_s if doc.errors.any?

Strict mode + explicit error check is a partial mitigation that can stop some (but not all) parser-differential exploits against ruby-saml. From the disclosure: "checking for Nokogiri errors could not have prevented the parser differential, but could have stopped at least one practical exploitation of it."

Role in the ruby-saml parser differential

In ruby-saml's xml_security.rb, Nokogiri is responsible for canonicalising <ds:SignedInfo> (needed for the signature verification) and for looking up the referenced <Assertion> by ID and canonicalising it for digest hashing. REXML locates the <ds:Signature> element and extracts the <ds:SignatureValue> + <ds:DigestValue>. The two parsers' views of the same document diverge on attacker-crafted inputs — producing the parser differential at the core of CVE-2025-25291 + CVE-2025-25292. See sources/2025-03-15-github-sign-in-as-anyone-bypassing-saml-sso-authentication-with-parser-differentials.

Seen in

Last updated · 319 distilled / 1,201 read