{{ message }}
Fix use-after-free and leak in encoder ident handling#355
Merged
Conversation
- Fix double Py_XDECREF on `ident` in encoder_listencode_obj when PyDict_DelItem(markers, ident) fails. The ident was XDECREF'd at line 2957 inside the if-block, then again at line 2960 unconditionally. This causes a use-after-free / refcount underflow segfault when the default() callback clears the markers dict. - Fix ident leak on Py_EnterRecursiveCall failure. When the recursion limit is hit, ident (from PyLong_FromVoidPtr) was never released, leaking ~158 bytes per RecursionError. Closes simplejson#349 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3da557e to
aa9182d
Compare
etrepum
approved these changes
Apr 6, 2026
netbsd-srcmastr
pushed a commit
to NetBSD/pkgsrc
that referenced
this pull request
Apr 20, 2026
Version 4.0.1 released 2026-04-18 * Skip uploading Pyodide/wasm wheels to PyPI, which rejects them with "unsupported platform tag 'pyodide_2024_0_wasm32'". The wheels are still built in CI and preserved as workflow artifacts. simplejson/simplejson#375 Version 4.0.0 released 2026-04-18 * simplejson 4 requires Python 2.7 or Python 3.8+. Older Python versions (2.5, 2.6, 3.0-3.7) are no longer supported. pip will not install simplejson 4 on unsupported versions. * The C extension now uses heap types and per-module state instead of static types and global state. This is required for free-threading support and sub-interpreter isolation. The Python-level API is unchanged. * Full support for Python 3.13+ free-threading (PEP 703). The C extension is now safe to use with the GIL disabled (python3.14t): - Converted all static types to heap types with per-module state - Added per-object critical sections to scanner and encoder - Added free-threading-safe dict operations for Python 3.13+ - Unified per-module state management and templated parser simplejson/simplejson#363 simplejson/simplejson#364 simplejson/simplejson#365 simplejson/simplejson#367 simplejson/simplejson#369 * Numerous C extension memory safety fixes: - Fix use-after-free and leak in encoder ident handling - Fix NULL dereferences on OOM in module init and static string init - Fix reference leaks in dict encoder (skipkeys item, variable shadowing) - Fix member table copy-paste, exception clobbering, missing Py_VISIT - Fix error-as-truthy bugs in maybe_quote_bigint and is_raw_json - Fix iterable_as_array swallowing MemoryError and KeyboardInterrupt - Fix for_json and _asdict swallowing MemoryError, KeyboardInterrupt, and other non-AttributeError exceptions raised by user __getattr__ simplejson/simplejson#355 simplejson/simplejson#356 simplejson/simplejson#357 simplejson/simplejson#358 simplejson/simplejson#359 simplejson/simplejson#360 simplejson/simplejson#373 * C/Python parity fixes: - Fix C scanstring off-by-one bounds checks that caused truncated or boundary \uXXXX escapes to raise "Invalid \\uXXXX escape sequence" instead of "Unterminated string", and report error position at the 'u' instead of the leading backslash. The C and Python decoders now agree on exception class, message, and position across all tested edge cases. - Align the Python encoder's dispatch order with the C encoder for objects that define _asdict(). Previously a list/tuple/dict subclass with an _asdict() method encoded as its container type under the Python encoder and as the _asdict() return value under the C encoder; both now check _asdict() before list/tuple/dict. for_json() continues to outrank _asdict() in both. - Fix C scanstring raising a plain ValueError ("end is out of bounds") instead of JSONDecodeError for out-of-range end indices. User code with `except JSONDecodeError:` now catches both the C and pure-Python paths consistently. simplejson/simplejson#372 * C extension performance and correctness improvements: - Add PyDict_Next fast path for unsorted exact-dict encoding, avoiding intermediate items list and N tuple allocations - Add indexed fast path for exact list/tuple encoding, avoiding iterator allocation and per-item PyIter_Next overhead - Use PyUnicodeWriter as JSON_Accu backend on Python 3.14+, eliminating intermediate string objects and ''.join calls - Fix integer overflow in ascii_escape output_size calculation that could cause buffer overwrite on pathologically large strings - Fix list encoder separator counter overflow (int to Py_ssize_t) - Dead code cleanup (unreachable NULL checks, do-while wrappers) simplejson/simplejson#370 * Added Python 3.14 support and updated to cibuildwheel 3.2.1. CI now tests free-threaded (3.14t) and debug builds with -Werror, refcount leak detection, and GIL-disabled mode. simplejson/simplejson#343 * Added a ThreadSanitizer (TSan) stress test CI job. Builds a TSan-instrumented free-threaded CPython (cached between runs) and runs a concurrent stress test script against the C extension to catch data races under free-threading. simplejson/simplejson#373 * Replace deprecated license classifiers with SPDX license expression simplejson/simplejson#347 * Documented RawJSON usage with examples and caveats simplejson/simplejson#346 * Added pyproject.toml for PEP 517 build support. setup.py is retained for Python 2.7 wheel builds and backwards compatibility. * Migrated build_ext import from distutils to setuptools in setup.py. The distutils.errors imports are kept since setuptools vendors distutils on Python 3.12+ where stdlib distutils was removed. * CI now tests PEP 517 builds (pyproject.toml) alongside the existing setup.py-based builds. * Added Pyodide (wasm32) wheel builds with C speedups via cibuildwheel. Previously Pyodide users fell back to the pure-Python wheel; now they get the compiled C extension cross-compiled to WebAssembly. Thread and subprocess tests are skipped on Emscripten where those APIs are unavailable. * Test suite now fails (instead of skipping) when C speedups are missing during cibuildwheel runs, catching broken extension builds early. * New ``array_hook`` parameter for ``loads()``, ``load()``, and ``JSONDecoder``. Called with each decoded JSON array (as a list), its return value replaces the list. Analogous to ``object_hook`` for dicts. Works with both the Python decoder and C scanner. (Matches CPython 3.15 json module.) * Trailing comma detection: the decoder now raises ``JSONDecodeError`` with "Illegal trailing comma before end of object/array" for inputs like ``[1,]`` and ``{"a": 1,}`` instead of generic error messages. Both the Python decoder and C scanner are updated. (Matches CPython 3.13+ json module.) * ``frozendict`` encoding support: when ``frozendict`` is available (CPython 3.15+ PEP 814), it is encoded as a JSON object just like ``dict``. No effect on older Python versions. * Serialization errors now include ``add_note()`` context on Python 3.11+ (PEP 678), annotating exceptions with the path to the error, e.g. "when serializing list item 1" / "when serializing dict item 'key'". Only applies to the Python encoder. * New C fast path for ``encode_basestring`` (``ensure_ascii=False``). Previously the non-ASCII string encoder fell back to pure Python; now it has a C implementation matching the existing ``encode_basestring_ascii`` fast path. simplejson/simplejson#207 * The Python decoder now rejects non-ASCII digits (e.g. fullwidth ``\uff10``) in JSON numbers, matching the C scanner behavior. The ``NUMBER_RE`` regex was changed from ``\d`` to ``[0-9]``. * Removed dead single-phase init code for Python 3.3/3.4 from the C extension (these versions are no longer supported).
netbsd-srcmastr
pushed a commit
to NetBSD/pkgsrc
that referenced
this pull request
May 13, 2026
Version 4.0.1 released 2026-04-18 * Skip uploading Pyodide/wasm wheels to PyPI, which rejects them with "unsupported platform tag 'pyodide_2024_0_wasm32'". The wheels are still built in CI and preserved as workflow artifacts. simplejson/simplejson#375 Version 4.0.0 released 2026-04-18 * simplejson 4 requires Python 2.7 or Python 3.8+. Older Python versions (2.5, 2.6, 3.0-3.7) are no longer supported. pip will not install simplejson 4 on unsupported versions. * The C extension now uses heap types and per-module state instead of static types and global state. This is required for free-threading support and sub-interpreter isolation. The Python-level API is unchanged. * Full support for Python 3.13+ free-threading (PEP 703). The C extension is now safe to use with the GIL disabled (python3.14t): - Converted all static types to heap types with per-module state - Added per-object critical sections to scanner and encoder - Added free-threading-safe dict operations for Python 3.13+ - Unified per-module state management and templated parser simplejson/simplejson#363 simplejson/simplejson#364 simplejson/simplejson#365 simplejson/simplejson#367 simplejson/simplejson#369 * Numerous C extension memory safety fixes: - Fix use-after-free and leak in encoder ident handling - Fix NULL dereferences on OOM in module init and static string init - Fix reference leaks in dict encoder (skipkeys item, variable shadowing) - Fix member table copy-paste, exception clobbering, missing Py_VISIT - Fix error-as-truthy bugs in maybe_quote_bigint and is_raw_json - Fix iterable_as_array swallowing MemoryError and KeyboardInterrupt - Fix for_json and _asdict swallowing MemoryError, KeyboardInterrupt, and other non-AttributeError exceptions raised by user __getattr__ simplejson/simplejson#355 simplejson/simplejson#356 simplejson/simplejson#357 simplejson/simplejson#358 simplejson/simplejson#359 simplejson/simplejson#360 simplejson/simplejson#373 * C/Python parity fixes: - Fix C scanstring off-by-one bounds checks that caused truncated or boundary \uXXXX escapes to raise "Invalid \\uXXXX escape sequence" instead of "Unterminated string", and report error position at the 'u' instead of the leading backslash. The C and Python decoders now agree on exception class, message, and position across all tested edge cases. - Align the Python encoder's dispatch order with the C encoder for objects that define _asdict(). Previously a list/tuple/dict subclass with an _asdict() method encoded as its container type under the Python encoder and as the _asdict() return value under the C encoder; both now check _asdict() before list/tuple/dict. for_json() continues to outrank _asdict() in both. - Fix C scanstring raising a plain ValueError ("end is out of bounds") instead of JSONDecodeError for out-of-range end indices. User code with `except JSONDecodeError:` now catches both the C and pure-Python paths consistently. simplejson/simplejson#372 * C extension performance and correctness improvements: - Add PyDict_Next fast path for unsorted exact-dict encoding, avoiding intermediate items list and N tuple allocations - Add indexed fast path for exact list/tuple encoding, avoiding iterator allocation and per-item PyIter_Next overhead - Use PyUnicodeWriter as JSON_Accu backend on Python 3.14+, eliminating intermediate string objects and ''.join calls - Fix integer overflow in ascii_escape output_size calculation that could cause buffer overwrite on pathologically large strings - Fix list encoder separator counter overflow (int to Py_ssize_t) - Dead code cleanup (unreachable NULL checks, do-while wrappers) simplejson/simplejson#370 * Added Python 3.14 support and updated to cibuildwheel 3.2.1. CI now tests free-threaded (3.14t) and debug builds with -Werror, refcount leak detection, and GIL-disabled mode. simplejson/simplejson#343 * Added a ThreadSanitizer (TSan) stress test CI job. Builds a TSan-instrumented free-threaded CPython (cached between runs) and runs a concurrent stress test script against the C extension to catch data races under free-threading. simplejson/simplejson#373 * Replace deprecated license classifiers with SPDX license expression simplejson/simplejson#347 * Documented RawJSON usage with examples and caveats simplejson/simplejson#346 * Added pyproject.toml for PEP 517 build support. setup.py is retained for Python 2.7 wheel builds and backwards compatibility. * Migrated build_ext import from distutils to setuptools in setup.py. The distutils.errors imports are kept since setuptools vendors distutils on Python 3.12+ where stdlib distutils was removed. * CI now tests PEP 517 builds (pyproject.toml) alongside the existing setup.py-based builds. * Added Pyodide (wasm32) wheel builds with C speedups via cibuildwheel. Previously Pyodide users fell back to the pure-Python wheel; now they get the compiled C extension cross-compiled to WebAssembly. Thread and subprocess tests are skipped on Emscripten where those APIs are unavailable. * Test suite now fails (instead of skipping) when C speedups are missing during cibuildwheel runs, catching broken extension builds early. * New ``array_hook`` parameter for ``loads()``, ``load()``, and ``JSONDecoder``. Called with each decoded JSON array (as a list), its return value replaces the list. Analogous to ``object_hook`` for dicts. Works with both the Python decoder and C scanner. (Matches CPython 3.15 json module.) * Trailing comma detection: the decoder now raises ``JSONDecodeError`` with "Illegal trailing comma before end of object/array" for inputs like ``[1,]`` and ``{"a": 1,}`` instead of generic error messages. Both the Python decoder and C scanner are updated. (Matches CPython 3.13+ json module.) * ``frozendict`` encoding support: when ``frozendict`` is available (CPython 3.15+ PEP 814), it is encoded as a JSON object just like ``dict``. No effect on older Python versions. * Serialization errors now include ``add_note()`` context on Python 3.11+ (PEP 678), annotating exceptions with the path to the error, e.g. "when serializing list item 1" / "when serializing dict item 'key'". Only applies to the Python encoder. * New C fast path for ``encode_basestring`` (``ensure_ascii=False``). Previously the non-ASCII string encoder fell back to pure Python; now it has a C implementation matching the existing ``encode_basestring_ascii`` fast path. simplejson/simplejson#207 * The Python decoder now rejects non-ASCII digits (e.g. fullwidth ``\uff10``) in JSON numbers, matching the C scanner behavior. The ``NUMBER_RE`` regex was changed from ``\d`` to ``[0-9]``. * Removed dead single-phase init code for Python 3.3/3.4 from the C extension (these versions are no longer supported).
netbsd-srcmastr
pushed a commit
to NetBSD/pkgsrc
that referenced
this pull request
May 21, 2026
Version 4.0.1 released 2026-04-18 * Skip uploading Pyodide/wasm wheels to PyPI, which rejects them with "unsupported platform tag 'pyodide_2024_0_wasm32'". The wheels are still built in CI and preserved as workflow artifacts. simplejson/simplejson#375 Version 4.0.0 released 2026-04-18 * simplejson 4 requires Python 2.7 or Python 3.8+. Older Python versions (2.5, 2.6, 3.0-3.7) are no longer supported. pip will not install simplejson 4 on unsupported versions. * The C extension now uses heap types and per-module state instead of static types and global state. This is required for free-threading support and sub-interpreter isolation. The Python-level API is unchanged. * Full support for Python 3.13+ free-threading (PEP 703). The C extension is now safe to use with the GIL disabled (python3.14t): - Converted all static types to heap types with per-module state - Added per-object critical sections to scanner and encoder - Added free-threading-safe dict operations for Python 3.13+ - Unified per-module state management and templated parser simplejson/simplejson#363 simplejson/simplejson#364 simplejson/simplejson#365 simplejson/simplejson#367 simplejson/simplejson#369 * Numerous C extension memory safety fixes: - Fix use-after-free and leak in encoder ident handling - Fix NULL dereferences on OOM in module init and static string init - Fix reference leaks in dict encoder (skipkeys item, variable shadowing) - Fix member table copy-paste, exception clobbering, missing Py_VISIT - Fix error-as-truthy bugs in maybe_quote_bigint and is_raw_json - Fix iterable_as_array swallowing MemoryError and KeyboardInterrupt - Fix for_json and _asdict swallowing MemoryError, KeyboardInterrupt, and other non-AttributeError exceptions raised by user __getattr__ simplejson/simplejson#355 simplejson/simplejson#356 simplejson/simplejson#357 simplejson/simplejson#358 simplejson/simplejson#359 simplejson/simplejson#360 simplejson/simplejson#373 * C/Python parity fixes: - Fix C scanstring off-by-one bounds checks that caused truncated or boundary \uXXXX escapes to raise "Invalid \\uXXXX escape sequence" instead of "Unterminated string", and report error position at the 'u' instead of the leading backslash. The C and Python decoders now agree on exception class, message, and position across all tested edge cases. - Align the Python encoder's dispatch order with the C encoder for objects that define _asdict(). Previously a list/tuple/dict subclass with an _asdict() method encoded as its container type under the Python encoder and as the _asdict() return value under the C encoder; both now check _asdict() before list/tuple/dict. for_json() continues to outrank _asdict() in both. - Fix C scanstring raising a plain ValueError ("end is out of bounds") instead of JSONDecodeError for out-of-range end indices. User code with `except JSONDecodeError:` now catches both the C and pure-Python paths consistently. simplejson/simplejson#372 * C extension performance and correctness improvements: - Add PyDict_Next fast path for unsorted exact-dict encoding, avoiding intermediate items list and N tuple allocations - Add indexed fast path for exact list/tuple encoding, avoiding iterator allocation and per-item PyIter_Next overhead - Use PyUnicodeWriter as JSON_Accu backend on Python 3.14+, eliminating intermediate string objects and ''.join calls - Fix integer overflow in ascii_escape output_size calculation that could cause buffer overwrite on pathologically large strings - Fix list encoder separator counter overflow (int to Py_ssize_t) - Dead code cleanup (unreachable NULL checks, do-while wrappers) simplejson/simplejson#370 * Added Python 3.14 support and updated to cibuildwheel 3.2.1. CI now tests free-threaded (3.14t) and debug builds with -Werror, refcount leak detection, and GIL-disabled mode. simplejson/simplejson#343 * Added a ThreadSanitizer (TSan) stress test CI job. Builds a TSan-instrumented free-threaded CPython (cached between runs) and runs a concurrent stress test script against the C extension to catch data races under free-threading. simplejson/simplejson#373 * Replace deprecated license classifiers with SPDX license expression simplejson/simplejson#347 * Documented RawJSON usage with examples and caveats simplejson/simplejson#346 * Added pyproject.toml for PEP 517 build support. setup.py is retained for Python 2.7 wheel builds and backwards compatibility. * Migrated build_ext import from distutils to setuptools in setup.py. The distutils.errors imports are kept since setuptools vendors distutils on Python 3.12+ where stdlib distutils was removed. * CI now tests PEP 517 builds (pyproject.toml) alongside the existing setup.py-based builds. * Added Pyodide (wasm32) wheel builds with C speedups via cibuildwheel. Previously Pyodide users fell back to the pure-Python wheel; now they get the compiled C extension cross-compiled to WebAssembly. Thread and subprocess tests are skipped on Emscripten where those APIs are unavailable. * Test suite now fails (instead of skipping) when C speedups are missing during cibuildwheel runs, catching broken extension builds early. * New ``array_hook`` parameter for ``loads()``, ``load()``, and ``JSONDecoder``. Called with each decoded JSON array (as a list), its return value replaces the list. Analogous to ``object_hook`` for dicts. Works with both the Python decoder and C scanner. (Matches CPython 3.15 json module.) * Trailing comma detection: the decoder now raises ``JSONDecodeError`` with "Illegal trailing comma before end of object/array" for inputs like ``[1,]`` and ``{"a": 1,}`` instead of generic error messages. Both the Python decoder and C scanner are updated. (Matches CPython 3.13+ json module.) * ``frozendict`` encoding support: when ``frozendict`` is available (CPython 3.15+ PEP 814), it is encoded as a JSON object just like ``dict``. No effect on older Python versions. * Serialization errors now include ``add_note()`` context on Python 3.11+ (PEP 678), annotating exceptions with the path to the error, e.g. "when serializing list item 1" / "when serializing dict item 'key'". Only applies to the Python encoder. * New C fast path for ``encode_basestring`` (``ensure_ascii=False``). Previously the non-ASCII string encoder fell back to pure Python; now it has a C implementation matching the existing ``encode_basestring_ascii`` fast path. simplejson/simplejson#207 * The Python decoder now rejects non-ASCII digits (e.g. fullwidth ``\uff10``) in JSON numbers, matching the C scanner behavior. The ``NUMBER_RE`` regex was changed from ``\d`` to ``[0-9]``. * Removed dead single-phase init code for Python 3.3/3.4 from the C extension (these versions are no longer supported).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #349 (part of #348).
Changes
Fix double
Py_XDECREFonident(Finding 14): WhenPyDict_DelItem(s->markers, ident)fails inencoder_listencode_obj,identwas XDECREF'd inside theifblock, then XDECREF'd again unconditionally. Now uses a singlePy_DECREFafter the error check, matching the correct pattern inencoder_listencode_dict(line 3086).Fix
identleak onPy_EnterRecursiveCallfailure (Finding 15):identacquired viaPyLong_FromVoidPtrwas never released whenPy_EnterRecursiveCallfails. AddedPy_XDECREF(ident)beforereturn rv.Reproducers
Finding 14 segfault (from issue):
Found using cext-review-toolkit.