Skip to content

Fix use-after-free and leak in encoder ident handling#355

Merged
etrepum merged 1 commit into
simplejson:masterfrom
devdanzin:fix/ident-use-after-free
Apr 6, 2026
Merged

Fix use-after-free and leak in encoder ident handling#355
etrepum merged 1 commit into
simplejson:masterfrom
devdanzin:fix/ident-use-after-free

Conversation

@devdanzin

Copy link
Copy Markdown
Contributor

Fixes #349 (part of #348).

Changes

  1. Fix double Py_XDECREF on ident (Finding 14): When PyDict_DelItem(s->markers, ident) fails in encoder_listencode_obj, ident was XDECREF'd inside the if block, then XDECREF'd again unconditionally. Now uses a single Py_DECREF after the error check, matching the correct pattern in encoder_listencode_dict (line 3086).

  2. Fix ident leak on Py_EnterRecursiveCall failure (Finding 15): ident acquired via PyLong_FromVoidPtr was never released when Py_EnterRecursiveCall fails. Added Py_XDECREF(ident) before return rv.

Reproducers

Finding 14 segfault (from issue):

import simplejson._speedups as sp
import decimal

markers = {}
class Evil: pass
call_count = [0]
def bad_default(obj):
    call_count[0] += 1
    if call_count[0] <= 1:
        markers.clear()
        return "safe"
    return str(obj)

c_enc = sp.make_encoder(
    markers, bad_default, sp.encode_basestring_ascii, None, ", ", ": ",
    False, False, True, None, False, False, False, None,
    None, "utf-8", False, False, decimal.Decimal, False,
)
c_enc(Evil(), 0)  # Segfault before fix

Found using cext-review-toolkit.

- Fix double Py_XDECREF on `ident` in encoder_listencode_obj when
  PyDict_DelItem(markers, ident) fails. The ident was XDECREF'd at
  line 2957 inside the if-block, then again at line 2960 unconditionally.
  This causes a use-after-free / refcount underflow segfault when the
  default() callback clears the markers dict.

- Fix ident leak on Py_EnterRecursiveCall failure. When the recursion
  limit is hit, ident (from PyLong_FromVoidPtr) was never released,
  leaking ~158 bytes per RecursionError.

Closes simplejson#349

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@devdanzin devdanzin force-pushed the fix/ident-use-after-free branch from 3da557e to aa9182d Compare April 5, 2026 20:50
@etrepum etrepum enabled auto-merge April 6, 2026 16:20
@etrepum etrepum merged commit b328c4d into simplejson:master Apr 6, 2026
9 checks passed
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Apr 20, 2026
Version 4.0.1 released 2026-04-18

* Skip uploading Pyodide/wasm wheels to PyPI, which rejects them with
  "unsupported platform tag 'pyodide_2024_0_wasm32'". The wheels are
  still built in CI and preserved as workflow artifacts.
  simplejson/simplejson#375

Version 4.0.0 released 2026-04-18

* simplejson 4 requires Python 2.7 or Python 3.8+. Older Python
  versions (2.5, 2.6, 3.0-3.7) are no longer supported. pip will
  not install simplejson 4 on unsupported versions.

* The C extension now uses heap types and per-module state instead of
  static types and global state. This is required for free-threading
  support and sub-interpreter isolation. The Python-level API is
  unchanged.

* Full support for Python 3.13+ free-threading (PEP 703). The C
  extension is now safe to use with the GIL disabled (python3.14t):
  - Converted all static types to heap types with per-module state
  - Added per-object critical sections to scanner and encoder
  - Added free-threading-safe dict operations for Python 3.13+
  - Unified per-module state management and templated parser
  simplejson/simplejson#363
  simplejson/simplejson#364
  simplejson/simplejson#365
  simplejson/simplejson#367
  simplejson/simplejson#369

* Numerous C extension memory safety fixes:
  - Fix use-after-free and leak in encoder ident handling
  - Fix NULL dereferences on OOM in module init and static string init
  - Fix reference leaks in dict encoder (skipkeys item, variable shadowing)
  - Fix member table copy-paste, exception clobbering, missing Py_VISIT
  - Fix error-as-truthy bugs in maybe_quote_bigint and is_raw_json
  - Fix iterable_as_array swallowing MemoryError and KeyboardInterrupt
  - Fix for_json and _asdict swallowing MemoryError, KeyboardInterrupt,
    and other non-AttributeError exceptions raised by user __getattr__
  simplejson/simplejson#355
  simplejson/simplejson#356
  simplejson/simplejson#357
  simplejson/simplejson#358
  simplejson/simplejson#359
  simplejson/simplejson#360
  simplejson/simplejson#373

* C/Python parity fixes:
  - Fix C scanstring off-by-one bounds checks that caused truncated
    or boundary \uXXXX escapes to raise "Invalid \\uXXXX escape
    sequence" instead of "Unterminated string", and report error
    position at the 'u' instead of the leading backslash. The C and
    Python decoders now agree on exception class, message, and
    position across all tested edge cases.
  - Align the Python encoder's dispatch order with the C encoder for
    objects that define _asdict(). Previously a list/tuple/dict
    subclass with an _asdict() method encoded as its container type
    under the Python encoder and as the _asdict() return value under
    the C encoder; both now check _asdict() before list/tuple/dict.
    for_json() continues to outrank _asdict() in both.
  - Fix C scanstring raising a plain ValueError ("end is out of
    bounds") instead of JSONDecodeError for out-of-range end indices.
    User code with `except JSONDecodeError:` now catches both the
    C and pure-Python paths consistently.
  simplejson/simplejson#372

* C extension performance and correctness improvements:
  - Add PyDict_Next fast path for unsorted exact-dict encoding,
    avoiding intermediate items list and N tuple allocations
  - Add indexed fast path for exact list/tuple encoding, avoiding
    iterator allocation and per-item PyIter_Next overhead
  - Use PyUnicodeWriter as JSON_Accu backend on Python 3.14+,
    eliminating intermediate string objects and ''.join calls
  - Fix integer overflow in ascii_escape output_size calculation
    that could cause buffer overwrite on pathologically large strings
  - Fix list encoder separator counter overflow (int to Py_ssize_t)
  - Dead code cleanup (unreachable NULL checks, do-while wrappers)
  simplejson/simplejson#370

* Added Python 3.14 support and updated to cibuildwheel 3.2.1. CI now
  tests free-threaded (3.14t) and debug builds with -Werror, refcount
  leak detection, and GIL-disabled mode.
  simplejson/simplejson#343

* Added a ThreadSanitizer (TSan) stress test CI job. Builds a
  TSan-instrumented free-threaded CPython (cached between runs) and
  runs a concurrent stress test script against the C extension to
  catch data races under free-threading.
  simplejson/simplejson#373

* Replace deprecated license classifiers with SPDX license expression
  simplejson/simplejson#347

* Documented RawJSON usage with examples and caveats
  simplejson/simplejson#346

* Added pyproject.toml for PEP 517 build support. setup.py is retained
  for Python 2.7 wheel builds and backwards compatibility.

* Migrated build_ext import from distutils to setuptools in setup.py.
  The distutils.errors imports are kept since setuptools vendors
  distutils on Python 3.12+ where stdlib distutils was removed.

* CI now tests PEP 517 builds (pyproject.toml) alongside the existing
  setup.py-based builds.

* Added Pyodide (wasm32) wheel builds with C speedups via cibuildwheel.
  Previously Pyodide users fell back to the pure-Python wheel; now they
  get the compiled C extension cross-compiled to WebAssembly. Thread
  and subprocess tests are skipped on Emscripten where those APIs are
  unavailable.

* Test suite now fails (instead of skipping) when C speedups are missing
  during cibuildwheel runs, catching broken extension builds early.

* New ``array_hook`` parameter for ``loads()``, ``load()``, and
  ``JSONDecoder``. Called with each decoded JSON array (as a list),
  its return value replaces the list. Analogous to ``object_hook``
  for dicts. Works with both the Python decoder and C scanner.
  (Matches CPython 3.15 json module.)

* Trailing comma detection: the decoder now raises ``JSONDecodeError``
  with "Illegal trailing comma before end of object/array" for inputs
  like ``[1,]`` and ``{"a": 1,}`` instead of generic error messages.
  Both the Python decoder and C scanner are updated.
  (Matches CPython 3.13+ json module.)

* ``frozendict`` encoding support: when ``frozendict`` is available
  (CPython 3.15+ PEP 814), it is encoded as a JSON object just like
  ``dict``. No effect on older Python versions.

* Serialization errors now include ``add_note()`` context on Python
  3.11+ (PEP 678), annotating exceptions with the path to the error,
  e.g. "when serializing list item 1" / "when serializing dict item
  'key'". Only applies to the Python encoder.

* New C fast path for ``encode_basestring`` (``ensure_ascii=False``).
  Previously the non-ASCII string encoder fell back to pure Python;
  now it has a C implementation matching the existing
  ``encode_basestring_ascii`` fast path.
  simplejson/simplejson#207

* The Python decoder now rejects non-ASCII digits (e.g. fullwidth
  ``\uff10``) in JSON numbers, matching the C scanner behavior.
  The ``NUMBER_RE`` regex was changed from ``\d`` to ``[0-9]``.

* Removed dead single-phase init code for Python 3.3/3.4 from the
  C extension (these versions are no longer supported).
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request May 13, 2026
Version 4.0.1 released 2026-04-18

* Skip uploading Pyodide/wasm wheels to PyPI, which rejects them with
  "unsupported platform tag 'pyodide_2024_0_wasm32'". The wheels are
  still built in CI and preserved as workflow artifacts.
  simplejson/simplejson#375

Version 4.0.0 released 2026-04-18

* simplejson 4 requires Python 2.7 or Python 3.8+. Older Python
  versions (2.5, 2.6, 3.0-3.7) are no longer supported. pip will
  not install simplejson 4 on unsupported versions.

* The C extension now uses heap types and per-module state instead of
  static types and global state. This is required for free-threading
  support and sub-interpreter isolation. The Python-level API is
  unchanged.

* Full support for Python 3.13+ free-threading (PEP 703). The C
  extension is now safe to use with the GIL disabled (python3.14t):
  - Converted all static types to heap types with per-module state
  - Added per-object critical sections to scanner and encoder
  - Added free-threading-safe dict operations for Python 3.13+
  - Unified per-module state management and templated parser
  simplejson/simplejson#363
  simplejson/simplejson#364
  simplejson/simplejson#365
  simplejson/simplejson#367
  simplejson/simplejson#369

* Numerous C extension memory safety fixes:
  - Fix use-after-free and leak in encoder ident handling
  - Fix NULL dereferences on OOM in module init and static string init
  - Fix reference leaks in dict encoder (skipkeys item, variable shadowing)
  - Fix member table copy-paste, exception clobbering, missing Py_VISIT
  - Fix error-as-truthy bugs in maybe_quote_bigint and is_raw_json
  - Fix iterable_as_array swallowing MemoryError and KeyboardInterrupt
  - Fix for_json and _asdict swallowing MemoryError, KeyboardInterrupt,
    and other non-AttributeError exceptions raised by user __getattr__
  simplejson/simplejson#355
  simplejson/simplejson#356
  simplejson/simplejson#357
  simplejson/simplejson#358
  simplejson/simplejson#359
  simplejson/simplejson#360
  simplejson/simplejson#373

* C/Python parity fixes:
  - Fix C scanstring off-by-one bounds checks that caused truncated
    or boundary \uXXXX escapes to raise "Invalid \\uXXXX escape
    sequence" instead of "Unterminated string", and report error
    position at the 'u' instead of the leading backslash. The C and
    Python decoders now agree on exception class, message, and
    position across all tested edge cases.
  - Align the Python encoder's dispatch order with the C encoder for
    objects that define _asdict(). Previously a list/tuple/dict
    subclass with an _asdict() method encoded as its container type
    under the Python encoder and as the _asdict() return value under
    the C encoder; both now check _asdict() before list/tuple/dict.
    for_json() continues to outrank _asdict() in both.
  - Fix C scanstring raising a plain ValueError ("end is out of
    bounds") instead of JSONDecodeError for out-of-range end indices.
    User code with `except JSONDecodeError:` now catches both the
    C and pure-Python paths consistently.
  simplejson/simplejson#372

* C extension performance and correctness improvements:
  - Add PyDict_Next fast path for unsorted exact-dict encoding,
    avoiding intermediate items list and N tuple allocations
  - Add indexed fast path for exact list/tuple encoding, avoiding
    iterator allocation and per-item PyIter_Next overhead
  - Use PyUnicodeWriter as JSON_Accu backend on Python 3.14+,
    eliminating intermediate string objects and ''.join calls
  - Fix integer overflow in ascii_escape output_size calculation
    that could cause buffer overwrite on pathologically large strings
  - Fix list encoder separator counter overflow (int to Py_ssize_t)
  - Dead code cleanup (unreachable NULL checks, do-while wrappers)
  simplejson/simplejson#370

* Added Python 3.14 support and updated to cibuildwheel 3.2.1. CI now
  tests free-threaded (3.14t) and debug builds with -Werror, refcount
  leak detection, and GIL-disabled mode.
  simplejson/simplejson#343

* Added a ThreadSanitizer (TSan) stress test CI job. Builds a
  TSan-instrumented free-threaded CPython (cached between runs) and
  runs a concurrent stress test script against the C extension to
  catch data races under free-threading.
  simplejson/simplejson#373

* Replace deprecated license classifiers with SPDX license expression
  simplejson/simplejson#347

* Documented RawJSON usage with examples and caveats
  simplejson/simplejson#346

* Added pyproject.toml for PEP 517 build support. setup.py is retained
  for Python 2.7 wheel builds and backwards compatibility.

* Migrated build_ext import from distutils to setuptools in setup.py.
  The distutils.errors imports are kept since setuptools vendors
  distutils on Python 3.12+ where stdlib distutils was removed.

* CI now tests PEP 517 builds (pyproject.toml) alongside the existing
  setup.py-based builds.

* Added Pyodide (wasm32) wheel builds with C speedups via cibuildwheel.
  Previously Pyodide users fell back to the pure-Python wheel; now they
  get the compiled C extension cross-compiled to WebAssembly. Thread
  and subprocess tests are skipped on Emscripten where those APIs are
  unavailable.

* Test suite now fails (instead of skipping) when C speedups are missing
  during cibuildwheel runs, catching broken extension builds early.

* New ``array_hook`` parameter for ``loads()``, ``load()``, and
  ``JSONDecoder``. Called with each decoded JSON array (as a list),
  its return value replaces the list. Analogous to ``object_hook``
  for dicts. Works with both the Python decoder and C scanner.
  (Matches CPython 3.15 json module.)

* Trailing comma detection: the decoder now raises ``JSONDecodeError``
  with "Illegal trailing comma before end of object/array" for inputs
  like ``[1,]`` and ``{"a": 1,}`` instead of generic error messages.
  Both the Python decoder and C scanner are updated.
  (Matches CPython 3.13+ json module.)

* ``frozendict`` encoding support: when ``frozendict`` is available
  (CPython 3.15+ PEP 814), it is encoded as a JSON object just like
  ``dict``. No effect on older Python versions.

* Serialization errors now include ``add_note()`` context on Python
  3.11+ (PEP 678), annotating exceptions with the path to the error,
  e.g. "when serializing list item 1" / "when serializing dict item
  'key'". Only applies to the Python encoder.

* New C fast path for ``encode_basestring`` (``ensure_ascii=False``).
  Previously the non-ASCII string encoder fell back to pure Python;
  now it has a C implementation matching the existing
  ``encode_basestring_ascii`` fast path.
  simplejson/simplejson#207

* The Python decoder now rejects non-ASCII digits (e.g. fullwidth
  ``\uff10``) in JSON numbers, matching the C scanner behavior.
  The ``NUMBER_RE`` regex was changed from ``\d`` to ``[0-9]``.

* Removed dead single-phase init code for Python 3.3/3.4 from the
  C extension (these versions are no longer supported).
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request May 21, 2026
Version 4.0.1 released 2026-04-18

* Skip uploading Pyodide/wasm wheels to PyPI, which rejects them with
  "unsupported platform tag 'pyodide_2024_0_wasm32'". The wheels are
  still built in CI and preserved as workflow artifacts.
  simplejson/simplejson#375

Version 4.0.0 released 2026-04-18

* simplejson 4 requires Python 2.7 or Python 3.8+. Older Python
  versions (2.5, 2.6, 3.0-3.7) are no longer supported. pip will
  not install simplejson 4 on unsupported versions.

* The C extension now uses heap types and per-module state instead of
  static types and global state. This is required for free-threading
  support and sub-interpreter isolation. The Python-level API is
  unchanged.

* Full support for Python 3.13+ free-threading (PEP 703). The C
  extension is now safe to use with the GIL disabled (python3.14t):
  - Converted all static types to heap types with per-module state
  - Added per-object critical sections to scanner and encoder
  - Added free-threading-safe dict operations for Python 3.13+
  - Unified per-module state management and templated parser
  simplejson/simplejson#363
  simplejson/simplejson#364
  simplejson/simplejson#365
  simplejson/simplejson#367
  simplejson/simplejson#369

* Numerous C extension memory safety fixes:
  - Fix use-after-free and leak in encoder ident handling
  - Fix NULL dereferences on OOM in module init and static string init
  - Fix reference leaks in dict encoder (skipkeys item, variable shadowing)
  - Fix member table copy-paste, exception clobbering, missing Py_VISIT
  - Fix error-as-truthy bugs in maybe_quote_bigint and is_raw_json
  - Fix iterable_as_array swallowing MemoryError and KeyboardInterrupt
  - Fix for_json and _asdict swallowing MemoryError, KeyboardInterrupt,
    and other non-AttributeError exceptions raised by user __getattr__
  simplejson/simplejson#355
  simplejson/simplejson#356
  simplejson/simplejson#357
  simplejson/simplejson#358
  simplejson/simplejson#359
  simplejson/simplejson#360
  simplejson/simplejson#373

* C/Python parity fixes:
  - Fix C scanstring off-by-one bounds checks that caused truncated
    or boundary \uXXXX escapes to raise "Invalid \\uXXXX escape
    sequence" instead of "Unterminated string", and report error
    position at the 'u' instead of the leading backslash. The C and
    Python decoders now agree on exception class, message, and
    position across all tested edge cases.
  - Align the Python encoder's dispatch order with the C encoder for
    objects that define _asdict(). Previously a list/tuple/dict
    subclass with an _asdict() method encoded as its container type
    under the Python encoder and as the _asdict() return value under
    the C encoder; both now check _asdict() before list/tuple/dict.
    for_json() continues to outrank _asdict() in both.
  - Fix C scanstring raising a plain ValueError ("end is out of
    bounds") instead of JSONDecodeError for out-of-range end indices.
    User code with `except JSONDecodeError:` now catches both the
    C and pure-Python paths consistently.
  simplejson/simplejson#372

* C extension performance and correctness improvements:
  - Add PyDict_Next fast path for unsorted exact-dict encoding,
    avoiding intermediate items list and N tuple allocations
  - Add indexed fast path for exact list/tuple encoding, avoiding
    iterator allocation and per-item PyIter_Next overhead
  - Use PyUnicodeWriter as JSON_Accu backend on Python 3.14+,
    eliminating intermediate string objects and ''.join calls
  - Fix integer overflow in ascii_escape output_size calculation
    that could cause buffer overwrite on pathologically large strings
  - Fix list encoder separator counter overflow (int to Py_ssize_t)
  - Dead code cleanup (unreachable NULL checks, do-while wrappers)
  simplejson/simplejson#370

* Added Python 3.14 support and updated to cibuildwheel 3.2.1. CI now
  tests free-threaded (3.14t) and debug builds with -Werror, refcount
  leak detection, and GIL-disabled mode.
  simplejson/simplejson#343

* Added a ThreadSanitizer (TSan) stress test CI job. Builds a
  TSan-instrumented free-threaded CPython (cached between runs) and
  runs a concurrent stress test script against the C extension to
  catch data races under free-threading.
  simplejson/simplejson#373

* Replace deprecated license classifiers with SPDX license expression
  simplejson/simplejson#347

* Documented RawJSON usage with examples and caveats
  simplejson/simplejson#346

* Added pyproject.toml for PEP 517 build support. setup.py is retained
  for Python 2.7 wheel builds and backwards compatibility.

* Migrated build_ext import from distutils to setuptools in setup.py.
  The distutils.errors imports are kept since setuptools vendors
  distutils on Python 3.12+ where stdlib distutils was removed.

* CI now tests PEP 517 builds (pyproject.toml) alongside the existing
  setup.py-based builds.

* Added Pyodide (wasm32) wheel builds with C speedups via cibuildwheel.
  Previously Pyodide users fell back to the pure-Python wheel; now they
  get the compiled C extension cross-compiled to WebAssembly. Thread
  and subprocess tests are skipped on Emscripten where those APIs are
  unavailable.

* Test suite now fails (instead of skipping) when C speedups are missing
  during cibuildwheel runs, catching broken extension builds early.

* New ``array_hook`` parameter for ``loads()``, ``load()``, and
  ``JSONDecoder``. Called with each decoded JSON array (as a list),
  its return value replaces the list. Analogous to ``object_hook``
  for dicts. Works with both the Python decoder and C scanner.
  (Matches CPython 3.15 json module.)

* Trailing comma detection: the decoder now raises ``JSONDecodeError``
  with "Illegal trailing comma before end of object/array" for inputs
  like ``[1,]`` and ``{"a": 1,}`` instead of generic error messages.
  Both the Python decoder and C scanner are updated.
  (Matches CPython 3.13+ json module.)

* ``frozendict`` encoding support: when ``frozendict`` is available
  (CPython 3.15+ PEP 814), it is encoded as a JSON object just like
  ``dict``. No effect on older Python versions.

* Serialization errors now include ``add_note()`` context on Python
  3.11+ (PEP 678), annotating exceptions with the path to the error,
  e.g. "when serializing list item 1" / "when serializing dict item
  'key'". Only applies to the Python encoder.

* New C fast path for ``encode_basestring`` (``ensure_ascii=False``).
  Previously the non-ASCII string encoder fell back to pure Python;
  now it has a C implementation matching the existing
  ``encode_basestring_ascii`` fast path.
  simplejson/simplejson#207

* The Python decoder now rejects non-ASCII digits (e.g. fullwidth
  ``\uff10``) in JSON numbers, matching the C scanner behavior.
  The ``NUMBER_RE`` regex was changed from ``\d`` to ``[0-9]``.

* Removed dead single-phase init code for Python 3.3/3.4 from the
  C extension (these versions are no longer supported).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crash: use-after-free and leak in encoder ident handling

2 participants