From: Khashayar Fereidani <info () fereidani com>
Date: Fri, 19 Jun 2026 09:56:09 +0330
# PHP 8.5.7 `levenshtein()` signed-integer overflow
**Author:** Khashayar Fereidani
**Disclosure Date:** 2026-06-18
**Advisory:** https://fereidani.com/php-857-levenshtein-signed-integer-overflow
**Contact:** https://fereidani.com/contact
## Description
The `levenshtein()` function calculates the Levenshtein distance
between two strings, optionally accepting custom costs for insertion,
replacement, and deletion operations. In PHP 8.5.7, the implementation
lacks proper bounds checking for these cost parameters. When
exceptionally large values (such as `PHP_INT_MAX`) are provided, the
arithmetic operations within the `reference_levdist()` function in
`ext/standard/levenshtein.c` result in a signed-integer overflow. This
triggers undefined behavior in C and causes the function to return a
negative distance, which is mathematically invalid.
## Proof of concept
```php
<?php
/*
* levenshtein() signed-integer overflow
* File: ext/standard/levenshtein.c reference_levdist() lines 47, 50, 53-58
*
* The user-supplied costs (cost_ins / cost_rep / cost_del, all zend_long) are
* added with NO overflow check, e.g.:
* p1[i2] = i2 * cost_ins; // line 47
* p2[0] = p1[0] + cost_del; // line 50
* c1 = p1[i2 + 1] + cost_del;// line 54 <-- PHP_INT_MAX +
PHP_INT_MAX
* c2 = p2[i2] + cost_ins; // line 58
*
* Result: signed overflow (undefined behaviour in C) producing a
* NEGATIVE edit distance, a value that is mathematically impossible.
*/
var_dump(levenshtein('a', 'b', PHP_INT_MAX, PHP_INT_MAX,
PHP_INT_MAX)); // int(-2) (should be PHP_INT_MAX)
var_dump(levenshtein('a', 'abc', PHP_INT_MAX, PHP_INT_MAX,
PHP_INT_MAX)); // int(-4)
var_dump(levenshtein('a', 'b', PHP_INT_MAX, 0,
PHP_INT_MAX)); // int(-2)
echo "All three distances are negative => signed overflow (expected >= 0).\n";
```
## Impact
The primary risk associated with this vulnerability is an application
logic flaw. Applications that rely on the `levenshtein()` function to
determine string similarity or calculate distance metrics might fail
to handle negative returns properly (for instance, treating a negative
number as `< threshold`). This can result in unexpected behavior,
incorrect data processing, or bypasses in business logic. Since it
involves integer overflow producing a negative result rather than a
memory corruption issue, the scope is generally limited to logic
disruption rather than arbitrary code execution.
## Solution
To effectively address this issue, bounds checking should be
implemented either on the cost parameters at the start of the
function, or during intermediate calculations. Utilizing safe
arithmetic macros provided by the Zend Engine can prevent the integer
overflow constraints from being violated:
```c
// Example: Adding overflow safeguards in ext/standard/levenshtein.c
if (UNEXPECTED(ZEND_SIGNED_ADD_OVERFLOWS(p1[i2 + 1], cost_del))) {
php_error_docref(NULL, E_WARNING, "Levenshtein distance
calculation caused an integer overflow");
// Handle error, e.g., return -1 or cap
}
```
An alternative and proactive measure is to restrict the inputs for
`cost_ins`, `cost_rep`, and `cost_del` before computing the distance,
ensuring that they wouldn't exceed `ZEND_LONG_MAX` when scaled
relative to the strings' lengths.
_______________________________________________
Sent through the Full Disclosure mailing list
https://nmap.org/mailman/listinfo/fulldisclosure
Web Archives & RSS: https://seclists.org/fulldisclosure/
Current thread:
- PHP 8.5.7 `levenshtein()` signed-integer overflow Khashayar Fereidani (Jun 20)