CVE-2026-11460 - security flaw in serialization
I haven't seen mention of this on this list yet, so just passing it along. Vulnerability page: https://vuldb.com/cve/CVE-2026-11460 More details: https://gist.github.com/TrebledJ/b7c872f869b5ed7cbd936f71f16c7d75 - Chris
On Mon, Jun 22, 2026, at 11:12 PM, Chris Frey via Boost wrote:
I haven't seen mention of this on this list yet, so just passing it along.
Vulnerability page: https://vuldb.com/cve/CVE-2026-11460
More details: https://gist.github.com/TrebledJ/b7c872f869b5ed7cbd936f71f16c7d75
Isn't this by design and as documented? Boost Serialization does not have checksums/tampering protection. Basically, reading untrusted archives is a no-no because malformed archives lead to undefined behavior. I believe this is documented under some version compatibility paragraphs, and likely under the `archive_flags`? That's a limitation of the scope of the library, but not necessarily in application, because the protection/authentication can be built into a higher layer of the serialization that is based on Boost Serialization archives. Just thinking out loud here, Seth
- Chris
_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/UV6VAGLK...
On 6/22/26 2:59 PM, Seth via Boost wrote:
On Mon, Jun 22, 2026, at 11:12 PM, Chris Frey via Boost wrote:
I haven't seen mention of this on this list yet, so just passing it along.
Vulnerability page: https://vuldb.com/cve/CVE-2026-11460
More details: https://gist.github.com/TrebledJ/b7c872f869b5ed7cbd936f71f16c7d75
Isn't this by design and as documented?
Boost Serialization does not have checksums/tampering protection. Basically, reading untrusted archives is a no-no because malformed archives lead to undefined behavior. I believe this is documented under some version compatibility paragraphs, and likely under the `archive_flags`?
That's a limitation of the scope of the library, but not necessarily in application, because the protection/authentication can be built into a higher layer of the serialization that is based on Boost Serialization archives.
Just thinking out loud here, Seth
- Chris
this has been brought up before. The serialization library does not provide any mechanism to verify that an archive has not been tampered with. I have considered from time to time and I've concluded that it's not a trivial thing to address. The first idea which occurs is to append some sort of checksum to an archive when it is closed. This would be unsatisfactory for a number of reasons: a) It would be pretty easy to tamper with the archive, and just replace the checksum with and updated one. This would defeat the system. b) The API permits the usage of an archive in "streaming" mode where the consumer reads the archive in parallel with the supplier. One would not be able to know that the archive has been tampered with "on the fly" until the archive is closed. Besides, if the archive stream is tampered with on the fly, the check sum could be updated when the stream is closed. c) The only way I see to reliably detecte that archive has been tampered with would be to generate the checksum "out of line" and transmit it over a separate channel/message. This would need to be stored in a separate place than the archive itself to avoid it (the checksum) being tampered with. This would be possible of course. But it doesn't require any changes to the serialization library itself. The scheme would look something like: i) create the archive as one does now ii) create the check sum iii) send the archive iv) send the check sum on a separate channel So no library changes needed. If one had nothing else to do, one could create an archive class which generates the checksum. The one create the two archives - the data and the checksum and then send them separately. Again no library changes needed to do this. Finally, if you've still got time on your hands, you caould make a parallel archive adaptor. This would compose two archives with a "composition archive which woould call the save/load functions of the constituent archives. This would likely be more generally useful, for example if one wanted to stream and store a local copy of an archive simultaneously. Again no changed to the Serialization library itself. These components - checksum archive, composition archive would be described/documented as. examples in updated/extended documentation of the serialization library build with Boost Book. Soooooo ... That's my answer Robert Ramey
On 6/23/26 07:44, Robert Ramey via Boost wrote:
These components - checksum archive, composition archive would be described/documented as. examples in updated/extended documentation of the serialization library build with Boost Book.
Soooooo ... That's my answer
The problem isn't tampering. Tampering implies a hostile man-in-the-middle between trusted endpoints. The problem is that the endpoints cannot, and shouldn't have to, trust each other in the first place. -- Rainer Deyke - rainerd@eldwood.com
On 6/22/26 23:59, Seth via Boost wrote:
On Mon, Jun 22, 2026, at 11:12 PM, Chris Frey via Boost wrote:
I haven't seen mention of this on this list yet, so just passing it along.
Vulnerability page: https://vuldb.com/cve/CVE-2026-11460
More details: https://gist.github.com/TrebledJ/b7c872f869b5ed7cbd936f71f16c7d75
Isn't this by design and as documented?
Boost Serialization does not have checksums/tampering protection. Basically, reading untrusted archives is a no-no because malformed archives lead to undefined behavior. I believe this is documented under some version compatibility paragraphs, and likely under the `archive_flags`?
That's a limitation of the scope of the library, but not necessarily in application, because the protection/authentication can be built into a higher layer of the serialization that is based on Boost Serialization archives.
Using checksums for authentication, or indeed any authentication system whatsoever, is quite problematic for serialization. Data sanitation is not an authentication problem, and no authentication may be possible. Imagine this scenario: - User A creates a file and uploads it onto the internet, without any knowledge of who is going to consume the file. - User B downloads the file and loads it in program C, trusting the security of program C to remain secure in the presence of untrusted data. - There are no shared secrets between A and B. There are no public keys. These users do not know or trust each other at all. Somehow this scenario works for image files. It works for HTML files. It even more or less works for Javascript files, and Javascript is a Turing-complete programming language. But it does not work at all for Boost.Serialization archives, and it cannot be made to work for Boost.Serialization archives. That's a pretty serious restriction for Boost.Serialization. The kind that should go in big bold red text on the main page of the Boost.Serialization documentation. It's not a bug per se, but it means that Boost.Serialization can only be used for a small subset of serialization scenarios, and the user could easily miss this restriction if they don't read the entire documentation of the library. -- Rainer Deyke - rainerd@eldwood.com
On 23 Jun 2026 09:03, Rainer Deyke via Boost wrote:
On 6/22/26 23:59, Seth via Boost wrote:
On Mon, Jun 22, 2026, at 11:12 PM, Chris Frey via Boost wrote:
I haven't seen mention of this on this list yet, so just passing it along.
Vulnerability page: https://vuldb.com/cve/CVE-2026-11460
More details: https://gist.github.com/TrebledJ/b7c872f869b5ed7cbd936f71f16c7d75
Isn't this by design and as documented?
Boost Serialization does not have checksums/tampering protection. Basically, reading untrusted archives is a no-no because malformed archives lead to undefined behavior. I believe this is documented under some version compatibility paragraphs, and likely under the `archive_flags`?
That's a limitation of the scope of the library, but not necessarily in application, because the protection/authentication can be built into a higher layer of the serialization that is based on Boost Serialization archives.
Using checksums for authentication, or indeed any authentication system whatsoever, is quite problematic for serialization. Data sanitation is not an authentication problem, and no authentication may be possible.
Imagine this scenario: - User A creates a file and uploads it onto the internet, without any knowledge of who is going to consume the file. - User B downloads the file and loads it in program C, trusting the security of program C to remain secure in the presence of untrusted data. - There are no shared secrets between A and B. There are no public keys. These users do not know or trust each other at all.
Somehow this scenario works for image files. It works for HTML files. It even more or less works for Javascript files, and Javascript is a Turing-complete programming language. But it does not work at all for Boost.Serialization archives, and it cannot be made to work for Boost.Serialization archives.
I believe, it works in the same definition of "works" as for the other file types you mention. That is, the user B can download and use the archive, but he has no way of knowing whether the file he downloaded is the same as the one uploaded by user A. This includes possible tampering or unintentional corruption in transit. The user B also doesn't know whether the file he downloaded is safe to use. E.g. is the JavaScript safe to run? Does the image file have some embedded data that exploits a vulnerability in the image parsing library? etc. A parsing library, including Boost.Serialization, must be prepared for malformed content, though. If the downloaded archive is broken (i.e. invalid in terms of the archive format), Boost.Serialization must give the user an error when attempting to read it instead of crashing or corrupting memory. Requirements beyond that, I think, are out of scope for a deserialization library as they typically can't be fulfilled at the deserialization level.
El 23/06/2026 a las 10:07, Andrey Semashev via Boost escribió:
On 23 Jun 2026 09:03, Rainer Deyke via Boost wrote:
On 6/22/26 23:59, Seth via Boost wrote:
On Mon, Jun 22, 2026, at 11:12 PM, Chris Frey via Boost wrote:
I haven't seen mention of this on this list yet, so just passing it along.
Vulnerability page: https://vuldb.com/cve/CVE-2026-11460
More details: https://gist.github.com/TrebledJ/b7c872f869b5ed7cbd936f71f16c7d75 Isn't this by design and as documented?
Boost Serialization does not have checksums/tampering protection. Basically, reading untrusted archives is a no-no because malformed archives lead to undefined behavior. I believe this is documented under some version compatibility paragraphs, and likely under the `archive_flags`?
That's a limitation of the scope of the library, but not necessarily in application, because the protection/authentication can be built into a higher layer of the serialization that is based on Boost Serialization archives. Using checksums for authentication, or indeed any authentication system whatsoever, is quite problematic for serialization. Data sanitation is not an authentication problem, and no authentication may be possible.
Imagine this scenario: - User A creates a file and uploads it onto the internet, without any knowledge of who is going to consume the file. - User B downloads the file and loads it in program C, trusting the security of program C to remain secure in the presence of untrusted data. - There are no shared secrets between A and B. There are no public keys. These users do not know or trust each other at all.
Somehow this scenario works for image files. It works for HTML files. It even more or less works for Javascript files, and Javascript is a Turing-complete programming language. But it does not work at all for Boost.Serialization archives, and it cannot be made to work for Boost.Serialization archives. I believe, it works in the same definition of "works" as for the other file types you mention. That is, the user B can download and use the archive, but he has no way of knowing whether the file he downloaded is the same as the one uploaded by user A. This includes possible tampering or unintentional corruption in transit. The user B also doesn't know whether the file he downloaded is safe to use. E.g. is the JavaScript safe to run? Does the image file have some embedded data that exploits a vulnerability in the image parsing library? etc.
A parsing library, including Boost.Serialization, must be prepared for malformed content, though. If the downloaded archive is broken (i.e. invalid in terms of the archive format), Boost.Serialization must give the user an error when attempting to read it instead of crashing or corrupting memory. Requirements beyond that, I think, are out of scope for a deserialization library as they typically can't be fulfilled at the deserialization level.
100% agree. The only security-related requirement we should put on Boost.Serialization, and we should put it, is that no UB be generated on archive loading time. Joaquín M López Muñoz
On 6/23/26 2:23 AM, Joaquin M López Muñoz via Boost wrote:
100% agree. The only security-related requirement we should put on Boost.Serialization, and we should put it, is that no UB be generated on archive loading time.
Joaquín M López Muñoz
I believe that there is no possible undefined for loading archives which have been saved in the same format as that loaded. The only scenario I could think of where this could occur would be: a) There is an error in the usage of the library in that the user code implementing the "saving" of an archive is not consistent with the code implementing the "loading" of tthat archive. b) The archive being loaded has been altered from the origainally saved one. (I called this tampering). This whole concern arose when a user ran some type lint type program which detected a line which used a piece of data from the archive which could be be null or something like that. But if such an archive has been created by the library, it could never contain such data. So the only concern that I could think of would be tampering. Robert Ramey
_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/ boost@lists.boost.org/message/UQG7D4H6K3HHRPJWECRXDKRFHGAQ3KZW/
On 23 Jun 2026 18:55, Robert Ramey via Boost wrote:
On 6/23/26 2:23 AM, Joaquin M López Muñoz via Boost wrote:
I believe that there is no possible undefined for loading archives which have been saved in the same format as that loaded. The only scenario I could think of where this could occur would be:
a) There is an error in the usage of the library in that the user code implementing the "saving" of an archive is not consistent with the code implementing the "loading" of tthat archive.
Could a) be the case, for example, if the archive being loaded was saved by an older or different version of the program? In particular, if the set of data or types of data that was saved into the archive is different from what is expected by the loader? Does Boost.Serialization report an error in this case? Is the loader application able to tell that this mismatch has happened?
On 6/23/26 9:38 AM, Andrey Semashev via Boost wrote:
On 23 Jun 2026 18:55, Robert Ramey via Boost wrote:
On 6/23/26 2:23 AM, Joaquin M López Muñoz via Boost wrote:
I believe that there is no possible undefined for loading archives which have been saved in the same format as that loaded. The only scenario I could think of where this could occur would be:
a) There is an error in the usage of the library in that the user code implementing the "saving" of an archive is not consistent with the code implementing the "loading" of tthat archive.
In general a the library cannot always detect when a user uses it in an incorrect manner. Perhaps it does sometimes - I don't know.
Could a) be the case, for example, if the archive being loaded was saved by an older or different version of the program? In particular, if the set of data or types of data that was saved into the archive is different from what is expected by the loader? Does Boost.Serialization report an error in this case? Is the loader application able to tell that this mismatch has happened?
The boost serialization library includes a provision for versioning archives. Each time the serialization format is changed the version number should be incremented. Each time an archive is loaded, the original version is available and can be compared against the current version. So provision can/should be made to include conditional code to read previous versions. This is well known and commonly done and I believe documented in the reference documentation. No code which follows this procedure should ever suffer undefined behavior. I can imagine some rare cases where there might still be a problem. One that comes to mind is were one creates an archive on a 32 bit machine that stores an integer greater than 2^16 then tries to load it to a 16 bit machine. If one had nothing else to do, he could comb trough and detect such cases. it's not that easy as they are really ununusual and as far as I know, there have been no reported problems of this nature. So it would be hard to know if one found all the pathalogical cases. Robert Ramey
_______________________________________________ Boost mailing list -- boost@lists.boost.org To unsubscribe send an email to boost-leave@lists.boost.org https://lists.boost.org/mailman3/lists/boost.lists.boost.org/ Archived at: https://lists.boost.org/archives/list/boost@lists.boost.org/message/SH3XR22N...
On Tue, Jun 23, 2026 at 9:14 AM Robert Ramey via Boost < boost@lists.boost.org> wrote:
On 6/23/26 2:23 AM, Joaquin M López Muñoz via Boost wrote:
100% agree. The only security-related requirement we should put on Boost.Serialization, and we should put it, is that no UB be generated on archive loading time.
Joaquín M López Muñoz
I believe that there is no possible undefined for loading archives which have been saved in the same format as that loaded. The only scenario I could think of where this could occur would be:
a) There is an error in the usage of the library in that the user code implementing the "saving" of an archive is not consistent with the code implementing the "loading" of tthat archive.
b) The archive being loaded has been altered from the origainally saved one. (I called this tampering).
Yes, this is exactly the aforementioned precondition that shouldn't exist in a library such as Serialization. We should have no preconditions. The library should exhibit no UB for all possible inputs. We should petition the Alliance to set up fuzzing for security-critical libraries like Serialization. - Christian
Christian Mazakas wrote:
On Tue, Jun 23, 2026 at 9:14 AM Robert Ramey via Boost < boost@lists.boost.org> wrote:
On 6/23/26 2:23 AM, Joaquin M López Muñoz via Boost wrote:
100% agree. The only security-related requirement we should put on Boost.Serialization, and we should put it, is that no UB be generated on archive loading time.
Joaquín M López Muñoz
I believe that there is no possible undefined for loading archives which have been saved in the same format as that loaded. The only scenario I could think of where this could occur would be:
a) There is an error in the usage of the library in that the user code implementing the "saving" of an archive is not consistent with the code implementing the "loading" of tthat archive.
b) The archive being loaded has been altered from the origainally saved one. (I called this tampering).
Yes, this is exactly the aforementioned precondition that shouldn't exist in a library such as Serialization.
We should have no preconditions. The library should exhibit no UB for all possible inputs.
I agree with Christian; deserialization should never be undefined behavior, regardless of input. How the "wrong" input has been produced, or has been brought into being, is immaterial. Bit flips happen. Transmission errors happen. It's not always malicious tampering, but it wouldn't matter even if it were.
On 6/23/26 10:07, Andrey Semashev via Boost wrote:
I believe, it works in the same definition of "works" as for the other file types you mention. That is, the user B can download and use the archive, but he has no way of knowing whether the file he downloaded is the same as the one uploaded by user A. This includes possible tampering or unintentional corruption in transit. The user B also doesn't know whether the file he downloaded is safe to use. E.g. is the JavaScript safe to run? Does the image file have some embedded data that exploits a vulnerability in the image parsing library? etc.
Tampering is not an interesting threat. If Alice wants to download a file from Bob and instead receives a tampered file from Charlie, then that is only a problem if Alice trusts Bob more than Charlie. On the open internet, Alice shouldn't trust Bob at all, so a tampered file from Charlie is not more dangerous than the original file from Bob. If the browser allows hostile Javascript to be dangerous to the user, then that is a serious security issue with the browser. If an image parsing library has an exploitable vulnerability, then that is a serious security issue with both the library and the browser that uses it. These are bugs, critical bugs, that must be fixed immediately. There is no guarantee that an image downloaded from the internet is the particular image you wanted. However, there are only two possible valid responses a browser can have to a hostile image file. If it's a valid image file that can be displayed, then the browser must display it (even though it may not be the image the user wanted). If it's invalid or otherwise can't be displayed, the browser must cleanly handle the error without corrupting the process. The situation is analogous to the strong exception guarantee: the operation can succeed or fail, but either way it can't leave the system in an unspecified, much less a compromised, state. Meanwhile, the attitude of Boost.Serialization is that passing invalid data in results in undefined behavior, which means that it leaves the system in a compromised state by default, and there's not even a way to validate the data beforehand. The only choice is to trust the data, which means trusting the data source. Do you see the difference? -- Rainer Deyke - rainerd@eldwood.com
On 23 Jun 2026 14:12, Rainer Deyke via Boost wrote:
On 6/23/26 10:07, Andrey Semashev via Boost wrote:
I believe, it works in the same definition of "works" as for the other file types you mention. That is, the user B can download and use the archive, but he has no way of knowing whether the file he downloaded is the same as the one uploaded by user A. This includes possible tampering or unintentional corruption in transit. The user B also doesn't know whether the file he downloaded is safe to use. E.g. is the JavaScript safe to run? Does the image file have some embedded data that exploits a vulnerability in the image parsing library? etc.
Tampering is not an interesting threat. If Alice wants to download a file from Bob and instead receives a tampered file from Charlie, then that is only a problem if Alice trusts Bob more than Charlie. On the open internet, Alice shouldn't trust Bob at all, so a tampered file from Charlie is not more dangerous than the original file from Bob.
If the browser allows hostile Javascript to be dangerous to the user, then that is a serious security issue with the browser. If an image parsing library has an exploitable vulnerability, then that is a serious security issue with both the library and the browser that uses it. These are bugs, critical bugs, that must be fixed immediately.
There is no guarantee that an image downloaded from the internet is the particular image you wanted. However, there are only two possible valid responses a browser can have to a hostile image file. If it's a valid image file that can be displayed, then the browser must display it (even though it may not be the image the user wanted). If it's invalid or otherwise can't be displayed, the browser must cleanly handle the error without corrupting the process. The situation is analogous to the strong exception guarantee: the operation can succeed or fail, but either way it can't leave the system in an unspecified, much less a compromised, state.
Meanwhile, the attitude of Boost.Serialization is that passing invalid data in results in undefined behavior, which means that it leaves the system in a compromised state by default, and there's not even a way to validate the data beforehand. The only choice is to trust the data, which means trusting the data source.
Do you see the difference?
By the same logic as the browser, if the archive is valid, according to the archive format specification, Boost.Serialization has to parse it and present the result to the user. Otherwise, it must fail with an error. I'm not a Boost.Serialization user, but I'm assuming that this is currently the case (otherwise, this would be a bug in the library). In this sense, the library already does validate the data. Now, the parsed archive may not be valid from the user's application standpoint (i.e. if the parsed data does not describe a valid state of the application), but I don't think Boost.Serialization is in a position to validate it at that level. Or maybe I'm missing something about Boost.Serialization.
On 6/23/26 14:18, Andrey Semashev via Boost wrote:
By the same logic as the browser, if the archive is valid, according to the archive format specification, Boost.Serialization has to parse it and present the result to the user. Otherwise, it must fail with an error. I'm not a Boost.Serialization user, but I'm assuming that this is currently the case (otherwise, this would be a bug in the library). In this sense, the library already does validate the data.
Now, the parsed archive may not be valid from the user's application standpoint (i.e. if the parsed data does not describe a valid state of the application), but I don't think Boost.Serialization is in a position to validate it at that level. Or maybe I'm missing something about Boost.Serialization.
Obviously Boost.Serialization cannot do anything about application-level invariants it doesn't know about. It is the job of the application to validate data it receives from Boost.Serialization. But there are cases where Boost.Serialization itself invokes undefined behavior before the data ever gets to the user code, like the problems mentioned by the OP of the thread. And again, it's not an error for a function to invoke undefined behaviors if its preconditions are not met. It's just a really problematic design decision if the function in question is deserialization function. -- Rainer Deyke - rainerd@eldwood.com
On Tue, Jun 23, 2026, at 8:03 AM, Rainer Deyke via Boost wrote:
Using checksums for authentication, or indeed any authentication system whatsoever, is quite problematic for serialization. Data sanitation is
Do you mean input sanitation? Input sanitation really isn't applicable here. There's no dangerous things to avoid in a generic way. You **could** have safety limits (like maximum number of container elements, max registered types, max memory allocated, stuff like that. Bounds checking is already in place anyways, correct me if I'm wrong).
not an authentication problem, and no authentication may be possible.
Imagine this scenario: - User A creates a file and uploads it onto the internet, without any knowledge of who is going to consume the file. - User B downloads the file and loads it in program C, trusting the security of program C to remain secure in the presence of untrusted data. - There are no shared secrets between A and B. There are no public keys. These users do not know or trust each other at all.
The idea here is: don't load the file if you cannot trust the source. The corollary is that if you are devising a format for this kind of untrusted exchange, you have to build in your own protection on top of any underlying archive format, whether Boost or not. Note that the same already goes for simple tar archives. Here too, consuming applications can use limits (e.g. path traversal limits, expanded size limits?)
Somehow this scenario works for image files.
It's easier for a special/single purpose fixed format.
It works for HTML files. I'll argue it clearly doesn't, unless you **only** parse to /dev/null
It even more or less works for Javascript files, Yeah, no. It clearly works more less than more here. JSON might be your "more or less" scenario?
But it does not work at all for Boost.Serialization archives, and it cannot be made to work for Boost.Serialization archives.
I'm in favor of building in some basic restrictions along the way, which certainly reduce the harm of corrupted/malicious archives. Best case it avoid comprosing the consuming code, while rejecting the input. It *might* involve disabling support for say polymorphic types under "strict" deserialiation settings(?), but other than that, nothing too seriously hampering (akin to "body length limits" on HTTP message parsing).
That's a pretty serious restriction for Boost.Serialization. The kind that should go in big bold red text on the main page of the Boost.Serialization documentation. It's not a bug per se, but it means that Boost.Serialization can only be used for a small subset of serialization scenarios, and the user could easily miss this restriction if they don't read the entire documentation of the library.
I'm torn here. The Boost community has tended to assume users know what they are doing. They tend to err on the side of giving the user all the power to optimally tune things for their use-case (I don't think Flyweight has big red notices about DoS attacks e.g.). Very similar concerns apply to multiple other libraries: Spirit Parsers (and Boost Parser too, IIRC) *by* default have no limits on variable-length input constructs at all. It can be easy to very involved to build a secure parser in them. Boost Interprocess managed segments have similar UB caveats when mixing versions/architecture or, worse, mapping pages from untrusted sources. I'd flip this around. I did ever get an implied promise of consistency, integrity checks or error detection from ANY part of the Boost Serialization docs. Quite the contrary. There's many spots that have caveats around risking UB if used/versioned improperly. Seth
On 6/23/26 14:27, Seth via Boost wrote:
On Tue, Jun 23, 2026, at 8:03 AM, Rainer Deyke via Boost wrote:
Using checksums for authentication, or indeed any authentication system whatsoever, is quite problematic for serialization. Data sanitation is
Do you mean input sanitation? Input sanitation really isn't applicable here. There's no dangerous things to avoid in a generic way. You **could** have safety limits (like maximum number of container elements, max registered types, max memory allocated, stuff like that. Bounds checking is already in place anyways, correct me if I'm wrong).
Input sanitation is always applicable when there is input involved. Defining a bunch of maximums isn't really useful. Application-level maximums can be checked after getting the containers from Boost.Serialization, along with all other application invariants. Worst that can happen, assuming no errors in Boost.Serialization, is that Boost.Serialization invokes a failsafe by throwing std::bad_alloc or calling std::terminate. Not a problem for desktop applications.
The idea here is: don't load the file if you cannot trust the source. The corollary is that if you are devising a format for this kind of untrusted exchange, you have to build in your own protection on top of any underlying archive format, whether Boost or not.
"Never trust user input" is one of the cardinal rules of software development. It may not apply if the "user" is a programmer passing obvious nonsense directly into a function, but it always applies if the input comes from a file or a network connection, no matter how "trusted" the computer on the other side of the network. Any data that enters a process by any means, even from a different process that's part of the same application, is untrusted. (I do make an exception for data that originates within the process.) There is no protection that a user can provide on top of Boost.Serialization that prevents Boost.Serialization from invoking undefined behavior. The user can't validate the data /before/ passing it to Boost.Serialization without reimplementing a better version of Boost.Serialization, and the user can't validate the data after it comes out of Boost.Serialization because at that point the process is already compromised.
I'm torn here. The Boost community has tended to assume users know what they are doing. They tend to err on the side of giving the user all the power to optimally tune things for their use-case (I don't think Flyweight has big red notices about DoS attacks e.g.).
Very similar concerns apply to multiple other libraries: Spirit Parsers (and Boost Parser too, IIRC) *by* default have no limits on variable-length input constructs at all. It can be easy to very involved to build a secure parser in them. Boost Interprocess managed segments have similar UB caveats when mixing versions/architecture or, worse, mapping pages from untrusted sources. Being able to overwhelm a process by sending it into an infinite loop or by consuming a lot of memory isn't very interesting to me. Almost all programs are vulnerable to locking up and crashing. That's usually not a security issue. When it is, you're usually either in an embedded system where user input is very limited or you're on an internet server where DoS attack and defense ultimately comes down to a war of attrition.
Boost.Interprocess does look very dangerous. Obviously so. I don't need a big red warning to know to stay away from it unless I am really sure I can tolerate the risk of using it. I don't think that applies to Boost.Serialization, because writing safe serialization code isn't that hard. -- Rainer Deyke - rainerd@eldwood.com
participants (8)
-
Andrey Semashev -
Chris Frey -
Christian Mazakas -
Joaquin M López Muñoz -
Peter Dimov -
Rainer Deyke -
Robert Ramey -
Seth