Boost logo

Boost :

Subject: Re: [boost] [new Warnings policy] MS C4180 on the Maintenance Guidelines
From: Patrick Horgan (phorgan1_at_[hidden])
Date: 2009-11-23 16:24:06


Zachary Turner wrote:
> On Mon, Nov 16, 2009 at 3:08 PM, Patrick Horgan <phorgan1_at_[hidden]> wrote:
>
> ... elision by Patrick ...
> C4996 Microsoft deprecated these functions on their own (and they call
>
>> themselves POSIX compliant), and replaced them with extremely dangerous
>> replacements that they claim are safe. The replacement functions are
>> non-POSIX and aren't available from other compilers, hence non-portable.
>> These warnings should be suppressed. I believe that you can also
>> suppress
>> by defining _CRT_NONSTDC_NO_DEPRECATE before and undef'ing after to do a
>> local suppression.
>>
>>
> Somewhat off topic, but why exactly are these extremely dangerous, and not
> secure?
>
Sorry about the delay in responding to this, but Nick Stoughton says it
much better than I do, so I was waiting for him to get back to me and
let me know if I could just quote him on it. Nicely enough, he did, so
the following is from him, it's the original liaison memo from POSIX to
C when C was considering a proposal to add these interfaces into the lib:

Nick Stoughton wrote:
> SC 22/WG 14 N1160 Austin Group Concerns on PDTR
24731 Stoughton
> 2006-02-27
> Members of the Austin Group have been reviewing the proposed Technical
> Report on "Bounds Checking Functions" over the last year, and wish to
> express their concerns over its direction.
> The proposed interfaces fail to address many of the aspects related to
> buffer overflow and as a result are only suitable for a narrow range
> of applications.
> The basic idea embodied by the proposed interface is not a new one.
> For example, the proposed strcpy_s function is similar to the strlcpy
> function of OpenBSD 2.4 (1998). However, the basic idea has not achieved
> practical consensus; on the contrary, for reasons discussed below it has
> been controversial almost since it was introduced. A Technical Report
> of type 2 does not seem warranted here: the subject is controversial
> rather than being under technical development, and mere publication of
> a TR is unlikely to further consensus.
> The core of the problem is that memory handling in C is complicated and
> error prone. Nobody doubts that improvements in the supporting APIs
> are useful. However, the existing APIs already provide all the means
> to write correct programs, although it is often cumbersome to do so.
> The proposed interfaces don't change that and, to the contrary, can
> make programs even more complex. A better solution would be to take
> the memory handling off the hands of the programmer as much as possible.
> Let's look at the string functions first. Obviously, code like
> void f(char *t, const char *s1, const char *s2) {
> strcpy(t, s1);
> strcat(t, s2);
> }
> is bad. But it is not written this way because it is impossible to
> write correct code. Obviously the length of the target buffer can be
> passed, though this alters the ABI (see below):
> void f(char *t, size_t tlen, const char *s1, const char *s2) {
> if (strlen(s1)+strlen(s2) >= tlen) abort();
> strcpy(t, s1);
> strcat(t, s2);
> }
> This is cumbersome to write and slow which is why programmers don't
do it.
> But, more importantly, this kind of change cannot retroactively made
> because it changes both the API and ABI. New interfaces would have to be
> introduced (give the new function a different name) but then one might
> as well write a better function than this. If f() is in a third party
> library, it cannot change without all the customers of the library
> changing and recompiling/relinking their applications.
> The version using the proposed interfaces has exactly the same problem.
> void f(char *t, rsize_t tlen, const char *s1, const char *s2) {
> if (strcpy_s(t, tlen, s1) != 0 || strcat_s(t, tlen, s2) != 0)
> abort();
> }
> If anything, this code is even less obvious then the previous version
> even though it is likely a bit faster.
> Even using the exception handler to provide the abort requires the ABI
> to change:
> void f(char *t, rsize_t tlen, const char *s1, const char *s2) {
> (void) set_constraint_handler_s(abort_handler_s);
>
Page 1 of 4
> strcpy_s(t, tlen, s1);
> strcat_s(t, tlen, s2);
> }
> Another problem with respect to the string functions being used to fix up
> existing code can be demonstrated with this code sequence:
> char *p = malloc (3 * NAME_LEN);
> strcpy (p, name1);
> strcat (p, name2);
> strcat (p, name3);
> All too often fixed values like NAME_LEN are used which are the basis for
> overflows. A programmer could certainly use
> char *p = malloc(strlen(name1) + strlen(name2) + strlen(name3) + 1);
> if (p != NULL)
> {
> strcpy (p, name1);
> strcat (p, name2);
> strcat (p, name3);
> }
> but this is once again cumbersome and therefore won't be used. Now
> assume the new string functions. The code to correctly handle
> the code (more correct than either of the previous two code sequences,
> this is the goal) could look something like this:
> rsize_t len = 3 * NAME_LEN;
> char *p = NULL;
> again:
> char *p2 = realloc(p, len);
> if (p2 == NULL)
> abort ();
> p = p2;
> if (strcpy_s(p, len, name1) != 0
> || strcat_s(p, len, name2) != 0
> || strcat_s(p, len name3) != 0)
> {
> len += 2;
> goto again;
> }
> Nobody can say that this is more appealing to the programmer and
> it is unlikely that code like this will find its way
> into many programs.
> Once again, this can be written more simply as
> char *p = malloc(3*NAME_LEN);
> set_constraint_handler_s(abort_handler_s);
> strcpy_s(p, NAME_LEN, name1);
> strcat_s(p, strlen(p)+NAME_LEN, name2);
> strcat_s(p, strlen(p)+NAME_LEN, name3);
> but this still has several vulnerabilities and coding weaknesses,
> resulting in different end results for p than previously correctly
> functioning code:
> 1. Is it certain that name1, name2 and name3 really were a
> maximum of NAMELEN on entry to this code? What if name1 was
> NAME_LEN+10, while name2 was NAME_LEN-10?
> 2. The code needs to be aware of the exception handler; if the
> handler is not the abort handler, the code *should* check the
> return code and take appropriate steps if it is to function
> correctly. A simple drop in replacement as above will fail with
> the ignore-handler.
>
Page 2 of 4
> The new string functions are meant as an aid to secure existing code
> bases but the requirement to change ABI (new function parameters,
> additional elements in structures, etc) plus the added complexity of the
> code makes the adoption of these interfaces higher unlikely.
> Instead, a better approach is to eliminate the requirement on the
> programmer to deal with the allocation him/herself. The runtime should
> do this. In this users of systems with the GNU C library can simply use
> char *p;
> if (asprintf(&p, "%s%s%s", name1, name2, name2) == -1)
> abort();
> The runtime makes sure the target string is large enough and that
> error conditions (out-of-memory etc) are recognized. It does leave the
> programmer the responsibility of adding
> free(p);
> when he/she is finished with the result, but is this really harder than
> some of the contortions necessary to use the proposed interfaces?
> The concept of rsize_t and RSIZE_MAX also cause confusion. The lesson
> learned from "640 KiB is enough address space" is that there is no fixed
> limit which people wouldn't want to see lifted over time.
> On 32-bit systems we used to have 2 GiB or up to 3 GiB of address space
> available for user-level code. Nowadays the whole 4 GiB is available
because
> people asked for it. Any RSIZE_MAX chosen for 2 or 3 GiB address spaces
> would prevent using 4 GiB address spaces. The same is true for any other
> limit and it will definitely remain true for 64-bit architectures as
well.
> Any interface introduced solely for the purpose of using rsize_t instead
> of size_t is completely unnecessary. Aside from the problem of picking
> a size, using rsize_t for different sizes like tmpnam_s (for a string)
> wcscpy_s (a wide char string) and qsort_s (a number of element and a
> type size) makes no sense. How can a function reject handling strings
> of, say, 2^20 bytes but allow wcscpy_s to handling 2^22 bytes (on
> platforms with 4 byte wchar_t)?
> All of this functionality can be implemented with the existing
> implementation. It is always possible for the runtime to determine the
> maximum possible string length, for example, by looking at the gaps in
> the address space at startup time. This *dynamically* determined value
> can then be used for sanity checks; no correct program can ever use
> larger values. It is therefore no violation of the ISO C to handle
> these situations as error cases.
> Then there are the stdio and string functions which are now supposed to
> gracefully handle NULL pointers. However, as "drop-in replacements" for
> the original functions, these are likely to lead to security
weaknesses in
> the application where the unmodified program would crash. Programmers far
> too often don't check for errors and so NULL pointers and the like are
> used in places where they shouldn't. Now assume that the stream pointer
> is supposed to be for a stream where security logs are written to.
> If an attacker can overwrite the FILE* value with NULL no more output
> happens and security problems remain unreported and undetected. The only
> valid exception handler in these cases should be the abort_handler.
> There are a myriad of situations like this where unreported invalid
> input can cause problems. There is a problem with the special handling
> of printf("%s", NULL) in the GNU C library, which chose to print the
> string "(null)". In hindsight, this extension probably was not a good
> idea because it hides problems.
> The only reasonable way to handle invalid inputs is by brute force:
> abort the program or at the very least make absolutely sure the user
> notices the problem. A Denial of Service attack is much less severe and
> easier to notice and battle than an attack which causes, for instance,
>
Page 3 of 4
> logging to be disabled.
> The "drop-in" replacement technique introduces subtle semantic changes
> in the way an application will behave under a certain set of input data,
> and these semantic changes may well have unintended side effects that
> make the resulting program less "secure". As a result, any programmer
> altering an application to make changes to the interfaces to use these
> bounds-checking versions *must* do more than simply "drop-in" the new
> version. Since he/she is going to that much work anyway, moving to a
> paradigm using malloc'ed memory would be no harder, and is demonstrably
> safer.
> Another problem can arise in cases where the programmer makes incorrect
> assumptions about buffer sizes. In this case, the programmer believes
> that he/she has mitigated all buffer overflow problems by using the new
> interfaces, but in reality there are still buffer overflows possible in
> the code.
> We are not aware of any study demonstrating that the proposed approach
> leads to safer or more-secure software. On the contrary, when one of
> us very-briefly attempted to investigate the matter in 2002, he found
> that the technique (as embodied by the similar strlcpy/strlcat
> functions of OpenBSD) made C code harder to read and to maintain,
> while not catching any bugs in the code surveyed. The code surveyed
> was OpenSSH_3.0.2p1, the then-current version. For some details
> please see:
> http://sources.redhat.com/ml/libc-alpha/2002-01/msg00096.html
> http://sources.redhat.com/ml/libc-alpha/2002-01/msg00159.html
> In summary, there is no aspect of the proposal which is worth
> standardizing. Either no change is needed for the new interfaces or
> there is no chance that the interfaces will find widespread adoption.
>
Page 4 of 4
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk