Previous Page
Next Page

mblen

Determines the length of a multibyte character, or whether the multibyte encoding is stateful

#include <stdlib.h>
int mblen ( const char *s , size_t maxsize  );

The mblen( ) function determines the length in bytes of a multibyte character referenced by its pointer argument. If the argument points to a valid multibyte character, then mblen( ) returns a value greater than zero. If the argument points to a null character ('\0'), then mblen( ) returns 0. A return value of -1 indicates that the argument does not point to a valid multibyte character, or that the multibyte character is longer than the maximum size specified by the second argument. The LC_TYPE category in the current locale settings determines which byte sequences are valid multibyte characters.

The second argument specifies a maximum byte length for the multibyte character, and should not be greater than the value of the symbolic constant MB_CUR_MAX, defined in stdlib.h.

If you pass mblen( ) a null pointer as the first argument, then the return value indicates whether the current multibyte encoding is stateful. This behavior is the same as that of mbtowc( ). If mblen( ) returns 0, then the encoding is stateless. If it returns any other value, the encoding is stateful; that is, the interpretation of a given byte sequence may depend on the shift state.

Example

size_t mbsrcat( char * restrict s1, char * restrict s2,
                mbstate_t * restrict p_s1state, size_t n )
/* mbsrcat: multibyte string restartable concatenation.
 * Appends s2 to s1, respecting final shift state of destination string,
 * indicated by *p_s1state. String s2 must start in the initial shift state.
 * Returns: number of bytes written, or (size_t)-1 on encoding error.
 * Max. total length (incl. terminating null byte) is <= n;
 * stores ending state of concatenated string in *s1state.
 */
{
  int result;
  size_t i = strlen( s1 );
  size_t j = 0;

  if ( i >= n - ( MB_CUR_MAX + 1 ))  // Sanity check: room for 1 multibyte
                                     // char + string terminator.
    return 0;                        // Report 0 bytes written.

  // Shift s1 down to initial state:

  if ( !mbsinit( p_s1state ))   // If not initial state, then append
  {                             // shift sequence to get initial state.
    if ( ( result = wcrtomb ( s1+i, L'\0', p_s1state )) == -1 )
      {                         // Encoding error:
        s1[i] = '\0';           // Try restoring termination.
        return (size_t)-1;      // Report error to caller.
      }
    else
      i += result;
  }
  // Copy only whole multibyte characters at a time.
  // Get length of next char w/o changing state:
  while (( result = mblen( s2+j, MB_CUR_MAX )) <= (n - ( 1 + i )) )
  {
    if ( result == 0  ) break;
    if ( result == -1 )
    {                      // Encoding error:
      s1[i] = '\0';        // Terminate now.
      return (size_t)-1;   // Report error to caller.
    }
                           // Next character fits; copy it and update state:
    strncpy( s1+i, s2+j, mbrlen( s2+j, MB_CUR_MAX, p_s1state ));
    i += result;
    j += result;
  }
  s1[i] = '\0';
  return j;
}

See Also

mbrlen( ), mbtowc( )


Previous Page
Next Page