Char Class Reference
 
 
 
Char Class Reference

#include <maxchar.h>

Inheritance diagram for Char:
MaxHeapOperators

Class Description

Represents a single unicode character.

The main advantage of this class is the ability of representing them without allocating external memory.

This class was designed to be able to cast "unsigned int*" array to a "Char*" array. This class doesn't have any destructor or virtual functions.

Public Member Functions

  Char ()
  Char (char)
  Create a character with a single-byte Active Code Page character.
  Char (wchar_t)
  Create a character with a single-WCHAR UTF16 character.
  Char (const MaxString &)
  Char (const wchar_t *)
  Char (const char *)
  Char (unsigned int code_page, const char *)
  Char (unsigned int ch)
  Create a character using a UCS4 character.
MaxString  ToMaxString () const
  Convert this character to a MaxString.
size_t  ToUTF16 (wchar_t *string, size_t length) const
  Convert this character to UTF16.
size_t  ToACP (char *string, size_t length) const
  Convert this character to the active code page.
size_t  ToCP (unsigned int cp, char *string, size_t length) const
  Convert this character to the specified code page.
size_t  ToMCHAR (MCHAR *s, size_t length) const
  Convert this character to an MCHAR (Active Code Page or UTF16).
bool  operator== (char) const
  Compare this character to an active code page character.
bool  operator== (wchar_t) const
  Compare this character to a UTF16 character.
bool  operator== (const Char &) const
  Compare this character to a UTF16 character.
bool  operator!= (char c) const
  Not equal operator for an active code page character.
bool  operator!= (wchar_t c) const
  Not equal operator for a UTF16 character.
bool  operator!= (const Char &c) const
  Not equal operator for a UTF16 character.
Char operator= (wchar_t)
  Assignment operator for a UTF16 character.
Char operator= (char)
  Assignment operator for an active code page character.
Char operator= (const MaxString &)
  Assignment operator for a MaxString.
Char operator= (const wchar_t *)
  Assignment operator for a constant UTF16 character.
Char operator= (const char *)
  Assignment operator for a constant active code page character.
Char Set (unsigned int cp, const char *string)
  Change this character to represents the first character of the string specified, based on the code page specified.
bool  IsNull () const
  Returns true if this character represents a NULL character.
bool  IsValid () const
  Returns true if this character represents a non-NULL character.
unsigned int  CharCode () const
  Returns the character as UTF32.
bool  IsSpace () const
bool  IsDigit () const
bool  IsAlpha () const
bool  IsAlphaNumeric () const
bool  IsPunctuation () const
bool  IsUnicode () const
Char  ToUpper () const
Char  ToLower () const
MCHAR *  IsInCharacterSet (const MCHAR *) const
  Determine if this character is found inside a particular character set.

Static Public Member Functions

static bool  IsValidUTF8 (const void *data, size_t len, bool *pFoundExtendedChar=0, size_t *pPartialCharAtEnd=0)
  Determine if the UTF-8 string is valid.
static unsigned int  UTF8CharLength (unsigned char character)
  Determine the length of an UTF-8 character.
static bool  IsUTF16LeadChar (wchar_t character)
  Determine if the UTF-16 char is a lead character.

Protected Attributes

unsigned int  _ch

Constructor & Destructor Documentation

Char ( )
Char ( char  )

Create a character with a single-byte Active Code Page character.

This constructor is best used with a compile-time constant because if you specify a only the first half of a DBCS character this function will throw an exception.

Char ( wchar_t  )

Create a character with a single-WCHAR UTF16 character.

This constructor is best used with a compile-time constant because if you specify a only the first half of a UTF16 character this function will throw an exception.

Char ( const MaxString )
Char ( const wchar_t *  )
Char ( const char *  )
Char ( unsigned int  code_page,
const char *   
)
Char ( unsigned int  ch ) [explicit]

Create a character using a UCS4 character.

We specified "explicit" because "unsigned int" are so generic that we didn't want to construct a "Char" implicitly by mistake.


Member Function Documentation

MaxString ToMaxString ( ) const

Convert this character to a MaxString.

size_t ToUTF16 ( wchar_t *  string,
size_t  length 
) const

Convert this character to UTF16.

Parameters:
string string buffer to put the converted characters
length the buffer size of param string in number characters
Returns:
number of characters converted.
size_t ToACP ( char *  string,
size_t  length 
) const

Convert this character to the active code page.

Parameters:
string - string buffer to put the converted characters
length - the buffer size of param string in number characters
Returns:
- number of characters converted.
size_t ToCP ( unsigned int  cp,
char *  string,
size_t  length 
) const

Convert this character to the specified code page.

Parameters:
string string buffer to put the converted characters
length the buffer size of param string in number characters
Returns:
number of characters converted.
size_t ToMCHAR ( MCHAR *  s,
size_t  length 
) const [inline]

Convert this character to an MCHAR (Active Code Page or UTF16).

Parameters:
s string buffer to put the converted characters
length the buffer size of param s in number characters
Returns:
number of characters converted.
{ return ToACP(s, length); };
bool operator== ( char  ) const

Compare this character to an active code page character.

bool operator== ( wchar_t  ) const

Compare this character to a UTF16 character.

bool operator== ( const Char ) const

Compare this character to a UTF16 character.

bool operator!= ( char  c ) const [inline]

Not equal operator for an active code page character.

{ return !(*this == c); }
bool operator!= ( wchar_t  c ) const [inline]

Not equal operator for a UTF16 character.

{ return !(*this == c); }
bool operator!= ( const Char c ) const [inline]

Not equal operator for a UTF16 character.

{ return !(*this == c); }
Char& operator= ( wchar_t  )

Assignment operator for a UTF16 character.

Char& operator= ( char  )

Assignment operator for an active code page character.

Char& operator= ( const MaxString )

Assignment operator for a MaxString.

Char& operator= ( const wchar_t *  )

Assignment operator for a constant UTF16 character.

Char& operator= ( const char *  )

Assignment operator for a constant active code page character.

Char& Set ( unsigned int  cp,
const char *  string 
)

Change this character to represents the first character of the string specified, based on the code page specified.

Parameters:
cp - The specified code page
string - The string to use as a copy source
bool IsNull ( ) const [inline]

Returns true if this character represents a NULL character.

{ return _ch == 0; }
bool IsValid ( ) const [inline]

Returns true if this character represents a non-NULL character.

{ return _ch != 0; }
unsigned int CharCode ( ) const [inline]

Returns the character as UTF32.

{ return _ch; }
bool IsSpace ( ) const
bool IsDigit ( ) const
bool IsAlpha ( ) const
bool IsAlphaNumeric ( ) const
bool IsPunctuation ( ) const
bool IsUnicode ( ) const [inline]
{ return _ch > 0x7e; }
Char ToUpper ( ) const
Char ToLower ( ) const
static bool IsValidUTF8 ( const void *  data,
size_t  len,
bool *  pFoundExtendedChar = 0,
size_t *  pPartialCharAtEnd = 0 
) [static]

Determine if the UTF-8 string is valid.

Parameters:
data Buffer pointer to validate encoding of
len Size of the buffer to validate
pFoundExtendedChar If not null, returns if the presence of multi-bytes character was found.
pPartialCharAtEnd If not null, returns if the number of bytes missing for the last character. The beginning was correct, but it missed the ending.
Returns:
True if the string is a valid UTF8, False otherwise
static unsigned int UTF8CharLength ( unsigned char  character ) [static]

Determine the length of an UTF-8 character.

Parameters:
character UTF-8 character to get the length of
Returns:
length of the character in bytes.
static bool IsUTF16LeadChar ( wchar_t  character ) [static]

Determine if the UTF-16 char is a lead character.

In some situation, UTF-16 can be stored on four bytes instead of two.

Parameters:
character Character to test
Returns:
True if the character is a lead character, False otherwise
MCHAR* IsInCharacterSet ( const MCHAR *  ) const

Determine if this character is found inside a particular character set.

This function slightly differs from strrchr or wcsrchr, because it can search DBCS or UTF16 character accurately inside a string without partially matching it.


Member Data Documentation

unsigned int _ch [protected]