![]() |
VOOZH | about |
For the purpose of character encoding and decoding, java offers a number of classes in the 'java.nio.charset' package. The 'CharsetEncoder' class of this package performs the important task of encoding. In this article, let us understand this class, its syntax, different methods, and some examples of error handling and optimization techniques.
The 'CharsetEncoder' class is imported from 'java.nio.charset' package.
The basic function of the class is to use a certain character set or an encoding known as a Charset. It converts the character sequences into byte format. This class is commonly used for activities such as writing textual data to files, transmitting data over the network, and encoding/decoding data between different character encodings.
CharsetEncoder translates a character input to a byte output. The internal character representation of Java which is usually UTF-16, is encoded and converted into the byte representation of the chosen character encoding (eg. UTF-8, etc).
public abstract class CharsetEncoder extends Object
Constructor associated with CharsetEncoder and its description.
Constructor | Modifier | Description |
|---|---|---|
CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar) | protected | A new encoder for a given Charset is initialized with the maximum and average bytes per character specified by the CharsetEncoder constructor. |
CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement) | protected | A new encoder for a given Charset is initialized by the CharsetEncoder constructor with an estimated average and maximum number of bytes per character as well as a unique alternative byte sequence for characters that cannot be mapped. |
Table of the methods associated with CharsetEncoder and its description.
Modifier and Type | Method | Description |
|---|---|---|
float | averageBytesPerChar() | Returns the average number of bytes that will be generated for every input character. |
boolean | canEncode(char c) | Indicates if the specified character can be encoded by this encoder. |
boolean | canEncode(CharSequence cs) | Indicates if the provided character sequence can be encoded by this encoder. |
Charset | charset() | Returns the charset that created this encoder. |
ByteBuffer | encode(CharBuffer in) | Encodes the remaining data from a single input character buffer into a newly-allocated byte buffer |
CoderResult | encode(CharBuffer in, ByteBuffer out, boolean endOfInput) | Writes the results to the specified output buffer after encoding as many characters as possible from the provided input buffer. |
protected abstract CoderResult | encodeLoop(CharBuffer in, ByteBuffer out) | Encodes one or more characters into one or more bytes. |
CoderResult | flush(ByteBuffer out) | Flushes the encoder. |
protected CoderResult | implFlush(ByteBuffer out) | Flushes the encoder. |
protected void | implReset() | Clears any internal state specific to a given charset by resetting this encoder. |
boolean | isLegalReplacement(byte[] repl) | Indicates if the provided byte array is a valid replacement value for this encoder. |
float | maxBytesPerChar() | Returns the maximum number of bytes that can be generated for each input character. |
CharsetEncoder | reset() | Resets the encoder, clearing any internal state. |
byte[] | replacement() | Returns the replacement value for this encoder. |
CharsetEncoder | replaceWith(byte[] newReplacement) | Modifies the replacement value of this encoder. |
The Methods included with Charset class are inherited by java.lang.Object .
In this example, the input string is encoded into bytes using the CharsetEncoder with UTF-8 character encoding.
It covers on how to construct a CharsetEncoder, encode the characters, place the input text within a CharBuffer, then output the data that has been encoded. It has basic error handling to address any issues that may come up during the encoding process.
CharsetEncoder Example
The UTF-8 character encoding can encode only the characters that lie within the Unicode standard. There are some special characters or symbols that cannot be recognized by this encoding technique. In order to prevent problems, the errors need to be handled using some methods. In the below given example, we have given an input string which contains a special symbol 'Ω', that is not mappable using UTF-8. We use the 'onUnmappableCharacter' and 'CodingErrorAction.REPLACE' methods to replace these unmappable characters with any different character.
In the code below, whenever we encounter 'Ω', it is replaced by '?' which indicates that the special symbol is replaced with a fallback character for error handling.
Encoded String: Charset ? Encoder
Now that we have understood about the encoding operations with the help of CharsetEncoder class, it is important to know about how to improve the efficiency and performance when dealing with larger volumes of data.
In this article, we covered all the methods and best practices related to the CharsetEncoder class. From syntax, constructors to error handling and optimization techniques, we explored how to utilize this class for character encoding tasks in Java applications.