Character SET / ASCII Codes in C Language:
The character set is the set of characters that are supported by a programming language like C, C++, or any other language. So, the set of characters supported by a programming language will be the same as the set of characters that are supported by any computer system. We know that the computer system works on a binary number system. So, everything on the computer is numbers, then how they can support characters? So basically, they don’t support characters.
Then how do we make them work on characters? We will define some sets of numbers as characters. So, it means for every character we define some numeric value. So, for the English alphabet, for every character, there are some codes defined and those codes are standard codes. Every electronic machine follows that same set of codes and those codes are called as American Standard Code for Information Interchange. That are ASCII codes. These codes are given by American National Standards Institute which is ANSI and also it is ISO standard.
So, there is a reason that every electronic device supposedly is called in the English language and moreover, for other national languages like Chinese, Japanese, or Hindi, the codes are defined and those are ISO standard codes and those schools are called us UNICODE. So, we will first discuss ASCII codes then we will discuss a little bit about Unicode also.
ASCII Codes in C Language:
ASCII codes are for the English language. Now how the codes are defined? For every letter of character, there is a code available. Below is the list of some ASCII Codes:

So, these are uppercase letters codes starting from 65 (A) up to 90 (Z), lowercase letters codes are from 97 (a) up to 122 (z) and numbers are starting from 0 (48) up to 9 (57).
Now basically all the symbols that you find on the keyboard form a character set and for every symbol on the keyboard there is some ASCII code available now the other symbols are remaining like special characters i.e. *, %, $, (, ), [, ], !, ~… There are ASCII codes for special characters.
Generally, we work with the alphabet, numbers, and some special characters which we have written above.
I have written the ASCII codes and you should remember these ASCII codes for uppercase, lowercase as well as these numeric symbols, and also for ENTER ASCII code is 10, SPACEBAR ASCII code is 13 and ESCAPE ASCII code is 27. So, these may also be helpful if you can remember them.
From where ASCII codes are starting and where they are ending is important. A total of 128 ASCII codes are there. Starting from 0 to 127. To represent these ASCII codes, anyone symbol, 7 bits are sufficient that is binary bits.
Unicode in C Language:
Now let’s discuss Unicode. Unicode is for all the languages so ASCII Code becomes the subset of Unicode. Like English is also one of the languages so it becomes a subset of Unicode. Unicode takes 2 bytes of memory which is 16 bits. Because it is supporting all national languages and these 16 bits can be represented in the form of hexadecimal codes. Hexadecimal codes are represented in 4 bits so Unicode is represented in 4×4 bits hexadecimal which is 16 bits.
So, these Unicode are represented in the 4 digits of hexadecimal, like for example C03A. So, Unicode is represented in the form of hexadecimal. You can go to a website – Unicode.org. There you can find the code for various languages.
Character Array in C Language:
Now let us understand how a character is represented and what is a character array. Let us see how to declare a character type variable in C and C+ +:

char is a data type and we declare a variable name as a temp. It takes one bite of the memory. So, temp takes just one bite. And if we want to store something then we can initialize it with the character i.e. A. Giving a character constant, it must be in the single quotes and we can give only a single alphabet:

So, we should have only one single alphabet inside single quotes then it will be acceptable. Now, what actually is stored in the memory? Actually, inside the memory 65 value is stored. It’s not ‘A’. ‘A’ is not represented in computer memory. To print this ‘A’, we simply write:

Here printf will print 65 but we have given our control character as ‘%c’ so it will print A on the screen and if I make it as a ‘%d’ then the decimal number that is an integer type and it will display 65 on the screen.
Character Array:
We will create our array of characters. For creating an array of characters just like any other. We will take the array name as ‘B’ of size 5.

Now getting initialized this one.

This is a declaration plus initialization. So an array will be created with the name B. And it will have alphabets.

This is how we can create an array without giving it any size. So the same type of array will be created of size 5 and initialized with all these alphabets we haven’t mentioned the size. It will be taken depending on the number of alphabets we’re assigning.

And one more method we can create an array by either mentioning or not mentioning the size. So, these are the ASCII code for these alphabets.

We will create one more array and we will mention only two alphabets. So now the array is created with only the ‘a’ and ‘b’ alphabet restored.
So, the set of characters is still here but the array size is a total of five. But we have only two valid alphabets rest of the places are empty/vacant and not in use. Now next we will take the same example and we will explain to you what are strings.
Strings in C Language:
We want to store a name in an array so we will create an array of characters of the name ‘boy’ and give the size as 10 and here we will store ‘Rohan’:

It’s a string for storing names for storing the words or sentences of paragraphs. The string is nothing but a set of characters. So the name of the boy or anything is a string. Now the problem is:

See here the size of an array is 10 but the string size is only 4 alphabets. Then how do I know where this string is ending? So that is the important thing. When the size of an array may be larger but you are having only part of it as a string then we need to know where we have a string.
So, we should know the length of a string or we should have the endpoint of a string. So yes, in C and C++ it is marked it null character that is ‘\0’. ‘\0’ this is a null symbol. We can also say that a string delimiter or end of the string or null character or string terminator. This is used to show the end of the string. So, in C or C++ strings are terminated with a null character that is ‘\0’. But whereas in another language like Java strings will not have ‘\0’.
Then how to know how many alphabets are valid.\? So that is known with the help of length. In Java, String length is known or the size of the string is known by its length but in C or C++ the size of a string is known by finding a termination character that is the null character so strings are delimited by ‘\0’.

Now, this is just an array of characters. How to make it as a string in C / C++? We must write ‘\0’ also.

Now, this becomes a string. Without ‘\0’ it is just an array of characters. This is the difference between an array of characters and a string.
Now let’s see what are the methods for creating or declaring a string and also initializing it. Above is the 1st method for declaring as well as initializing a string. In the 2nd method, we can declare a string without any size and we will use the same name:

Then what will be the size of this array? The size of this array is 6 so for storing five alphabets of a name and also provide a space for ‘\0’ as it also consumes memory. The next method of declaring or initializing a string is:

We can write down the name in double-quotes. So only ‘ROHAN’ is written in double quotes so ‘\0’ will be automatically included. So then this looks better than these two methods. One more method of creating a string is:

This is a character pointer. Then where will the string be created? This string will be automatically created in the heap. Though we did not use a malloc () function or we did not write a new but this is implicitly allocated in heap memory. The array created by the above methods will be created inside the stack.

Inside heap ‘y’ is pointing to the array which is directly accessible to a program. This is created in heap so this indirectly accessible using a pointer. Automatically compiler will create this string inside the heap and the pointer will point there. Now let we discuss about printing a string.

For printing the above string:

So ‘%s’ is a control character for string. We can just give the name of an array and the string will be displayed. Remember it is not possible for any other type of array, like for integer or float. Suppose we want to read some new name another name and then here, we will use ‘scan’:

The scanf can also read strings from the keyboard and store those alphabets there followed by ‘\0’. 0 or so but indefinite scan if both are dependent on that slab zero Faldo library functions of C language that are meant for strings are dependent on slash 0.
Strings in C Language:
Character arrays or groups of characters or collections of characters are called strings. In implementation when we are manipulating multiple characters, then recommended to go for strings. Within the ‘ ‘ any content is called character constant, within the “ “ any content is called string constant. Character constant always returns an integer value i.e. ASCII value of a character. String constant always returns the base address of a string. When we are working with a string constant, always ends with nul(‘\0’). The representation of the null character is nul(‘\0’) and the ASCII value is 0.
Syntax: char str[size];
Note: Null is a global constant value which is defined in <stdio.h>. Null is a macro which is having the replacement data as 0 or (void*)0.
Example: int x=NULL;
int *ptr=NULL;
nul(‘\0’) is an ASCII character data which is having an ASCII value of 0.
Declaration of String in C Language:
C does not support string data type that’s why it allows us to represent the string as character arrays.
Syntax: char string_name[size];
Example: char book[10];
A null character (\0) is assigned to the string automatically when the compiler assigns a string to a character array. So, the size of the array becomes the maximum number of arrays plus 1.
Initialization of String:
We can initialize a String in different ways.
- char str[] = “Cprogramming”;
- char str[50] = “Cprogramming”;
- char str[] = {‘C’,’p’,’r’,’o’,’g’,’r’,’a’,’m’,’m’,’i’,’n’,’g’,’\0′};
- char str[14] = {‘C’,’p’,’r’,’o’,’g’,’r’,’a’,’m’,’m’,’i’,’n’,’g’,’\0′};
Memory representation of String in C Language:

Program:
#include<stdio.h> int main() { // declare and initialize string char str[] = "Strings"; // print string printf("%s",str); return 0; }
Output: Strings
Properties of Strings in C Language:
- In the declaration of string, the size must be an unsigned integer constant whose value is greater than zero only.
- In the initialization of the string, specific characters are not initialized the remaining elements are automatically initialized with nul(\0).
- In the initialization of the string, it is not possible to initialize more than the size of string elements.
- In the initialization of the string, if we are assigning a numeric value, then according to the ASCII value, corresponding data will be stored.
- In the initialization of the string, specifying the size is optional, in this case, how many characters are initialized and how many variables are created.
- When we are working with strings, always recommended to initialize the data in double-quotes only.
- When we are working with a string constant, always it ends with a ‘\0’ (null) character that’s why one extra byte memory is required but if we are working with a character array then it doesn’t require one extra byte memory.
- When we are working with character operations recommended going for the %c format specifier.
- When we are working with string operations recommended going for %s format specifier.
- When we are working with %s format specifier then we are required to pass an address of a string, from given address up to null, entire content will print on the console.
- When the null character has occurred in the middle of the string, then we are not able to print complete data because the null character indicates termination of the string.
What do u mean by formatted and unformatted functions?
The functions which will work with the help of format specifiers are called formatted functions. A formatted function can be applied to any data type. For example: printf(), scanf(), fprintf(), fscanf(), sprint(), etc.
The functions which does not require any format specifier and need to be applied for specific data type only are called unformatted function. For example: puts(), gets(), fputs(), cgets(), getch(), etc.
puts():
It is a predefined unformatted function, which is declared in stdio.h. By using this function, we can print string data on the console. Puts() function required 1 argument of type char* and returns an integer value. When we are working with the puts function, automatically it prints a newline character after printing string data.
Syntax: int puts(char*str);
Example to understand String in C Language:
#include<stdio.h> int main() { char str[] = "Strings"; puts(str); printf("%s",str); return 0; }
Output:

Example to understand String in C Language:
#include<stdio.h> #include<conio.h> int main() { char str[10]; printf("Enter a string: "); scanf("%s", str); printf("input string:%s", str); getch(); return 0; }
Output:

By using scanf function, we can’t read the string data properly when we have multiple works because, in scanf function space, tab and newline characters are treated like separators so when the separator is present, it is replaced with \0 character. In sacnf function, when we are using %[^\n]s format specifier, then it indicates that read the string data up to newline character occurrence.
gets():
It is a predefined unformatted function that is declared in stdio.h. By using this function we can read the string data properly, even when we are having multiple words. gets() function requires one argument of type (char*) & returns (char*) only. In the gets() function only a newline character is treated as a separator.
Syntax: char*gets(char*str);
Example to understand String in C Language:
#include<stdio.h> #include<conio.h> int main() { char str[10]; printf("Enter a string: "); gets(str); printf("input string:%s", str); getch(); return 0; }
Output:

Example to understand String in C Language:
#include<stdio.h> #include<conio.h> int main() { char s1[10]="hello"; char s2[10]="welcome"; puts(s1); puts(s2); s2=s1; puts(s1); puts(s2); getch(); return 0; }
Output:

Any kind of string manipulations, we can’t perform directly by using operators. In implementation when we are required to perform any kind of string operations then recommended to go for any string handling functions or go for user-defined function logic.
Stringing Operator (#):
This operator is introduced in the NCC version. By using this operator, we can convert the text in the form of string i.e. replacement in ” “. Following is an example.
#include<stdio.h> #define ABC(xy) printf(#xy "=%d",xy); int main() { int a,b; a=10; b=20; ABC(a+b); return 0; }
Output: a+b=30
Token Paste Operator (##):
NCC programming language supports this operator. By using this operator, we can concatenate multiple tokens. Following is an example.
#include<stdio.h> #define ABC(x,y) printf("%d",x##y); void main() { int var12=120; ABC(var,12); return 0; }
Output: 120