Machine Learning – Shishir Kant Singh https://shishirkant.com Jada Sir जाड़ा सर :) Sun, 04 May 2025 15:40:52 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.1 https://shishirkant.com/wp-content/uploads/2020/05/cropped-shishir-32x32.jpg Machine Learning – Shishir Kant Singh https://shishirkant.com 32 32 187312365 Pandas – Get Row Count https://shishirkant.com/pandas-get-row-count/?utm_source=rss&utm_medium=rss&utm_campaign=pandas-get-row-count https://shishirkant.com/pandas-get-row-count/#respond Sun, 04 May 2025 15:40:47 +0000 https://shishirkant.com/?p=4389 You can get the number of rows in Pandas DataFrame using len(df.index) and df.shape[0] properties. Pandas allow us to get the shape of the DataFrame by counting the number of rows in the DataFrame.

DataFrame.shape property returns the rows and columns, for rows get it from the first index which is zero; like df.shape[0] and for columns count, you can get it from df.shape[1]. Alternatively, to find the number of rows that exist in a DataFrame, you can use DataFrame.count() method, but this is not recommended approach due to performance issues.

source: stackoverflow.com

In this article, I will explain how to count or find the DataFrame rows count with examples.

Key Points –

  • The shape attribute returns a tuple of the form (rows, columns), where the first element represents the number of rows.
  • The len() function can be used to return the number of rows in a DataFrame.
  • Accessing the first element of the shape tuple gives the number of rows directly.
  • Accessing shape[0] is more efficient than using len() because shape is a direct attribute of the DataFrame.
  • Accessing the first element of the shape tuple gives the number of rows directly.
  • When applying filters or conditions, the number of rows can change, and you can use these methods to get the updated count.

1. Quick Examples of Get the Number of Rows in DataFrame

If you are in hurry, below are some quick examples of how to get the number of rows (row count) in Pandas DataFrame.


# Quick examples of get the number of rows

# Example 1: Get the row count 
# Using len(df.index)
rows_count = len(df.index)

# Example 2: Get count of rows 
# Using len(df.axes[])
rows_count = len(df.axes[0])

# Example 3:Get count of rows 
# Using df.shape[0]
rows_count = df.shape[0]

# Example 4: Get count of rows
# Using count()
rows_count = df.count()[0]

If you are a Pandas learner, read through the article as I have explained these examples with the sample data to understand better.

Let’s create a Pandas DataFrame with a dictionary of lists, pandas DataFrame columns names CoursesFeeDurationDiscount.


import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
    'Courses Fee' :[22000,25000,23000,24000,26000],
    'Duration':['30days','50days','30days', None,np.nan],
    'Discount':[1000,2300,1000,1200,2500]
          }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

Pandas get number rows

2. Get Number of Rows in DataFrame

You can use len(df.index) to find the number of rows in pandas DataFrame, df.index returns RangeIndex(start=0, stop=8, step=1) and use it on len() to get the count. You can also use len(df) but this performs slower when compared with len(df.index) since it has one less function call. Both these are faster than df.shape[0] to get the count.

If performance is not a constraint better use len(df) as this is neat and easy to read.


# Get the row count using len(df.index)
print(df.index)

# Outputs: 
# RangeIndex(start=0, stop=5, step=1)

print('Row count is:', len(df.index))
print('Row count is:', len(df))

# Outputs:
# Row count is:5

3. Get Row Count in DataFrame Using .len(DataFrame.axes[0]) Method

Pandas also provide Dataframe.axes property that returns a tuple of your DataFrame axes for rows and columns. Access the axes[0] and call len(df.axes[0]) to return the number of rows. For columns count, use df.axes[1]. For example: len(df.axes[1]).

Here, DataFrame.axes[0] returns the row axis (index), and len() is then used to get the length of that axis, which corresponds to the number of rows in the DataFrame.


# Get the row count using len(df.axes[0])
print(df.axes)

# Output:
# [RangeIndex(start=0, stop=5, step=1), Index(['Courses', 'Courses Fee', 'Duration', 'Discount'], dtype='object')]

print(df.axes[0])

# Output:
# RangeIndex(start=0, stop=5, step=1)

print('Row count is:', len(df.axes[0]))

# Outputs:
# Row count is:5

4. Using df.shape[0] to Get Rows Count

Pandas DataFrame.shape returns the count of rows and columns, df.shape[0] is used to get the number of rows. Use df.shape[1] to get the column count.

In the below example, df.shape returns a tuple containing the number of rows and columns in the DataFrame, and df.shape[0] specifically extracts the number of rows. This approach is concise and widely used for obtaining the row count in Pandas DataFrames.


# Get row count using df.shape[0]
df = pd.DataFrame(technologies)
row_count = df.shape[0]  # Returns number of rows
col_count = df.shape[1]  # Returns number of columns
print(row_count)

# Outputs:
# Number of rows: 5

5. Using df.count() Method

This is not recommended approach due to its performance but, still I need to cover this as this is also one of the approaches to get the row count of a DataFrame. Note that this ignores the values from columns that have None or Nan while calculating the count. As you see, my DataFrame contains 2 None/nan values in column Duration hence it returned 3 instead of 5 on the below example.


# Get count of each column
print(df.count())

# Outputs: 
# Courses        5
# Courses Fee    5
# Duration       3
# Discount       5
# dtype: int64

Now let’s see how to get the row count.


# Get count of rows using count()
rows_count = df.count()[0]
rows_count = = df[df.columns[0]].count()
print('Number of Rows count is:', rows_count )

# Outputs:
# Number of Rows count is: 5

Reference

]]>
https://shishirkant.com/pandas-get-row-count/feed/ 0 4389
Pandas – Cast Column Type https://shishirkant.com/pandas-cast-column-type/?utm_source=rss&utm_medium=rss&utm_campaign=pandas-cast-column-type https://shishirkant.com/pandas-cast-column-type/#respond Sun, 04 May 2025 15:36:45 +0000 https://shishirkant.com/?p=4386 While working in Pandas DataFrame or any table-like data structures we are often required to change the data type(dtype) of a column also called type casting, for example, convert from int to string, string to int e.t.c, In pandas, you can do this by using several methods like astype()to_numeric()covert_dttypes()infer_objects() and e.t.c. In this article, I will explain different examples of how to change or convert the data type in Pandas DataFrame – convert all columns to a specific type, convert single or multiple column types – convert to numeric types e.t.c.

Key Points–

  • Applying the .astype() method to convert data types directly, specifying the desired dtype.
  • Utilizing the .to_numeric() function to coerce object types into numeric types, with options for handling errors and coercing strings.
  • Using the infer_objects() method to automatically infer and convert data types.
  • Employing the as_type() method to convert data types with specific parameters like nullable integers.
  • Utilizing custom functions or mapping techniques for more complex type conversions.

1. Quick Examples of Changing Data Type

Below are some quick examples of converting column data type on Pandas DataFrame.


# Quick examples of converting data types 

# Example 1: Convert all types to best possible types
df2=df.convert_dtypes()

# Example 2: Change All Columns to Same type
df = df.astype(str)

# Example 3: Change Type For One or Multiple Columns
df = df.astype({"Fee": int, "Discount": float})

# Example 4: Ignore errors
df = df.astype({"Courses": int},errors='ignore')

# Example 5: Converts object types to possible types
df = df.infer_objects()

# Example 6: Converts fee column to numeric type
df['Fee'] = pd.to_numeric(df['Fee'])

# Example 7: Convert Fee and Discount to numeric types
df[['Fee', 'Discount']] =df [['Fee', 'Discount']].apply(pd.to_numeric)

Now let’s see with an example. first, create a Pandas DataFrame with columns names CoursesFeeDurationDiscount.


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration ':['30day','40days','35days', '40days','60days','50days','55days'],
    'Discount':[11.8,23.7,13.4,15.7,12.5,25.4,18.4]
    }
df = pd.DataFrame(technologies)
print(df.dtypes)

Yields below output.


# Output:
Courses       object
Fee            int64
Duration      object
Discount     float64

2. DataFrame.convert_dtypes() to Convert Data Type in Pandas

convert_dtypes() is available in Pandas DataFrame since version 1.0.0, this is the most used method as it automatically converts the column types to best possible types.

Below is the Syntax of the pandas.DataFrame.convert_dtypes().


# Syntax of DataFrame.convert_dtypes
DataFrame.convert_dtypes(infer_objects=True, convert_string=True,
      convert_integer=True, convert_boolean=True, convert_floating=True)

Now, let’s see a simple example.


# Convert all types to best possible types
df2=df.convert_dtypes()
print(df2.dtypes)

Yields below output. Note that it converted columns with object type to string type.


# Output:
Courses       string
Fee            int64
Duration      string
Discount     float64

This method is handy when you want to leverage Pandas’ built-in type inference capabilities to automatically convert data types, especially when dealing with large datasets or when you’re unsure about the optimal data type for each column.

3. DataFrame.astype() to Change Data Type in Pandas

In pandas DataFrame use dataframe.astype() function to convert one type to another type of single or multiple columns at a time, you can also use it to change all column types to the same type. When you perform astype() on a DataFrame without specifying a column name, it changes all columns to a specific type. To convert a specific column, you need to explicitly specify the column.

Below is the syntax of pandas.DataFrame.astype()


# Below is syntax of astype()
DataFrame.astype(dtype, copy=True, errors='raise')

3.1 Change All Columns to Same type in Pandas

df.astype(str) converts all columns of Pandas DataFrame to string type. To convert all columns in the DataFrame to strings, as confirmed by printing the data types before and after the conversion. Each column will be of type object, which is the dtype Pandas uses for storing strings.


# Change All Columns to Same type
df = df.astype(str)
print(df.dtypes)

Yields below output.


# Output:
Courses      object
Fee          object
Duration     object
Discount     object
dtype: object

3.2 Change Type For One or Multiple Columns in Pandas

On astype() Specify the param as JSON notation with column name as key and type you wanted to convert as a value to change one or multiple columns. Below example cast DataFrame column Fee to int type and Discount to float type.


# Change Type For One or Multiple Columns
df = df.astype({"Fee": int, "Discount": float})
print(df.dtypes)

3.3 Convert Data Type for All Columns in a List

Sometimes you may need to convert a list of DataFrame columns to a specific type, you can achieve this in several ways. Below are 3 different ways that convert columns Fee and Discount to float type.


# Convert data type for all columns in a list
df = pd.DataFrame(technologies)
cols = ['Fee', 'Discount']
df[cols] = df[cols].astype('float')

# By using a loop
for col in ['Fee', 'Discount']:
    df[col] = df[col].astype('float')

# By using apply() & astype() together
df[['Fee', 'Discount']].apply(lambda x: x.astype('float'))

3.4 Raise or Ignore Error when Convert Column type Fails

By default, when you are trying to change a column to a type that is not supported with the data, Pandas generates an error, in order to ignore error use errors param; this takes either ignore or error as value. In the below example I am converting a column that has string value to int which is not supported hence it generates an error, I used errors='ignore' to ignore the error.


# Ignores error
df = df.astype({"Courses": int},errors='ignore')

# Generates error
df = df.astype({"Courses": int},errors='raise')

4. DataFrame.infer_objects() to Change Data Type in Pandas

Use DataFrame.infer_objects() method to automatically convert object columns to a type of data it holding. It checks the data of each object column and automatically converts it to data type. Note that it converts only object types. For example, if a column with object type is holding int or float types, using infer_object() converts it to respective types.


# Converts object types to possible types
df = pd.DataFrame(technologies)
df = df.infer_objects()
print(df.dtypes)

5. Using DataFrame.to_numeric() to Convert Numeric Types

pandas.DataFrame.to_numeric() is used to convert columns with non-numeric dtypes to the most suitable numeric type.

5.1 Convert Numeric Types

Using pd.to_numeric() is another way to convert a specific column to a numeric type in Pandas. Here’s how you can use it to convert the Fee column to numeric type


# Converts fee column to numeric type
df['Fee'] = pd.to_numeric(df['Fee'])
print(df.dtypes)

This code will convert the Fee column from strings to numeric values, as confirmed by printing the data types after the conversion.

5.2 Convert Multiple Numeric Types using apply() Method

Use to_numeric() along with DataFrame.apply() method to convert multiple columns into a numeric type. The below example converts column Fee and Discount to numeric types.


# Convert Fee and Discount to numeric types
df = pd.DataFrame(technologies)
df[['Fee', 'Discount']] =df [['Fee', 'Discount']].apply(pd.to_numeric)
print(df.dtypes)

References

]]>
https://shishirkant.com/pandas-cast-column-type/feed/ 0 4386
Pandas Drop Rows Based on Column Value https://shishirkant.com/pandas-drop-rows-based-on-column-value/?utm_source=rss&utm_medium=rss&utm_campaign=pandas-drop-rows-based-on-column-value https://shishirkant.com/pandas-drop-rows-based-on-column-value/#respond Sun, 04 May 2025 15:17:59 +0000 https://shishirkant.com/?p=4381 Use drop() method to delete rows based on column value in pandas DataFrame, as part of the data cleansing, you would be required to drop rows from the DataFrame when a column value matches with a static value or on another column value. In this article, I will explain dropping rows based on column value.

Key Points –

  • Use boolean indexing to filter rows based on specific conditions in a DataFrame column.
  • The condition inside the boolean indexing can involve any comparison or logical operation.
  • Apply the mask to the DataFrame using the .loc[] indexer or DataFrame.drop() method.
  • Use boolean indexing or conditional statements to create a mask identifying rows to be dropped.
  • Always ensure to create a new DataFrame or use inplace=True parameter to modify the original DataFrame when dropping rows to avoid unintended consequences.

Create DataFrame

To run some examples of drop rows based on column value, let’s create Pandas DataFrame.


# Create pandas DataFrame
import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python"],
    'Fee' :[22000,25000,np.nan,24000],
    'Duration':['30days',None,'55days',np.nan],
    'Discount':[1000,2300,1000,np.nan]
          }
df = pd.DataFrame(technologies)
print("DataFrame:\n", df)

Yields below output.

pandas drop rows value

Delete Rows Using drop()

To delete rows based on specific column values in a Pandas DataFrame, you typically filter the DataFrame using boolean indexing and then reassign the filtered DataFrame back to the original variable or use the drop() method to remove those rows.


# Delete rows using drop()
df.drop(df[df['Fee'] >= 24000].index, inplace = True)
print("Drop rows based on column value:\n", df)

Yields below output.

pandas drop rows value

In the above example, use the drop() method to remove the rows where the Fee column is greater than or equal to 24000. We used inplace=True to modify the original DataFrame df.

Using loc[]

Using loc[] to drop rows based on a column value involves leveraging the loc[] accessor in pandas to filter rows from a DataFrame according to a condition applied to a specific column, effectively filtering out rows that do not meet the condition.


# Remove row
df2 = df[df.Fee >= 24000]
print("Drop rows based on column value:\n", df)

# Using loc[]
df2 = df.loc[df["Fee"] >= 24000 ]
print("Drop rows based on column value:\n", df)

# Output:
#  Drop rows based on column value:
#    Courses      Fee Duration  Discount
# 1  PySpark  25000.0     None    2300.0
# 3   Python  24000.0      NaN       NaN

Delete Rows Based on Multiple Column Values

To delete rows from a DataFrame based on multiple column values in pandas, you can use the drop() function along with boolean indexing.


# Delect rows based on multiple column value
df = pd.DataFrame(technologies)
df = df[(df['Fee'] >= 22000) & (df['Discount'] == 2300)]
print("Drop rows based on multiple column values:\n", df)

# Output:
# Drop rows based on multiple column values:
#    Courses      Fee Duration  Discount
# 1  PySpark  25000.0     None    2300.0

Delete Rows Based on None or NaN Column Values

When you have None or NaN values on columns, you may need to remove NaN values before you apply some calculations. you can do this using notnull() function.

Note: With None or NaN values you cannot use == or != operators.


# Drop rows with None/NaN values
df2 = df[df.Discount.notnull()]
print("Drop rows based on column value:\n", df)

# Output:
#  Drop rows based on column value:
#    Courses      Fee Duration  Discount
# 0    Spark  22000.0   30days    1000.0
# 1  PySpark  25000.0     None    2300.0
# 2   Hadoop      NaN   55days    1000.0

Using query()

To use the DataFrame.query() function is primarily used for filtering rows based on a condition, rather than directly deleting rows. However, you can filter rows using query() and then assign the filtered DataFrame back to the original DataFrame, effectively removing the rows that do not meet the specified condition.


# Delete rows using DataFrame.query()
df2=df.query("Courses == 'Spark'")

# Using variable
value='Spark'
df2=df.query("Courses == @value")

# Inpace
df.query("Courses == 'Spark'",inplace=True)

# Not equals, in & multiple conditions
df.query("Courses != 'Spark'")
df.query("Courses in ('Spark','PySpark')")
df.query("`Courses Fee` >= 23000")
df.query("`Courses Fee` >= 23000 and `Courses Fee` <= 24000")

# Other ways to Delete Rows
df.loc[df['Courses'] == value]
df.loc[df['Courses'] != 'Spark']
df.loc[df['Courses'].isin(values)]
df.loc[~df['Courses'].isin(values)]
df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
df.loc[(df['Discount'] >= 1200) & (df['Fee'] >= 23000 )]

df[df["Courses"] == 'Spark'] 
df[df['Courses'].str.contains("Spark")]
df[df['Courses'].str.lower().str.contains("spark")]
df[df['Courses'].str.startswith("P")]

# Using lambda
df.apply(lambda row: row[df['Courses'].isin(['Spark','PySpark'])])
df.dropna()

Based on the Inverse of Column Values

To delete rows from a DataFrame where the value in the Courses column is not equal to PySpark. The tilde ~ operator is used to invert the boolean condition.


# Delect rows based on inverse of column values
df1 = df[~(df['Courses'] == "PySpark")].index 
df.drop(df1, inplace = True)
print("Drop rows based on column value:\n", df)

# Output:
# Drop rows based on column value:
#    Courses    Fee Duration  Discount
# b  PySpark  25000   50days      2300
# f  PySpark  25000   50days      2000

The above code will drop rows from the DataFrame df where the value in the Courses column is not equal to PySpark. It first finds the index of such rows using the boolean condition and then drops those rows using the drop() method.

Complete Example


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python"],
    'Fee' :[22000,25000,np.nan,24000],
    'Duration':['30days',None,'55days',np.nan],
    'Discount':[1000,2300,1000,np.nan]
          }
df = pd.DataFrame(technologies)
print(df)

# Using drop() to remove rows
df.drop(df[df['Fee'] >= 24000].index, inplace = True)
print(df)

# Remove rows
df = pd.DataFrame(technologies)
df2 = df[df.Fee >= 24000]
print(df2)

# Reset index after deleting rows
df2 = df[df.Fee >= 24000].reset_index()
print(df2)

# If you have space in column name.
# Surround the column name with single quote
df2 = df[df['column name']]

# Using loc
df2 = df.loc[df["Fee"] >= 24000 ]
print(df2)

# Delect rows based on multiple column value
df2 = df[(df['Fee'] >= 22000) & (df['Discount'] == 2300)]
print(df2)

# Drop rows with None/NaN
df2 = df[df.Discount.notnull()]
print(df2)

References

]]>
https://shishirkant.com/pandas-drop-rows-based-on-column-value/feed/ 0 4381
Python CGI Programming https://shishirkant.com/python-cgi-programming/?utm_source=rss&utm_medium=rss&utm_campaign=python-cgi-programming Thu, 28 Sep 2023 15:11:08 +0000 https://shishirkant.com/?p=4312 The Concept of CGI

CGI is an abbreviation for Common Gateway Interface. It is not a type of language but a set of rules (specification) that establishes a dynamic interaction between a web application and the client application (or the browser). The programs based on CGI helps in communicating between the web servers and the client. Whenever the client browser makes a request, it sends it to the webserver, and the CGI programs return output to the webserver based on the input that the client-server provides.

Common Gateway Interface (CGI) provides a standard for peripheral gateway programs to interface with the data servers like an HTTP server.

The programming with CGI is written dynamically, which generates web-pages responding to the input from the user or the web-pages interacting with the software on the server.

The Concept of Web Browsing

Have you ever wondered how these blue-colored underlined texts, commonly known as hyperlinks, able to take you from one web-page or Uniform Resource Locator (URL) to another? What exactly happens when some user clicks on a hyperlink?

Let’s understand the very concept behind web browsing. Web browsing consists of some steps that are as follows:

STEP 1: Firstly, the browser communicates with the data server, say HTTP server, to demand the URL.

STEP 2: Once it is done, then it parses the URL.

STEP 3: After then, it checks for the specified filename.

STEP 4: Once it finds that file, a request is made and sent it back.

STEP 5: The Web browser accepts a response from the webserver.

STEP 6: As the server’s response, it can either display the requested file or a message showing error.

However, it may be possible to set up an HTTP server because whenever a file in a specific directory is requested, it is processed as a program rather than sending that file back. The output of that program is shown back to the browser. This function is also known as the Common Gateway Interface or abbreviated as CGI. These processed programs are known as CGI scripts, and they can be a C or C++ program, Shell Script, PERL Script, Python Script, etc.

The working of CGI

Whenever the client-server requests the webserver, the Common Gateway Interface (CGI) handles these requests using external script files. These script files can be written in any language, but the chief idea is to recover the data more efficiently and quickly. These scripts are then used to convert the recovered data into an HTML format that can send data to these web servers in HTML formatted page.

An architectural diagram representing the working of CGI is shown below:

Usage of cgi module

Python provides the cgi module consisting of numerous useful core properties. These properties and functions can be used by importing the cgi module, in current working program as shown below:

import cgi

Now, We will use cgitb.enable() in our script to stimulate an exception handler in the web browser to display the detailed report for the errors that occurred. The save will look as shown below:

import cgi

cgitb.enable()

Now, we can save the report with the help of the following script.

import cgitb 
cgitb.enable(display = 0, logdir = “/path/to/logdir” )

The function of the cgi module stated above would help throughout the script development. These reports help the user for debugging the script efficiently. Whenever the users get the expected result, they can eliminate this.

As we have discussed earlier, we can save information with the help of the form. But the problem is, how can we obtain that information? To answer this question, let’s understand the FieldStorage class of Python. If the form contains the non-ASCII characters, we can apply the encoding keyword parameter to the document. We will find the content <META> tag inside the <HEAD> section of the HTML file.

 The FieldStorage class is used to read the form data from the environment or the standard input.

FieldStorage instance is similar to the Python dictionary. The user can utilize the len() and all the dictionary functions as the FieldStorage instance. It is used to overlook the fields that have values as an empty string. The users can also consider the void values with the optional keyword parameter keep_blank_values by setting it to True.

Let’s see an example:

form = cgi.FieldStorage()   if ("name" not in form or "add" not in form):       
    print("<H1>Input Error!!</H1>")
   print("Please enter the details in the Name and Address fields!")    return 
print("<p>Name: file_item = form["userfile"]   
if (fileitem.file):      
    # It represents the uploaded file     
    count_line = 0       
    while(True):           
        line = fileitem.file.readline()   
        if not line: break           
        count_line = count_line + 1   # The execution of next lines of code will start here...

In the above snippet of code, we have utilized the form [“name”], where name is key, for extracting the value which the user enters.

To promptly fetch the string value, we can utilize the getvalue() method. This method also takes a second argument by default. And if the key is not available, it will return the value as default.

Moreover, if the information in the submitted form has multiple fields with the same name, we should take the help of the form.getlist() function. This function helps in returning the list of strings. Now let’s look at the following snippet of code; we have added some username fields and separate them by commas.

first_value = form.getlist("username")   f_username = ",".join(value)

If we want to access the field where a file is uploaded and read that in bytes, we can use the value attribute or the getvalue() method. Let’s see the following snippet of code if the user uploads the file.

file_item = form["userfile"]   
if (fileitem.file):      
    # It represents the uploaded file     
    count_line = 0       
    while(True):           
        line = fileitem.file.readline()           if not line: break           
        count_line = count_line + 1  

An error can often interrupt the program while reading the content of the file that was uploaded. It may happen when a user clicks on the Back Button or the Cancel Button. However, to set the value – 1, the FieldStorage class provides the done attribute.

Furthermore, the item will be objects of the MiniFieldStorage class if the submitted form is in the “old” format. The attributes like list, filename, and file are always None in this class.

Usually, the form is submitted with the POST method’s help and contains a query string with both the MiniFieldStorage and FieldStorage items.

Let’s see a list of the FieldStorage attribute in the following table.

FieldStorage Attributes:

S. No.AttributesDescription
1NameThe Name attribute is used to represent the field name.
2FileThe File attribute is used as a file(-like) instance to read data as bytes.
3FilenameThe Filename attribute is used to represent the filename at the Client-side.
4TypeThe Type attribute is used to show the type of content.
5ValueThe Value attribute is used to upload files, read the files and return byte. It is a string type value.
6HeaderThe Header attribute is used as a dictionary type instance containing all headers.

In addition to the above, the FieldStorage instance uses various core methods for manipulating users’ data. Some of them are listed below:

FieldStorage Methods:

S. No.MethodsDescription
1getfirst()The getfirst() method is used to return the received first value.
2getvalue()The getvalue() method is used as a dictionary get() method
3getlist()The getlist() method is used to return the list of values received.
4keys()The keys() method is used as the dictionary keys() method
5make_file()The make_file() method is used to return a writable and readable file.

CGI Program Structure in Python

Let’s understand the structure of a Python CGI Program:

  • There must be two sections that separate the output of the Python CGI script by a blank line.
  • The first section consists of the number of headers describing the client about the type of data used, and the other section consists of the data that will be displayed during the execution of the script.

Let’s have a look at the Python code given below:

print ("Content-type : text/html") 
# now enter the rest html document print ("<html>") 
print ("<head>") 
print ("<title> Welcome to CGI program </title>") 
print ("<head>") 
print ("<body>") 
print ("<h2> Hello World! This is my first CGI program. </h2>") print ("</body>") 
print ("</html>")

Now, let’s save the above file as cgi.py. Once we execute the file, we should see an output, as shown below:

Hello World! This is my first CGI program.

The above program is a simple Python script that writes the output to STDOUT file that is on-screen.

Understanding the HTTP Header

There are various HTTP headers defined that are frequently used in the CGI programs. Some of them are listed below:

S. No.HTTP HeaderDescription
1Content-typeThe Content-type is a MIME string used for defining the file format that is being returned.
2Content-length: NThe Content-length works as the information used for reporting the estimated time for downloading a file.
3Expires: DateThe Expires: Date is used for displaying the valid date information
4Last-modified: DateThe Last-modified: Date is used to show the resource’s last modification date
5Location: URLThe Location: URL is used to display the URL returned by the server.
6Set-Cookies: StringThe Set-Cookies: String is used for setting the cooking with help of a string

The CGI Environment Variables

There are some variables predefined in the CGI environment alongside the HTML syntax. Some of them are listed in the following table:

S. No.Environment VariablesDescription
1CONTENT_TYPEThe CONTENT_TYPE variable is used to describe the type and data of the content.
2CONTENT_LENGTHThe CONTENT_LENGTH variable is used to define the query or information length.
3HTTP_COOKIEThe HTTP_COOKIE variable is used to return the cookie set by the user in the current session.
4HTTP_USER_AGENTThe HTTP_USER_AGENT variable is used for displaying the browser’s type currently being used by the user.
5REMOTE_HOSTThe REMOTE_HOST variable is used for describing the Host-name of the user.
6PATH_INFOThe PATH_INFO variable is used for describing the CGI script path.
7REMOTE_ADDRThe REMOTE_ADDR variable is used for defining the IP address of the visitor.
8REQUEST_METHODThe REQUEST_METHOD variable is used for requests with the help of the GET or POST method.

How to Debug CGI Scripts?

Then, the test() function can be used from the script. We can write the following code using a single statement

cgi.test()

Pros and Cons of CGI Programming

Some Pros of CGI Programming:

There are numerous pros of using CGI programming. Some of them are as follows:

  • The CGI programs are multi-lingual. These programs can be used with any programming language.
  • The CGI programs are portable and can work on almost any web-server.
  • The CGI programs are quite scalable and can perform any task, whether it’s simple or complex.
  • The CGIs take lesser time in processing requests.
  • The CGIs can be used in development; they can reduce the cost of developments and maintenance, making it profitable.
  • The CGIs can be used for increasing the dynamic communication in web applications.

Some Cons of CGI Programming:

There are a few cons of using CGI programming. Some of them are as follows:

  • The CGI programs are pretty much complex, making it harder to debug.
  • While initiating the program, the interpreter has to appraise a CGI script in every initiation. As an output, it creates a lot of traffic because of multiple requests from the client-server’s side.
  • The CGI programs are fairly susceptible, as they are mostly free and easily available with no server security.
  • CGI utilizes a lot of time in processing.
  • The data doesn’t store in the cache memory during the loading of the page.
  • CGIs have huge extensive codebases, mostly in Perl.
]]>
4312
Python Regex Functions https://shishirkant.com/python-regex-functions/?utm_source=rss&utm_medium=rss&utm_campaign=python-regex-functions Thu, 28 Sep 2023 15:02:27 +0000 https://shishirkant.com/?p=4308 A regular expression is a set of characters with highly specialized syntax that we can use to find or match other characters or groups of characters. In short, regular expressions, or Regex, are widely used in the UNIX world.

Import the re Module

  1. # Importing re module  
  2. import re  

The re-module in Python gives full support for regular expressions of Pearl style. The re module raises the re.error exception whenever an error occurs while implementing or using a regular expression.

We’ll go over crucial functions utilized to deal with regular expressions.

But first, a minor point: many letters have a particular meaning when utilized in a regular expression called metacharacters.Backward Skip 10sPlay VideoForward Skip 10s

The majority of symbols and characters will easily match. (A case-insensitive feature can be enabled, allowing this RE to match Python or PYTHON.) For example, the regular expression ‘check’ will match exactly the string ‘check’.

There are some exceptions to this general rule; certain symbols are special metacharacters that don’t match. Rather, they indicate that they must compare something unusual or have an effect on other parts of the RE by recurring or modifying their meaning.

Metacharacters or Special Characters

As the name suggests, there are some characters with special meanings:

CharactersMeaning
.Dot – It matches any characters except the newline character.
^Caret – It is used to match the pattern from the start of the string. (Starts With)
$Dollar – It matches the end of the string before the new line character. (Ends with)
*Asterisk – It matches zero or more occurrences of a pattern.
+Plus – It is used when we want a pattern to match at least one.
?Question mark – It matches zero or one occurrence of a pattern.
{}Curly Braces – It matches the exactly specified number of occurrences of a pattern
[]Bracket – It defines the set of characters
|Pipe – It matches any of two defined patterns.

Special Sequences:

The ability to match different sets of symbols will be the first feature regular expressions can achieve that’s not previously achievable with string techniques. On the other hand, Regexes isn’t much of an improvement if that had been their only extra capacity. We can also define that some sections of the RE must be reiterated a specified number of times.

The first metacharacter we’ll examine for recurring occurrences is *. Instead of matching the actual character ‘*,’ * signals that the preceding letter can be matched 0 or even more times rather than exactly once.

Ba*t, for example, matches ‘bt’ (zero ‘a’ characters), ‘bat’ (one ‘a’ character), ‘baaat’ (three ‘a’ characters), etc.

Greedy repetitions, such as *, cause the matching algorithm to attempt to replicate the RE as many times as feasible. If later elements of the sequence fail to match, the matching algorithm will retry with lesser repetitions.

Special Sequences consist of ‘\’ followed by a character listed below. Each character has a different meaning.

CharacterMeaning
\dIt matches any digit and is equivalent to [0-9].
\DIt matches any non-digit character and is equivalent to [^0-9].
\sIt matches any white space character and is equivalent to [\t\n\r\f\v]
\SIt matches any character except the white space character and is equivalent to [^\t\n\r\f\v]
\wIt matches any alphanumeric character and is equivalent to [a-zA-Z0-9]
\WIt matches any characters except the alphanumeric character and is equivalent to [^a-zA-Z0-9]
\AIt matches the defined pattern at the start of the string.
\br”\bxt” – It matches the pattern at the beginning of a word in a string.
r”xt\b” – It matches the pattern at the end of a word in a string.
\BThis is the opposite of \b.
\ZIt returns a match object when the pattern is at the end of the string.

RegEx Functions:

  • compile – It is used to turn a regular pattern into an object of a regular expression that may be used in a number of ways for matching patterns in a string.
  • search – It is used to find the first occurrence of a regex pattern in a given string.
  • match – It starts matching the pattern at the beginning of the string.
  • fullmatch – It is used to match the whole string with a regex pattern.
  • split – It is used to split the pattern based on the regex pattern.
  • findall – It is used to find all non-overlapping patterns in a string. It returns a list of matched patterns.
  • finditer – It returns an iterator that yields match objects.
  • sub – It returns a string after substituting the first occurrence of the pattern by the replacement.
  • subn – It works the same as ‘sub’. It returns a tuple (new_string, num_of_substitution).
  • escape – It is used to escape special characters in a pattern.
  • purge – It is used to clear the regex expression cache.

1. re.compile(pattern, flags=0)

It is used to create a regular expression object that can be used to match patterns in a string.

Example:

# Importing re module  
import re  

# Defining regEx pattern  
pattern = "amazing"  

# Createing a regEx object
regex_object = re.compile(pattern)  

# String  
text = "This tutorial is amazing!"   

# Searching for the pattern in the string  
match_object = regex_object.search(text)  

# Output  
print("Match Object:", match_object)  


Output:
Match Object:

This is equivalent to:

re_obj = re.compile(pattern)
result = re_obj.search(string)
=result = re.search(pattern, string)

Note – When it comes to using regular expression objects several times, the re.complie() version of the program is much more efficient.

2. re.match(pattern, string, flags=0)

  • It starts matching the pattern from the beginning of the string.
  • Returns a match object if any match is found with information like start, end, span, etc.
  • Returns a NONE value in the case no match is found.

Parameters

  • pattern:-this is the expression that is to be matched. It must be a regular expression
  • string:-This is the string that will be compared to the pattern at the start of the string.
  • flags:-Bitwise OR (|) can be used to express multiple flags.

Example:

# Importing re module  
import re  

# Our pattern  
pattern = "hello"  

# Returns a match object if found else Null  
match = re.match(pattern, "hello world")  

print(match) # Printing the match object  
print("Span:", match.span()) # Return the tuple (start, end)  
print("Start:", match.start()) # Return the starting index  
print("End:", match.end()) # Returns the ending index  

Output: 
Span: (0, 5) 
Start: 0 
End: 5

Another example of the implementation of the re.match() method in Python.

  • The expressions “.w*” and “.w*?” will match words that have the letter “w,” and anything that does not has the letter “w” will be ignored.
  • The for loop is used in this Python re.match() illustration to inspect for matches for every element in the list of words.

CODE:

import re    
line = "Learn Python through tutorials on shishirkant"  
match_object = re.match( r'.w* (.w?) (.w*?)', line, re.M|re.I)

if match_object:    
    print ("match object group : ", match_object.group())   
    print ("match object 1 group : ", match_object.group(1))
    print ("match object 2 group : ", match_object.group(2))  
else:    
    print ( "There isn't any match!!" )   

Output:
There isn't any match!!

3. re.search(pattern, string, flags=0)

The re.search() function will look for the first occurrence of a regular expression sequence and deliver it. It will verify all rows of the supplied string, unlike Python’s re.match(). If the pattern is matched, the re.search() function produces a match object; otherwise, it returns “null.”

To execute the search() function, we must first import the Python re-module and afterward run the program. The “sequence” and “content” to check from our primary string are passed to the Python re.search() call.

Here is the description of the parameters –

pattern:- this is the expression that is to be matched. It must be a regular expression

string:- The string provided is the one that will be searched for the pattern wherever within it.

flags:- Bitwise OR (|) can be used to express multiple flags. These are modifications, and the table below lists them.

Code

import re  

line = "Learn Python through tutorials on shishirkant";  

search_object = re.search( r' .*t? (.*t?) (.*t?)', line) 
if search_object:  
    print("search object group : ", search_object.group())  
    print("search object group 1 : ", search_object.group(1)) 
    print("search object group 2 : ", search_object.group(2)) 
else:  
    print("Nothing found!!")  

Output:
search object group : Python through tutorials on shishirkant 
search object group 1 : on 
search object group 2 : shishirkant

4. re.sub(pattern, repl, string, count=0, flags=0)

  • It substitutes the matching pattern with the ‘repl’ in the string
  • Pattern – is simply a regex pattern to be matched
  • repl – repl stands for “replacement” which replaces the pattern in string.
  • Count – This parameter is used to control the number of substitutions

Example 1:

# Importing re module  
import re  

# Defining parameters  
pattern = "like" # to be replaced  
repl = "love" # Replacement  
text = "I like Shishirkant!" # String 

# Returns a new string with a substituted pattern 
new_text = re.sub(pattern, repl, text)  

# Output  
print("Original text:", text)  
print("Substituted text: ", new_text)  

Output:
Original text: I like Shishirkant! 
Substituted text: I love Shishirkant!

In the above example, the sub-function replaces the ‘like’ with ‘love’.

Example 2 – Substituting 3 occurrences of a pattern.

# Importing re package  
import re  

# Defining parameters  
pattern = "l" # to be replaced  
repl = "L" # Replacement  
text = "I like Shishirkant! I also like tutorials!" # String  

# Returns a new string with the substituted pattern  
new_text = re.sub(pattern, repl, text, 3)  

# Output  
print("Original text:", text)  
print("Substituted text:", new_text)  

Output:
Original text: I like Shishirkant! I also like tutorials! 
Substituted text: I Like Shishirkant! I aLso Like tutorials!

Here, first three occurrences of ‘l’ is substituted with the “L”.

5. re.subn(pattern, repl, string, count=0, flags=0)

  • Working of subn if same as sub-function
  • It returns a tuple (new_string, num_of_substitutions)

Example:

# Importing re module  
import re  

# Defining parameters  
pattern = "l" # to be replaced  
repl = "L" # Replacement  
text = "I like Shishirkant! I also like tutorials!" # String  

# Returns a new string with the substituted pattern  
new_text = re.subn(pattern, repl, text, 3)  

# Output  
print("Original text:", text)  
print("Substituted text:", new_text) 

Output:
Original text: I like Shishirkant! I also like tutorials! 
Substituted text: ('I Like Shishirkant! I aLso Like tutorials!', 3)

    In the above program, the subn function replaces the first three occurrences of ‘l’ with ‘L’ in the string.

    6. re.fullmatch(pattern, string, flags=0)

    • It matches the whole string with the pattern.
    • Returns a corresponding match object.
    • Returns None in case no match is found.
    • On the other hand, the search() function will only search the first occurrence that matches the pattern.

    Example:

    # Importing re module  
    import re   
    # Sample string  
    line = "Hello world";    
    
    # Using re.fullmatch()  
    print(re.fullmatch("Hello", line))  
    print(re.fullmatch("Hello world", line))  

    Output:

    None

    In the above program, only the ‘Hello world” has completely matched the pattern, not ‘Hello’.

    Q. When to use re.findall()?

    Ans. Suppose we have a line of text and want to get all of the occurrences from the content, so we use Python’s re.findall() function. It will search the entire content provided to it.

    7. re.finditer(pattern, string, flags=0)

    • Returns an iterator that yields all non-overlapping matches of pattern in a string.
    • String is scanned from left to right.
    • Returning matches in the order they were discovered

    # Importing re module  
    import re   
    
    # Sample string  
    line = "Hello world. I am Here!";  
    
    # Regex pattern  
    pattern = r'[aeiou]'  
    
    # Using re.finditer()  
    iter_ = re.finditer(pattern, line)  
    
    # Iterating the itre_  
    for i in iter_:  
        print(i)  

    Output:

    8. re.split(pattern, string, maxsplit=0, flags=0)

    • It splits the pattern by the occurrences of patterns.
    • If maxsplit is zero, then the maximum number of splits occurs.
    • If maxsplit is one, then it splits the string by the first occurrence of the pattern and returns the remaining string as a final result.

    Example:

    # Import re module  
    import re    
    
    # Pattern  
    pattern = ' '  
    
    # Sample string  
    line = "Learn Python through tutorials on shishirkant"    
    
    # Using split function to split the string after ' '  
    result = re.split( pattern, line)   
    
    # Printing the result  
    print("When maxsplit = 0, result:", result)  
    
    # When Maxsplit is one  
    result = re.split(pattern, line, maxsplit=1)  
    print("When maxsplit = 1, result =", result)
    
    Output:
    When maxsplit = 0, result: ['Learn', 'Python', 'through', 'tutorials', 'on', 'shishirkant'] 
    When maxsplit = 1, result = ['Learn', 'Python through tutorials on shishirkant']

    9. re.escape(pattern)

    • It escapes the special character in the pattern.
    • The esacpe function become more important when the string contains regular expression metacharacters in it.

    Example:

    # Import re module  
    import re    
    
    # Pattern  
    pattern = 'https://www.shishirkant.com/'  
    
    # Using escape function to escape metacharacters  
    result = re.escape( pattern)   
      
    # Printing the result  
    print("Result:", result)

      Output:Result: https://www\.shishirkant\.com/

      The escape function escapes the metacharacter ‘.’ from the pattern. This is useful when want to treat metacharacters as regular characters to match the actual characters themselves.

      10. re.purge()

      • The purge function does not take any argument that simply clears the regular expression cache.

      Example:

      # Importing re module  
      import re  
      
      # Define some regular expressions  
      pattern1 = r'\d+'  
      pattern2 = r'[a-z]+'  
      
      # Use the regular expressions  
      print(re.search(pattern1, '123abc'))  
      print(re.search(pattern2, '123abc'))  
      
      # Clear the regular expression cache  
      re.purge()  
      
      # Use the regular expressions again  
      print(re.search(pattern1, '456def'))  
      print(re.search(pattern2, '456def'))

          Output:

          • After using, pattern1 and pattern2 to search for matches in the string ‘123abc’.
          • We have cleared the cache using re.purge().
          • We have again used pattern1 and pattern2 to search for matches in the string ‘456def’.
          • Since the regular expression cache has been cleared. The regular expressions are recompiled, and searching for matches in the ‘456def’ has been performed with the new regular expression object.

          Matching Versus Searching – re.match() vs. re.search()

          Python has two primary regular expression functions: match and search. The match function looks for a match only where the string starts, whereas the search function looks for a match everywhere in the string.

          CODE:

          # Import re module  
          import re
          
          # Sample string  
          line = "Learn Python through tutorials on shishirkant" 
          
          # Using match function to match 'through'
          match_object = re.match( r'through', line, re.M|re.I)
          if match_object:    
              print("match object group : ", match_object)    
          else:    
              print( "There isn't any match!!")    
          
          # using search function to search  
          search_object = re.search( r'through', line, re.M|re.I)    
          if search_object:    
              print("Search object group : ", search_object)    
          else:    
              print("Nothing found!!")  
          
          Output:
          There isn't any match!! 
          Search object group :

          The match function checks whether the string is starting with ‘through’ or not, and the search function checks whether there is ‘through’ in the string or not.

          ]]>
          4308
          Python Regular Expressions – I https://shishirkant.com/python-regular-expressions-i/?utm_source=rss&utm_medium=rss&utm_campaign=python-regular-expressions-i Thu, 28 Sep 2023 06:06:58 +0000 https://shishirkant.com/?p=4305 Introduction to the Python regular expressions

          Regular expressions (called regex or regexp) specify search patterns. Typical examples of regular expressions are the patterns for matching email addresses, phone numbers, and credit card numbers.

          Regular expressions are essentially a specialized programming language embedded in Python. And you can interact with regular expressions via the built-in re module in Python.

          The following shows an example of a simple regular expression:

          '\d'

          Code language: Python (python)

          In this example, a regular expression is a string that contains a search pattern. The '\d' is a digit character set that matches any single digit from 0 to 9.

          Note that you’ll learn how to construct more complex and advanced patterns in the next tutorials. This tutorial focuses on the functions that deal with regular expressions.

          To use this regular expression, you follow these steps:

          First, import the re module:

          import re

          Second, compile the regular expression into a Pattern object:

          p = re.compile('\d')

          Third, use one of the methods of the Pattern object to match a string:

          s = "Python 3.10 was released on October 04, 2021" 
          result = p.findall(s) 
          
          print(result)

          Output:

          ['3', '1', '0', '0', '4', '2', '0', '2', '1']

          The findall() method returns a list of single digits in the string s.

          The following shows the complete program:

          import re 
          
          p = re.compile('\d') 
          s = "Python 3.10 was released on October 04, 2021" 
          
          results = p.findall(s) 
          print(results)

          Besides the findall() method, the Pattern object has other essential methods that allow you to match a string:

          MethodPurpose
          match()Find the pattern at the beginning of a string
          search()Return the first match of a pattern in a string
          findall()Return all matches of a pattern in a string
          finditer()Return all matches of a pattern as an iterator

          Python regular expression functions

          Besides the Pattern class, the re module has some functions that match a string for a pattern:

          • match()
          • search()
          • findall()
          • finditer()

          These functions have the same names as the methods of the Pattern object. Also, they take the same arguments as the corresponding methods of the Pattern object. However, you don’t have to manually compile the regular expression before using it.

          The following example shows the same program that uses the findall() function instead of the findall() method of a Pattern object:

          import re 
          
          s = "Python 3.10 was released on October 04, 2021." 
          results = re.findall('\d',s) 
          print(results)

          Using the functions in the re module is more concise than the methods of the Pattern object because you don’t have to compile regular expressions manually.

          Under the hood, these functions create a Pattern object and call the appropriate method on it. They also store the compiled regular expression in a cache for speed optimization.

          It means that if you call the same regular expression from the second time, these functions will not need to recompile the regular expression. Instead, they get the compiled regular expression from the cache.

          Should you use the re functions or methods of the Pattern object?

          If you use a regular expression within a loop, the Pattern object may save a few function calls. However, if you use it outside of loops, the difference is very little due to the internal cache.

          The following sections discuss the most commonly used functions in the re module including search()match(), and fullmatch().

          search() function

          The search() function searches for a pattern within a string. If there is a match, it returns the first Match object or None otherwise. For example:

          import re 
          
          s = "Python 3.10 was released on October 04, 2021." 
          pattern = '\d{2}' 
          match = re.search(pattern, s) 
          print(type(match)) 
          print(match)
          Output:<class 're.Match'> 
          <re.Match object; span=(9, 11), match='10'>

          In this example, the search() function returns the first two digits in the string s as the Match object.

          Match object

          The Match object provides the information about the matched string. It has the following important methods:

          MethodDescription
          group()Return the matched string
          start()Return the starting position of the match
          end()Return the ending position of the match
          span()Return a tuple (start, end) that specifies the positions of the match

          The following example examines the Match object:

          import re 
          
          s = "Python 3.10 was released on October 04, 2021." 
          result = re.search('\d', s) 
          
          print('Matched string:',result.group()) 
          print('Starting position:', result.start()) 
          print('Ending position:',result.end()) print('Positions:',result.span())

          Output:

          Matched string: 3 
          Starting position: 7 
          Ending position: 8 
          Positions: (7, 8)

          match() function

          The match() function returns a Match object if it finds a pattern at the beginning of a string. For example:

          import re 
          
          l = ['Python', 
               'CPython is an implementation of Python written in C', 
               'Jython is a Java implementation of Python', 
                'IronPython is Python on .NET framework'] 
          
          pattern = '\wython' 
          for s in l: 
              result = re.match(pattern,s) 
              print(result)

          Output:

          <re.Match object; span=(0, 6), match='Python'> 
          None 
          <re.Match object; span=(0, 6), match='Jython'> 
          None

          In this example, the \w is the word character set that matches any single character.

          The \wython matches any string that starts with any sing word character and is followed by the literal string ython, for example, Python.

          Since the match() function only finds the pattern at the beginning of a string, the following strings match the pattern:

          Python 
          Jython is a Java implementation of Python

          And the following string doesn’t match:

          'CPython is an implementation of Python written in C' 
          'IronPython is Python on .NET framework'

          fullmatch() function

          The fullmatch() function returns a Match object if the whole string matches a pattern or None otherwise. The following example uses the fullmatch() function to match a string with four digits:

          import re 
          
          s = "2021" 
          pattern = '\d{4}' 
          result = re.fullmatch(pattern, s) 
          print(result)

          Output:

          <re.Match object; span=(0, 4), match='2019'>(python)

          The pattern '\d{4}' matches a string with four digits. Therefore, the fullmatch() function returns the string 2021.

          If you place the number 2021 at the middle or the end of the string, the fullmatch() will return None. For example:

          import re 
          
          s = "Python 3.10 released in 2021" 
          pattern = '\d{4}' 
          result = re.fullmatch(pattern, s) 
          print(result)

          Output:

          None

          Regular expressions and raw strings

          It’s important to note that Python and regular expression are different programming languages. They have their own syntaxes.

          The re module is the interface between Python and regular expression programming languages. It behaves like an interpreter between them.

          To construct a pattern, regular expressions often use a backslash '\' for example \d and \w . But this collides with Python’s usage of the backslash for the same purpose in string literals.

          For example, suppose you need to match the following string:

          s = '\section'

          In Python, the backslash (\) is a special character. To construct a regular expression, you need to escape any backslashes by preceding each of them with a backslash (\):

          pattern = '\\section'Code language: JavaScript (javascript)

          In regular expressions, the pattern must be '\\section'. However, to express this pattern in a string literal in Python, you need to use two more backslashes to escape both backslashes again:

          pattern = '\\\\section'Code language: JavaScript (javascript)

          Simply put, to match a literal backslash ('\'), you have to write '\\\\' because the regular expression must be '\\' and each backslash must be expressed as '\\' inside a string literal in Python.

          This results in lots of repeated backslashes. Hence, it makes the regular expressions difficult to read and understand.

          A solution is to use the raw strings in Python for regular expressions because raw strings treat the backslash (\) as a literal character, not a special character.

          To turn a regular string into a raw string, you prefix it with the letter r or R. For example:

          import re 
          
          s = '\section' 
          pattern = r'\\section' 
          result = re.findall(pattern, s) 
          
          print(result) 
          Output:['\\section']

          Note that in Python ‘\section’ and ‘\\section’ are the same:

          p1 = '\\section' 
          p2 = '\section' 
          
          print(p1==p2) # true

          In practice, you’ll find the regular expressions constructed in Python using the raw strings.

          Summary

          • A regular expression is a string that contains the special characters for matching a string with a pattern.
          • Use the Pattern object or functions in re module to search for a pattern in a string.
          • Use raw strings to construct regular expression to avoid escaping the backslashes.
          ]]>
          4305
          Nested While Loop in C https://shishirkant.com/nested-while-loop-in-c/?utm_source=rss&utm_medium=rss&utm_campaign=nested-while-loop-in-c Sun, 03 Sep 2023 04:10:33 +0000 https://shishirkant.com/?p=4077 Nested While Loop in C Programming Language:

          Writing while loop inside another while loop is called nested while loop or you can say defining one while loop inside another while loop is called nested while loop. That is why nested loops are also called “loops inside the loop”. There can be any number of loops inside one another with any of the three combinations depending on the complexity of the given problem.

          In implementation when we need to repeat the loop body itself n number of times then we need to go for nested loops. Nested loops can be designed for up to 255 blocks.

          Nested While Loop Syntax in C Language:

          Following is the syntax to use the nested while loop in C language.

          Nested While Loop in C Programming Language

          Note: In the nested while loop, the number of iterations will be equal to the number of iterations in the outer loop multiplied by the number of iterations in the inner loop which is almost the same as the nested for loop. Nested while loops are mostly used for making various pattern programs in C like number patterns or shape patterns.

          Execution Flow of Nested While Loop in C Language:

          The outer while loop executes based on the outer condition and the inner while loop executes based on the inner condition. Now let us understand how the nested while loop executes. First, it will check the outer loop condition and if the outer loop condition fails, then it will terminate the loop.

          Suppose if the outer loop condition is true, then it will come inside, first, it will print the outer loop statements which are there before the inner loop. Then it will check the inner loop condition. If the inner while condition is true, then the control move inside and executes the inner while loop statements. After execution of inner while loop statements, again, it will check the inner while loop condition because it is a loop and as long as the condition is true, it will repeat this process. Once the inner while loop condition fails, then the control moves outside and executes the statements which are present after the inner while loop. Once it executes then, again it will go and check the outer while loop condition. And if it is true, then it will again execute the same process.

          So, when the loop will terminate means when the outer while loop condition becomes false.

          Flow Chart of Nested While Loop:

          Please have a look at the following diagram, which represents the flow chart of the nested while loop.

          Flow Chart of nested While Loop

          The flow will start and first, it will check the outer while loop condition. And if the outer while loop condition failed, then it will come to end. Suppose, the outer loop condition is true, then it will first execute the outer while loop statements if any. After execution of Outer while loop statements, it will check the inner while loop condition. For the inner while loop condition, it will also check for true and false. Suppose, the inner while loop condition is true, then inner while loop statements are executed. After executing the inner while loop statements, again, it will check the inner while loop condition, and this inner loop execution process will repeat as long as the inner while loop condition is true. If the inner while loop condition is false, then the remaining outer loop statements are executed. Once, the outer loop statements are executed, then again, it will come and check the outer while condition. This is the flow of the nested while loop.

          Example: WAP to print the following format.
          Nested While Loop in C Programming Language with Examples
          Program:
          #include <stdio.h>
          int main ()
          {
              int i, n, in;
              printf ("ENTER A NUMBER ");
              scanf ("%d", &n);
              i = 1;
              while (i <= n)
              {
                  printf ("\n");
                  in = 1;
                  while (in <= i)
                 {
                      printf ("%d ", in);
                      in = in + 1;
                 }
                 i = i + 1;
             }
             return 0;
          }

          Example: WAP to print the following format:
          Nested While Loop in C Language with Examples
          Program:
          #include <stdio.h>
          int main()
          {
              int i, n, dn;
              printf("ENTER A NUMBER ");
              scanf("%d", &n);
              i = n;
              while(i >= 1)
              {
                  printf("\n");
                  dn = i;
                  while(dn >= 1)
                 {
                      printf("%d ", dn);
                      dn = dn - 1;
                 }
                 i = i - 1;
              }
              return 0;
          }

          Example: WAP to print the following format:
          Nested While Loop in C Programming Language
          Program:
          #include <stdio.h>
          int main ()
          {
              int a = 1, b = 1;
              while (a <= 5)
              {
                  b = 1;
                  while (b <= 5)
                 {
                       printf ("%d ", b);
                       b++;
                 }
                 printf ("\n");
                 a++;
              }
              return 0;
          }

          In the next article, I am going to discuss Do While Loop in C Language with Examples. Here, in this article, I try to explain the Nested While Loop in C Programming Langauge with Examples. I hope you enjoy this Nested While Loop in C Programming Langauge with Examples article. I would like to have your feedback. Please post your feedback, question, or comments about this article.

          ]]>
          4077
          Programming C – What is a Translator https://shishirkant.com/programming-c-what-is-a-translator/?utm_source=rss&utm_medium=rss&utm_campaign=programming-c-what-is-a-translator Sat, 05 Aug 2023 16:11:42 +0000 https://shishirkant.com/?p=3964

          Translators in Programming Languages

          In this article, I am going to discuss What is a Translator and its need in Programming Languages.

          What is a Translator?

          Always the user’s given instructions are in English, which is called source code. But the computer is not able to understand this source code and the computer understandable code is binary / machine. To convert this source code into binary code we are using the interface software called translators.

          Translators are system software that converts programming language code into binary format. The translators are classified into three types:

          1. Compiler
          2. Interpreter
          3. Assembler

          For better understanding please have a look at the following image.

          What is a Translator

          Compiler and interpreter are both used to convert high-level programs to machine code. Assembler is used to convert low-level programs to machine code.

          Compiler:
          Compiler

          A compiler is the system software that translates High-level programming language code into binary format in a single step except for those lines which are having an error. It checks all kinds of limits, ranges, errors, etc. But its execution time is more and occupies the largest part of the memory.

          Interpreter:
          Interpreter

          It is the system software that converts programming language code into binary format step by step i.e. line by line compilation takes place. It reads one statement and then executes it until it proceeds further to all the statements. If an error occurs it will stop the compilation process. Development-wise, an interpreter is recommended to use.

          Note: The compiler converts the total source code at once by leaving the error lines. Whereas the interpreter is line by line. C & C++ are compiler-based languages. Java / .Net / Python, etc. are compiler-based interpreted languages. The assembler working style is similar to the compiler.

          Assembler:

          It is the system software that converts assembly language instructions into binary formats.

          Assembler
          Operating System:

          An Operating System (OS) is an interface between a computer user and computer hardware. An Operating system is a software that performs all the basic tasks like file management, memory management, process management, handling input and output, and controlling peripheral devices such as disk drives and Printers.

          Loader:

          A loader is a program that loads the machine codes of a program into system memory. And a locator is a program that assigns specific memory addresses for each machine code of a program that is to be loaded into system memory.

          Linker:

          Usually, a longer program is divided into a number of smaller subprograms called modules. It is easier to develop, test, and debug smaller programs. A linker is a program that links smaller programs to form a single program. The linker links the machine codes of the program. Therefore, it accepts the user’s programs after the editor has edited the program, and the compiler has produced machine codes of the program. The Process is called Linking.

          ]]>
          3964
          Programming C – Introduction to Programming https://shishirkant.com/programming-c-introduction-to-programming/?utm_source=rss&utm_medium=rss&utm_campaign=programming-c-introduction-to-programming Sat, 05 Aug 2023 16:03:56 +0000 https://shishirkant.com/?p=3960 Introduction to Programming Languages:

          Are you aiming to become a Software engineer one day? Do you also want to develop an application for Solving Problems and People all over the world would love to use? Are you passionate enough to take the big step to enter the world of programming? Then you are in the right place. In this article, you will get a brief Introduction to Programming Languages. As part of this article, we are going to discuss the following pointers.

          1. Program and Programming
          2. Programming Languages
          3. Types of Software’s
          4. Operating Systems
          5. Compiler, Interpreter, Assembler, Loader, and Linker
          Program and Programming:

          Program: A Program is a common computer term that means it is executed by software that runs on a computer when the program is used as what it means to create a software program. The set of instructions are called a program. For Example, Programmers create programs by writing code that instructs the computer what to do and execute it on a special software designed for it such as turbo C for executing ‘C’ Programs.

          Programming:- Programming is the implementation of logic to facilitate the specified computing operations and functionality. Thus, in simple words, we can say that the process of writing a program is called Programming.

          What is Software?

          A Software is a collection of the program which uses the resources of the Hardware components. A Program is a set of instructions that are designed for a particular task.

          The set of programs is called software. Let us understand this with an example i.e. Calculator. For each button, there is some program written inside it. That means a calculator is a collection of programs. And we can also say that a Calculator is a software. So, the software is a collection of programs.

          Programming Languages

          As per IT Standards software is a digitalized and automated process. Let us understand this with an example i.e. AC. If you set the timer to automatically off the AC after 1 hour, then after 1 hour the AC is going to be off. And again, using digits you can set the temperature of the AC. And these things are managed by software inside the AC.

          What is Software?
          Types of Software:

          Software is classified into two types, such as System Software and Application Software. For better understanding please have a look at the below image.

          Types of Software
          System Software:

          System Software is the software designed for a general purpose and does not have any limitations. It is basically designed to provide a platform for other software Systems. So, the Software does the functionality for the hardware devices like printers, mobile, processors, etc. System Software is classified into three types:

          • Operating System: DOS, WINDOWS, LINUX, UNIX
          • System Support: Compiler, Interpreter, Assembler
          • System Development: Linker, Loader, Editor
          Application Software: 

          Application Software is a program or group of programs designed for end-users i.e. designed for a specific task. Application Software does the functionality for business-oriented applications. Application Software is classified into two types:

          • Application-Specific: MS OFFICE, Oracle
          • General Purpose Software: Tally
          What is a language?

          Generally, languages are used to communicate with others. The languages like Odia / English / Marathi / Hindi are called human/regional languages, which are used to communicate with humans. The computer languages are used to write the programs [software] to communicate with the machines.

          What is a language?
          Types of computer languages:

          Basically, computer languages are divided into 3 types.

          1. Machine language: Created with binary code [0, 1] and they are very difficult for humans. Example: 11100001
          2. Low level/assembly language: Created with English-like shortcuts called MNEMONICS. Example: Add, Sub, Subject, Subtract, Subway, Subscribe, Subscript, subordinate
          3. High-level language: Created with simple English. Example: please, good morning, subject, addition, etc.

          C is a high-level language with low-level features. Hence C is also called middle-level language. High-level features allow designing application software like calculator, calendar, media player, etc., and low-level features are used to design system software like OS, device drivers, translators. etc. Hence c is multi-purpose.

          ‘C’ is a high-level/middle-level programming language.

          The languages are used to communicate with others. The computer languages are used to write the programs [software] to communicate with the machines.

          What is a Programming Language?

          A  Programming Language is a formal language, which comprises a set of instructions that is used to communicate with the computer. Programming Language is classified into two types:

          • High-Level Programming Language
          • Low-level Programming Language

          For better understanding please have a look at the following image.

          Types of Programming Languages
          High-Level Programming Language: 

          The High-Level Programming Languages are syntactically similar to English and easy to understand. High-Level Programming Languages are user-dependent languages. A High-Level Programming Language is a combination of alphabets, digits, and symbols. It is called Micro Statements. By using a high-level programming language we are developing user interface applications. Examples: C, C++, VC++, JAVA, C#, Swift, Objective C, D-Language

          Low-Level Programming Language:

          The Low-Level Programming Languages are the languages that can be easily understandable to the system. These are system-dependent languages. In these two languages are there i.e.

          1. Machine Language
          2. Assembly Language
          Machine Language:

          Machine Language is the fundamental language for the system it can be directly understandable without any translation. These are machine-oriented languages that use the collection of the binary of 1’s and 0’s.

          Assembly Language:

          The Assembly Language can be called Symbolic Language. In order to remember easily the program coding be implementing this language. In this language, different types of symbols will be used to design the program. But this assembly code is directly not understandable to the system so we require translators.

          As a programmer, if we know the programming language then it is not possible to interact with computers because the computer can understand binary code only.

          In the above case, recommended to use a translator. As a programmer, if the instruction came in the programming language, the Translator will convert programming language code into binary format and according to every binary instruction, we will get an application or software.

          ]]>
          3960
          C Programming-Introduction to Language https://shishirkant.com/c-programming-introduction-to-language/?utm_source=rss&utm_medium=rss&utm_campaign=c-programming-introduction-to-language Sat, 05 Aug 2023 15:30:55 +0000 https://shishirkant.com/?p=3956 Introduction to Language

          In this article, I am going to give a brief introduction to the language, computer language, programming language and why we need a programming language, and what is the job of a programmer.

          What is language?

          A language is nothing but a set of instructions. So generally, if you take English language or Hindi language, the languages we are using to communicate. If we want to communicate with another person, we are passing instructions using a particular language. But while using a language, we need to follow some of the instructions. For example, if I want to speak in English, I just want to form a sentence, to form a sentence, first, we should be grammatically good or else we cannot form a sentence.

          What is computer language?

          A computer language is also a set of instructions, or in other words, you can say a set of programs, we give an application to understand.

          What is computer language?
          What is the need for computer language?

          If one person wants to communicate with another person, generally, they have to share information nothing but passing instructions for general communication. They are using general languages like English, Telugu, Hindi, etc.

          Communicating with the machine is a simple task. Just consider if I am asking the question, what is the five factorials? Everyone can answer easily it’s a 120, all right, it is a simple calculation. If, my next question is what is the factorial of 120 factorials? No one can answer because it’s a bit complex operation. This complex operation is performed by a Computer/PC using programs. The computer understands the binary language. That’s why there is a need for a programming language.

          What is the need for computer language?

          If the person wants to communicate with the computer, the person needs to pass instructions in a machine code only i.e. a computer can understand only machine code. So that’s why first we have to learn one programming language properly. There are many programming languages like C, C++, Java, and C#. And all these are programming languages like high-level languages.

          What is a language

          For communicating with Computers, we write programs in any programming language.  If you want to communicate with the computer using the C language, first you should learn the C language perfectly. After learning the language, we are writing programs, program means a set of instructions. For example, A equals 10 and b equals 20 and c equals a plus b. We are taking two variables nothing but two values and we are adding and printing the result.

          What is computer language

          Programs get converted by Compiler and the output result gets generated on the screen. The compiler converts all these instructions into binary language or machine code. Now machine code is ready. So, once the machine code is ready you can pass it as an input to the computer.

          What is programming language

          The computer will produce the output. This is actually the process of communication with the computer. Directly we cannot communicate with the computer. We are communicating to perform complex operations in an easy way.

          why we need a programming language

          Imagine an ATM facility is not available and I just want to withdraw some amount of money from my account. How difficult it is right. So, it is a long process, first I have to visit the bank, and then so much process I have to complete. finally, I will get the money. Suppose if machine availability is there, within one minute or two minutes you will get the money.

          So, machines always make our tasks very easy and this is the only reason we are communicating with the machines. Direct communication with the computer is not allowed because we cannot pass instructions in a binary language. So first we have to learn one programming language and after learning the programming language, we will write programs and then converting into binary instructions using the compiler.

          What is an Interface?

          It is not always necessary to be a programmer to communicate with the machines. An end-user also can communicate with the machine using an interface. The interface means without having the background details we can perform our tasks.

          Just consider end user and user, the end-user wants to perform one ATM transaction, and the end-user can communicate using interfaces. For example, the end-user understands the English language, then the end-user clicks on English, then all the instructions will come in English language and operations will be performed.

          What is an Interface?

          What is happening in the background is not required for the end-user. For example, if you are driving a car, you just want to increase the speed of the car, you will accelerate generally. So, whenever we accelerate, the speed will increase or automatically but, what is happening in the background we don’t know.

          Now we will learn, how this interface will communicate with different machines to complete the operations. The End user whenever enters how much amount they want to withdraw, it is communicating with the server machine, how it is communicating with a server machine, by a database machine that is not required to the end-user.

          The end-user leaves the ATM center with the money. For any reason, if the transaction has failed, then the end-user directly contacts the bank management, and the bank management contact programmer because the programmer is providing the interface.

          So, here, as a programmer, we are not communicating with the machines, we are just developing applications nothing but we are providing interfaces by which every end-user can easily interact with the machine and that is our motto.

          what is the job of a programmer
          ]]>
          3956