Python Automation and Machine Learning for EM and ICs

An Online Book, Second Edition by Dr. Yougui Liao (2019)

Python Automation and Machine Learning for EM and ICs - An Online Book

Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix

loc[] and iloc[]

The primary method for retrieving data from a DataFrame involves referencing its labels. Employ the loc attribute, short for location, to indicate the specific rows and columns we wish to access:

df.loc[row_selection, column_selection]  

The loc attribute accommodates slice notation, allowing the use of a colon to select all rows or columns. Furthermore, you can utilize lists containing labels or a singular column or row name for more targeted selections. 

It is crucial to know the DataFrame, which can have one or more columns, and a Series. Even when a DataFrame has only a single column, it is still two-dimensional, whereas a Series is one-dimensional. Both the DataFrame and Series possess an index, but only the DataFrame includes column headers. When we select a column as a Series, the column header serves as the Series name. While many functions or methods can be applied to both Series and DataFrames, discrepancies arise in arithmetic calculations. 

For instance, print just the first matched value at code (on page0008) with:

value = filtered_rows['Yougui'].iloc[0]
print(value)

================================================

This Python script (code) was developed to efficiently extract the first and last data entries from a specified column in a structured dataset. Using the pandas library, the script reads a file containing tabular data—such as a CSV or Excel file—and targets the column labeled "globalsino." It verifies the existence of the column before retrieving the values from the first and last rows using the .iloc[0] and .iloc[-1] indexing methods. This approach allows for quick inspection of boundary values within a dataset, which is particularly useful for identifying trends, verifying data integrity, or initializing further data processing workflows in experimental analyses.

============================================

Write to a specific cell (loc[]) in a csv file: code:
         Replace headers in a csv file         
Output:         
          Replace headers in a csv file

================================================

loc[] and iloc[], namely explicit index and implicit index (similar to numpy indexing): code:
        loc[] and iloc[], namely explicit index and implicit index (similar to numpy indexing)
Output:        
        loc[] and iloc[], namely explicit index and implicit index (similar to numpy indexing)

===================================================

Get maximum and minimum value of column and its index in pandas: code:          
          Get maximum and minimum value of column and its index in pandas
Output:         
          Get maximum and minimum value of column and its index in pandas

===================================================

Machine learning: KNN algorithm: code:
        Machine learning: KNN algorithm
Input:        
        Machine learning: KNN algorithm
Output:        
        Machine learning: KNN algorithm

===================================================

Machine learning: KNN algorithm (version 3 -- more functions are added): code:
        Machine learning: KNN algorithm
        Machine learning: KNN algorithm
Input:        
        Machine learning: KNN algorithm
Output:        
        Machine learning: KNN algorithm        

===================================================

Write a single cell with the rules (Add one more cell at the end of a specific column and then write a number into the end of the column) (code):  

         

Input csv file (headersOnly.csv):

         

Output (OutputCSV.csv):  

         

The code is modified to (for further test):

          

Input (OutputCSV.csv):

         

Output (OutputCSV2.csv):

          

The code is modified to (for further test):

          

Input (OutputCSV2.csv):

         

Output (OutputCSV3.csv):

                   

The code is modified to (for further test) (code):

          

Input (OutputCSV3.csv):

         

Output (OutputCSV4.csv):

                            

===================================================